当前位置: 首页> 游戏> 手游 > 北京互联网_彭阳门户网站建设_全网营销推广方案_十大少儿编程教育品牌

北京互联网_彭阳门户网站建设_全网营销推广方案_十大少儿编程教育品牌

时间:2025/7/9 23:22:02来源:https://blog.csdn.net/weixin_30777913/article/details/146484900 浏览次数:1次
北京互联网_彭阳门户网站建设_全网营销推广方案_十大少儿编程教育品牌

Python控制台的程序,实现遍历MySQL中所有的SQL对象(表、视图、用户定义函数、存储过程和触发器等),并取得它们之间之前的依赖性关系,并列出三张表,第一张表的第一列是所有的SQL对象名称,第二列是它的数据的生成路径,路径中的相邻SQL对象之间用“->”隔开,如果有多条路径,就存储多条记录,第二张表是根据依赖性生成所有表的列表,依懒性从上到下依次递增,第三张表是根据依赖性生成所有第二张表中表数据的存储过程列表,依懒性从上到下依次递增,存储在Excel文件中。

程序总体设计

1. 系统架构
+-----------------+     +-----------------+     +-----------------+     +-----------------+
| MySQL元数据采集 | --> | 依赖关系分析引擎 | --> | 数据路径生成器 | --> | Excel输出模块 |
+-----------------+     +-----------------+     +-----------------+     +-----------------+
2. 模块划分
  • 元数据采集模块:通过information_schema获取所有数据库对象
  • 依赖分析模块:解析SQL定义获取对象依赖关系
  • 路径生成模块:构建依赖图谱并生成所有可能路径
  • 拓扑排序模块:为表和存储过程生成依赖顺序
  • Excel输出模块:使用pandas生成多Sheet的Excel文件
3. 技术选型
  • Python 3.8+
  • 数据库驱动:mysql-connector-python
  • SQL解析:sqlparse + 正则表达式
  • 数据处理:pandas
  • Excel输出:openpyxl
4. 数据结构设计
class DBObject:def __init__(self, name, obj_type, definition):self.name = name          # 对象名称self.obj_type = obj_type # 对象类型(TABLE/VIEW/PROCEDURE等)self.definition = definition # 原始定义SQLself.dependencies = []    # 直接依赖对象列表

详细设计方案

1. 元数据采集(示例SQL)
# 获取所有表/视图
SELECT table_name, table_type 
FROM information_schema.tables 
WHERE table_schema = DATABASE()# 获取存储过程/函数
SELECT routine_name, routine_type 
FROM information_schema.routines 
WHERE routine_schema = DATABASE()# 获取触发器
SELECT trigger_name, event_object_table 
FROM information_schema.triggers 
WHERE trigger_schema = DATABASE()
2. 依赖关系分析算法
def parse_dependencies(obj: DBObject):if obj.obj_type == 'VIEW':# 使用正则匹配FROM和JOIN子句pattern = r'\b(?:FROM|JOIN)\s+([\w`]+)'elif obj.obj_type in ('PROCEDURE', 'FUNCTION'):# 匹配表引用和CALL语句pattern = r'\b(?:FROM|INTO|UPDATE|JOIN|CALL)\s+([\w`]+)'elif obj.obj_type == 'TRIGGER':# 触发器关联表已在元数据中pattern = Noneif pattern:matches = re.findall(pattern, obj.definition, re.IGNORECASE)obj.dependencies = [m.strip('`') for m in matches]
3. 依赖图谱构建
class DependencyGraph:def __init__(self):self.graph = defaultdict(list)  # 邻接表:{节点: [依赖项]}self.reverse_graph = defaultdict(list)  # 逆邻接表def add_edge(self, from_node, to_node):self.graph[to_node].append(from_node)self.reverse_graph[from_node].append(to_node)
4. 路径生成算法(DFS实现)
def find_all_paths(graph, start):paths = []def dfs(node, path):if node not in graph or not graph[node]:paths.append('->'.join(path[::-1]))returnfor neighbor in graph[node]:dfs(neighbor, path + [neighbor])dfs(start, [start])return paths
5. 拓扑排序算法(Kahn算法)
def topological_sort(graph):in_degree = {u:0 for u in graph}for u in graph:for v in graph[u]:in_degree[v] += 1queue = deque([u for u in in_degree if in_degree[u] == 0])sorted_list = []while queue:u = queue.popleft()sorted_list.append(u)for v in graph[u]:in_degree[v] -= 1if in_degree[v] == 0:queue.append(v)return sorted_list
6. Excel输出结构
with pd.ExcelWriter('output.xlsx') as writer:# Sheet1: 对象路径表path_df.to_excel(writer, sheet_name='对象路径', index=False)# Sheet2: 表依赖顺序table_order_df.to_excel(writer, sheet_name='表依赖顺序', index=False)# Sheet3: 存储过程依赖顺序sp_order_df.to_excel(writer, sheet_name='存储过程顺序', index=False)

执行流程

  1. 初始化数据库连接
  2. 采集所有数据库对象元数据
  3. 构建依赖关系图谱
  4. 验证并处理循环依赖
  5. 生成路径数据(Sheet1)
  6. 分别对表和存储过程进行拓扑排序(Sheet2、Sheet3)
  7. 格式化输出到Excel文件

异常处理设计

  1. 数据库连接失败:捕获OperationalError,提示检查连接参数
  2. 循环依赖检测:使用Tarjan算法检测强连通分量
  3. SQL解析失败:记录解析错误日志,跳过当前对象
  4. 文件写入异常:捕获PermissionError,提示关闭已打开的Excel文件

优化策略

  1. 缓存机制:对已解析对象建立缓存字典
  2. 并行处理:使用ThreadPoolExecutor并行解析不同对象
  3. 增量更新:记录最后处理时间戳,支持增量分析
  4. 批处理优化:使用executemany批量查询对象定义

示例输出格式

表1:对象路径表
对象名称生成路径
view1table_a->view1
view1table_b->view1
proc1table_a->view1->proc1
表2:表依赖顺序
表名
table_a
table_b
table_c
表3:存储过程顺序
存储过程名
proc_base
proc_main

这个设计方案可以实现对MySQL数据库对象的全面依赖分析,最终输出结构清晰的Excel报告。实际开发时需要特别注意SQL解析的准确性和异常处理的完备性。

此代码实现了对MySQL数据库对象的元数据采集、依赖关系分析、路径生成和拓扑排序,并将结果输出到Excel文件中。同时包含了基本的异常处理和优化策略。

import mysql.connector
import sqlparse
import re
from collections import defaultdict, deque
import pandas as pd
from tarjan import tarjan as tarjan_algorithm
import traceback
from concurrent.futures import ThreadPoolExecutor
import timeclass DBObject:def __init__(self, name, obj_type, definition):self.name = nameself.obj_type = obj_typeself.definition = definitionself.dependencies = []def get_metadata(cnx):cursor = cnx.cursor()tables = []cursor.execute("""SELECT table_name, table_type FROM information_schema.tables WHERE table_schema = DATABASE()""")for row in cursor:name, obj_type = rowcursor2 = cnx.cursor(dictionary=True)cursor2.execute(f"SHOW CREATE {obj_type.lower()} {name}")result = cursor2.fetchone()tables.append(DBObject(name, obj_type, result[f'Create {obj_type}']))cursor2.close()routines = []cursor.execute("""SELECT routine_name, routine_type FROM information_schema.routines WHERE routine_schema = DATABASE()""")for row in cursor:name, obj_type = rowcursor2 = cnx.cursor(dictionary=True)cursor2.execute(f"SHOW CREATE {obj_type.lower()} {name}")result = cursor2.fetchone()routines.append(DBObject(name, obj_type, result[f'Create {obj_type}']))cursor2.close()triggers = []cursor.execute("""SELECT trigger_name, event_object_table FROM information_schema.triggers WHERE trigger_schema = DATABASE()""")for row in cursor:name, table = rowtriggers.append(DBObject(name, 'TRIGGER', table))all_objects = tables + routines + triggerscursor.close()return all_objectsdef parse_dependencies(obj: DBObject):if obj.obj_type == 'VIEW':pattern = r'\b(?:FROM|JOIN)\s+([\w`]+)'elif obj.obj_type in ('PROCEDURE', 'FUNCTION'):pattern = r'\b(?:FROM|INTO|UPDATE|JOIN|CALL)\s+([\w`]+)'elif obj.obj_type == 'TRIGGER':pattern = Noneif pattern:matches = re.findall(pattern, obj.definition, re.IGNORECASE)obj.dependencies = [m.strip('`') for m in matches]class DependencyGraph:def __init__(self):self.graph = defaultdict(list)self.reverse_graph = defaultdict(list)def add_edge(self, from_node, to_node):self.graph[to_node].append(from_node)self.reverse_graph[from_node].append(to_node)def find_all_paths(graph, start):paths = []def dfs(node, path):if node not in graph or not graph[node]:paths.append('->'.join(path[::-1]))returnfor neighbor in graph[node]:dfs(neighbor, path + [neighbor])dfs(start, [start])return pathsdef topological_sort(graph):in_degree = {u: 0 for u in graph}for u in graph:for v in graph[u]:in_degree[v] += 1queue = deque([u for u in in_degree if in_degree[u] == 0])sorted_list = []while queue:u = queue.popleft()sorted_list.append(u)for v in graph[u]:in_degree[v] -= 1if in_degree[v] == 0:queue.append(v)return sorted_listdef detect_cycles(graph):components = tarjan_algorithm(graph)for component in components:if len(component) > 1:return Truereturn Falsedef main():try:cnx = mysql.connector.connect(user='your_username',password='your_password',host='your_host',database='your_database')all_objects = get_metadata(cnx)cache = {}graph = DependencyGraph()def parse_object(obj):if obj.name in cache:return cache[obj.name]parse_dependencies(obj)for dep in obj.dependencies:graph.add_edge(obj.name, dep)cache[obj.name] = objreturn objwith ThreadPoolExecutor() as executor:all_objects = list(executor.map(parse_object, all_objects))if detect_cycles(graph.graph):print("检测到循环依赖,请检查数据库对象的定义。")returnpath_data = []for obj in all_objects:paths = find_all_paths(graph.graph, obj.name)for path in paths:path_data.append({'对象名称': obj.name, '生成路径': path})path_df = pd.DataFrame(path_data)tables = [obj.name for obj in all_objects if obj.obj_type == 'TABLE']table_graph = {k: v for k, v in graph.graph.items() if k in tables}table_order = topological_sort(table_graph)table_order_df = pd.DataFrame({'表名': table_order})procs = [obj.name for obj in all_objects if obj.obj_type in ('PROCEDURE', 'FUNCTION')]proc_graph = {k: v for k, v in graph.graph.items() if k in procs}proc_order = topological_sort(proc_graph)sp_order_df = pd.DataFrame({'存储过程名': proc_order})with pd.ExcelWriter('output.xlsx') as writer:path_df.to_excel(writer, sheet_name='对象路径', index=False)table_order_df.to_excel(writer, sheet_name='表依赖顺序', index=False)sp_order_df.to_excel(writer, sheet_name='存储过程顺序', index=False)cnx.close()except mysql.connector.OperationalError as e:print(f"数据库连接失败: {e}")except Exception as e:print(f"发生错误: {e}")traceback.print_exc()if __name__ == "__main__":main()
关键字:北京互联网_彭阳门户网站建设_全网营销推广方案_十大少儿编程教育品牌

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com

责任编辑: