1. 窗口函数(Window Functions)
用于在结果集的“窗口”(指定行范围)内执行计算,保留原数据行的同时生成聚合或排序结果。
1.1 核心语法
SELECT column1,column2,[窗口函数] OVER (PARTITION BY 分组列 ORDER BY 排序列 [ROWS/RANGE 范围定义]) AS 别名
FROM 表名;
1.2 常用窗口函数
-
排序类:
ROW_NUMBER() -- 行号(唯一,无重复) RANK() -- 排名(允许并列,后续跳过序号) DENSE_RANK() -- 密集排名(允许并列,后续不跳号)
示例:按分数对学生排名
SELECT name, score,ROW_NUMBER() OVER (ORDER BY score DESC) AS row_num,RANK() OVER (ORDER BY score DESC) AS rank,DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rank FROM students;
-
聚合类:
SUM() OVER (PARTITION BY 分组列) -- 分组求和 AVG() OVER (ORDER BY 排序列 ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) -- 滑动平均
示例:计算每个学生的累计总分
SELECT name, score,SUM(score) OVER (ORDER BY enroll_date) AS cumulative_sum FROM students;
-
分布类:
LAG(column, n) -- 获取前第n行的值 LEAD(column, n) -- 获取后第n行的值 NTILE(4) -- 将数据分为4组
2. 存储过程与函数
2.1 存储过程(Stored Procedure)
封装复杂逻辑,可重复调用:
-- 创建存储过程(示例:根据年龄筛选学生)
DELIMITER //
CREATE PROCEDURE GetStudentsByAge(IN min_age INT, IN max_age INT)
BEGINSELECT name, age FROM students WHERE age BETWEEN min_age AND max_age;
END //
DELIMITER ;-- 调用存储过程
CALL GetStudentsByAge(18, 25);
2.2 自定义函数(User-Defined Function)
返回单一值,可在查询中使用:
-- 创建函数(示例:计算折扣价)
CREATE FUNCTION CalculateDiscount(price DECIMAL(10,2), discount_rate DECIMAL(3,2))
RETURNS DECIMAL(10,2)
DETERMINISTIC
BEGINRETURN price * (1 - discount_rate);
END;-- 使用函数
SELECT product_name, price, CalculateDiscount(price, 0.1) AS discounted_price
FROM products;
3. 触发器(Triggers)
在指定事件(INSERT/UPDATE/DELETE)前后自动执行:
-- 创建触发器(示例:在插入订单时更新库存)
CREATE TRIGGER UpdateInventoryAfterOrder
AFTER INSERT ON orders
FOR EACH ROW
BEGINUPDATE products SET stock = stock - NEW.quantity WHERE product_id = NEW.product_id;
END;-- 插入订单后,库存自动减少
INSERT INTO orders (product_id, quantity) VALUES (101, 5);
4. 动态SQL
构建灵活查询(示例:根据条件动态筛选):
-- 使用预处理语句(MySQL示例)
SET @sql = CONCAT('SELECT * FROM students WHERE 1=1');
-- 动态添加条件
IF age_filter IS NOT NULL THENSET @sql = CONCAT(@sql, ' AND age = ', age_filter);
END IF;PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
5. JSON 数据处理
现代数据库支持 JSON 类型和操作:
-- 创建含 JSON 列的表
CREATE TABLE user_profiles (user_id INT PRIMARY KEY,profile JSON
);-- 插入 JSON 数据
INSERT INTO user_profiles VALUES (1, '{"name": "Alice", "hobbies": ["reading", "music"]}');-- 查询 JSON 字段
SELECT user_id,profile->'$.name' AS name,JSON_EXTRACT(profile, '$.hobbies[0]') AS first_hobby
FROM user_profiles;-- 更新 JSON 字段
UPDATE user_profiles
SET profile = JSON_SET(profile, '$.age', 25)
WHERE user_id = 1;
6. 性能优化技巧
6.1 分析执行计划
使用 EXPLAIN
查看查询优化路径:
EXPLAIN SELECT * FROM students WHERE age > 20;
6.2 索引优化
- 覆盖索引:索引包含查询所需的所有字段。
- 避免全表扫描:对 WHERE 和 JOIN 中的列建索引。
- 复合索引顺序:将高区分度的列放在前面。
6.3 避免隐式类型转换
-- 错误示例(将数字与字符串比较)
SELECT * FROM students WHERE id = '100';-- 正确写法
SELECT * FROM students WHERE id = 100;
6.4 分页查询优化
避免 LIMIT offset, size
在大偏移量时的性能问题:
-- 优化写法(基于有序唯一列)
SELECT * FROM students
WHERE id > 1000 -- 上次查询的最大ID
ORDER BY id
LIMIT 10;
7. 递归查询(CTE)
处理层次结构数据(如组织结构):
-- 查询所有下属员工(示例表:employee(id, name, manager_id))
WITH RECURSIVE Subordinates AS (SELECT id, name, manager_id FROM employee WHERE id = 1 -- 根节点(CEO)UNION ALLSELECT e.id, e.name, e.manager_id FROM employee eINNER JOIN Subordinates s ON e.manager_id = s.id
)
SELECT * FROM Subordinates;
8. 视图与物化视图
8.1 视图(View)
虚拟表,简化复杂查询:
CREATE VIEW HighScoreStudents AS
SELECT name, score
FROM students
WHERE score >= 90;-- 使用视图
SELECT * FROM HighScoreStudents;
8.2 物化视图(Materialized View)
物理存储查询结果(需数据库支持,如 PostgreSQL):
CREATE MATERIALIZED VIEW SalesSummary AS
SELECT product_id, SUM(quantity) AS total_sales
FROM orders
GROUP BY product_id;-- 刷新物化视图
REFRESH MATERIALIZED VIEW SalesSummary;
9. 综合实战:电商数据分析
-- 统计每个用户的订单数、总金额及最近购买时间
SELECT u.user_id,u.name,COUNT(o.order_id) AS order_count,SUM(o.amount) AS total_amount,MAX(o.order_date) AS last_order_date
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id;-- 分析销售额的月度增长趋势(窗口函数)
SELECT DATE_FORMAT(order_date, '%Y-%m') AS month,SUM(amount) AS monthly_sales,LAG(SUM(amount)) OVER (ORDER BY DATE_FORMAT(order_date, '%Y-%m')) AS prev_month_sales,(SUM(amount) - LAG(SUM(amount)) OVER (ORDER BY DATE_FORMAT(order_date, '%Y-%m'))) / LAG(SUM(amount)) OVER (ORDER BY DATE_FORMAT(order_date, '%Y-%m')) * 100 AS growth_rate
FROM orders
GROUP BY month;
10. 本章练习
- 使用窗口函数计算每个学生的分数与班级平均分的差值。
- 编写存储过程:根据用户ID删除订单,并自动退还库存。
- 优化以下分页查询(假设表含百万数据):
SELECT * FROM orders ORDER BY order_date DESC LIMIT 100000, 10;