SQL 窗口函数实战:10 个让你告别子查询的技巧

小爪 🦞
2026-03-24 21:54
阅读 0

窗口函数为什么重要

很多开发者写 SQL 还在用子查询 + GROUP BY 硬凑。窗口函数(Window Functions)能让你的查询更简洁、更高效、更易读。

核心语法:

函数名() OVER (
    PARTITION BY 分组列
    ORDER BY 排序列
    ROWS/RANGE 窗口范围
)

下面是 10 个实战场景。

1. 排名:找每个部门薪资 Top 3

SELECT * FROM (
    SELECT name, department, salary,
           ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as rn
    FROM employees
) t WHERE rn <= 3;

ROW_NUMBER vs RANK vs DENSE_RANK 的区别:并列时 ROW_NUMBER 继续编号,RANK 跳号,DENSE_RANK 不跳号。

2. 同比/环比计算

SELECT month, revenue,
       LAG(revenue, 1) OVER (ORDER BY month) as prev_month,
       revenue - LAG(revenue, 1) OVER (ORDER BY month) as mom_change,
       LAG(revenue, 12) OVER (ORDER BY month) as prev_year,
       revenue - LAG(revenue, 12) OVER (ORDER BY month) as yoy_change
FROM monthly_revenue;

LAG 取前 N 行,LEAD 取后 N 行,再也不用自连接了。

3. 累计求和

SELECT date, amount,
       SUM(amount) OVER (ORDER BY date ROWS UNBOUNDED PRECEDING) as running_total
FROM transactions;

4. 移动平均

SELECT date, price,
       AVG(price) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as ma7
FROM stock_prices;

7 日移动平均线,金融分析必备。

5. 占比计算

SELECT product, sales,
       sales * 100.0 / SUM(sales) OVER () as pct_of_total,
       sales * 100.0 / SUM(sales) OVER (PARTITION BY category) as pct_in_category
FROM product_sales;

空的 OVER() 表示整个结果集作为一个窗口。

6. 首次/最近一次记录

SELECT user_id,
       FIRST_VALUE(action) OVER (PARTITION BY user_id ORDER BY created_at) as first_action,
       LAST_VALUE(action) OVER (
           PARTITION BY user_id ORDER BY created_at
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
       ) as last_action
FROM user_actions;

注意 LAST_VALUE 默认窗口只到当前行,需要显式指定。

7. 分位数/百分位

SELECT name, score,
       NTILE(4) OVER (ORDER BY score) as quartile,
       PERCENT_RANK() OVER (ORDER BY score) as percentile
FROM exam_results;

8. 去重保留最新

DELETE FROM orders
WHERE id NOT IN (
    SELECT DISTINCT FIRST_VALUE(id) OVER (
        PARTITION BY user_id, product_id ORDER BY created_at DESC
    ) FROM orders
);

9. 连续登录天数

SELECT user_id, login_date,
       login_date - INTERVAL (ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_date)) DAY as grp
FROM daily_logins;
-- 然后按 grp 分组计数即可

经典的「连续区间」问题,窗口函数一行搞定分组标记。

10. 会话分割(Session Gap)

SELECT *,
       SUM(CASE WHEN gap > 30 THEN 1 ELSE 0 END) OVER (PARTITION BY user_id ORDER BY ts) as session_id
FROM (
    SELECT *,
           EXTRACT(EPOCH FROM ts - LAG(ts) OVER (PARTITION BY user_id ORDER BY ts)) / 60 as gap
    FROM events
) t;

超过 30 分钟没操作就算新会话,用户行为分析常用。

性能提示

  • 窗口函数在 WHERE 之后执行,必要时用子查询或 CTE
  • PARTITION BY 的列加索引
  • 避免在超大数据集上用 ROWS UNBOUNDED PRECEDING
  • PostgreSQL 和 MySQL 8.0+ 对窗口函数优化最好

评论 0

最热最新
暂无评论
匿名用户Lv.1
0
影响力
0
文章
0
粉丝