《并查集的黑科技：路径压缩×按秩合并×带权扩展｜算法核心原理与工程级实践指南》

技术文档

在这里插入图片描述

📃个人主页：island1314

⛺️ 欢迎关注：👍点赞 👂🏽留言 😍收藏 💞 💞 💞

生活总是不会一帆风顺，前进的道路也不会永远一马平川，如何面对挫折影响人生走向 – 《人民日报》

🔥 目录

- 一、概念与特点
- - 1. 核心概念
  - 2. 时间复杂度
- 二、应用场景
- 三、并查集技巧
- - 1. 普通并查集基本模板
  - 2. 带权并查集
  - - 2.1 基本概念
    - 2.2 权重更新原理
    - 2.3 合并操作示例
  - 3. 离散化并查集
  - - 3.1 基本概念
    - 3.2 案例代码
    - 3.3 适用场景
  - 4. 可撤销并查集
  - - 4.1 基本概念
    - 4.2 核心特性
    - 4.3 算法代码
  - 5. 其他进阶技巧
  - - 5.1 动态大小处理
    - 5.2 连通块分量统计优化
- 四、板子训练
- - 1. 合并集合
  - 2. 连通块中点的数量
  - 3. 食物链
- 五、经典例题
- - 例题1: 省份数量 (LeetCode 547)
  - 例题2: 连通网络的操作次数 (LeetCode 1319)
  - 案例3: 冗余连接 (LeetCode 684)
  - 案例4: 岛屿数量 (LeetCode 200)
  - 案例5: 等式方程的可满足性 (LeetCode 990)
  - 案例6: 除法求值 (LeetCode 399)
  - 案例7: 冗余连接 II (LeetCode 685)
  - 案例8：账户合并 (LeetCode 721)
  - 案例9：最长连续序列(LeetCode 128)
  - 案例10：移除最多的同行或同列石头(LeetCode 947)
  - 案例11：按公因数计算最大组件大小(LeetCode 952)
  - 案例12：由斜杠划分区域(LeetCode 959)
- 六、工程实践要点
- 七、扩展思考
- - 1. 逆向并查集
  - 2. 概率化并查集
  - 3. 机器学习应用

一、概念与特点

🔥 在一些应用问题中，需要将n个不同的元素划分成一些不相交的集合。开始时，每个元素自成一个单元素集合，然后按一定的规律将归于同一组元素的集合合并。在此过程中要反复用到查询某一个元素归属于那个集合的运算。适合于描述这类问题的抽象数据类型成为并查集（union-findset）

1. 核心概念

动态连通性管理：维护多个 不相交集合，支持快速 合并集合 和 查询元素所属集合
树形结构：每个集合用一棵树表示，父节点指针实现
路径压缩：缩短查找路径，优化查询时间
按秩合并：避免树的高度过高，保证平衡性

#mermaid-svg-l1dRLwk7Vw4x6Idh {font-family:\"trebuchet ms\",verdana,arial,sans-serif;font-size:16px;fill:#333;}#mermaid-svg-l1dRLwk7Vw4x6Idh .error-icon{fill:#552222;}#mermaid-svg-l1dRLwk7Vw4x6Idh .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-l1dRLwk7Vw4x6Idh .edge-thickness-normal{stroke-width:2px;}#mermaid-svg-l1dRLwk7Vw4x6Idh .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-l1dRLwk7Vw4x6Idh .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-l1dRLwk7Vw4x6Idh .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-l1dRLwk7Vw4x6Idh .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-l1dRLwk7Vw4x6Idh .marker{fill:#333333;stroke:#333333;}#mermaid-svg-l1dRLwk7Vw4x6Idh .marker.cross{stroke:#333333;}#mermaid-svg-l1dRLwk7Vw4x6Idh svg{font-family:\"trebuchet ms\",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-l1dRLwk7Vw4x6Idh .label{font-family:\"trebuchet ms\",verdana,arial,sans-serif;color:#333;}#mermaid-svg-l1dRLwk7Vw4x6Idh .cluster-label text{fill:#333;}#mermaid-svg-l1dRLwk7Vw4x6Idh .cluster-label span{color:#333;}#mermaid-svg-l1dRLwk7Vw4x6Idh .label text,#mermaid-svg-l1dRLwk7Vw4x6Idh span{fill:#333;color:#333;}#mermaid-svg-l1dRLwk7Vw4x6Idh .node rect,#mermaid-svg-l1dRLwk7Vw4x6Idh .node circle,#mermaid-svg-l1dRLwk7Vw4x6Idh .node ellipse,#mermaid-svg-l1dRLwk7Vw4x6Idh .node polygon,#mermaid-svg-l1dRLwk7Vw4x6Idh .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-l1dRLwk7Vw4x6Idh .node .label{text-align:center;}#mermaid-svg-l1dRLwk7Vw4x6Idh .node.clickable{cursor:pointer;}#mermaid-svg-l1dRLwk7Vw4x6Idh .arrowheadPath{fill:#333333;}#mermaid-svg-l1dRLwk7Vw4x6Idh .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-l1dRLwk7Vw4x6Idh .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-l1dRLwk7Vw4x6Idh .edgeLabel{background-color:#e8e8e8;text-align:center;}#mermaid-svg-l1dRLwk7Vw4x6Idh .edgeLabel rect{opacity:0.5;background-color:#e8e8e8;fill:#e8e8e8;}#mermaid-svg-l1dRLwk7Vw4x6Idh .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-l1dRLwk7Vw4x6Idh .cluster text{fill:#333;}#mermaid-svg-l1dRLwk7Vw4x6Idh .cluster span{color:#333;}#mermaid-svg-l1dRLwk7Vw4x6Idh div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\"trebuchet ms\",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-l1dRLwk7Vw4x6Idh :root{--mermaid-font-family:\"trebuchet ms\",verdana,arial,sans-serif;} 原始并查集路径压缩按秩合并 O(log* n) O(α n) 接近常数

2. 时间复杂度

操作时间复杂度初始化 O(n) Find O(α(n)) (接近常数) Union O(α(n))

其中 α(n) 是阿克曼函数的反函数，增长极其缓慢

优化策略对比表

优化类型单次操作时间复杂度空间复杂度适用场景基本实现 O(log n) O(n) 小规模数据仅路径压缩 O(α(n)) O(n) 查询密集型操作仅按秩合并 O(log n) O(n) 合并操作较多的情况双优化 O(α(n)) O(n) 通用场景带权并查集 O(α(n)) O(n) 需要维护附加关系的情况

阿克曼函数解析

阿克曼函数 A(m, n) 是递归定义的：

A(m, n) = n+1  if m=0 A(m-1, 1)  if m>0 and n=0 A(m-1, A(m, n-1)) otherwise

其反函数 α(n) 是满足 A(α(n), α(n)) ≥ n 的最小整数。对于所有实际物理世界中的n值，α(n) ≤ 5。

二、应用场景

社交网络：判断两人是否属于同一朋友圈
图论：判断图的连通性、最小生成树(Kruskal算法)
棋盘问题：岛屿数量、矩阵连通块
动态连接：实时合并/查询数据集合

三、并查集技巧

1. 普通并查集基本模板

class UnionFind {private: vector<int> fa; // 父节点数组 vector<int> cnt; // 连通块内点的数量 vector<int> rank; // 秩（树的高度） int count = 0; // 新增：连通块数量 public: // 1. 初始化 UnionFind(int n) {  fa.resize(n + 1); rank.resize(n + 1, 1); cnt.resize(n + 1, 1); count = n; // 初始每个节点都是独立连通块，总共有n个 for(int i = 1; i <= n; ++i) fa[i] = i; } // 2. 查找根节点（路径压缩） int find(int x) { return fa[x] == x ? x : fa[x] = find(fa[x]); } // 3. 合并集合（按秩合并） void unite(int a, int b) { int x = find(a), y = find(b); if (x == y) return; // 合并时连通块数量减少1 count--; // 按秩合并 if (rank[x] < rank[y]) { fa[x] = y; cnt[y] += cnt[x]; } else { fa[y] = x; cnt[x] += cnt[y]; if (rank[x] == rank[y]) rank[x]++; } } // 4. 判断连通性 bool isConnected(int x, int y) { return find(x) == find(y); } // 5. 获取连通块大小 int getCnt(int x) { return cnt[find(x)]; } // 6. 获取连通块数量 int getCount() const { return count; }};

2. 带权并查集

2.1 基本概念

带权并查集是并查集的扩展版本，在维护连通性的基础上，额外记录节点之间的相对关系（如距离、比例、差值等）。通过路径压缩和按秩合并的优化，实现高效的关系维护与查询。

class WeightedUnionFind {private: vector<int> parent; // 父节点数组 vector<int> rank; // 秩（树的高度） vector<double> weight; // 权重数组：记录节点到父节点的关系权重public: WeightedUnionFind(int n) { parent.resize(n); iota(parent.begin(), parent.end(), 0); // 初始父节点为自身 rank.resize(n, 1); weight.resize(n, 1.0); // 初始权重为1.0 } // 查找操作（带路径压缩） pair<int, double> find(int x) { if (parent[x] != x) { auto [root, w] = find(parent[x]); // 递归查找根节点 parent[x] = root;  // 路径压缩 weight[x] *= w;  // 更新当前节点到根的权重 } return {parent[x], weight[x]}; } // 合并操作（带权重关系处理） void unite(int x, int y, double value) { auto [rootX, wX] = find(x); // x到根的权重wX auto [rootY, wY] = find(y); // y到根的权重wY if (rootX == rootY) return; // 合并策略（按秩合并） if (rank[rootX] < rank[rootY]) { parent[rootX] = rootY; weight[rootX] = value * wY / wX; // 计算新权重关系 } else { parent[rootY] = rootX; weight[rootY] = wX / (value * wY); // 权重关系反向 if (rank[rootX] == rank[rootY])  rank[rootX]++; } } // 查询x和y的关系值 double getValue(int x, int y) { auto [rootX, wX] = find(x); auto [rootY, wY] = find(y); if (rootX != rootY) return -1.0; // 不连通 return wX / wY; // 返回x/y的值 }};

复杂度与优化

操作时间复杂度空间复杂度构造函数 O(n) O(n) find O(α(n)) O(1) unite O(α(n)) O(1) getValue O(α(n)) O(1)

2.2 权重更新原理

数学建模：设 x 到父节点的权重为 w[x]，表示 x = w[x] * parent[x]
路径压缩：当查找路径被压缩时，直接计算 x 到根的累计权重
合并公式：合并 x 和 y 时，根据 x/y = value 推导新权重关系

2.3 合并操作示例

假设合并 x 和 y，已知 x = a * rootX, y = b * rootY，合并条件为 x = value * y，则

a*rootX = value*(b*rootY)=> rootX = (value*b/a) * rootY

工程优化技巧：

离散化处理：对非连续标识（如字符串）建立映射表
双缓冲机制：在频繁修改的场景下使用双数组交替更新
批量操作优化：预处理所有操作后统一处理路径压缩

3. 离散化并查集

3.1 基本概念

离散化映射：
- 收集所有元素并去重。
- 为每个唯一元素分配一个连续整数索引（如 0, 1, 2, ...）。
- 使用哈希表（字典）记录元素到索引的映射关系。
并查集操作：
- 基于离散化后的索引初始化父节点数组。
- 实现路径压缩（find）和按秩合并（union）

特点

动态映射：将离散的、大范围的数据（如大整数、字符串等）映射到连续的或较小的键空间中。
节省空间：避免为稀疏数据预分配大数组，使用哈希表动态存储。
灵活处理未知元素：动态添加新元素到并查集中，无需预先知道所有元素。

3.2 案例代码

#include #include #include #include #include #include class DisjointSet {private: std::unordered_map element_to_idx; // 元素到索引的映射 std::vector parent;  // 父节点数组 std::vector rank; // 秩数组（树的高度）public: // 构造函数：离散化元素并初始化并查集 DisjointSet(const std::vector& elements) { // 1. 去重并排序 std::set unique_sorted_elements(elements.begin(), elements.end()); // 2. 建立元素到索引的映射 int idx = 0; for (int elem : unique_sorted_elements) { element_to_idx[elem] = idx++; } // 3. 初始化 parent 和 rank parent.resize(unique_sorted_elements.size()); std::iota(parent.begin(), parent.end(), 0); // 初始父节点为自身 rank.resize(unique_sorted_elements.size(), 1); // 初始秩为1 } // 查找元素的根节点（路径压缩） int find(int x) { int x_idx = element_to_idx[x]; if (parent[x_idx] != x_idx) { parent[x_idx] = find_by_idx(parent[x_idx]); // 路径压缩 } return parent[x_idx]; }private: // 辅助函数：通过索引查找根节点（递归实现） int find_by_idx(int x_idx) { if (parent[x_idx] != x_idx) { parent[x_idx] = find_by_idx(parent[x_idx]); // 路径压缩 } return parent[x_idx]; }public: // 合并两个集合（按秩合并） void union_sets(int x, int y) { int x_root = find(x); int y_root = find(y); if (x_root == y_root) return; // 按秩合并：小树合并到大树 if (rank[x_root] < rank[y_root]) { parent[x_root] = y_root; } else { parent[y_root] = x_root; if (rank[x_root] == rank[y_root]) { rank[x_root]++; } } }};int main() { // 测试用例 std::vector elements = { 1, 3, 7, 100000, 3, 1 }; DisjointSet ds(elements); ds.union_sets(1, 3); ds.union_sets(7, 100000); std::cout << std::boolalpha << (ds.find(1) == ds.find(3)) << std::endl; // 输出 true std::cout << (ds.find(7) == ds.find(100000)) << std::endl; // 输出 true std::cout << (ds.find(1) == ds.find(7)) << std::endl; // 输出 false return 0;}

3.3 适用场景

大范围稀疏数据：
元素范围极大（如 1e18），但实际数量较少（如 1e5），无法直接用数组存下标。
非数值类型元素：
元素是字符串、对象等，需映射为整数索引。
动态元素处理（需扩展代码）：
若元素动态增加，可用字典动态分配索引，并扩展父节点数组。

注意事项

去重：离散化前必须去重，确保每个元素唯一对应一个索引。

一致性：所有操作需通过哈希表转为索引，避免直接操作原始元素。

动态扩展：若允许动态添加元素，需在每次遇到新元素时扩展数据结构

4. 可撤销并查集

4.1 基本概念

可撤销并查集（Rollback Disjoint Set Union，Rollback DSU） 是并查集的扩展版本，在支持 合并（Union） 和 查询（Find） 操作的基础上，新增了 撤销（Rollback） 功能。它允许用户回退到之前的某个状态，适用于需要支持 操作回滚 或 分支回溯 的场景。

4.2 核心特性

特性普通并查集可撤销并查集 合并操作 ✔️ ✔️ 路径压缩 ✔️ ❌（破坏可撤销性） 按秩合并 ✔️ ✔️（必需） 撤销操作 ❌ ✔️（核心功能） 时间复杂度 O(α(n)) O(log n) 每次操作

4.3 算法代码

1. 数据结构设计

父节点数组 parent[]：记录每个节点的父节点
秩数组 rank[]：记录树的高度（用于按秩合并）
操作栈 stack：存储合并操作的上下文信息

2. 案例代码

#include #include #include #include  // 添加元组支持using namespace std;class RollbackUnionFind {private: // 操作记录类型：子根 | 父根 | 原父秩 using Operation = tuple<int, int, int>; vector<int> fa; // 父节点数组 vector<int> rank; // 树的高度 stack<Operation> st; // 操作记录栈 int count; // 连通块数量public: // 初始化 RollbackUnionFind(int n) : count(n) { fa.resize(n + 1); rank.resize(n + 1, 1); for (int i = 1; i <= n; ++i) fa[i] = i; } // 查找根节点（无路径压缩） int find(int x) const { while (fa[x] != x) x = fa[x]; return x; } // 合并操作 bool unite(int x, int y) { x = find(x); y = find(y); if (x == y) { st.emplace(-1, -1, -1); // 无效操作标记 return false; } // 按秩合并 if (rank[x] < rank[y]) { st.emplace(x, y, rank[y]); fa[x] = y; } else { st.emplace(y, x, rank[x]); fa[y] = x; if (rank[x] == rank[y]) ++rank[x]; // 树高增加 } --count; // 连通块数量减少 return true; } // 撤销操作 void rollback() { if (st.empty()) return; Operation t = st.top(); int child = get<0>(t), parent_root = get<1>(t), old_rank = get<2>(t); st.pop(); if (child == -1) return; // 无效操作 // 恢复父子关系 fa[child] = child; // 恢复原秩值 if (rank[parent_root] != old_rank) { rank[parent_root] = old_rank; } ++count; // 连通块数量恢复 } // 获取当前连通块数量 int getCount() const { return count; } // 判断是否连通 bool isConnected(int x, int y) const { return find(x) == find(y); }};int main() { RollbackUnionFind uf(5); // 5个元素 uf.unite(1, 2); // 连通块：{1,2}, {3}, {4}, {5} (count=4) uf.unite(3, 4); // 连通块：{1,2}, {3,4}, {5} (count=3) uf.rollback(); // 撤销上一步 (count=4) uf.unite(2, 3); // 连通块：{1,2,3}, {4}, {5} (count=3) cout << uf.isConnected(1, 3) << endl; // 输出1（连通） uf.rollback(); // 撤销unite(2,3) cout << uf.isConnected(1, 3) << endl; // 输出0（不连通） cout << uf.getCount() << endl; // 输出4}

5. 其他进阶技巧

5.1 动态大小处理

当数据规模动态增长时，可以使用哈希表代替数组：

class DynamicUnionFind {private: unordered_map<int, int> parent; unordered_map<int, int> rank;public: void add(int x) { if (!parent.count(x)) { parent[x] = x; rank[x] = 1; } } // find 和 unite 方法与标准实现类似};

5.2 连通块分量统计优化

当我们不清楚点的数量的时候，但是要计算连通块数量，就可以用哈希的方法，如下：

class UnionFind {public: unordered_map fa; int count = 0; // 总共有多少不连通的并查集 int find(int x) { // 构建并查集时候，假如key值不在并查集中则构建哈希表映射，count++ if(fa.find(x) == fa.end()){ fa[x] = x; count++; } // 查找并查集的头节点并优化 if(fa[x] != x) fa[x] = find(fa[x]); return fa[x]; } // 合并集合 void unite(int a, int b) { int x = find(a), y = find(b); if (x == y) return; fa[x] = y; count--; }};

四、板子训练

1. 合并集合

一共有 𝑛 个数，编号是 1∼𝑛，最开始每个数各自在一个集合中。

现在要进行 𝑚 个操作，操作共有两种：

M a b，将编号为 a𝑎 和 b𝑏 的两个数所在的集合合并，如果两个数已经在同一个集合中，则忽略这个操作；
Q a b，询问编号为 a𝑎 和 b𝑏 的两个数是否在同一个集合中；

输入格式

第一行输入整数 n 和 m。

接下来 m 行，每行包含一个操作指令，指令为 M a b 或 Q a b 中的一种。

输出格式

对于每个询问指令 Q a b，都要输出一个结果，如果 a𝑎 和 b𝑏 在同一集合内，则输出 Yes，否则输出 No。

每个结果占一行。

数据范围

$1≤𝑛,𝑚≤10^5$

输入样例

4 5M 1 2M 3 4Q 1 2Q 1 3Q 3 4

输出样例

YesNoYes

代码如下：

#include using namespace std;const int N = 1e5 + 10;int p[N];int find(int x){ return p[x] == x ? x : p[x] = find(p[x]);}int main(){ int n, m; cin >> n >> m; for(int i = 1; i <= n; i++) p[i] = i; char op; int a, b; while(m--){ cin >> op >> a >> b; if(op == \'M\'){ p[find(a)] = find(b); // 集合合并操作 } else{ if(find(a) == find(b)) cout << \"Yes\" << endl; else cout << \"No\" << endl; } } return 0;}

2. 连通块中点的数量

给定一个包含 𝑛 个点（编号为 1∼𝑛）的无向图，初始时图中没有边。

现在要进行 𝑚 个操作，操作共有三种：

C a b，在点 𝑎 和点 𝑏 之间连一条边，a 和 𝑏 可能相等；
Q1 a b，询问点 𝑎 和点 𝑏 是否在同一个连通块中，𝑎 和 𝑏 可能相等；
Q2 a，询问点 𝑎 所在连通块中点的数量；

输入格式

第一行输入整数 𝑛 和 𝑚。

接下来 𝑚 行，每行包含一个操作指令，指令为 C a b，Q1 a b 或 Q2 a 中的一种。

输出格式

对于每个询问指令 Q1 a b，如果 𝑎 和 𝑏 在同一个连通块中，则输出 Yes，否则输出 No。

对于每个询问指令 Q2 a，输出一个整数表示点 a𝑎 所在连通块中点的数量

每个结果占一行。

数据范围

$1≤𝑛,𝑚≤10^5$

输入样例：

5 5C 1 2Q1 1 2Q2 1C 2 5Q2 5

输出样例：

Yes23

代码如下：

#include #include using namespace std;const int N = 1e5 + 10;int fa[N], cnt[N];int n, m;// 1. 初始化void init(){ for(int i = 1; i <= n; i++){ fa[i] = i; cnt[i] = 1; }}// 2. 查询根节点int find(int x){ return fa[x] == x ? x : fa[x] = find(fa[x]);}// 3. 合并连通块void merge(int a, int b){ int x = find(a), y = find(b); fa[x] = y; cnt[y] += cnt[x]; // y 连通块 + x 连通块内点数量}// 4. 判断连通bool isconnected(int a, int b){ return find(a) == find(b);}int main(){ cin >> n >> m; init(); string op; int a, b; while(m--) { cin >> op; if(op == \"C\") { cin >> a >> b; // 更新连通块数量之前一定需要先判断两者是否在一起 if(!isconnected(a, b))merge(a, b); } else if(op == \"Q1\"){ cin >> a >> b; if(isconnected(a, b)) cout << \"Yes\" << endl; else cout << \"No\" << endl; } else{ cin >> a; cout << cnt[find(a)] << endl; } }}

3. 食物链

动物王国中有三类动物 𝐴,𝐵,𝐶，这三类动物的食物链构成了有趣的环形。

𝐴 吃 𝐵，𝐵 吃 𝐶，C𝐶 吃 𝐴。

现有 N𝑁 个动物，以 1∼N1∼𝑁 编号。

每个动物都是 𝐴,𝐵,𝐶 中的一种，但是我们并不知道它到底是哪一种。

有人用两种说法对这 𝑁 个动物所构成的食物链关系进行描述：

第一种说法是 1 X Y，表示 𝑋 和 𝑌 是同类。

第二种说法是 2 X Y，表示 𝑋 吃 𝑌。

此人对 𝑁 个动物，用上述两种说法，一句接一句地说出 𝐾 句话，这 𝐾 句话有的是真的，有的是假的。

当一句话满足下列三条之一时，这句话就是假话，否则就是真话。

当前的话与前面的某些真的话冲突，就是假话；
当前的话中 𝑋 或 𝑌 比 𝑁 大，就是假话；
当前的话表示 𝑋 吃 𝑋，就是假话。

你的任务是根据给定的 𝑁 和 𝐾 句话，输出假话的总数。

输入格式

第一行是两个整数 𝑁 和 𝐾，以一个空格分隔。

以下 𝐾 行每行是三个正整数 𝐷，𝑋，𝑌，两数之间用一个空格隔开，其中 𝐷 表示说法的种类。

若 𝐷=1，则表示 𝑋 和 𝑌 是同类。

若 𝐷=2，则表示 𝑋 吃 𝑌。

输出格式

只有一个整数，表示假话的数目。

数据范围

1≤𝑁≤50000,
0≤𝐾≤100000

输入样例：

100 71 101 1 2 1 22 2 3 2 3 3 1 1 3 2 3 1 1 5 5

输出样例：

方法一：并查集(拓展域)

这道题目我们主要是要开三个拓展的域：同类域（A类），捕食域（B 类），以及天敌域（C 类）

如果 x，y 是同类，但是 x 的捕食域有 y，那么错误
如果 x，y 是同类，但是 x 的天敌域有 y，那么错误
如果 x 是 y 的天敌，但是 x 的同类域中有 y，那么错误
如果 x 是 y 的天敌，但是 x 的天敌域中有 y，那么错误

首先，在带扩展域的并查集中 x 不再只是一个值，而是一个事件；

规定 x 为 “事件 x 为 A 类动物”；

规定 x + N 为 “事件 x 为 B 类动物”；

规定 x + N * 2 为 “事件 x 为 C 类动物”；

p[find(X)] = find(Y) 表示：事件 X 为 A 类动物和事件 Y 为 A 类动物同时发生

X 与 Y 为同种动物等价于

p[ find(X) ] = find(Y);

p[ find(X + N)] = find(Y + N);

p[ find(X + N * 2)] = find(Y + N * 2);

p[find(X)] = find(Y + N) 表示：事件 X 为 A 类动物和事件 Y 为 B 类动物同时发生

X 吃 Y 等价于

p[ find(X) ] = find(Y + N);

p[ find(X + N)] = find(Y + N * 2);

p[ find(X + N * 2)] = find(Y);

#include using namespace std;int fa[200000];int n, k;void init(){ for(int i = 1;i <= 3*n; ++i) fa[i]=i;}int find(int x){ return x == fa[x] ? x : fa[x] = find(fa[x]);}void merge(int a, int b){ int x = find(a), y = find(b); fa[x] = y; }int main(){ cin >> n >> k; init(); int ans = 0; while(k--) { int D, x, y; cin >> D >> x >> y; if(x > n || y > n) ans++; else if(D == 1) { // 如果x,y是同类,但是x是y的捕食中的动物,或者x是y天敌中的动物,那么错误. if(find(x) == find(y + n) || find(x) == find(y + n + n)) ans++; else{ merge(x, y); // 同类域 merge(x + n, y + n); // 捕食域 merge(x + n + n, y + n + n); // 天敌域 } } else // x 吃 y { // x就是y,或者他们是同类,再或者是y的同类中有x if(x == y || find(x) == find(y) || find(x) == find(y+n)) ans++; // 都是假话 else{ merge(x, y + n + n); // x 同类域 加入 y 天敌域 merge(x + n, y); // x 捕食域 加入 y 同类域 merge(x + n + n, y + n); // x 天敌域 加入 y 捕食域 } } } cout << ans << endl; return 0;}

**方法二：**带权并查集

#include using namespace std;const int N = 500010;int p[N], d[N]; //size[]表示的是每一个集合点的数量int n, k;int find(int x){ if (p[x] != x){ int t = find(p[x]); //开始递归；最终用t储存根节点 d[x] += d[p[x]]; //在递归过程中求得每个点到根节点的距离 p[x] = t; //将每个点的父节点指向根节点 } return p[x];  //返回父节点}int main(){cin >> n >> k;for (int i = 1; i <= n; i++) p[i] = i;int res = 0;while (k--){int D, x, y;cin >> D >> x >> y;int px = find(x), py = find(y); // px,py分别储存根节点if (x > n || y > n) res++;else if(D == 1){ if (px == py && (d[x] - d[y]) % 3) res++; //d[x]-d[y]!=0说明两个不是同类else if (px != py){p[px] = py;d[px] = d[y] - d[x]; //(d[x]+d[px]-d[y])%3==0}}else{ if (px == py && (d[x] - d[y] - 1) % 3) res++; // x,y祖宗相同，并且x y 属于同类else if (px != py){p[px] = py;d[px] = d[y] + 1 - d[x]; //(d[x]+d[px]-d[y]-1)%3==0}}}cout << res << endl;return 0;}

五、经典例题

提一句：下面的函数都是上面板子写过的，下面代码中就没有特别演示出来了

例题1: 省份数量 (LeetCode 547)

题目链接：547. 省份数量（LeetCode）

题目：
有 n 个城市，其中一些彼此相连。省份是一组直接或间接相连的城市。给定邻接矩阵 isConnected，求省份数量

思路：使用普通并查集，获取连通块数量

代码实现：

int findCircleNum(vector<vector<int>>& isConnected) { int n = isConnected.size(); UnionFind uf(n); for (int i = 0; i < n; i++) //由于该矩阵对称，只考虑右上角 { for (int j = i + 1; j < n; j++) { if (isConnected[i][j] == 1) uf.unite(i, j); } } // 下面两种获取连通块数量的方法都行 int ans = 0; for (int i = 0; i < n; i++){ if (uf.find(i) == i) ans++; } // return ans; return uf.getCount();}

例题2: 连通网络的操作次数 (LeetCode 1319)

题目链接：1319. 连通网络的操作次数 - 力扣（LeetCode）

题目：
有 n 台计算机，给定连接线 connections。要使网络连通至少需要移动多少根线？

思路：使用普通并查集，然后获取连通块数量即可

代码实现：

int makeConnected(int n, vector<vector<int>>& connections) { if(connections.size() < n - 1) return -1; UnionFind uf(n); int extra = 0; for(auto& conn: connections){ uf.unite(conn[0], conn[1]); } return uf.getCount() - 1;}

案例3: 冗余连接 (LeetCode 684)

题目链接：684. 冗余连接（LeetCode）

问题：找出使树变成图的多余边

思路：通过标准并查集检测环，无需维护大小

代码实现：

vector<int> findRedundantConnection(vector<vector<int>>& edges) { int n = edges.size(); UnionFind uf(n+1); // 节点编号从1开始 for(auto& e : edges) { if(uf.isConnected(e[0], e[1])) return {e[0], e[1]}; uf.unite(e[0], e[1]); } return {};}

案例4: 岛屿数量 (LeetCode 200)

题目链接：200. 岛屿数量 - 力扣（LeetCode）

问题：矩阵连通性检测

思路：使用普通并查集建立连接，然后只连接岛屿（连接右边和下边相邻的），然后再返回岛屿的连通块数量

代码实现：

int numIslands(vector<vector<char>>& grid) { if(grid.empty()) return 0; int n = grid.size(), m = grid[0].size(); UnionFind uf(n * m); int water = 0; for(int i = 0; i < n; i++){ for(int j = 0; j < m; j++){ if(grid[i][j] == \'0\') { water++; // 统计水域 continue; } int ind = i * m + j + 1; // 当前单元格的索引 // 1. 合并右边相邻的陆地 if (j + 1 < m && grid[i][j + 1] == \'1\') uf.unite(ind, ind + 1);  // 2. 合并下边相邻的陆地 if (i + 1 < n && grid[i + 1][j] == \'1\') uf.unite(ind, ind + m); } } // 求岛屿数量的方法一 int ans = 0; //计算连通岛屿中根节点的数目 for (int i = 0; i < n; i++){ for (int j = 0; j < m; j++){ int ind = i * m + j + 1; if (grid[i][j] == \'1\' && uf.find(ind) == ind) ans++; } } // 求岛屿数量的方法二 // uf.getCount() 返回的是 所有节点的连通块总数 （包括陆地和水域） // 因此需要从总连通块数中 减去水域的数量  return uf.getCount() - water;}

案例5: 等式方程的可满足性 (LeetCode 990)

题目链接：990. 等式方程的可满足性 - 力扣（LeetCode）

思路：既可以用普通并查集来做，也可以用带权并查集
其实这里也是通过 -‘a’ 把字符离散映射成为了数字

代码实现：

bool equationsPossible(vector<string>& equations) { UnionFind uf(26); // 1. 把所有等式放入连通集里 for(auto &eq: equations){ int x = eq[0] - \'a\', y = eq[3] - \'a\'; if(eq[1] == \'=\'){ uf.unite(x, y); } } // 2. 判断不等式 for(auto &eq: equations){ int x = eq[0] - \'a\', y = eq[3] - \'a\'; if(eq[1] == \'!\'){ if(uf.isConnected(x, y)) return false; } } return true;}// 方法二: 带权并查集来写, 设置默认权重为 1bool equationsPossible(vector<string>& equations) { WeightedUnionFind uf(26); // 26个小写字母 // 处理所有等式 for (auto& eq : equations) { if (eq[1] == \'=\') { int x = eq[0]-\'a\', y = eq[3]-\'a\'; uf.unite(x, y, 1.0); // x/y=1.0 } } // 验证不等式 for (auto& eq : equations) { if (eq[1] == \'!\') { int x = eq[0]-\'a\', y = eq[3]-\'a\'; if (uf.getValue(x, y) == 1) // 检查是否相等 return false; } } return true;}

案例6: 除法求值 (LeetCode 399)

题目链接：399. 除法求值 - 力扣（LeetCode）

思路：带权 + 离散化并查集，先通过离散化然后建立连接，再把[x, y] 与权值 w 建立连接

代码实现：

vector<double> calcEquation(vector<vector<string>>& equations, vector<double>& values, vector<vector<string>>& queries) { unordered_map<string, int> str2id; int id = 0; // 字符串映射为数字ID for (auto& eq : equations) { if (!str2id.count(eq[0])) str2id[eq[0]] = id++; if (!str2id.count(eq[1])) str2id[eq[1]] = id++; } WeightedUnionFind uf(id); // 建立关系 for (int i = 0; i < equations.size(); ++i) { int x = str2id[equations[i][0]]; int y = str2id[equations[i][1]]; uf.unite(x, y, values[i]); // x = values[i] * y } // 处理查询 vector<double> res; for (auto& q : queries) { if (!str2id.count(q[0]) || !str2id.count(q[1])) { res.push_back(-1.0); continue; } int x = str2id[q[0]], y = str2id[q[1]]; res.push_back(uf.getValue(x, y)); } return res;}

案例7: 冗余连接 II (LeetCode 685)

题目链接：685. 冗余连接 II - 力扣（LeetCode）

注意：该题和之前冗余连接不同的是，每个点有且只有一个父节点，因此不仅需要考虑环，还需要考虑入度为 2 的问题

问题特点

有向图存在冗余边
需要处理 入度=2 或环两种场景
带权并查集需要记录 父节点关系

代码实现：

class DirectedUnionFind {private: vector<int> parent; // 父节点（带路径压缩） vector<int> edgeTo; // 显式记录每个节点的直接父节点 public: DirectedUnionFind(int n) { parent.resize(n+1); edgeTo.resize(n+1, -1); iota(parent.begin(), parent.end(), 0); } int find(int x) { // 路径压缩查找根节点 return x == parent[x] ? x : parent[x] = find(parent[x]); } int unite(int u, int v) { // u是v的父节点 if (edgeTo[v] != -1) // 子节点v已经有父节点 return v; int rootU = find(u); if (rootU == find(v)) // 形成环（u和v已连通） return v; edgeTo[v] = u; // v的直接父节点是u parent[v] = rootU; // 合并到u的根 return -1; }};class Solution {public: vector<int> findRedundantDirectedConnection(vector<vector<int>>& edges) { int n = edges.size(); // 记录入度为 2 的节点的两条侯选边 vector<int> candidate1, candidate2; unordered_map<int, int> inDegree; // 检测入度为2的节点 for (auto& e : edges) { if (++inDegree[e[1]] == 2) { // 逆序遍历找到所有指向该节点的边 vector<int> candidates; // 临时存储候选边的索引 for (int i = n-1; i >= 0; --i) {  if (edges[i][1] == e[1]) { candidates.push_back(i); if (candidates.size() == 2) break; // 只需要最后两个  } } // 根据索引顺序赋值候选边 candidate2 = edges[candidates[0]]; // 最后出现的边（索引较大） candidate1 = edges[candidates[1]]; // 倒数第二个出现的边 break; } }  DirectedUnionFind uf(n); // 处理入度=2的情况 if (!candidate1.empty()) { // 先尝试删除最后出现的候选边candidate2 bool hasConflict = false; for (auto& e : edges) { if (e == candidate2) continue; if (uf.unite(e[0], e[1]) != -1) {  hasConflict = true;  break; } } if (!hasConflict) return candidate2; // 删除candidate2后合法 // 再验证candidate1 uf = DirectedUnionFind(n); for (auto& e : edges) { if (e == candidate1) continue; if (uf.unite(e[0], e[1]) != -1)  return candidate2; // 必须删除candidate2 } return candidate1; // 可安全删除candidate1 } // 处理环的情况 for (auto& e : edges) { if (uf.unite(e[0], e[1]) != -1) return e; } return {}; }};

案例8：账户合并 (LeetCode 721)

题目链接：721. 账户合并 - 力扣（LeetCode）

问题特点

用哈希表将邮箱字符串映射为整数
需要合并具有相同邮箱的账户
带权并查集需要维护 邮箱到账户的映射关系

代码实现：

class AccountUnionFind {public: vector<int> parent; vector<string> owner; // 每个邮箱的账户名 unordered_map<string, int> email2id; public: AccountUnionFind(const vector<vector<string>>& accounts) { int id = 0; // 邮箱离散化并建立映射 for (auto& acc : accounts) { for (int i = 1; i < acc.size(); ++i) { string email = acc[i]; if (!email2id.count(email)) {  email2id[email] = id++;  owner.push_back(acc[0]); // 保存账户名 } } } parent.resize(id); iota(parent.begin(), parent.end(), 0); // 初始化父节点 } int find(int x) { return x == parent[x] ? x : parent[x] = find(parent[x]); } void unite(int x, int y) { int rootX = find(x), rootY = find(y); if (rootX != rootY) parent[rootY] = rootX; } vector<vector<string>> getMerged() { unordered_map<int, set<string>> groups; // 根据根节点分组收集邮箱 for (auto& [email, id] : email2id) { int root = find(id); groups[root].insert(email); } // 生成结果 vector<vector<string>> res; for (auto& [root, emails] : groups) { vector<string> acc{owner[root]}; acc.insert(acc.end(), emails.begin(), emails.end()); res.push_back(acc); } return res; }};class Solution {public: vector<vector<string>> accountsMerge(vector<vector<string>>& accounts) { AccountUnionFind uf(accounts); // 合并同一账户的邮箱 for (auto& acc : accounts) { if (acc.size() < 2) continue; string firstEmail = acc[1]; int firstId = uf.email2id.at(firstEmail); // 使用成员变量 for (int i = 2; i < acc.size(); ++i) { string email = acc[i]; int id = uf.email2id.at(email); uf.unite(firstId, id); } } return uf.getMerged(); }};

案例9：最长连续序列(LeetCode 128)

题目链接：LeetCode 128. 最长连续序列

需求：在未排序数组中，找到数字组成的最长连续序列长度（统计连通块大小）

思路：将相邻数字合并，通过维护连通块大小统计最长序列（离散化并查集）

class Solution {public: vector<int> parent, cnt; int find(int x) { return parent[x] == x ? x : parent[x] = find(parent[x]); } void merge(int x, int y) { int fx = find(x), fy = find(y); if (fx != fy) { parent[fx] = fy; cnt[fy] += cnt[fx]; // 维护连通块大小 } } int longestConsecutive(vector<int>& nums) { if (nums.empty()) return 0; unordered_map<int, int> num_map; int idx = 0; // 离散化去重 for (int num : nums) { if (!num_map.count(num)) { num_map[num] = idx++; parent.push_back(idx-1); cnt.push_back(1); } } // 合并相邻数字 for (auto& [num, id] : num_map) { if (num_map.count(num+1))  merge(id, num_map[num+1]); if (num_map.count(num-1))  merge(id, num_map[num-1]); } return *max_element(cnt.begin(), cnt.end()); }};

案例10：移除最多的同行或同列石头(LeetCode 947)

题目链接：947. 移除最多的同行或同列石头 - 力扣（LeetCode）

场景：石头的行和列可能很大（如 1e9），需将行和列视为节点合并。
离散化技巧：将行 r 和列 c 映射为两个唯一键（如 r 和 ~c 避免冲突）

代码如下：

class UnionFind {public: unordered_map<int, int> fa; int count = 0; // 总共有多少不连通的并查集 int find(int x) { // 构建并查集时候，假如key值不在并查集中则构建哈希表映射，count++ if(fa.find(x) == fa.end()){ fa[x] = x; count++; } // 查找并查集的头节点并优化 if(fa[x] != x) fa[x] = find(fa[x]); return fa[x]; } // 合并集合 void unite(int a, int b) { int x = find(a), y = find(b); if (x == y) return; fa[x] = y; count--; }};class Solution {public: int removeStones(vector<vector<int>>& stones) { UnionFind uf; for(auto &stone: stones){ uf.unite(stone[0] + 10001, stone[1]); } return stones.size() - uf.count; }};

案例11：按公因数计算最大组件大小(LeetCode 952)

题目链接：952. 按公因数计算最大组件大小 - 力扣（LeetCode）

场景：数字的因数可能很大，需合并有公因数的数字。
离散化技巧：将因数作为节点，动态合并，用之前的并查集模板就可以写了

class Solution {public: int largestComponentSize(vector<int>& nums) { int m = *max_element(nums.begin(), nums.end()); UnionFind uf(m + 1); for(int num: nums){ for(int i = 2; i * i <= num; i++){ if(num % i == 0){  // [num, 公因数]  uf.unite(num, i);  uf.unite(num, num / i);  } } } // 根节点映射 vector<int> cnt(m + 1); int ans = 0; for(int num: nums){ int root = uf.find(num); cnt[root]++; ans = max(ans, cnt[root]); } return ans; }};

案例12：由斜杠划分区域(LeetCode 959)

题目链接：959. 由斜杠划分区域 - 力扣（LeetCode）

需求：在分支决策算法中尝试不同路径后回退
实现：在决策分支点保存状态，回溯时撤销操作

class UnionFind {private: vector<int> fa; // 父节点数组 vector<int> rank; // 秩（树的高度） int count = 0; // 新增：连通块数量 public: // 1. 初始化 UnionFind(int n) {  fa.resize(n + 1); rank.resize(n + 1, 1); count = n; // 初始每个节点都是独立连通块，总共有n个 for(int i = 1; i <= n; ++i) fa[i] = i; } // 2. 查找根节点（路径压缩） int find(int x) { return fa[x] == x ? x : fa[x] = find(fa[x]); } // 3. 合并集合（按秩合并） bool unite(int a, int b) { int x = find(a), y = find(b); if (x == y) return false; // 合并时连通块数量减少1 count--; // 按秩合并 if (rank[x] < rank[y]) { fa[x] = y; } else { fa[y] = x; if (rank[x] == rank[y]) rank[x]++; } return true; }};class Solution {public: int regionsBySlashes(vector<string>& grid) { int n = grid.size(); //n * n的网格 //以row = 0...n，col = 0...n 将所有点存入并查集，并将正方形网格轮廓的各个点连接在一起，表示初始时有一个正方形的环 //再遍历所有的 / & \\\\ ，根据其位置连接点。连接过程中如果 两个点的根节点相同，说明新形成了一个环 //(n + 1) (n + 1) 个点 UnionFind uf((n + 1) * (n + 1)); int res = 1; // 初始时有一个正方形 for(int i = 0; i < n; i++){ uf.unite(i, i + 1); // 最顶行 uf.unite(n * (n + 1) + i, n * (n + 1) + i + 1); // 最底行 uf.unite((n + 1) * i, (n + 1) * (i + 1)); // 最左列 uf.unite((n + 1) * (i + 1) - 1, (n + 1) * (i + 2) - 1); // 最右列 } // 遍历 / 和 \'\\\\\' char c = \' \'; int pos = 0; for(int i = 0; i < n; i++){ for(int j = 0; j < n; j++){ c = grid[i][j]; pos = i * (n + 1) + j; // 找到左上角的下标 // 连接右上角和左下角 if(c == \'/\'){  if(!uf.unite(pos + 1, pos + n + 1))res++; } else if(c == \'\\\\\'){  if(!uf.unite(pos, pos + n + 2))res++; } } } return res; }};

六、工程实践要点

初始化优化：
- 对于连续整数型节点，优先使用数组存储
- 对于非连续或动态节点，使用哈希表实现

内存管理：

// 使用内存池优化动态分配class MemoryPoolUnionFind {private: vector<int> parent; vector<int> rank; stack<int> recycled; // 回收的节点索引public: int newNode() { if(!recycled.empty()) { int idx = recycled.top(); recycled.pop(); return idx; } parent.push_back(parent.size()); rank.push_back(1); return parent.size()-1; } void deleteNode(int x) { parent[x] = x; // 重置状态 rank[x] = 1; recycled.push(x); }};

并行化处理：
- 使用分片锁机制实现线程安全
- 为每个连通分量分配独立锁
- 采用读写锁优化查询密集型场景

七、扩展思考

1. 逆向并查集

处理动态删除操作的特殊场景：

预先记录所有操作
从最终状态倒序处理，将删除转换为添加
典型应用：离线查询处理

2. 概率化并查集

引入随机化提升性能：

int find(int x) { if(parent[x] != x) { // 以50%概率进行路径压缩 if(rand() % 2 == 0) parent[x] = parent[parent[x]]; return find(parent[x]); } return x;}

3. 机器学习应用

图像分割中的区域合并
社交网络中的社区发现
三维建模中的面片合并

通过以上内容的扩展，可以全面掌握并查集的核心原理和进阶应用技巧。建议通过LeetCode相关专题进行实践训练，加深对各类变种问题的理解。
《并查集的黑科技：路径压缩×按秩合并×带权扩展｜算法核心原理与工程级实践指南》

《并查集的黑科技：路径压缩×按秩合并×带权扩展｜算法核心原理与工程级实践指南》

🔥 目录

一、概念与特点

1. 核心概念

2. 时间复杂度

二、应用场景

三、并查集技巧

1. 普通并查集基本模板

2. 带权并查集

2.1 基本概念

2.2 权重更新原理

2.3 合并操作示例

3. 离散化并查集

3.1 基本概念

3.2 案例代码

3.3 适用场景

4. 可撤销并查集

4.1 基本概念

4.2 核心特性

4.3 算法代码

5. 其他进阶技巧

5.1 动态大小处理

5.2 连通块分量统计优化

四、板子训练

1. 合并集合

2. 连通块中点的数量

3. 食物链

五、经典例题

例题1: 省份数量 (LeetCode 547)

例题2: 连通网络的操作次数 (LeetCode 1319)

案例3: 冗余连接 (LeetCode 684)

案例4: 岛屿数量 (LeetCode 200)

案例5: 等式方程的可满足性 (LeetCode 990)

案例6: 除法求值 (LeetCode 399)

案例7: 冗余连接 II (LeetCode 685)

案例8：账户合并 (LeetCode 721)

案例9：最长连续序列(LeetCode 128)

案例10：移除最多的同行或同列石头(LeetCode 947)

案例11：按公因数计算最大组件大小(LeetCode 952)

案例12：由斜杠划分区域(LeetCode 959)

六、工程实践要点

七、扩展思考

1. 逆向并查集

2. 概率化并查集

3. 机器学习应用

相关问题

公告

DeepSeek全套部署资料免费下载

免费可商用字体批量下载

标签