可持久化线段树

可持久化线段树（又称函数式线段树）是一种可持久化数据结构（英語：Persistent data structure）。这种数据结构在普通线段树的基础之上支持查询某个历史版本，同时时间复杂度与线段树是同级，空间复杂度相较而言更高。^[1]^[2]在中国信息学奥林匹克竞赛中，由于引入者黄嘉泰姓名的缩写与前中共中央总书记、国家主席胡锦涛(H.J.T.)相同，因此这种数据结构也可被称为总书记树或主席树。^{[來源請求]}

原理[编辑]

与大部分可持久化数据结构相似，可持久化线段树在更新时尽可能与之前某一个旧版本共用一部分结点，从而节省空间。

例如，有一颗维护着区间和的线段树，现在以这颗线段树作为基础，将下标为 $5$ 的元素数值减去一，同时另存为一个新的版本。按照一般线段树中使用的思路，当位于根结点时我们应该递归操作它的右子树，如果是暴力实现的可持久化线段树，那么则需要再次递归将一整颗左子树复制一遍然后保存指针，但是复制的这一整颗子树是完全一模一样的，因此可以先只复制一个根结点，然后将它的左子树直接指向原先版本根结点的左子树，代表上一个版本和这一个新版本这颗子树保存的信息是完全一样的，然后再按照相似的方法，递归地处理下标 $5$ 存在的右子树。

性能分析[编辑]

静态可持久化线段树在每次更新版本时，总是最大程度上减少结点的复制，这样不仅减少了时间的开销，更避免了不必要的空间浪费。与线段树相同，可持久化线段树由于在每一次更新版本时没有访问到不必要的结点，所以每一次查询和修改（即建立一个新的版本）时间复杂度为 $O(\log N)$ , 在这个过程中，会同时建立 $O(\log N)$ 个新的结点。如果总操作数量为 $m$ , 那么可持久化线段树可以以 $O(N+M\log N)$ 的时空复杂度解决问题。

实践[编辑]

静态区间第k大数值[编辑]

这类问题需要求解在一个长度为 $n$ 的数列中，第 $i$ 个数为 $a_{i}$ . 现在给定一些形如 $(l,r,k)$ 的三元组作为询问，设计程序计算数列第 $l~r$ 这些元素中出现次数排在第 $k$ 位的是多少。

利用可持久化线段树，维护区间 $(l,r)$ 代表区间 $[l,r]$ 中的元素出现了多少次，以此作为原始版本 $S_{0}$ ，此后每次建立一个新版本 $S_{i}$ ，代表去掉原数列中 $a_{0}~a_{i-1}$ 的元素之后建立的线段树，维护目标与上述相同。具体过程可以每次将 $a_{i}$ 的出现次数减一，并保存此时生成的新版本。

参考程序[编辑]

C++[编辑]

#include<bits/stdc++.h>

constexpr auto MAXN = (int)2e5 + 500;

std::map<int, int> val, mp;

struct Node {
	int fr, to, sum;
	Node *lft, *rgt;
	Node& Copy(Node *targ) {
		fr = targ->fr; to = targ->to; sum = targ->sum;
		lft = targ->lft; rgt = targ->rgt;
		return *this;
	}
};
Node* NewNode() {
	Node* ret = (Node*)malloc(sizeof(Node));
	ret->lft = ret->rgt = nullptr;
	return ret;
}
std::vector<Node*> version;

int num[MAXN], cnt[MAXN], orig[MAXN], fr[MAXN], to[MAXN], rank[MAXN];
std::queue<Node*> que, add;

signed main(void)
{
	int totNums, totQuery, cnt;

	//Read
	scanf("%d%d", &totNums, &totQuery);
	for (int i = 0; i < totNums; i++)scanf("%d", num + i);
	for (int i = 0; i < totQuery; i++)scanf("%d%d%d", fr + i, to + i, rank + i);

	for (int i = 0; i < totNums; i++) orig[i] = num[i];
	std::sort(num, num + totNums);
	cnt = 0; int tmp;
	mp[val[0] = *num] = cnt++;
	for (int i = 1; i < totNums; i++)
		if (num[i] != num[i - 1])
			tmp = cnt,mp[val[tmp] = num[i]] = cnt++;	
	for (int i = 0; i < totNums; i++)::cnt[num[i] = mp[orig[i]]]++;

	//Build
	Node *a, *b, *t;
	for (int i = 0; i < cnt; i++) {
		t = NewNode(); t->fr = t->to = i;
		t->sum = ::cnt[i];
		que.push(t);
	}
	for (; que.size() >= 2; std::swap(que, add)) {
		while (que.size() >= 2) {
			a = que.front(); que.pop(); b = que.front(); que.pop();
			t = NewNode(); t->fr = a->fr; t->to = b->to; t->sum = a->sum + b->sum;
			t->lft = a; t->rgt = b;
			add.push(t);
		}
		if (!que.empty()) { add.push(que.front()); que.pop(); }
	}version.push_back(que.front());
	//New Versions
	for (int del = 0; del < totNums; del++) {
		const int &target = num[del];
		t = a = NewNode(); b = version.back();
		while (true) {
			a->Copy(b); a->sum--;
			if (a->fr == target && a->to == target)break;
			if (a->lft->fr <= target && target <= a->lft->to) {
				a->lft = NewNode(); a = a->lft; b = b->lft;
			} else {
				a->rgt = NewNode(); a = a->rgt; b = b->rgt;
			}
		}version.push_back(t);
	}

	//Query
	int rnk;
	for (int i = 0; i < totQuery; i++)
	{
		a = version[--fr[i]]; b = version[to[i]]; rnk = rank[i];
		while (true) {
			if (a->lft == nullptr) { printf("%d\n", val[a->fr]); break; }

			if (a->lft->sum - b->lft->sum >= rnk) {
				a = a->lft; b = b->lft;
			}
			else {
				rnk -= a->lft->sum - b->lft->sum;
				a = a->rgt; b = b->rgt;
			}
		}
	}
	
	//system("pause");
	return 0;
}

参见[编辑]

参考文献[编辑]

^ 李煜东. 算法竞赛进阶指南. 中原出版传媒集团·河南电子音像出版社. 2018-1: P255–257. ISBN 978-7-83009-313-6. 请检查|date=中的日期值 (帮助)
^ Antti Laaksonen. Guide to Competitive Programming: Learning and Improving Algorithms Through Contests 1st edition. Springer. 2017: P250–251. ISBN 978-3-319-72546-8 （英语）.

[lyd-1] 李煜东. 算法竞赛进阶指南. 中原出版传媒集团·河南电子音像出版社. 2018-1: P255–257. ISBN 978-7-83009-313-6. 请检查|date=中的日期值 (帮助)

[antti-2] Antti Laaksonen. Guide to Competitive Programming: Learning and Improving Algorithms Through Contests 1st edition. Springer. 2017: P250–251. ISBN 978-3-319-72546-8 （英语）.

[1]

[2]