可持久化線段樹

可持久化線段樹（又稱函數式線段樹）是一種可持久化數據結構（英語：Persistent data structure）。這種數據結構在普通線段樹的基礎之上支持查詢某個歷史版本，同時時間複雜度與線段樹是同級，空間複雜度相較而言更高。^[1]^[2]在中國信息學奧林匹克競賽中，由於引入者黃嘉泰姓名的縮寫與前中共中央總書記、國家主席胡錦濤(H.J.T.)相同，因此這種數據結構也可被稱為總書記樹或主席樹。^{[來源請求]}

原理[編輯]

與大部分可持久化數據結構相似，可持久化線段樹在更新時儘可能與之前某一個舊版本共用一部分結點，從而節省空間。

例如，有一顆維護着區間和的線段樹，現在以這顆線段樹作為基礎，將下標為 $5$ 的元素數值減去一，同時另存為一個新的版本。按照一般線段樹中使用的思路，當位於根結點時我們應該遞歸操作它的右子樹，如果是暴力實現的可持久化線段樹，那麼則需要再次遞歸將一整顆左子樹複製一遍然後保存指針，但是複製的這一整顆子樹是完全一模一樣的，因此可以先只複製一個根結點，然後將它的左子樹直接指向原先版本根結點的左子樹，代表上一個版本和這一個新版本這顆子樹保存的信息是完全一樣的，然後再按照相似的方法，遞歸地處理下標 $5$ 存在的右子樹。

性能分析[編輯]

靜態可持久化線段樹在每次更新版本時，總是最大程度上減少結點的複製，這樣不僅減少了時間的開銷，更避免了不必要的空間浪費。與線段樹相同，可持久化線段樹由於在每一次更新版本時沒有訪問到不必要的結點，所以每一次查詢和修改（即建立一個新的版本）時間複雜度為 $O(\log N)$ , 在這個過程中，會同時建立 $O(\log N)$ 個新的結點。如果總操作數量為 $m$ , 那麼可持久化線段樹可以以 $O(N+M\log N)$ 的時空複雜度解決問題。

實踐[編輯]

靜態區間第k大數值[編輯]

這類問題需要求解在一個長度為 $n$ 的數列中，第 $i$ 個數為 $a_{i}$ . 現在給定一些形如 $(l,r,k)$ 的三元組作為詢問，設計程序計算數列第 $l~r$ 這些元素中出現次數排在第 $k$ 位的是多少。

利用可持久化線段樹，維護區間 $(l,r)$ 代表區間 $[l,r]$ 中的元素出現了多少次，以此作為原始版本 $S_{0}$ ，此後每次建立一個新版本 $S_{i}$ ，代表去掉原數列中 $a_{0}~a_{i-1}$ 的元素之後建立的線段樹，維護目標與上述相同。具體過程可以每次將 $a_{i}$ 的出現次數減一，並保存此時生成的新版本。

參考程序[編輯]

C++[編輯]

#include<bits/stdc++.h>

constexpr auto MAXN = (int)2e5 + 500;

std::map<int, int> val, mp;

struct Node {
	int fr, to, sum;
	Node *lft, *rgt;
	Node& Copy(Node *targ) {
		fr = targ->fr; to = targ->to; sum = targ->sum;
		lft = targ->lft; rgt = targ->rgt;
		return *this;
	}
};
Node* NewNode() {
	Node* ret = (Node*)malloc(sizeof(Node));
	ret->lft = ret->rgt = nullptr;
	return ret;
}
std::vector<Node*> version;

int num[MAXN], cnt[MAXN], orig[MAXN], fr[MAXN], to[MAXN], rank[MAXN];
std::queue<Node*> que, add;

signed main(void)
{
	int totNums, totQuery, cnt;

	//Read
	scanf("%d%d", &totNums, &totQuery);
	for (int i = 0; i < totNums; i++)scanf("%d", num + i);
	for (int i = 0; i < totQuery; i++)scanf("%d%d%d", fr + i, to + i, rank + i);

	for (int i = 0; i < totNums; i++) orig[i] = num[i];
	std::sort(num, num + totNums);
	cnt = 0; int tmp;
	mp[val[0] = *num] = cnt++;
	for (int i = 1; i < totNums; i++)
		if (num[i] != num[i - 1])
			tmp = cnt,mp[val[tmp] = num[i]] = cnt++;	
	for (int i = 0; i < totNums; i++)::cnt[num[i] = mp[orig[i]]]++;

	//Build
	Node *a, *b, *t;
	for (int i = 0; i < cnt; i++) {
		t = NewNode(); t->fr = t->to = i;
		t->sum = ::cnt[i];
		que.push(t);
	}
	for (; que.size() >= 2; std::swap(que, add)) {
		while (que.size() >= 2) {
			a = que.front(); que.pop(); b = que.front(); que.pop();
			t = NewNode(); t->fr = a->fr; t->to = b->to; t->sum = a->sum + b->sum;
			t->lft = a; t->rgt = b;
			add.push(t);
		}
		if (!que.empty()) { add.push(que.front()); que.pop(); }
	}version.push_back(que.front());
	//New Versions
	for (int del = 0; del < totNums; del++) {
		const int &target = num[del];
		t = a = NewNode(); b = version.back();
		while (true) {
			a->Copy(b); a->sum--;
			if (a->fr == target && a->to == target)break;
			if (a->lft->fr <= target && target <= a->lft->to) {
				a->lft = NewNode(); a = a->lft; b = b->lft;
			} else {
				a->rgt = NewNode(); a = a->rgt; b = b->rgt;
			}
		}version.push_back(t);
	}

	//Query
	int rnk;
	for (int i = 0; i < totQuery; i++)
	{
		a = version[--fr[i]]; b = version[to[i]]; rnk = rank[i];
		while (true) {
			if (a->lft == nullptr) { printf("%d\n", val[a->fr]); break; }

			if (a->lft->sum - b->lft->sum >= rnk) {
				a = a->lft; b = b->lft;
			}
			else {
				rnk -= a->lft->sum - b->lft->sum;
				a = a->rgt; b = b->rgt;
			}
		}
	}
	
	//system("pause");
	return 0;
}

參見[編輯]

參考文獻[編輯]

^ 李煜東. 算法竞赛进阶指南. 中原出版傳媒集團·河南電子音像出版社. 2018-1: P255–257. ISBN 978-7-83009-313-6. 請檢查|date=中的日期值 (幫助)
^ Antti Laaksonen. Guide to Competitive Programming: Learning and Improving Algorithms Through Contests 1st edition. Springer. 2017: P250–251. ISBN 978-3-319-72546-8 （英語）.

[lyd-1] 李煜東. 算法竞赛进阶指南. 中原出版傳媒集團·河南電子音像出版社. 2018-1: P255–257. ISBN 978-7-83009-313-6. 請檢查|date=中的日期值 (幫助)

[antti-2] Antti Laaksonen. Guide to Competitive Programming: Learning and Improving Algorithms Through Contests 1st edition. Springer. 2017: P250–251. ISBN 978-3-319-72546-8 （英語）.

[1]

[2]