您的当前位置：首页正文

跳表(SkipList) 原理分析，代码实现

来源：步旅网

零、前言

对于维护有序结构，我们一般采用平衡树，如 AVL-Tree，RB-Tree，Splay，Treap 等。不过除了Treap，剩下那几个都挺复杂的。

设计跳表（skip list），正是为了以另外一种简便直观的方式来完成对于有序集合的维护。

一、跳表

1.1 跳表

有序线性结构中可以通过二分查找来高效搜索目标
能否借鉴对半查找的思想，提高有序链表中元素查找的效率?
William Pugh（康奈尔大学博士马里兰大学教授）于1989年发表论文，提出了一种线性空间，O(logn) 单次操作的分层次的、相互耦合的链表结构——跳表（skip list）

1.2 基本思想

跳表是分层次、相互耦合的多个链表

注意到每层都有负无穷和正无穷两个哨兵，下文默认其存在

如何在这样一种多层链表中应用二分查找？图中S0层显然为O(n)的空间复杂度，如何保证整体空间复杂度为O(n)的呢？

1.2.1 空间复杂度

跳表自下而上：

第0层有 n 个节点
第 1 层，期望 n / 2 个节点
第 2 层，期望 n / 4 个节点
……
第 i 层，期望 n / 2^i 个节点

$\begin{align} & 假设有h层，则 \\ & \sum_{i=0}^{h} \left \lfloor \frac{n}{2^i} \right \rfloor \lt \sum_{i=0}^{h} \frac{n}{2^i} = n\sum_{i=0}^{h} \frac{1}{2^i} = 2n(1 - \frac{1}{2^{h+1}}) \lt 2n = O(n) \end{align}$

这样分配每层节点的数目，巧妙地保证了O(N)的空间复杂度

这里提前提一下：每列元素的高度通过抛硬币决定（1/2 01分布），从而保证了每层的期望节点数，这也是我们后续时间复杂度证明要用到的

1.2.2 查找

查找元素 v
p 指向最上层头节点，v 和 p->next->val 比较
如果 v < p->next->val，p 向下移动
如果 v > p->next->val，p 向右移动
否则，v = p->next->val，查找成功

以下面这张图查找21为例，红线代表指针p 的移动轨迹：

1.2.3 时间复杂度

跳表层数 h = O(logn)
竖向跳转：O(logn)
横向跳转：
$\begin{align} & 我们发现同层间横向跳转的节点都是本列的最高点(否则可以反证法证明可以在更高层提前跳转)\\ & 事实上，我们在确定每列高度时通过抛硬币决定，第 i 层的每个节点为本列最高点的概率为 (\frac{1}{2})^{i}\\ & 那么我们可以计算每层横向跳转的期望为：E = \sum_{i=1}^{\infin}\frac{i}{2^i} = 2\\ \end{align}$
因而查找时间复杂度：O(logn)，后面的插入，删除和查找也是类似的流程，时间复杂度同样也是O(logn)

1.2.3 插入

插入的话有一个问题：是否允许重复元素出现

这里约定一下，本文的跳表维护的是多重集，允许重复元素

插入元素 v
p 指向最上层头节点，v 和 p->next->val 比较
如果 v < p->next->val，p 向下移动
如果 v > p->next->val，p 向右移动
到最后一层结束时，插入位置为 p 的下一个位置
v 以抛硬币的方式决定是否向上生长，生长概率逐层减半

下图是插入17的示例：

1.2.4 删除

假定删除元素 v 存在
p 指向最上层头节点，v 和 p->next->val 比较
如果 v < p->next->val，p 向下移动
如果 v > p->next->val，p 向右移动
到最后一层结束时，删除位置为 p 的下一个位置

不再画图

1.2.5 小结

1.3 实现

写这篇博客参考了两个版本：

清华数据结构课本，邓俊辉泛型版本：四联表结构(Quadlist)，尝试写了下跟舞蹈链一样恶心，会大大增加博客体量，遂放弃

静态数组，非泛型版本

const int N = 1E5;
const int B = 25;

struct Node
{
    int key;
    int next[B];  // 塔内向上第 i + 1 层 节点的 后继节点
}pool[N + 2];

int update[N]; // 辅助数组，用于保存搜索路径上的位置，用于 插入 / 删除

1.3.1 节点定义

const int N = 1E5;
const int B = 25;

struct Node
{
    int key;
    int next[B];  // 塔内向上第 i + 1 层 节点的 后继节点
}pool[N + 2];

int update[N]; // 辅助数组，用于保存搜索路径上的位置，用于 插入 / 删除

1.3.2 随机高度

int rand_level()
{
    static std::mt19937 rng(std::chrono::steady_clock::now().time_since_epoch().count());
    int res = 1;
    while ((rng() & 1) && res < B) {
        ++ res;
    }
    return res;
}

1.3.3 查询

int find(int key)
{
    int p = head;

    for (int i = B - 1; ~i; i--)
    {
        while (pool[p].next[i] != tail && pool[pool[p].next[i]].key < key) {
            p = pool[p].next[i];
        }
    }

    p = pool[p].next[0];

    return p;
}

1.3.4 插入

void insert(int key)
{
    int p = head;
    int K = rand_level();
    for (int i = B - 1; ~i; -- i)
    {
        while (pool[p].next[i] != tail && pool[pool[p].next[i]].key < key) {
            p = pool[p].next[i];
        }
        update[i] = p;
    }

    int node = new_node(key);

    for (int i = K - 1; ~i; -- i)
    {
        pool[node].next[i] = pool[update[i]].next[i];
        pool[update[i]].next[i] = node;
    }
}

1.3.5 删除

void erase(int key)
{
    int p = head;

    for (int i = B - 1; ~i; --i)
    {
        while (pool[p].next[i] != tail && pool[pool[p].next[i]].key < key) {
            p = pool[p].next[i];
        }
        update[i] = p;
    }

    // shrink(pool[p].next[0]);

    for (int i = B - 1; ~i; --i)
    {
        if (pool[pool[update[i]].next[i]].key == key) {
            pool[update[i]].next[i] = pool[pool[update[i]].next[i]].next[i];
        }
    }
}

二、OJ练习

2.1 宠物收养场

原题链接

[P2286

思路分析

维护一个有序集合

我们用一个有序集合维护权值，这里的权值可能是宠物的权值，也可能是客人的权值

到来一个宠物 / 人，不妨设其类型为 t，权值为x
如果集合空，那么加入集合
如果 t 和当前集合维护的类型相同，那么加入其权值到集合
否则，如果从集合拿出一个权值严格小于x的节点，和权值不小于x的节点，删除掉代价小的那个

有序集合的维护可以用跳表来实现

时间复杂度：O(nlogn)

AC代码

#include <bits/stdc++.h>

// #define DEBUG

using u32 = unsigned;
using i64 = long long;
using u64 = unsigned long long;

constexpr int inf32 = 1E9 + 7;
constexpr i64 inf64 = 1E18 + 7;

constexpr int N = 1E5;
constexpr int B = 25;
constexpr int P = 1000000;

struct Node
{
    int key;
    int next[B];  // 塔内向上第 i 层 节点的 后继节点
} pool[N + 2];

int update[B];
std::vector<int> nodes;

int head, tail;

int rand_level()
{
    static std::mt19937 rng(std::chrono::steady_clock::now().time_since_epoch().count());
    int res = 1;
    while ((rng() & 1) && res < B) {
        ++res;
    }
    return res;
}

int new_node(int key)
{
    int res = nodes.back();
    nodes.pop_back();
    pool[res].key = key;
    return res;
}

void init()
{
    head = new_node(-inf32);
    tail = new_node(inf32);
    for (int i = 0; i < B; ++ i) {
        pool[head].next[i] = tail;
    }
}

void insert(int key)
{
    int p = head;
    int K = rand_level();
    for (int i = B - 1; ~i; -- i)
    {
        while (pool[p].next[i] != tail && pool[pool[p].next[i]].key < key) {
            p = pool[p].next[i];
        }
        update[i] = p;
    }

    int node = new_node(key);

    for (int i = K - 1; ~i; -- i)
    {
        pool[node].next[i] = pool[update[i]].next[i];
        pool[update[i]].next[i] = node;
    }
}

int find(int key)
{
    int p = head;

    for (int i = B - 1; ~i; i--)
    {
        while (pool[p].next[i] != tail && pool[pool[p].next[i]].key < key) {
            p = pool[p].next[i];
        }
    }

    // p = pool[p].next[0];

    return p;
}

void shrink(int p) {
    nodes.push_back(p);
}

void erase(int key)
{
    int p = head;

    for (int i = B - 1; ~i; --i)
    {
        while (pool[p].next[i] != tail && pool[pool[p].next[i]].key < key) {
            p = pool[p].next[i];
        }
        update[i] = p;
    }

    shrink(pool[p].next[0]);

    for (int i = B - 1; ~i; --i)
    {
        if (pool[pool[update[i]].next[i]].key == key) {
            pool[update[i]].next[i] = pool[pool[update[i]].next[i]].next[i];
        }
    }
}

auto INIT = [] {
    nodes.resize(N + 2);
    for (int i = 0; i < nodes.size(); ++ i) {
        nodes[i] = nodes.size() - i - 1;
    }
    init();
    return 0;
}();

void solve() {
    int n;
    std::cin >> n;

    int ans = 0;
    int cur = 0, siz = 0;

    for (int i = 0; i < n; ++ i) {
        int a, b;
        std::cin >> a >> b;
        if (siz == 0) {
            cur = a;
            insert(b);
            ++ siz;
        } else if (a == cur){
            insert(b);
            ++ siz;
        } else {
            int j = find(b);

            if (pool[pool[j].next[0]].key - b < b - pool[j].key) {
                j = pool[j].next[0];
            }

            ans = (1LL * ans + std::abs(pool[j].key - b)) % P;
            erase(pool[j].key);
            -- siz;
        }

    }

    std::cout << ans << '\n';
}

int main() {
    std::ios::sync_with_stdio(false);
    std::cin.tie(nullptr);

#ifdef DEBUG
    int START = clock();
    freopen("in.txt", "r", stdin);
    freopen("out.txt", "w", stdout);
#endif

    int t = 1;
    // std::cin >> t;

    while (t --) {
        solve();
    }
#ifdef DEBUG
    std::cerr << "run-time: " << clock() - START << '\n';
#endif
    return 0;
}

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文