使用二级指针操作链表

现在的编程过程中，我们很少需要亲自动手实现链表这类数据结构，因为标准库（甚至语言本身）往往已经提供了相关的功能。

然而，对于底层的编程——尤其是系统编程而言，手工实现并优化一些底层的数据结构仍然是必要的。要知道，我们可能会面对一个无法使用动态语言、无法使用C标准库、即使是“hello, world”也要通过IO指令输出的环境。

本文将探讨手工实现的链表中，关于指针的一个小技巧。

通常的链表操作

（会看这篇博客的）大部分读者对链表都很熟悉了，基本概念略过。

我们定义一个简单的单向链表，包含一个整数作为内容：

1 struct node {
2     struct node *next;
3     int content;
4 };

就像教科书上写的那样，我们可以用一个指针，从头到尾访问整个链表：

1 void scan(struct node *head) {
2     struct node *current = head;
3 
4     while (current) {
5         // do something
6 
7         current = current->next;
8     }
9 }

然后，我们实现一个简单的链表插入：

 1 void insert(struct node *head, int index, struct node *node) {
 2     struct node *current = head;
 3 
 4     while (current) {
 5         --index;
 6         if (index == 0) {
 7             node->next = current->next;
 8             current->next = node;
 9             return;
10         }
11 
12         current = current->next;
13     }
14 
15     // error
16 }

Duang！不出所料，insert(head, 0, node)挂了。

我们发现，链表的第一个节点是特殊的。其它节点都是上一个节点的->next，唯独第一个节点是通过head指针传入的。而且需要注意的是，当index为0时，head指针本身就应该指向新插入的节点。

于是我们秉持着打那指哪的精神，应付了一下：

 1 struct node *insert(struct node *head, int index, struct node *node) {
 2     // head
 3     if (index == 0) {
 4         node->next = head;
 5 
 6         return node;
 7     }
 8 
 9     // others
10     struct node *current = head;
11 
12     while (current) {
13         --index;
14         if (index == 0) {
15             node->next = current->next;
16             current->next = node;
17             return head;
18         }
19 
20         current = current->next;
21     }
22 
23     // error
24 }

这个方法也适用于在链表中删除节点。需要注意，如果要插入或删除多个节点，第一个if需要相应地改为循环。

但是，把代码写两遍，分开处理第一个节点和其它节点。这显然是一种坏味道。如果操作本身更复杂一些，例如根据content值来决定是否要删除，或者对链表节点进行排序，代码会变得更加复杂。

有几种可能的应对方式，一种是再写一个函数，把重复的代码包装起来；另一种是使用一个指针存放上一个node，有经验的“教科书式程序员”会选择这样做。

我们来看一个删除节点的例子：

 1 struct node *remove(struct node *head, int value) {
 2     struct node *last = 0;
 3     struct node *current = head;
 4 
 5     while (current) {
 6         if (current->content == value) {
 7             struct node *next = current->next;
 8 
 9             free(current);
10             current = next;
11 
12             if (last) {
13                 last->next = current;
14             } else {
15                 head = current;
16             }
17         } else {
18             last = current;
19             current = current->next;
20         }
21     }
22 
23     return head;
24 }

这种实现对于通常的使用来说，已经足够。但把last和head割裂开的做法，对于追求完美的我们来说，还是不够优雅。于是，请出本文的主角——二级指针。

使用二级指针

先不急着解决问题。我们观察remove函数的接口：

1 struct node *remove(struct node *head, int value);

在使用这个函数时，需要先传入head，然后再把返回值重新赋给head。换句话说，head在这里应该是一个可以被改写的参数。

这种可改写的参数，在Delphi里叫var参数，在C++里是传引用的参数。

在C里我们可以怎么做呢？对，二级指针：

 1 void remove(struct node **head, int value) {
 2     struct node *last = 0;
 3     struct node *current = *head;
 4 
 5     while (current) {
 6         if (current->content == value) {
 7             struct node *next = current->next;
 8 
 9             free(current);
10             current = next;
11 
12             if (last) {
13                 last->next = current;
14             } else {
15                 *head = current;
16             }
17         } else {
18             last = current;
19             current = current->next;
20         }
21     }
22 }

仔细想想，remove函数对链表中指针的修改，无非是改head和改last->next这两种。next指针和*head指针其实是对称的。

这种对称关系可以用递归来展现：

 1 void remove(struct node **head, int value) {
 2     struct node *current = *head;
 3 
 4     if (current) {
 5         if (current->content == value) {
 6             struct node *next = current->next;
 7 
 8             free(current);
 9             current = next;
10 
11             *head = current;
12 
13             remove(head, value);
14         } else {
15             remove(&current->next, value);
16         }
17     }
18 }

我们之所以可以用递归来处理链表，正是因为next和*head具有这种特殊的关系。一个链表可以看成另一个链表的头上插入一个元素。而用于表示那“另一个链表”的指针正好就是第一个元素的next指针。

递归中的current其实是多余的，可以直接用*head来代替：

 1 void remove(struct node **head, int value) {
 2     if (*head) {
 3         if ((*head)->content == value) {
 4             struct node *next = (*head)->next;
 5 
 6             free(*head);
 7             *head = next;
 8 
 9             remove(head, value);
10         } else {
11             remove(&(*head)->next, value);
12         }
13     }
14 }

然后，我们把递归化成循环，就得到了简洁而一致的实现：

 1 void remove(struct node **head, int value) {
 2     while (*head) {
 3         if ((*head)->content == value) {
 4             struct node *next = (*head)->next;
 5 
 6             free(*head);
 7             *head = next;
 8         } else {
 9             head = &(*head)->next;
10         }
11     }
12 }

当删除链表第一个节点的时候，remove函数会改写*head，使它指向第二个元素（即，第一个元素的next指向的节点）；而删除之后的节点的时候，被改写的则是上一个节点的next指针。

我们再来看看如何实现插入：

 1 void insert(struct node **head, int index, struct node *node) {
 2     while (*head) {
 3         if (index == 0) {
 4             node->next = *head;
 5             *head = node;
 6             return;
 7         }
 8         --index;
 9 
10         head = &(*head)->next;
11     }
12 }

实现方法和remove大抵类似。

不用二级指针

二级指针还是有些抽象的，有没有一种几乎等价、但是更容易理解的实现方式呢？

答案是肯定的。

我们换一个角度思考——既然可以用head代替last->next，那我们也可以使用last->next来代替head。

方法很简单——建立一个空节点，作为虚拟的last节点，然后让last->next指向链表的第一个节点：

1 struct node last;
2 last->next = /* head of the linked list */;

用last->next代替之前的*head，来操作链表就可以了。代码差别不大：

 1 void remove(struct node *last, int value) {
 2     while (last->next) {
 3         if (last->next->content == value) {
 4             struct node *next = last->next->next;
 5 
 6             free(last->next);
 7             last->next = next;
 8         } else {
 9             last = last->next;
10         }
11     }
12 }

这种虚拟节点对于双向链表来说非常方便。我们可以直接使用一个节点entry来保存头尾指针，即entry->next指向第一个元素，entry->last指向最后一个元素。

《操作系统》课的JOS lab里，就出现了二级指针。这是本文的来由。因此我们用lab的方式结束本文：

This completes the blog.

~~> > <~~

>> return 0

>>> 使用二级指针操作链表

>>>> C++编译时的“变量”（下篇）

>>>>> C++编译时的“变量”（上篇）

<<<<<<