The existing "list" command uses a hash table to detect duplicate items as it
traverses the list. The hash table approach has worked well for many years.
However, with increasing memory sizes and list sizes, the overhead of the
hash table can be substantial, often leading to commands running for a very
long time. For large lists, we have found that the existing hash based approach
may slow the system to a crawl and possibly never complete. You can turn off
the hash with "hash off" but then there is no loop detection. In this case,
loop detection must be done manually after dumping the list to disk or some
other method. This patch is an implementation of the cycle detection algorithm
from R. P. Brent as an alternative algorithm for the "list" command. The
algorithm both avoids the overhead of the hash table and yet is able to detect
a loop. In addition, further loop characteristics are printed, such as the
distance to the start of the loop as well as the loop length.
The algorithm from R. P. Brent uses only a tiny amount of memory (two pointers)
and adds extremely minimal overhead if no loop is detected. If a loop is detected
it adds additional overhead of advancing through the list to find the start of
the loop. While this overhead may be non-trivial for large lists, the cost of
each list advance should be a fixed cost and compares favorably to an ever
increasing cost of a list advance with a hash table. Thus in all instances,
the algorithm is superior to a hash table approach for list traversal. The
algorithm is divided into two phases, detection of a loop, and detection of
the loop start (first duplicate).
The first phase of the algorithm adds almost no overhead to an existing list
traversal. To detect a loop a second pointer is added which follows the first
list traversal pointer periodically according to an increasing power of two
length. Eventually, if there is a loop, this increasing power of two will be
larger than the loop size, and the list traversal pointer will eventually be
equal to the second pointer that is periodically moved. If the list traversal
pointer never reaches this second pointer, there is no loop.
If a loop is detected, the second phase of the algorithm is entered, and this
adds overhead. First a couple points of explanation and a diagram of a list
with a loop example:
non_loop_len = 5
+--->--->--->---^--->
| | loop_len = 4
| |
<---v
In the above diagram, we have two segments: a) a non_loop segment (or
"non-perodic" segment), and b) a loop segment (or "periodic"
segment).
As a side note, in the literature, these lengths are often given
two specific greek letters with the length of the loop segment referred
to as lambda, while the non-loop segment length is referred to as mu.
In any case, the important thing to understand is there are two distinct
segments now, with two different lengths.
Once we have exited from Brent's first phase, the loop length is known.
Why is that? For the first phase to have exited, the traversal pointer had
to have met the following pointer. And for that to happen, the traversal
pointer had to have been stationary for the length of the loop, and the
power of two was greater than the loop length.
When the loop is detected, the following message is printed:
list loop detected, loop length: 4
The second phase of the loop is entered, with the purpose of identifying
the non-loop length as well as the point at which the loop and the non-loop
intersect (this is the first duplicate). To obtain these objectives, we
note the following relation of steps required to reach the intersection:
itersection = non_loop_len + N*loop_len
for any integer N where N >= 0
The second phase does the following:
1. Reset the first and second pointer to the start
2. Advance the second pointer loop_len steps
3. Advance both pointers until they meet, saving the step count
The second pointer will have gone both non_loop_len + loop_len steps to
arrive at the intersection, and the first pointer will have gone
non_loop_len steps to arrive at the intersection. When the second phase
exits (both pointers are equal) we have the non_loop length.
When the loop exits, the following message is printed:
list loop length from start to loop: 5
duplicate list entry: ffff8abdf3bb5d00
An example of the full output of this new algorithm when a list loop is
detected is as follows:
crash> list -H 0xffff8ac03c81fc28
ffff8abdf38b7d00
ffff8abdf38b7ac0
ffff8abdf38b7f40
ffff8abe0edb4b00
ffff8abdf3bb5d00
ffff8abdf3bb5b80
ffff8abdf3bb5640
ffff8abe0e92b100
ffff8abe0e92ad40
ffff8abdf3bb5d00
ffff8abdf3bb5b80
ffff8abdf3bb5640
list: loop detected, loop length: 5
list: length from start to loop: 4
list: duplicate list entry: ffff8abdf3bb5d00
crash>
This compares with existing crash output as follows:
crash> list -H 0xffff8ac03c81fc28
ffff8abdf38b7d00
ffff8abdf38b7ac0
ffff8abdf38b7f40
ffff8abe0edb4b00
ffff8abdf3bb5d00
ffff8abdf3bb5b80
ffff8abdf3bb5640
ffff8abe0e92b100
ffff8abe0e92ad40
ffff8abdf3bb5d00
list: duplicate list entry: ffff8abdf3bb5d00
crash>
Signed-off-by: Dave Wysochanski <dwysocha(a)redhat.com>
---
tools.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 64 insertions(+), 4 deletions(-)
diff --git a/tools.c b/tools.c
index 0d4b8e5..ba525a8 100644
--- a/tools.c
+++ b/tools.c
@@ -3949,6 +3949,21 @@ static int do_list_no_hash_readmem(struct list_data *ld, ulong
*next_ptr,
return 0;
}
+ulong brent_x; /* tortoise */
+ulong brent_y; /* hare */
+ulong brent_r; /* power */
+ulong brent_lambda; /* loop length */
+ulong brent_mu; /* distance to start of loop */
+ulong brent_loop_detect;
+ulong brent_loop_exit;
+/*
+ * 'ptr': representative of x or y; modified on return
+ */
+static int brent_f(ulong *ptr, struct list_data *ld, ulong readflag)
+{
+ return do_list_no_hash_readmem(ld, ptr, readflag);
+}
+
/*
* Similar to do_list() but without the hash_table or LIST_ALLOCATE.
* Useful for the 'list' command and other callers needing faster list
@@ -3996,22 +4011,29 @@ do_list_no_hash(struct list_data *ld)
e[i] = fill_member_offsets(ld->structname[i]);
}
+ brent_loop_detect = brent_loop_exit = 0;
+ brent_lambda = 0;
+ brent_r = 2;
+ brent_x = brent_y = next;
+ ret = brent_f(&brent_y, ld, readflag);
+ if (ret == -1)
+ return -1;
while (1) {
- if (ld->flags & VERBOSE) {
+ if (!brent_loop_detect && ld->flags & VERBOSE) {
fprintf(fp, "%lx\n", next - ld->list_head_offset);
if (ld->structname) {
do_list_output_struct(ld, next, offset, radix, e);
}
}
- if (next && 0) { /* FIXME - duplicate detection */
+ if (next && brent_loop_exit) {
if (ld->flags &
(RETURN_ON_DUPLICATE|RETURN_ON_LIST_ERROR)) {
error(INFO, "\nduplicate list entry: %lx\n",
- next);
+ brent_x);
return -1;
}
- error(FATAL, "\nduplicate list entry: %lx\n", next);
+ error(FATAL, "\nduplicate list entry: %lx\n", brent_x);
}
if ((searchfor == next) ||
@@ -4030,6 +4052,43 @@ do_list_no_hash(struct list_data *ld)
if (ret == -1)
return -1;
+ if (!brent_loop_detect) {
+ if (brent_x == brent_y) {
+ brent_loop_detect = 1;
+ error(INFO, "loop detected, loop length: %lx\n", brent_lambda);
+ /* reset x and y to start; advance y loop length */
+ brent_mu = 0;
+ brent_x = brent_y = ld->start;
+ while (brent_lambda--) {
+ ret = brent_f(&brent_y, ld, readflag);
+ if (ret == -1)
+ return -1;
+ }
+ } else {
+ if (brent_r == brent_lambda) {
+ brent_x = brent_y;
+ brent_r *= 2;
+ brent_lambda = 0;
+ }
+ brent_y = next;
+ brent_lambda++;
+ }
+ } else {
+ if (!brent_loop_exit && brent_x == brent_y) {
+ brent_loop_exit = 1;
+ error(INFO, "length from start to loop: %lx",
+ brent_mu);
+ } else {
+ ret = brent_f(&brent_x, ld, readflag);
+ if (ret == -1)
+ return -1;
+ ret = brent_f(&brent_y, ld, readflag);
+ if (ret == -1)
+ return -1;
+ brent_mu++;
+ }
+ }
+
if (next == 0) {
if (ld->flags & LIST_HEAD_FORMAT) {
error(INFO, "\ninvalid list entry: 0\n");
@@ -4037,6 +4096,7 @@ do_list_no_hash(struct list_data *ld)
}
if (CRASHDEBUG(1))
console("do_list end: next:%lx\n", next);
+
break;
}
--
1.8.3.1