----- Original Message -----
On Tue, 2018-06-26 at 11:59 -0400, Dave Anderson wrote:
>
> ----- Original Message -----
> > On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote:
> > >
> > > ----- Original Message -----
> > > > On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote:
> > > > > On 06/26/2018 03:29 PM, David Wysochanski wrote:
> > > > > > On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote:
> > > > > > > Yes, by default all list entries encountered are put
in the
> > > > > > > built-in
> > > > > > > hash queue, specifically for the purpose of
determining whether
> > > > > > > there
> > > > > > > are duplicate entries. So if it's still running,
it hasn't
> > > > > > > found
> > > > > > > any.
> > > > > > >
> > > > > > > To avoid the use of the hashing feature, try entering
"set hash
> > > > > > > off"
> > > > > > > before kicking off the command. But of course if it
finds any,
> > > > > > > it
> > > > > > > will loop forever.
> > > > > > >
> > > > > >
> > > > > > Ah ok yeah I forgot about the built-in list loop
detection!
> > > > >
> > > > > For a storage-less method of list loop-detection: run two
walkers
> > > > > down the list, advancing two versus one elements. If you ever
> > > > > match the same element location after starting, you have a
loop.
> > > >
> > > > I agree some algorithm [1] without a hash table may be better
> > > > especially for larger lists.
> > >
> > > I'll await your patch...
> > >
> >
> > Do you see any advantage to keeping the hash table for loop detection
> > or would you accept a patch that removes it completely in favor of a
> > another algorithm?
>
> For maintenance sake, it's probably worth keeping the hash queue option
> in place, primarily since there are a few dozen other internal facilities
> besides the user "list" command that use the do_list() function.
>
Ok. I was actually thinking about replacing all the callers of
hq_enter() with a new algorithm. There are 32 of them so it would be a
non-trivial patch though longer term may be simpler maintenance.
Probably only now it matters more since it's much more common to have
vmcores 100's of GB and much larger lists but maybe some of these it
would not be useful replacement.
Duplicate list detection is not the only reason for the hq_enter() facility.
It has to stay in place, because after collecting a list, several hq_enter()
users need to read all of the entries back out for whatever post-processing
they need to do.
So it can't just be a "repeal-and-replace"... ;-)
Dave