On 06/26/2018 12:01 PM, Dave Kleikamp wrote:
On 06/26/2018 11:15 AM, Dave Anderson wrote:
>
>
> ----- Original Message -----
>> On 06/26/2018 10:40 AM, David Wysochanski wrote:
>>> On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote:
>>>>
>>>> ----- Original Message -----
>>>>> On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote:
>>>>>> On 06/26/2018 03:29 PM, David Wysochanski wrote:
>>>>>>> On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote:
>>>>>>>> Yes, by default all list entries encountered are put in
the built-in
>>>>>>>> hash queue, specifically for the purpose of determining
whether there
>>>>>>>> are duplicate entries. So if it's still running, it
hasn't found any.
>>>>>>>>
>>>>>>>> To avoid the use of the hashing feature, try entering
"set hash off"
>>>>>>>> before kicking off the command. But of course if it
finds any, it
>>>>>>>> will loop forever.
>>>>>>>>
>>>>>>>
>>>>>>> Ah ok yeah I forgot about the built-in list loop detection!
>>>>>>
>>>>>> For a storage-less method of list loop-detection: run two
walkers
>>>>>> down the list, advancing two versus one elements. If you ever
>>>>>> match the same element location after starting, you have a loop.
>>>>>
>>>>> I agree some algorithm [1] without a hash table may be better
>>>>> especially for larger lists.
>>>>
>>>> I'll await your patch...
>>>>
>>>
>>> Do you see any advantage to keeping the hash table for loop detection
>>> or would you accept a patch that removes it completely in favor of a
>>> another algorithm?
>>
>> Could the same algorithm be modified so that it can slow down after a
>> certain number of list members, say maybe saving only every 10th element
>> to the hash (but checking every new one)?
>
> I've seen list corruption where a list containing hundreds of entries
> contains an entry that points back to another entry in the list that
> came hundreds of entries before it.
Right, but if we didn't detect the first duplicate, we should still see
one in the next 10 entries. Once the list repeats, it should repeat
identically.
Why choose the longer loop step size value at code creation? Make it
user-configurable with a minimum of 2 (longer than 1) and force the
short step size in code to be 1. Document the method of selection for
the long step size but make that documentation include enough to
understand the effect (and purpose). Then let people perform their own
experiments based on the sizes of the lists they believe they are
working with.
--
David.