On 06/26/2018 11:15 AM, Dave Anderson wrote:
----- Original Message -----
> On 06/26/2018 10:40 AM, David Wysochanski wrote:
>> On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote:
>>>
>>> ----- Original Message -----
>>>> On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote:
>>>>> On 06/26/2018 03:29 PM, David Wysochanski wrote:
>>>>>> On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote:
>>>>>>> Yes, by default all list entries encountered are put in the
built-in
>>>>>>> hash queue, specifically for the purpose of determining
whether there
>>>>>>> are duplicate entries. So if it's still running, it
hasn't found any.
>>>>>>>
>>>>>>> To avoid the use of the hashing feature, try entering
"set hash off"
>>>>>>> before kicking off the command. But of course if it finds
any, it
>>>>>>> will loop forever.
>>>>>>>
>>>>>>
>>>>>> Ah ok yeah I forgot about the built-in list loop detection!
>>>>>
>>>>> For a storage-less method of list loop-detection: run two walkers
>>>>> down the list, advancing two versus one elements. If you ever
>>>>> match the same element location after starting, you have a loop.
>>>>
>>>> I agree some algorithm [1] without a hash table may be better
>>>> especially for larger lists.
>>>
>>> I'll await your patch...
>>>
>>
>> Do you see any advantage to keeping the hash table for loop detection
>> or would you accept a patch that removes it completely in favor of a
>> another algorithm?
>
> Could the same algorithm be modified so that it can slow down after a
> certain number of list members, say maybe saving only every 10th element
> to the hash (but checking every new one)?
I've seen list corruption where a list containing hundreds of entries
contains an entry that points back to another entry in the list that
came hundreds of entries before it.
Right, but if we didn't detect the first duplicate, we should still see
one in the next 10 entries. Once the list repeats, it should repeat
identically.
Dave