Threshold for edit distance between two versions of a page when deciding whether it has changed or not.
Can be set in: collection.cfg
This parameter specifies a threshold to use when deciding whether the content of a URL has changed compared to a previous version. The edit distance is the number of operations (add, edit, delete) that would be required to transform one string into the other.
If the edit distance is less than this threshold then the page is marked as "unchanged" and this information will be fed into the crawler’s revisit policy. Pages that don’t change very often may not be revisited as often and a copy of their content may be used instead.