crawler.max_dir_depth
Background
This option sets the limit for the number of directories in a valid URL. The crawler will ignore all URLs that have more than this number of directories. Typically, if there are too many directories, it is likely to be a crawler trap, so this limit should not be set too high.
this limit is not checked for dynamic URLs, e.g. ones containing a '?'. |