Crawler Monitor Url Reject List (collection.cfg)


This parameter can be modified during a running crawl to tell the crawler to ignore the specified list of URLs for the remainder of the crawl. Normally if you know before a crawl what areas to avoid you would add them to the exclude_patterns parameter. The format to use is a comma separated list of URLs.

Matching URLs gathered prior to this configuration change will not be affected.

NB: The pattern must include a protocol/schema e.g. and not

Default value



Reject any URLs from the given sites or sub-sites during a running crawl: