Specify a regex to allow crawler redirects that would otherwise be disallowed by the current include/exclude patterns.

Key: crawler.allowed_redirect_pattern
Type: String
Can be set in: collection.cfg


When the crawler is redirected to a URL, it will check it against the include/exclude patterns, to determine whether it should continue processing that URL. Usually if the URL doesn’t match the include/exclude rules, it means the crawler has wandered offsite and shouldn’t proceed any further.

However, some websites use external authentication portals. The purpose of this variable is to allow the crawler to continue processing a URL even though it has been redirected offsite. The contents of the offsite pages won’t be stored, but the crawler will still be allowed to proceed, e.g. for the purposes of authentication / form interaction.

This check is case-sensitive.

The following will allow the crawler to be redirected to any URL containing (without scraping additional links from the redirected site).