Specify a regex to allow crawler redirections that would otherwise by disallowed by the current include/exclude patterns.
Can be set in: collection.cfg
When the crawler is redirected to a URL, it will check it against the include/exclude patterns, to determine whether it should continue processing that URL. Usually if the URL doesn’t match the include/exclude rules, it means the crawler has wandered offsite and shouldn’t proceed any further.
However, some websites use external authentication portals. The purpose of this variable is to allow the crawler to continue processing a URL even though it has been redirected offsite. The contents of the offsite pages won’t be stored, but the crawler will still be allowed to proceed, e.g. for the purposes of authentication / form interaction.
|This check is case-sensitive.|
The following will allow the crawler to be redirected to any URL containing gatekeeper.com (without scraping additional links from the redirected site).