Path to a file that contains a list of URLs (one per line) that will be used as the starting point for a crawl.

Key: crawler.start_urls_file
Type: String
Can be set in: collection.cfg


The list of start URLs that will be initially crawled is a combination of all URLs declared in the file specified here and those which are in start_url.

Only use HTTP/HTTPS protocols in the URL.

Default Value




This file might then contain something like:

⚠ Caveats

While permission to read and edit this key is configured by read.key.start_urls_file and edit.key.start_urls_file, to fully restrict the URLs that will be crawled, you will need to also consider sec.collection-start-urls, read.key.start_url and edit.key.start_url.

