crawler.use_additional_link_extraction
Background
This parameter controls whether the webcrawler should extract and follow links found via URL features
(i.e., whether the regular expression stored in crawler.additional_link_extraction_pattern
would be used for link extraction) .
By default, extracting links found via URL features is turned off.
This setting causes the crawler to extract any URLs embedded in any type of texts. Enabling this does not enable the webcrawler to crawl any websites generated from the targeted file(s). |