crawler.use_additional_link_extraction

Background

This parameter controls whether the webcrawler should extract and follow links found via URL features (i.e., whether the regular expression stored in crawler.additional_link_extraction_pattern would be used for link extraction) .

By default, extracting links found via URL features is turned off.

This setting causes the crawler to extract any URLs embedded in any type of texts. Enabling this does not enable the webcrawler to crawl any websites generated from the targeted file(s).

Setting the key

Set this configuration key in the search package or data source configuration.

Use the configuration key editor to add or edit the crawler.use_additional_link_extraction key, and set the value. This can be set to any valid Boolean value.

Default value

crawler.use_additional_link_extraction=false

Examples

Turn on universal URL link extraction:

crawler.use_additional_link_extraction=true