crawler.link_extraction_group
The group in the crawler.link_extraction_regular_expression
option which should be extracted as the link/URL.
Key: crawler.link_extraction_group
Type: Integer
Can be set in: collection.cfg
Table of Contents
Description
If you are using a regular expression to extract the URL from a document via crawler.link_extraction_regular_expression
then this option specifies the group to extract.
Default Value
crawler.link_extraction_group=5
If no value is specified, then 5
is used, see crawler.link_extraction_regular_expression