The group in the crawler.link_extraction_regular_expression option which should be extracted as the link/URL.

Key: crawler.link_extraction_group
Type: Integer
Can be set in: collection.cfg


If you are using a regular expression to extract the URL from a document via crawler.link_extraction_regular_expression then this option specifies the group to extract.

Default Value


If no value is specified, then 5 is used, see crawler.link_extraction_regular_expression

© 2015- Squiz Pty Ltd