crawler.incremental_logging

Background

This parameter controls whether information on different URL types seen during an incremental crawl is logged out. If this setting is enabled then the following log files will be created:

  • new_urls.log: A new URL is defined as one which was not stored in the previous crawl.

  • copied_urls.logs: All URLs whose content was copied from the previous crawl, as they had not changed and so were not downloaded again.

The logs will be located in the log directory in the relevant "view" for the web data source and can be viewed using the log viewer.

Setting the key

Set this configuration key in the search package or data source configuration.

Use the configuration key editor to add or edit the crawler.incremental_logging key, and set the value. This can be set to any valid Boolean value.

Default value

crawler.incremental_logging=false

The default behaviour is false i.e. do not perform this logging.

Examples

Turn on incremental logging:

crawler.incremental_logging=true