crawler.accept_files

Background

This is a comma-separated list of file extensions that will be downloaded by the crawler. It is normally left empty, so that the crawler will accept all valid content regardless of the suffix.

Setting the key

Set this configuration key in the search package or data source configuration.

Use the configuration key editor to add or edit the crawler.accept_files key, and set the value. This can be set to any valid List<String> value.

Default value

(Empty) - This means there are no restrictions on what files will be downloaded.

Examples

crawler.accept_files=htm,html,asp,php,txt,stm,jsp,xml,cfm,pdf

In this example a specific list of filetypes (based on suffix) is listed - only files of these types will be downloaded.