filecopy.exclude_pattern

Background

This option allows a regular expression to be specified that will be used to exclude certain files from being copied. Any filename that matches the given regular expression will not be copied and indexed. This has many uses, for example, it could be used to prevent a certain subdirectory from being indexed.

Notes:

  • Only for file system data sources

  • The regular expression must match the entire name of the file. Note the .* at the end of the exclude pattern in the example below.

  • The regular expression must match the internal name that is used in copying. This will not be a standard windows, UNC or unix path. It will either be a 'file' or 'smb' URL. e.g. file:///c:/documents/file.txt_or smb://file-server/dir/file.txt

Setting the key

Set this configuration key in the search package or data source configuration.

Use the configuration key editor to add or edit the filecopy.exclude_pattern key, and set the value. This can be set to any valid String value.

Default value

Office temporary files are excluded:

filecopy.exclude_pattern=^.*[\\/]~[^\\/]+$

Examples

Ignore images:

filecopy.exclude_pattern=.*(png|jpg|gif|bmp)$

Ignore everything in the 'junk' directory

filecopy.exclude_pattern=file:///myfilecopysource/data/junk/.*