filter.text-cleanup.ranges-to-replace

Background

This parameter lists the Unicode blocks of characters which should be removed by the TextCleanupFilterProvider when it is used in the filter.classes config setting.

The list is expected to be comma separated, and contain names as listed in the defined unicode block names. Block names are case-insensitive (so PLAYING_CARDS and playing_cards are both permitted and are equivalent).

Setting the key

Set this configuration key in the search package or data source configuration.

Use the configuration key editor to add or edit the filter.text-cleanup.ranges-to-replace key, and set the value. This can be set to any valid List<String> value.

Default value

filter.text-cleanup.ranges-to-replace=private_use_area

Examples

If both private use and playing card characters occur in documents but should not be shown in search results or cached copies:

filter.text-cleanup.ranges-to-replace=private_use_area,playing_cards