Built-in HTML document filters - Detect undesirable text

The undesirable text filter analyzes HTML documents for the occurrence of words from various word lists.

This filter is required by content auditor to produce the undesirable text report.

Enabling

The undesirable text filter is enabled by default.

filter.jsoup.classes=<jsoup_filters>,UndesirableText,<jsoup_filters>

Configuration

filter.jsoup.undesirable_text.[key_name]=[word]

filter.jsoup.undesirable_text-source.default-misspellings=$SEARCH_HOME/conf/common-misspellings.txt.default

filter.jsoup.undesirable_text-source.weasel-words=undesirable-text.weasel-words.cfg

See also