Plugin: Clean title
Use this plugin to remove sections of text from your search result titles.
Search result titles often includes text, such as a common prefix or suffix that contains the site name. This clutters the search results display and can make it difficult for a user to quickly scan the search results.
This plugin allows you to clean up the title, by removing parts of the text. This is achieved by defining a series of regular expressions that identify parts of the title to remove.
This plugin supports two types of usage:
- Cleaning the title in HTML source data
This method fixes the title before it has been indexed. This means the title will be changed for all features within Funnelback that refer to the title. This method will provide more consistent behavior, but requires an index. Also use this if you need to sort your search results alphabetically.
When enabled on a data source it can be used to clean the contents of the html
This only applies to HTML documents and will update the value of the
<title>element. The advantage of modifying the source data is that the title included in the index will contain the modification meaning that sorting will function correctly if the search result title is based on the html
To clean the title within the source data, follow the steps for enabling the plugin on data source, below.
- Cleaning the title returned in the search results listing
This method fixes the title after it has been indexed, and only applies the change to the JSON data returned when making a query. Use this method if you need to quickly apply a change and don’t need to sort your results alphabetically. The changes will only affect the titles that are printed in the search results listing.
When enabled on a results page it can be used to modify the value of the
result.titledata model element. Use the plugin on a results page if you just need to update the sear result titles (regardless of the underlying data source type).
If you modify the
result.titleusing this method then sorting by title may be incorrect as the renaming occurs after the result set has been sorted. Sorting will be incorrect if you modify the start of any titles and the regex pattern does not match all search result titles.
To clean the title returned in the results listing, follow the steps for enabling the plugin on results page, below.
Select Plugins from the side navigation pane and click on the Clean title tile.
From the Location section, decide if you wish to enable this plugin on a data source or a results page and select the corresponding radio button.
Select the data source or results page to which you would like to enable this plugin from the drop-down menu.
|If enabled on a data source, the plugin will take effect as soon as the setup steps are completed, and an advanced > full update of the data source has completed. If enabled on a results page the plugin will take effect as soon as the setup steps are completed.|
The configuration settings section is where you do most of the configuration for your plugin. The settings enable you to control how the plugin behaves.
|The configuration key names below are only used if you are configuring this plugin manually. The configuration keys are set in the data source or results page configuration to configure the plugin. When setting the keys manually you need to type in (or copy and paste) the key name and value.|
The removal pattern (
plugin.clean-title.config.regex) option affects the titles returned within the data model’s
This option can be defined multiple times by assigning a different identifier in the Parameter 1 field when configuring the setting. If multiple regex keys are defined then they will be executed in sequence with the order determined by the Parameter 1 value when sorted alphabetically.
For example, if you have three removal patterns defines with Parameter 1 IDs of orange, apple and pear, then the patterns will be applied in the following order:
This plugin uses filters which are used to apply transformations to the gathered content.
The filters run in sequence and need be set in an order that makes sense. The plugin supplied filter(s) (as indicated in the listing) should be re-ordered to an appropriate point in the sequence.
|Changes to the filter order affects the way the data source processes gathered documents. See: document filters documentation.|
This example applies for both data sources and results pages, as outlined above.
Consider we have titles like:
ExampleOrg - Page title (www.example.com)
Where many pages have titles that are prefixed with
ExampleOrg - and contain a suffix of
You would like the
Page title to be displayed as the hyperlinked title in your search results.
This could be achieved by setting the following configuration keys:
|Configuration key name||Parameter 1||Value|
This runs each of the regexes on the result
<title> element thus we remove both the prefix and suffix.
generic-suffix names could have been called anything, but remember that the names used will define the order in which the patterns are applied.
When viewing the raw configuration these keys will appear as
plugin.clean-title.config.regex.generic-prefix=^ExampleOrg -\s+ plugin.clean-title.config.regex.generic-suffix=\s+\(www\.example\.com\)$