Upgrading custom groovy filters
Custom groovy filters allowed search developers to implement custom groovy code that manipulated downloaded date prior to indexing.
This is not permitted in the DXP as it is a security risk, and it also prevents automatic upgrades of the search.
This guide outlines the process you need to follow when upgrading groovy filters. This includes both document filters and also html document/Jsoup filters.
High level process
The key to successfully upgrading custom groovy filters is to break down the different unique tasks that the filters are performing. When doing this you often need to look at the configuration holistically - because the filter is often dependent on both other collection configuration and other filters that may have run earlier in the filter chain, placing dependencies on the filter.
The identified tasks should be performing discrete operations that can then be replaced with other product functionality.
Replacing custom groovy filters
Once you have broken down the functionality implemented within a filter into a set of discrete tasks, you then need to figure out how this can be done in the DXP without any custom coding.
The high-level options to replace groovy filters are:
-
Existing plugins: these implement commonly occurring tasks that were previous implemented as custom filters - e.g. cleaning result titles or modifying the search URL. Become familiar with the plugins that are available and the functions they perform.
-
Other built-in functionality: built-in filters such as the metadata scraper and metadata normalizer often duplicate functionality implemented in custom filters.
-
Move the custom processing out of Funnelback. Investigate the feasibility of updating the source data, or look at other DXP services that can process the downloaded data such as using the DXP integrations service or Job Runner in conjunction with the DXP file storage service.
A suitable replacement often involves using several plugins and built-in filters that are chained together in an appropriate order.
Common patterns and their replacements
The table below outlines some of the more commonly ocurring custom filter functionality and their replacements.
Current function | Replacement |
---|---|
Modifications to the result title |
Other custom filters that clean titles should also be replaced with the clean title plugin, running on a data source. |
Modifies the structure or content of XML or JSON data |
For XML, use the transform XML plugin, which applies XSLT to transform the XML. For JSON, use the modify JSON data plugin, which uses JSONata to perform transformations of JSON data. Alternatively use the convert JSON to XML built-in filter then apply the transform XML plugin outlined above. |
Processes and converts CSV data to XML |
Use the convert CSV to XML built-in filter then apply the transform XML plugin, or one of the metadata conversion filters or plugins outlined below. |
Alters metadata/XML/JSON field values |
Use the metadata-normalizer built-in filter. JSON must be converted to XML using the convert JSON to XML built-in filter before the metadata normalizer can be applied to JSON data. |
Extracts some document content and adds it as metadata |
Use the metadata-scraper built-in filter. |
Clones or combines metadata into a new metadata field |
Use the combine or clone metadata fields plugin. |
Adds some additional metadata based on matches to the URL |
Use the add metadata to URL plugin. |
Splits HTML or XML documents |
Use the split HTML or XML plugin. |
Deletes items older than a certain date |
Use the date filter plugin. |