Plugin: Modify JSON data

Purpose

This plugin can be used to transform and modify JSON data before it is indexed.

The JSONata Query and Transform language is used to transform the JSON data and query additional metadata from the JSON documents.

When to use this plugin

Use this plugin if you need to make changes to JSON data that has been downloaded so that the changes are reflected within the search index.

Usage

  1. Enable the modify-json-data plugin on your data source from the Extensions screen in the administration dashboard or add the following data source configuration to enable the plugin.

  plugin.modify-json-data.enabled=true
  plugin.modify-json-data.version=1.0.0
  1. Add the modify-json-data filter to the filter chain.

  filter.classes=<OTHER-FILTERS>:com.funnelback.plugin.modifyjsondata.JsonataStringFilter:<OTHER-FILTERS>
The modify-json-data filter should be placed at an appropriate position in the filter chain (which applies the filters from left to right). In most circumstances this should be located towards the end of the filter chain and must be placed before the JSONToXML filter).
  1. Configure the plugin (see the plugin configuration settings section below).

  2. Run a full update of the data source. Note: a full update is required as all of your documents must be re-gathered and filtered for any changes to take effect. If you are using this with a push data source then you will need to resubmit anything where you want the new filter to be applied.

Plugin configuration settings

The following plugin configuration settings can be used to configure the plugin:

  • plugin.modify-json-data.config.metadata.[class_name]: (JSONata expression) Evaluates the JSONata expression and stores the result in the class_name metadata class.

  • plugin.modify-json-data.config.transform: (JSONata expression) Evaluates the JSONata expression to transform the JSON input.

  • plugin.modify-json-data.config.max_evaluation_time: (Integer) Maximum amount of time (ms) that an expression can evaluate for before it is cancelled. Default is 5000ms.

  • plugin.modify-json-data.config.max_evaluation_stack_depth: (Integer) Maximum amount of stack frames allowed during execution (only relevant for recursive function). Default is 64.

Any number of plugin.modify-json-data.config.metadata.[class_name] parameters can be provided, and each is evaluated for that given class_name metadata class.

Examples

Query - Create a "fullname" field to use as a record title

A JSON data feed containing Directory/People information may contain fields for their first name and last name, but not the full name. The search results output should use the full name as the record title for ranking relevance and ensuring that partial queries for autocompletion produce the expected results.

The following setting would add a new metadata field fullname that is a concatenation of the firstname and lastname fields in the JSON record, given the following input data (JSON object):

{
    "firstname": "Jane",
    "lastname": "Smith",
    "url": "https://www.example.com/1"
}

and the following plugin configuration:

plugin.modify-json-data.config.metadata.fullname=firstname & ' ' & lastname

firstname & ' ' & lastname is a JSONata expression which concatenates the firstname and lastname fields together with a space character in between, thus creating a fullname field without the need for a specialized plugin to be written.

Transform - Remove Records Based on a Criteria

A common need for a JSON data feed is to remove records that fit a given criteria, for example, remove Events that have already occurred or remove news articles in "draft" status that have not been published yet.

Given the following input data (JSON array):

[
  {
    "articleTitle": "...",
    "url": "https://www.example.com/1",
    "status": "published"
  },
  {
    "articleTitle": "...",
    "url": "https://www.example.com/2",
    "status": "draft"
  }
]

and the following plugin configuration:

plugin.modify-json-data.config.transform=$filter($, function($article) { $article.status = "published"})

The result of transforming the JSON with this expression is that only records with a "status" field of "published" will be added to the search index.

Transform - Remove null values from a JSON object

An input data JSON object with a null value can be problematic, as when the JSON data is converted into XML data in Funnelback using the JSONToXML filter, the null JSON value is converted into a string "null" value which shows in facets and in the search results output.

The following transform expression can be used to remove all null values from a JSON object.

plugin.modify-json-data.config.transform=$sift($, function($value) { $value != null })