Plugin: Transform date field

Purpose

Use this plugin to transform the value of date fields.

When to use this plugin

This plugin can be used to convert dates in your data, allowing you to:

  • transform ambiguous date formats into ISO format so that dates are correctly indexed by Funnelback e.g. US format dates like 2024-02-03 (2 March 2024), where you know what the intended format is.

  • create date values for use with faceted navigation, when the existing date facet groupings are not appropriate.

  • convert non-English dates into ISO format for the indexer.

  • create non-English formatted dates for presentation.

this plugin currently only supports the modification of dates that are within metadata or XML fields within the source document. It does not currently support modification of metadata generated by other filters, or of metadata fields at display time.

Usage

Enable the plugin

  1. Select Plugins from the side navigation pane and click on the Transform date field tile.

  2. From the Location section, select the data source to which you would like to enable this plugin from the Select a data source select list.

The plugin will take effect after setup steps and an advanced > full update of the data source has completed.

Configuration settings

The configuration settings section is where you do most of the configuration for your plugin. The settings enable you to control how the plugin behaves.

The configuration key names below are only used if you are configuring this plugin manually. The configuration keys are set in the data source configuration to configure the plugin. When setting the keys manually you need to type in (or copy and paste) the key name and value.

Date selector

Configuration key

plugin.transform-date-field.config.*.date_selector

Data type

string

Required

This setting is required

Defines an element containing a date that should be transformed. Parameter 1 should be set to a unique ID that will group together the different configuration items that make up a rule.

Date element type

Configuration key

plugin.transform-date-field.config.*.date_element_type

Data type

string

Allowed values

FIELD CONTENT,ATTRIBUTE VALUE

Required

This setting is required

Defines if the date value is sourced from the element content, or as the value of an attribute. Parameter 1 should be set to the Parameter 1 value of the matching date selector.

Date element attribute value

Configuration key

plugin.transform-date-field.config.*.date_element_attribute_name

Data type

string

Default value

++

Required

This setting is optional

Defines the name of the element’s attribute containing the date, if the date element type is set to ATTRIBUTE VALUE. Parameter 1 should be set to the Parameter 1 value of the matching date selector.

Is date element multiple values?

Configuration key

plugin.transform-date-field.config.*.date_element_multiple_values

Data type

boolean

Default value

false

Required

This setting is optional

Defines if the date element contains multiple date values. Parameter 1 should be set to the Parameter 1 value of the matching date selector.

Date element multiple values separator

Configuration key

plugin.transform-date-field.config.*.date_element_multiple_values_separator

Data type

string

Default value

|

Required

This setting is optional

Defines separator to extract multiple date values from the element content, if Is date element multiple values? is true. Parameter 1 should be set to the Parameter 1 value of the matching date selector.

Date field input format

Configuration key

plugin.transform-date-field.config.*.date_input_format

Data type

string

Required

This setting is required

Defines the date format of the input data, specified as a Java SimpleDateFormat string. Parameter 1 should be set to the Parameter 1 value of the matching date selector. see: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/text/SimpleDateFormat.html

Date field input language

Configuration key

plugin.transform-date-field.config.*.date_input_language

Data type

string

Default value

en-us

Required

This setting is optional

Defines the language of the input data to enable correct parsing of things like months specified in different languages. Parameter 1 should be set to the Parameter 1 value of the matching date selector. The value should be a valid Java Locale. see: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Locale.html

Date field output format

Configuration key

plugin.transform-date-field.config.*.date_output_format

Data type

string

Required

This setting is required

Defines the date format of the output data, specified as a Java SimpleDateFormat string. Parameter 1 should be set to the Parameter 1 value of the matching date selector. see: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/text/SimpleDateFormat.html

Date field output language

Configuration key

plugin.transform-date-field.config.*.date_output_language

Data type

string

Default value

en-us

Required

This setting is optional

Defines the language of the output data to ensure correct formatting of things like months specified in different languages. e.g. Parameter 1 should be set to the Parameter 1 value of the matching date selector. The value should be a valid Java Locale. see: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Locale.html

If a specified element cannot be parsed then it will remain unmodified.

Filter chain configuration

This plugin uses filters which are used to apply transformations to the gathered content.

The filters run in sequence and need be set in an order that makes sense. The plugin supplied filter(s) (as indicated in the listing) should be re-ordered to an appropriate point in the sequence.

Changes to the filter order affects the way the data source processes gathered documents. See: document filters documentation.

Filter classes

This plugin supplies a filter that runs in the main document filter chain: com.funnelback.plugin.TransformDateField.TransformDateFieldStringFilter

Drag the com.funnelback.plugin.TransformDateField.TransformDateFieldStringFilter plugin filter to where you wish it to run in the filter chain sequence.

Examples

XML example

Consider the following XML document as input:

<?xml version="1.0" encoding="UTF-8"?>
<books>
    <book updatedAt="2024-10-20T13:00+10:00">
        <id>1</id>
        <pubDate>1 Janvier 2024</pubDate>
    </book>
    <book updatedAt="2024-01-02T14:00Z">
        <id>2</id>
        <pubDate>2 Novembre 2018</pubDate>
        <reprint>1 Janvier 2019;2 Novembre 2020</reprint>
    </book>
</books>

with the following settings:

Configuration key name Parameter 1 Value

Date selector

publishDate

//book/pubDate

Date element type

publishDate

FIELD CONTENT

Is date element multiple values?

publishDate

false

Date field input format

publishDate

d MMMM yyyy

Date field input language

publishDate

fr

Date field output format

publishDate

MMM yyyy

Date field input language

publishDate

en-au

Date selector

updated

//book

Date element type

updated

ATTRIBUTE VALUE

Is date element multiple values?

updated

false

Date element attribute value

updated

updatedAt

Date field input format

updated

yyyy-MM-dd’T’HH:mmXXX

Date field output format

updated

yyyy

Date selector

reprint

//book/reprint

Date element type

reprint

FIELD CONTENT

Is date element multiple values?

reprint

true

Date element multiple values separator

reprint

;

Date field input format

reprint

d MMMM yyyy

Date field input language

reprint

fr

Date field output format

reprint

MMM yyyy

Date field input language

reprint

en-au

The transform will output the following XML document:

<?xml version="1.0" encoding="UTF-8"?>
<books>
    <book updatedAt="2024">
        <id>1</id>
        <pubDate>Jan 2024</pubDate>
    </book>
    <book updatedAt="2024">
        <id>2</id>
        <pubDate>Nov 2018</pubDate>
        <reprint>Jan 2019;2 Nov 2020</reprint>
    </book>
</books>

HTML example

Example 1

Consider the following HTML document as input:

<html>
    <head></head>
    <body>
        <h1>Book List</h1>
        <ol>
            <li data-pubdate="2024-01-01T00:00:00Z">Book 1 - Publish date <strong>2024-01-01T00:00:00Z</strong></li>
            <li data-pubdate="2024-02-03T00:00:00+10:00">Book 2 - Publish date <strong>2024-02-03T00:00:00+10:00</strong></li>
        </ol>
    </body>
</html>

with the following input:

Configuration key name Parameter 1 Value

Date selector

1

ol > li > strong

Date element type

1

FIELD CONTENT

Is date element multiple values?

1

false

Date field input format

1

yyyy-MM-dd’T’HH:mm:ssXXX

Date field output format

1

MMM yyyy

Date selector

2

ol > li

Date element type

2

ATTRIBUTE VALUE

Date element attribute value

2

data-pubdate

Is date element multiple values?

2

false

Date field input format

2

yyyy-MM-dd’T’HH:mm:ssXXX

Date field output format

2

MMMM yyyy

The transform will output the following XML document:

<html>
    <head></head>
    <body>
        <h1>Book List</h1>
        <ol>
            <li data-pubdate="January 2024">Book 1 - Publish date <strong>Jan 2024</strong></li>
            <li data-pubdate="February 2024">Book 2 - Publish date<strong>Feb 2024</strong></li>
        </ol>
    </body>
</html>

Example 2

Consider the following HTML document as input:

<html>
    <head>
        <meta name="updated-time" content="2024-01-01T00:00:00Z|2024-02-01T00:00:00Z"/>
    </head>
    <body>
        .....
    </body>
</html>

with the following input:

Configuration key name Parameter 1 Value

Date selector

1

meta[name=updated-time]

Date element type

1

ATTRIBUTE VALUE

Date element attribute value

1

content

Is date element multiple values?

1

true

Date element multiple values separator

1

|

Date field input format

1

yyyy-MM-dd’T’HH:mm:ssXXX

Date field output format

1

MMM yyyy

The transform will output the following XML document:

<html>
    <head>
        <meta name="updated-time" content="Jan 2024|Feb 2024"/>
    </head>
    <body>
        .....
    </body>
</html>

Change log

[1.1.0]

Added

  • Added support to transform the date field with multiple values.

Fixed

  • Fixed filter fully qualified Java class name used in plugin UI interface.

  • Updated documentation to fix XML example.