Plugin: Transform date field
Purpose
Use this plugin to transform the value of date fields.
When to use this plugin
This plugin can be used to convert dates in your data, allowing you to:
-
transform ambiguous date formats into ISO format so that dates are correctly indexed by Funnelback e.g. US format dates like 2024-02-03 (2 March 2024), where you know what the intended format is.
-
create date values for use with faceted navigation, when the existing date facet groupings are not appropriate.
-
convert non-English dates into ISO format for the indexer.
-
create non-English formatted dates for presentation.
this plugin currently only supports the modification of dates that are within metadata or XML fields within the source document. It does not currently support modification of metadata generated by other filters, or of metadata fields at display time. |
Usage
Enable the plugin
-
Select Plugins from the side navigation pane and click on the Transform date field tile.
-
From the Location section, select the data source to which you would like to enable this plugin from the Select a data source select list.
The plugin will take effect after setup steps and an advanced > full update of the data source has completed. |
Configuration settings
The configuration settings section is where you do most of the configuration for your plugin. The settings enable you to control how the plugin behaves.
The configuration key names below are only used if you are configuring this plugin manually. The configuration keys are set in the data source configuration to configure the plugin. When setting the keys manually you need to type in (or copy and paste) the key name and value. |
Date selector
Configuration key |
|
Data type |
string |
Required |
This setting is required |
Defines an element containing a date that should be transformed. Parameter 1 should be set to a unique ID that will group together the different configuration items that make up a rule.
Date element type
Configuration key |
|
Data type |
string |
Allowed values |
FIELD CONTENT,ATTRIBUTE VALUE |
Required |
This setting is required |
Defines if the date value is sourced from the element content, or as the value of an attribute. Parameter 1 should be set to the Parameter 1 value of the matching date selector.
Date element attribute value
Configuration key |
|
Data type |
string |
Default value |
|
Required |
This setting is optional |
Defines the name of the element’s attribute containing the date, if the date element type is set to ATTRIBUTE VALUE. Parameter 1 should be set to the Parameter 1 value of the matching date selector.
Is date element multiple values?
Configuration key |
|
Data type |
boolean |
Default value |
|
Required |
This setting is optional |
Defines if the date element contains multiple date values. Parameter 1 should be set to the Parameter 1 value of the matching date selector.
Date element multiple values separator
Configuration key |
|
Data type |
string |
Default value |
|
Required |
This setting is optional |
Defines separator to extract multiple date values from the element content, if Is date element multiple values? is true. Parameter 1 should be set to the Parameter 1 value of the matching date selector.
Date field input format
Configuration key |
|
Data type |
string |
Required |
This setting is required |
Defines the date format of the input data, specified as a Java SimpleDateFormat string. Parameter 1 should be set to the Parameter 1 value of the matching date selector. see: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/text/SimpleDateFormat.html
Date field input language
Configuration key |
|
Data type |
string |
Default value |
|
Required |
This setting is optional |
Defines the language of the input data to enable correct parsing of things like months specified in different languages. Parameter 1 should be set to the Parameter 1 value of the matching date selector. The value should be a valid Java Locale. see: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Locale.html
Date field output format
Configuration key |
|
Data type |
string |
Required |
This setting is required |
Defines the date format of the output data, specified as a Java SimpleDateFormat string. Parameter 1 should be set to the Parameter 1 value of the matching date selector. see: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/text/SimpleDateFormat.html
Date field output language
Configuration key |
|
Data type |
string |
Default value |
|
Required |
This setting is optional |
Defines the language of the output data to ensure correct formatting of things like months specified in different languages. e.g. Parameter 1 should be set to the Parameter 1 value of the matching date selector. The value should be a valid Java Locale. see: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Locale.html
If a specified element cannot be parsed then it will remain unmodified. |
Filter chain configuration
This plugin uses filters which are used to apply transformations to the gathered content.
The filters run in sequence and need be set in an order that makes sense. The plugin supplied filter(s) (as indicated in the listing) should be re-ordered to an appropriate point in the sequence.
Changes to the filter order affects the way the data source processes gathered documents. See: document filters documentation. |
Filter classes
This plugin supplies a filter that runs in the main document filter chain: com.funnelback.plugin.TransformDateField.TransformDateFieldStringFilter
Drag the com.funnelback.plugin.TransformDateField.TransformDateFieldStringFilter plugin filter to where you wish it to run in the filter chain sequence.
Examples
XML example
Consider the following XML document as input:
<?xml version="1.0" encoding="UTF-8"?>
<books>
<book updatedAt="2024-10-20T13:00+10:00">
<id>1</id>
<pubDate>1 Janvier 2024</pubDate>
</book>
<book updatedAt="2024-01-02T14:00Z">
<id>2</id>
<pubDate>2 Novembre 2018</pubDate>
<reprint>1 Janvier 2019;2 Novembre 2020</reprint>
</book>
</books>
with the following settings:
Configuration key name | Parameter 1 | Value |
---|---|---|
Date selector |
|
|
Date element type |
|
|
Is date element multiple values? |
|
|
Date field input format |
|
|
Date field input language |
|
|
Date field output format |
|
|
Date field input language |
|
|
Date selector |
|
|
Date element type |
|
|
Is date element multiple values? |
|
|
Date element attribute value |
|
|
Date field input format |
|
|
Date field output format |
|
|
Date selector |
|
|
Date element type |
|
|
Is date element multiple values? |
|
|
Date element multiple values separator |
|
|
Date field input format |
|
|
Date field input language |
|
|
Date field output format |
|
|
Date field input language |
|
|
The transform will output the following XML document:
<?xml version="1.0" encoding="UTF-8"?>
<books>
<book updatedAt="2024">
<id>1</id>
<pubDate>Jan 2024</pubDate>
</book>
<book updatedAt="2024">
<id>2</id>
<pubDate>Nov 2018</pubDate>
<reprint>Jan 2019;2 Nov 2020</reprint>
</book>
</books>
HTML example
Example 1
Consider the following HTML document as input:
<html>
<head></head>
<body>
<h1>Book List</h1>
<ol>
<li data-pubdate="2024-01-01T00:00:00Z">Book 1 - Publish date <strong>2024-01-01T00:00:00Z</strong></li>
<li data-pubdate="2024-02-03T00:00:00+10:00">Book 2 - Publish date <strong>2024-02-03T00:00:00+10:00</strong></li>
</ol>
</body>
</html>
with the following input:
Configuration key name | Parameter 1 | Value |
---|---|---|
Date selector |
|
|
Date element type |
|
|
Is date element multiple values? |
|
|
Date field input format |
|
|
Date field output format |
|
|
Date selector |
|
|
Date element type |
|
|
Date element attribute value |
|
|
Is date element multiple values? |
|
|
Date field input format |
|
|
Date field output format |
|
|
The transform will output the following XML document:
<html>
<head></head>
<body>
<h1>Book List</h1>
<ol>
<li data-pubdate="January 2024">Book 1 - Publish date <strong>Jan 2024</strong></li>
<li data-pubdate="February 2024">Book 2 - Publish date<strong>Feb 2024</strong></li>
</ol>
</body>
</html>
Example 2
Consider the following HTML document as input:
<html>
<head>
<meta name="updated-time" content="2024-01-01T00:00:00Z|2024-02-01T00:00:00Z"/>
</head>
<body>
.....
</body>
</html>
with the following input:
Configuration key name | Parameter 1 | Value |
---|---|---|
Date selector |
|
|
Date element type |
|
|
Date element attribute value |
|
|
Is date element multiple values? |
|
|
Date element multiple values separator |
|
|
Date field input format |
|
|
Date field output format |
|
|
The transform will output the following XML document:
<html>
<head>
<meta name="updated-time" content="Jan 2024|Feb 2024"/>
</head>
<body>
.....
</body>
</html>