Plugin: Metadata delimiters
Other versions of this plugin may exist. Please ensure you are viewing the documentation for the version that you are currently using. If you are not running the latest version of your plugin we recommend upgrading. See: list of all available versions of this plugin. |
This filter can be used to replace the delimiters used in metadata fields on a per-field basis.
The delimiters are replaced with the Funnelback standard delimiter (a vertical bar character).
Usage
Enable the plugin
Enable the metadata-delimiters plugin on your data source from the Extensions screen in the administration dashboard or add the following data source configuration to enable the plugin.
plugin.metadata-delimiters.enabled=true
plugin.metadata-delimiters.version=1.0.0
The plugin will take effect after a full update of the data source. |
Configuring the plugin
The MetadataDelimiters
filter must be added to the jsoup filter chain in order for the plugin to work.
Add the filter to the jsoup.filter.classes
in the data source configuration.
e.g.
filter.jsoup.classes=ContentGeneratorUrlDetection,FleschKincaidGradeLevel,UndesirableText,com.funnelback.plugin.metadatadelimiters.MetadataDelimiters
The following options must be set in the data source configuration to configure the plugin:
-
plugin.metadata-delimiters.config.metadata.<METADATA-FIELD-NAME>.delimiter=<CHARACTER-TO-REPLACE>
: This defines the delimiter character<CHARACTER-TO-REPLACE>
that applies to the specified metadata field. A key needs to be defined for each field where you want to set the field delimiter. Only a single field delimiter is supported. e.g.plugin.metadata-delimiters.config.metadata.keywords=,
sets the separator for the<meta name="keywords">
field to a comma.
Additional configuration settings:
-
plugin.metadata-delimiters.config.metadata.<METADATA-FIELD-NAME>.attribute=<META-FIELD-ATTRIBUTE-CONTAINING-NAME>
: This changes the<meta>
tag property that holds the metadata field name. This is normallyname
but for some<meta>
tags such as Open Graph meta tags this needs to be set toproperty
. e.g. a standard metadata field looks like<meta name="dc.title" content="Example title">
. An Open Graph metadata field looks like<meta property="og:title" content="Example title">
. A key needs to be set for each metadata field that does not have the metadata field name defined within thename
attribute of the<meta>
tag. -
plugin.metadata-delimiters.config.separator=<DELIMITER-TO-USE>
: This defines the padre separator, which is what the delimiter is replaced with. Default is the vertical bar character|
. This only needs to be changed if thefacet_item_sepchars
indexer option has been set and removes the vertical bar from the list of separators.
Example
Consider the following HTML file:
<html>
<head>
<title>Example document</title>
<meta name="country" content="Australia, New Zealand">
<meta name="fruit.type" content="apple; banana; pear">
<meta name="colour" content="blue, green, orange, pink">
<meta property="og:type" content="web page; article">
</head>
<content>
...
</content>
</html>
To set the field delimiters for the fruit.type
, colour
and og:type
fields:
-
enable the
metadata-delimiters
plugin. -
add the metadata delimiters filter to the jsoup filter chain and ensure jsoup filtering is enabled in the main filter chain.
-
Add the following data source configuration options to configure the plugin:
plugin.metadata-delimiters.config.metadata.colour.delimiter=, plugin.metadata-delimiters.config.metadata.fruit.type.delimiter=; plugin.metadata-delimiters.config.metadata.og:type.delimiter=; plugin.metadata-delimiters.config.setting.og:type.attribute=property
-
Run a full update of your data source.
The plugin will update the HTML that is stored on disk to:
<html>
<head>
<title>Example document</title>
<meta name="country" content="Australia, New Zealand">
<meta name="fruit.type" content="apple| banana| pear">
<meta name="colour" content="blue| green| orange| pink">
<meta property="og:type" content="web page| article">
</head>
<content>
...
</content>
</html>
This will result in the Funnelback indexer splitting the colour
, fruit.type
and og:type
fields when indexing. The country field will not get split.