Metadata class types
Funnelback supports five types of metadata classes:
-
Text: The content of this class is a string of text.
-
Number: The content of this field is a numeric value. Funnelback will interpret this as a number. This type should only be used if there is a need to use numeric operators when performing a search (e.g.
X > 2050
). If the field is only required for display within the search results a text type metadata class is sufficient. -
Date: Funnelback supports a single date class and will use the values mapped to this class to determine a date for the document for the purpose of ranking, sorting and also date range search. If additional dates are required they should be configured as either text (e.g.
2017-09-24
) or number (e.g.20170924
) type metadata classes. -
Geospatial x/y coordinate: The content of this field is a decimal latlong value in the following format: geo-x;geo-y (e.g. 40.6976684;-74.260555) This type should only be used if there is a need to perform a geospatial search (e.g. This point is within X km of another point). If the geospatial coordinate is only required for plotting items on a map then a text type metadata class is sufficient.
-
Document permissions: The content of this field is a security lock string defining the document permissions. This type should only be used when working with an enterprise collection that includes document level security and specifies the requirement of a document permissions metadata field.
Metadata class types: text
A text type metadata class has the values interpreted as a text string.
The text can include code such as HTML tags and these will be returned as is by Funnelback. It is the responsibility of the user interface layer to interpret or escape the field content.
Metadata class types: date
Funnelback supports a single date-type metadata class using the reserved d
metadata class. The value of this field is interpreted as a date and is assigned as the document’s date for the purposes of recency in the ranking algorithm, and also for sort and presentation.
Only a single date value will be assigned to the document. If multiple date metadata fields exist in the document the assigned date is chosen based on the date precedence rules below.
Supported date formats
Name | Format | Example | Notes |
---|---|---|---|
RFC1123 |
Wed Mar 08 14:11:00 EST 2000 |
||
YYYY-MM-DD |
2001-01-31 or 2001-31-01 12:53:01Z or 2001-31-01T12:53:01Z |
January 31st 2001 |
|
14 digits |
YYYYMMDDHHmmss |
20091110083016 |
November 10th 2009, 8:30:16 am |
6 digits |
YYMMDD |
010131 |
January 31st 2001 |
Short ISO-8601 |
YYYY-MM |
2001-01 |
January 2001 |
Very short ISO-8601 |
YYYY |
2001 |
2001 |
Non compliant ISO-8601 |
YYYY-DD-MM |
2001-31-01 |
Although this format is not standards compliant, dates with a middle component greater than 12 are treated this way. Take care though, ambiguous dates (e.g. May 2nd) will be interpreted in YYYY-MM-DD format. |
Abbreviated date |
YYMMMDD |
31jan01 |
January 31st 2001 |
Long form date |
DD MMMM YYYY |
31st january, 2001 or 31 Jan 2001 |
Long or short form months accepted, punctuation and 'st' 'nd' optional - "31 January 2001" is also acceptable. |
Long form date, month first |
MMMM DD YYYY |
January 31st, 2001 |
Long or short form months accepted, punctuation and 'st' 'nd' optional - "January 31 2001" is also acceptable. |
Pre-2000 dates |
DD MM YY |
31/1/01 or 31-01-01 |
Punctuation ignored. The indexer interprets years less than 80 as post 2000, and years greater or equal to 80 as 1980 onwards. It is not recommended. |
A TRIM format |
DD/MM/YYYY at h:mm a |
13/6/2007 at 6:51 AM, or 06/12/2007 at 4:51 PM |
Used by TRIM record management system |
Non-standard |
DD-MM-YYYY |
13-06-2007 or 13/06/2007 |
Avoid if possible |
Non-standard |
Day, DD Mon YYYY |
Wed, 13 Jun 2007 17:26:08 +1000 |
At least there is no ambiguity here. |
19 character UTC |
yyyyMMddHHmmss.SSSZ |
19970705071122.123Z |
The indexer will convert this date from UTC to the server’s local time zone. |
|
Date precedence order
When multiple dates are encountered for a document the following precedence order applies:
-
External metadata (highest priority)
-
The first occurrence in the document of
dc.date
or any metadata source mapped to thed
metadata class. -
dc.date.modified
-
dc.date.created
-
dc.date.issued
-
HTTP last modified date (lowest priority)
There are also two indexer options that can be set to change how the last modified HTTP header metadata is processed. These can be added to the indexer_options
in your data source configuration.
-lmd
-
The HTTP
LastModified
date takes priority over metadata dates. -lmd_never
-
Completely ignore HTTP
LastModified
dates.
Metadata class types: number
Defining a metadata class as a number tells Funnelback to interpret the contents of the field as a number. This allows numeric comparisons (==, !=, >=, >, <, <=) to be run against the field, and for numeric ranges to be defined as faceted navigation using the class.
Numeric metadata is only required if you wish to make use of these range comparisons or for numeric range facets. Numbers for the purpose of display in the search results should be defined as text metadata.
The value of a numeric field will contain an integer or float, and this number is interpreted by Funnelback as an 8-byte double. This affects the precision of large and small numerical values when applying range searches against a specific number. The lt_x
and gt_x
operators compare against the exact value specified. Other operators allow a small tolerance, enforced by the accuracy of 8-byte doubles.
Metadata class types: geospatial x/y coordinate
Defining a field as geospatial type metadata tells Funnelback to interpret the contents of the field as a decimal lat/long coordinate. (e.g. -31.95516;115.85766
). This is used by Funnelback to assign a geospatial coordinate to an indexed item (effectively pinning it to a single point on a map). A geospatial metadata field is useful if you wish to add any location-based search constraints such as show me items within a specified distance to a specified origin point, or sort the results by proximity (closeness) to a specific point.
Funnelback requires a geospatial metadata field to follow the decimal lat/long coordinate format
|
A geospatial metadata coordinate is not required if you just want to plot the item onto a map in the search results (a text type value will be fine as it’s just a text value you are passing to the mapping API service that will generate the map). It is only required if you wish to make use of the distance related searching (e.g. find results near this location). |
Metadata class types: document permissions
Funnelback interprets the value contained in a document permissions type metadata class as a document lock string describing the access controls that apply to the document.
This is used for enterprise search collections that enforce document level security.
The format of the lockstring is determined by the connector that is used for the repository that is being indexed.
Defining a document permissions type metadata field will prevent all results from the index from being returned unless an appropriate security plugin has been defined. This is to enforce a miniminum level of security over the collection when document level security is enabled. For this reason metadata fields of this type should only be defined when indexing a supported repository type that requires a document permissions metadata field to be defined.
See: document level security for further information.
Searching metadata
Metadata can be searched via the Funnelback query language or metadata specific CGI parameters.
See: Funnelback query language help for further information.