Metadata class

Funnelback allows metadata classes to be defined as ASCII alphanumeric strings up to 64 characters long, which do NOT start with upper or lower case FUN. Funnelback has some predefined metadata classes which should be used when possible.

Reserved Classes

The following metadata classes are reserved for internal use, and should not normally be used for other purposes.

Metadata Class Explanation of Reservation
h Outgoing link target information.
i Image information (alt and src attributes of img tags).
k Anchor text referring to the document (text within a tags).
m Email addresses within the document (a tags using mailto: in href attributes).
u URL hostname information.
v URL path and filename information.
K User click information referring to the document.

Any metadata class that starts with FUN or any upper or lower case variation is also reserved.

Special Classes

The following metadata classes are treated specially by Funnelback. It may be appropriate to map metadata into them, but they will be treated differently internally as described below.

Metadata Class Explanation Default mappings
d Used for document date information. Date sources may be mapped in metamap.cfg or xml.cfg, and will be used when the document date is displayed and for recency related ranking. dc.date (and qualifications like dc.data.published not mentioned thereafter), dc.date.modified, dc.date.created, dc.date.issued, Last Modified Date (from HTTP headers), dc.date.expires, dc.date.valid, in order of decreasing priority. See supported date formats for more information.
f Used for file format information. Will be used as the original type of a file (e.g. HTML, PDF, Word Document) where this information is displayed. dc.format, funnelback.format, text/html
t Used for title information. Title sources may be mapped in metamap.cfg or xml.cfg. The first title found will be used when the document title is displayed and all title content will be up-weighted by default in ranking. For html documents the title in <title> will typically be preferred. title, dc.title, trim.title, h1 tags, h2 tags, h3 tags, h4 tags

Predefined Classes

Funnelback has predefined the following classes, this allows Funnelback to look for some metadata within html documents and display this data on the search results page without heavy customisation.

Metadata Class Explanation Metadata fields included
* Anywhere. In any metadata field or in the page content. N/A
a Author Author, DC.Creator, DC.Author, DC.Contributor, from: (email)
b Rights DC.Rights
c Description DC.Description
e Type DC.Type
f Format DC.Format
g Relation DC.Relation
j Availability/Identifier DC.Identifier, AGLS.Availability
l Language DC.Language
n Source DC.Source
o Coverage DC.Coverage
p Publisher DC.Publisher
q Function AGLS.Function
r Recipients to: (email),AGLS.audience
s Subject/Keywords keywords, DC.Subject, subject: (in the case of email)
w - AGLS.Mandate
S Used for document security information in DLS-enabled collections. -

Listing classes

In a number of situations, for example some query processor options, Funnelback supports providing a list of metadata classes.

The standard syntax for such a list is a comma separated list of class names within square brackets, for example:

[class1,class2...,classN]

See also