Numerical Metadata

Introduction

Funnelback supports search over numeric data stored in metadata fields. These fields can be defined in either:

Numeric fields can be queried using CGI parameters.

The CGI parameters are:

CGI Parameter Values Description
lt_<x> where <x> is a metadata class <float> Performs a "Less than" operation on metadata class <x>
le_<x> where <x> is a metadata class <float> Performs a "Less than or equals" operation on metadata class <x>
gt_<x> where <x> is a metadata class <float> Performs a "Greater than" operation on metadata class <x>
ge_<x> where <x> is a metadata class <float> Performs a "Greater than or equals" operation on metadata class <x>
eq_<x> where <x> is a metadata class <float> Performs an "Equals" operation on metadata class <x>
ne_<x> where <x> is a metadata class <float> Performs a "Not Equals" operation on metadata class <x>

Assumptions

The following assumptions are made by the indexer and query processor:

How to index numerical data

The numerical range metadata can be represented in three different ways:

  1. via meta elements in HTML (or XML),
  2. via XML elements,
  3. via attributes of XML elements.

Example

Example metamap.cfg

# Numerical metadata fields relating to cars
weight,3,weight
acceleration,3,acceleration
capacity,3,engine_capacity
price,3,price

Example xml.cfg

PADRE XML Mapping Version: 2
#Supports numerical metadata either through elements or attributes.
document,/car
docurl,/car/url
t,1,,//title
description,0,,//description
weight,3,,//weight
acceleration,3,,//acceleration
capacity,3,,//engine_capacity
price,3,,//price
weight,3,,/car@weight 
acceleration,3,,/car@acceleration 
capacity,3,,/car@engine_capacity 
price,3,,/car@price

No special settings are needed for indexing, but the appropriate query_processor_options (-SF=<numeric metadata classes> and -SM=both) will need to be set in collection.cfg to ensure that the numeric fields appear in the result packet. For the example above:

query_processor_options=-SM=meta -SF=[weight,capacity,acceleration,price] -SBL=2000

Example XML document which the above xml.cfg applies to:


<car>
  <url>http://www.bmw.com.au/scripts/main.asp?PageID=11768&ModelID=1000079&ModelCategoryID=10</url>
  <title>BMW model X95</title>
  <meta name='description' content='The only BMW sports car with the ability to plough a field!'/>
  <weight>1056.9<weight>
  <acceleration>30.9<acceleration>
  <engine_capacity>5500<engine_capacity>
  <price>165300<price>
</car>

<car weight='1312.8' acceleration='15.2' engine_capacity='2293' price='65800'> 
  <url>http://www.bmw.com.au/scripts/main.asp?PageID=11768&ModelID=1000116&ModelCategoryID=10&Screen=LaunchPage</url>
  <title>BMW model X100</title>
  <meta name='description' content='The only BMW sports car which does not seem out of place when shopping for groceries.'/>
</car>

Composing a search

To find all the BMW cars costing less than or equal to one hundred thousand dollars with acceleration between 10 and 20, you would require a CGI query string as follows:

query=BMW&le_price=100000&ge_acceleration=10&le_acceleration=20

Caveats

See Also