Implementer training - Advanced metadata

Geospatial and numeric metadata

Recall that Funnelback supports five types of metadata classes:

  • Text: The content of this class is a string of text.

  • Geospatial x/y coordinate: The content of this field is a decimal latlong value in the following format: geo-x;geo-y (e.g. 2.5233;-0.95) This type should only be used if there is a need to perform a geospatial search (e.g. This point is within X km of another point). If the geospatial coordinate is only required for plotting items on a map then a text field is sufficient.

  • Number: The content of this field is a numeric value. Funnelback will interpret this as a number. This type should only be used if there is a need to use numeric operators when performing a search (e.g. X > 2050) or to sort the results in numeric order. If the field is only required for display within the search results text field is sufficient.

  • Document permissions: The content of this field is a security lock string defining the document permissions. This type should only be used when working with an enterprise collection that includes document level security.

  • Date: A single metadata class supports a date, which is used as the document’s date for the purpose of relevance and date sorting. Additional dates for the purpose of display can be indexed as either a text or number type metadata class, depending on how you wish to use the field.

Funnelback’s text metadata type is sufficient for inclusion of metadata in the index appropriate for the majority of use cases.

The geospatial x/y coordinate and number metadata types are special metadata types that alter the way the indexed metadata value is interpreted, and provide type specific methods for working with the indexed value.

Defining a field as a geospatial x/y coordinate tells Funnelback to interpret the contents of the field as a decimal lat/long coordinate. (e.g. -31.95516;115.85766). This is used by Funnelback to assign a geospatial coordinate to an index item (effectively pinning it to a single point on a map). A geospatial metadata field is useful if you wish to add any location-based search constraints such as (show me items within a specified distance to a specified origin point), or sort the results by proximity (closeness) to a specific point.

A geospatial x/y coordinate is not required if you just want to plot the item onto a map in the search results (a text type value will be fine as it’s just a text value you are passing to the mapping API service that will generate the map).

Defining a field as a number tells Funnelback to interpret the contents of the field as a number. This allows range and equality comparisons (==, !=, >=, >, <, <=) to be run against the field. Numeric metadata is only required if you wish to make use of these range comparisons. Numbers for the purpose of display in the search results should be defined as text type metadata.

Only use geospatial and numeric values if you wish to make use of the special type-specific query operators. Be careful when selecting your class names because these will be merged with the classes from other data sources that are included in the same search package.

Tutorial: Geospatial and numeric metadata

In this exercise we will extend the metadata that is extracted from the XML example. We will include both a geospatial metadata field and a numeric metadata field. Recall the record format for the XML data:

  <row>
    <Airport_ID>1</Airport_ID>
    <Name>Goroka</Name>
    <City>Goroka</City>
    <Country>Papua New Guinea</Country>
    <IATA_FAA>GKA</IATA_FAA>
    <ICAO>AYGA</ICAO>
    <Latitude>-6.081689</Latitude>
    <Longitude>145.391881</Longitude>
    <Altitude>5282</Altitude>
    <Timezone>10</Timezone>
    <DST>U</DST>
    <TZ>Pacific/Port_Moresby</TZ>
    <LATLONG>-6.081689;145.391881</LATLONG>
  </row>

The <LATLONG> field contains the geospatial metadata that will be associated with the item.

when working with geospatial metadata Funnelback expects the format of the field to contain a decimal X/Y coordinate in the format above (X coordinate;Y coordinate). If the format of the field doesn’t match (e.g. is delimited with a comma) or the X/Y values are supplied separately you will need to clean the XML before Funnelback indexes it (or provide an additional field in the correct format within the source data).

The <Altitude> field will be used as the source of numeric metadata for the purpose of this exercise.

  1. Log in to the search dashboard where you are doing your training.

    See: Training - search dashboard access information if you’re not sure how to access the training. Ignore this step if you’re treating this as a non-interactive tutorial.
  2. Locate the airports search package.

  3. Navigate to the manage data source screen for the airports data data source.

  4. Edit the metadata mappings. (Settings panel, configure metadata mappings).

  5. Modify the mapping for the <LATLONG> field to set the type as a geospatial coordinate.

    the <LATLONG> field was mapped previously so edit the existing entry.
    exercise geospatial and numeric metadata 01
  6. Modify the mapping for the <Altitude field> to be number then save the changes.

    the <Altitude> field was mapped previously so edit the existing entry.
    exercise geospatial and numeric metadata 02
  7. Rebuild the index (as you have changed the metadata configuration). Reminder: Update panel, start advanced update, rebuild live index.

  8. Run a search for !showall and inspect the JSON noting that kmFromOrigin elements now appear (due to the elements containing geospatial metadata).

    exercise geospatial and numeric metadata 03
    the kmFromOrigin field returns the distance (in km) from an origin point, which is defined by passing in an origin parameter, set to a geo-coordinate. It’s returning null because we haven’t defined this.
  9. Return to the HTML results and add numeric constraints to the query to return only airports that are located between 2000 ft and 3000 ft: add &lt_altitude=3000&ge_altitude=2000 to the URL observing that the number of matching results is reduced and that altitudes of the matching results are all now between 2000 and 3000.

  10. Remove the numeric parameters and add an origin parameter to the URL: &origin=48.8588336;2.2769957 (this is the lat/long value for Paris, France). Observe that the kmFromOrigin field now contains values.

    exercise geospatial and numeric metadata 04
  11. Geospatial searches can be limited to a radius measured from the origin (in km). Supply an optional maxdist parameter and set this to 500km, by adding &maxdist=500 to the URL. Note that the number of results has dropped dramatically and are all airports within 500km of Paris.

    When working with geospatial search you may want to consider setting the origin value by reading the location data from your web browser (which might be based on a mobile phone’s GPS coordinates, or on IP address location). Once you’ve read this value you can pass it to Funnelback along with the other search parameters.
    exercise geospatial and numeric metadata 05
  12. Edit the template to print out the kmFromOrigin value in the results. Add the following below the metadata (e.g. immediately before the </dl> tag at approx line 595) that is printed in the result template:

    <#if s.result.kmFromOrigin??>
    <dt>Distance from origin:</dt><dd>${s.result.kmFromOrigin} km</dd>
    </#if>
  13. Run the !showall search again and observe the distance is now returned in the results.

    exercise geospatial and numeric metadata 06
  14. Sort the results by proximity to the origin by adding &sort=prox and observe that the kmFromOrigin values are now sorted by distance (from nearest to farthest).

    exercise geospatial and numeric metadata 07
Change the sort to be from farthest to nearest by setting the sort to dprox (descending proximity).
Extended exercises: Geospatial search and numeric metadata
  1. Modify the search box to include controls to set the origin using the browser’s location support and to adjust the maxdist. Hint: examine the advanced search form for an example.

  2. Add sort options to sort the results by proximity to the origin.

  3. Modify the search box to set the origin inside a hidden field.

  4. Set the origin parameter using a pre_process hook script. You won’t be able to do this until you have completed the SEARCH 202 training course. Hint: the maxdist and origin need to be set in the additionalParameters data model element.

  5. Modify the template to plot the search results onto a map. See: Using Funnelback search results to populate a map

  6. Add sort options to sort the results numerically by altitude. Observe that the sort order is numeric (1, 2, 10, 11). Update the metadata mappings so that altitude is a standard text metadata field and rebuild the live index. Refresh the search results and observe the sort order is now alphabetic (1, 10, 11, 2). This distinction is important if you have a metadata field that you need to sort numerically.