Funnelback query language and search query parameters

Funnelback query language

Funnelback has a query language that enables a user to perform advanced searches and to refine searches in Funnelback using different query language operators (or special characters) that are entered into the search box (or passed to the query parameter).

Simple search query - no special characters

This is just a simple sequence of words entered into the search box.

julius caesar rome
  • Fully matching results include all the words.

  • Partially matching results include some (but not all) of the words.

Search for a phrase - double quotes

Search for exact matches to a phrase by enclosing the words in double quotes. The phrase is treated as a single search term.

Punctuation, and special characters contained within a phrase will be ignored.
  • Fully matching results include all the words in the order specified with no additional words in between.

"hail caesar"

Answers to this query will contain the exact phrase "hail caesar". No partial matches will occur.

Empty/default searches

By default, Funnelback must be given some search terms in order to return some search results.

However, there is a plugin that can be enabled that allows empty searches to be submitted. When enabled, an empty search will return all results but the plugin can also be configured to run a default search query.

Documents may contain metadata, including the document’s author, title and when it was created. Funnelback can query this information using the syntax:

class:query

where class is the metadata class as defined in the search configuration.

The query:

city:barcelona

locates documents containing the word barcelona within city metadata class.

Special metadata searches

The special metadata searches below won’t work if you have set the -noifb indexer option.

Search for a metadata field containing a specific value

Metadata fields may contain multiple field values. Funnelback can find a metadata field with a specific value set in a list of values.

Funnelback uses field boundary operators ($++) to wrap the term that is being searched for.

CLASSNAME:"$++ field value $++"

The query:

city:"$++ York $++"

will match items with a city metadata field containing a value of 'york' but will not match a field containing 'new york'.

The metadata field can still contain multiple values, but one of the values must exactly match to be returned in the results.

Search for all items that has any value in a specific metadata class

Use the has_meta_CLASSNAME CGI parameter for has metadata queries.

The CLASSNAME:$++ operator can be used as a value to find all search results that have a particular metadata class set to some value.

CLASSNAME:$++

The query:

city:$++

locates documents that have a city metadata class set to any value.

If you are passing this value via a CGI or hidden HTML parameter make sure you correctly HTML encode the value as %24%2B%2B as browsers will convert + values to a space when these are included in a URL.

Search for all items that are missing a metadata class

Use the has_meta_CLASSNAME CGI parameter for missing metadata queries.

The -CLASSNAME:$++ operator can be used as a value to find all search results that do not have a particular metadata class.

-CLASSNAME:$++

The query:

-city:$++

locates documents that do not include a city metadata value.

If you are passing this value via a CGI or hidden HTML parameter make sure you correctly HTML encode the value as %24%2B%2B as browsers will convert + values to a space when these are included in a URL.

Search for items with word1 OR word2

To search for results that have word1 OR word2 enclose the words inside square brackets.

[mighty brave] national army
  • Fully matching results contain national, army and either brave or mighty.

  • Partially matching results contain any of the terms.

Search for items that exclude a specific word

Exclude any search results that contain a specific word by prefixing it with a dash/minus character.

julias caesar -antony
  • Fully matching results contain julius and ceasar but only if they do not include antony.

  • Partially matching results contain either julius or caesar but only if they do not include antony.

Exclude any search results that contain a specific word from fully matching results only by prefixing it with an exclamation mark character.

julias caesar !antony
  • Fully matching results contain julius and ceasar but only if they do not include antony.

  • Partially matching results contain either julius or caesar but these may also include the word antony.

Search for items that always include a specific word

Ensure a word is included in all results by prefixing it with a plus character.

rome octavius +antony +cleopatra
  • Fully matching results include the words rome, octavius, antony and cleopatra. Every result will contain the words antony and cleopatra.

  • Partially matching results include the words antony and cleopatra, and may include the words rome or octavius.

Search within a set of items (advanced)

Advanced scoping operators can be used to filter down the result set before a search is run. This allows you to run a search (with partial matches) on a part of overall search index.

Scoped operators cause the index to be scoped and don’t count as query parameters. The scoping reduces the index to include only documents containing the scoped items, then the rest of the query is run on the scoped index. This means that the scoped operators don’t count towards partial matches and that other query parameters are required for results to be returned as the scoping just defines the overall set of pages that can be included in a result set.

Search within a set of items that always include all words in a set of specific words

Define the overall set of items to search within by prefixing the words that must be included with vertical bar characters.

rome octavius |antony |cleopatra
  • This query searches with a set of results that contain anthony and cleopatra.

  • Fully matching results contain all four words.

  • Partially matching results contain antony and cleopatra and either rome or octavius but not both.

Search within a set of items that include some words of a set of words

Define the overall set of items to search within by enclosing the set or words that must be included in square brackets, prefixed with a vertical bar.

rome octavius |[antony cleopatra]
  • This query searches with a set of results that contain anthony or cleopatra (or both words).

  • Fully matching results contain antony or cleopatra (or both words) and rome and octavius.

  • Partially matching results contain antony or cleopatra (or both words) and either rome or octavius but not both.

Search for words that appear close to other words

Search for words that appear close to other words by enclosing the words in back quotes or backticks. The phrase is treated as a single search term.

Results will include the words, in any order, within 15 words of each other.

`army march`
  • Fully matching results include the word army within 15 words of march (in any order).

Additional options can be set as query processor options (or CGI parameters) that adjust the number of words that can included in the phrase (-phrase-prox-word-limit) and also the maximum distance considered between the words (-prox).

Increase the importance of certain words in a query

The up-weight operator (tilde) will increase the score of documents that match the element without making this element a constraint. For example:

computer science course ~concurrency
  • Fully matching results include the words computer, science and course. Results that have the term concurrency will be upweighted in the search results.

The influence of the upweighting can be set by appending a number between 0 and 1 using a carat operator. For example:

computer science course ~concurrency^0.9

will push up results with the word concurrency more than

computer science course ~concurrency^0.6

Restrict the search to results within a certain date range

Date queries constrain the result set to documents that were modified/created during a specified time period. For date querying purposes, Funnelback only records one date per document. The value for this is determined when the document is index and follows the date precedence order.

d=1jan1600

This returns documents that were modified/created on the 1st of January 1600.

d<1jan1600

This returns documents that were modified/created on or before the 1st of January 1600.

d=1jan1600 d<1jan1600

This returns documents that were modified/created before the 1st of January 1600.

d>1jan1600

This returns documents that were modified/created after the 1st of January 1600.

d=1jan1600 d>1jan1600

This returns documents that were modified/created on or after the 1st of January 1600.

Search for plurals and other stemmed equivalents

By default Funnelback stems words in both the query and in the index. This means that searching for economic will also match economics.

This behavior can be cancelled for specific words by appending a hash character to the end of the word.

When stemming is enabled (this is the default) adding a hash character will ensure that stemming is disabled for the word.

Searching for

economic# policy

will ensure that the word economic is not stemmed (so you won’t receive matches in the results for economics).

When stemming is disabled in the query processor options (stem=0), adding a hash character to a word will enable stemming for the specific word.

Searching for

economic# policy

will match:

  • economic policy

  • economics policy

Wildcard searching

Limited wildcard searching is supported when the query language - wildcard (truncation) support plugin is enabled.

The plugin supports searching for words that start with a particular string.

anti*

This example pattern matches all pages containing words starting with anti, such as antium and antioch.

When using the plugin, the truncation operator (*) can only appear at the right of the string and can only be used to find words that start with the string. It does not support words that end in the string (as was supported in the legacy wildcard searching).

Legacy wildcard searching

This functionality is deprecated and only works when Funnelback is running in legacy Term At A Time (TAAT) mode.
  • The truncation operator (*) is supported in term at a time mode, only when either the -service_volume=low or -daat=0query processor option is set.

  • Enabling term at a time mode disables many newer Funnelback features.

  • Truncation is an expensive operation and can significantly impact the search response time.

The truncation operator matches pages containing words that contain variants of the query term matching the wildcard.

anti*

This example pattern matches all pages containing words starting with anti, such as antium and antioch. Be careful, there are almost always more matching words than you expect, resulting in more matching pages.

The truncation operator can appear at the left, at the right or both, but NOT in the middle of the string.

*och*

This example pattern matches all pages with words containing the string och, such as antioch and rochester.

The truncation operator is equivalent to the query_trunc CGI parameter. e.g. query_trunc=och

Social media hash-tags and user mentions

Searching for social media hash-tags and user mentions within content is supported by enabling the social tags plugin.

When the social tags plugin is enabled, the following additional query language operators are supported:

Preceding a term with the hash (#) operator matches hash-tags. e.g. search for occurrences of the #funnelback hash-tag:

#funnelback

Preceding a term with an at symbol (@) matches user mentions. e.g. search for occurrences of the @jsmith user:

@jsmith

Result collapsing

When result collapsing (grouping) is configured you will see a special operator ?= appear within the Funnelback JSON responses that applies the result collapsing to the query.

The operator is ?= followed by a string of letters and numbers (the result collapsing signature). e.g.

?=9BC56EE9E1630C9D
This is included for completeness and is not something you ever need to set yourself when running a query.

Complex examples

Mixed operators

The different types of queries above can be mixed and matched to get your desired outcome.

For example, the query:

t:`war castle` |england

mixes the following operators:

  • searching for the words war and castle within 15 words of each other when found inside metadata titles

  • searching with a set of results that include the word england

Search CGI parameters and query language equivalents

Query CGI parameters

The following list of CGI paramaeters are available to faciliate advanced search forms. Values passed via these special parameters are translated into the Funnelback query language.

e.g. passing in a CGI parameter query_prox=romeo+juliet will result in the following being appended to the query that is processed: `romeo juliet` which runs a query that looks for the word romeo near the word juliet.

The full list of query parameters is:

CGI parameter Description Query language expression

query=term1+term2

General query terms (can include anything in the query language)

term1 term2

query_and=term1+term2

and the terms

+term1 +term2

query_not=term1+term2

negate the terms

-term1 -term2

query_or=term1+term2

or the terms

[term1 term2]

query_orsand=term1+term2

or with scoped and

|[term1 term2]

query_phrase=term1+term2

phrase

"term1 term2"

query_prox=term1+term2

proximity

`term1 term2`

query_sand=term1+term2

scoped and

|term1 |term2

query_trunc=term1+term2

word truncation

*term1* *term2*

Note: query will accept a set of words that are specified in the query language. For example you could pass in query=-economics course which would search for anything that included the word course but not the word economics. This would be equivalent to supplying query=course&query_not=economics

Metadata CGI query parameters

Funnelback includes a number of query language and CGI parameters that can be used to search a text type metadata field

CGI parameter Description Query language expression

meta_CLASSNAME=value

Matching result will contain the term value within the CLASSNAME class.

CLASSNAME:value

meta_CLASSNAME_and=value1+value2

Matching result will contain the terms value1 AND value2 within the CLASSNAME class.

+CLASSNAME:value1 +CLASSNAME:value2

meta_CLASSNAME_or=value1+value2

Matching results will contain value1 OR value2 within the CLASSNAME class.

[CLASSNAME:value1 CLASSNAME:value2]

meta_CLASSNAME_not=value1+value2

Matching result will not contain the terms value1 AND value2 within the CLASSNAME class.

-CLASSNAME:value1 -CLASSNAME:value2

meta_CLASSNAME_sand=value1+value2

The result set will be scoped to items containing value1 AND value2 within the CLASSNAME class before other query constraints are applied. Partially matching results will always include both of these terms in the CLASSNAME class.

|CLASSNAME:value2 |CLASSNAME:value2

meta_CLASSNAME_orsand=value1+value2

The result set will be scoped to items containing value1 OR value2 within the CLASSNAME class before other query constraints are applied. Partially matching results will always include either or both of these terms in the CLASSNAME class.

|[CLASSNAME:value1 CLASSNAME:value2]

meta_CLASSNAME_phrase=value1+value2

Matching results will contain the phrase "value1 value2".

CLASSNAME:"value1 value2"

meta_CLASSNAME_prox=value1+value2

Matching results will contain value1 and value2 within 15 words of each other.

CLASSNAME:`value1 value2`

has_meta_CLASSNAME=true

Matching results contain metadata class CLASSNAME, set to any value.

CLASSNAME:$++

has_meta_CLASSNAME=false

Matching results do not contain metadata class CLASSNAME.

-CLASSNAME:$++

Date metadata CGI parameters

A number of special date parameters are supported via CGI parameters and the query language.

Dates must be specified as DMMMYYYY format. e.g. 1Jan2015, 5Sep2001.

CGI parameter Query language operator Description

meta_d=1Jan2015

d=1Jan2015

Exact match to the specified date.

meta_d1=1Jan2015

d>1Jan2015

Matches all dates greater than the supplied date (after).

meta_d2=1Jan2015

d<1Jan2015

Matches all dates less than the supplied date (before).

meta_d3=1Jan2015

d=1Jan2015 d>1Jan2015

mMtches all dates greater than or equal to the supplied date (from).

meta_d4=1Jan2015

d=1Jan2015 d<1Jan2015

Matches all dates less than or equal to the supplied date (to).

Parameters can be combined to create date range queries. e.g. the query below would match results with dates after 28th July, 1914 and before 11th November, 1918:

meta_d1=28Jul1914&meta_d2=11Nov1918

Additional day, month and year variants are available for each of the above CGI parameters to facilitate easy form integration. The parameters can be modified further by appending

  • day

  • month

  • year

The example below would match results with dates matching 25th April 1915:

meta_dday=25 meta_dmonth=Apr meta_dyear=1915

The example below would match results with dates from 1st September, 1939 to 2nd September, 1945:

meta_d3day=01 meta_d3month=Sep meta_d3year=1939 meta_d4day=11 meta_d4month=Sep meta_d4year=1945

Note:

  • d3 and d4 require all three components (day, month and year) to be provided

  • d, d1 and d2 do not require all three components. e.g. just the year could be specified.

Date metadata can also be sorted by date by using the sort=date or sort=adate parameters. See: sort options for more information on sorting search results.

Numeric metadata CGI parameters

Numeric fields can be queried using CGI parameters.

There are no equivalent query language operators for numeric metadata search.

The CGI parameters are:

CGI parameter Value type Description

lt_CLASS

float

Performs a "Less than" operation on metadata class

le_CLASS

float

Performs a "Less than or equals" operation on metadata class

gt_CLASS

float

Performs a "Greater than" operation on metadata class

ge_CLASS

float

Performs a "Greater than or equals" operation on metadata class

eq_CLASS

float

Performs an "Equals" operation on metadata class

ne_CLASS

float

Performs a "Not Equals" operation on metadata class

The CGI parameters currently work only as scoping operators. There must be a query to define a result set which is then scoped by lt_x etc. If there is no query there will be no results.

Numeric metadata can also be sorted using the sort=metaCLASSNAME or sort=dmetaCLASSNAME parameters. See: sort options for more information on sorting search results.

Geospatial metadata CGI parameters

A number of geospatial CGI parameters are available when searching geospatial metadata. These parameters can be used to scope the search to items with a geospatial coordinate within a specific distance of an origin point.

This allows for a show results near me search when used in conjunction with a user’s GPS or browser-derived location coordinates.

CGI parameter Description

origin=X,Y

Specifies a coordinate (formatted as x,y e.g. origin=24.543,-2.331) that will be used as the reference point for geospatial calculations.

maxdist=DISTANCE

Can be used to restrict a search to a DISTANCE (in km) from the origin. (e.g. maxdist=20)

Geospatial metadata can also be sorted by proximity to the origin point by using the sort=prox or sort=dprox parameters. See: sort options for more information on sorting search results.