Result collapsing

Result collapsing groups similar results into one, when displayed on the search results page. Results are considered similar when:

  • Their content is identical, or nearly identical.

  • They share one or multiple identical metadata fields

The list of fields to consider for similarity is controlled by the indexing.collapse_fields setting.

Workflow

Every time the data source is updated a signature file is generated. This file contains one or more signatures for each document, depending on the list of fields which has been configured. This signature file can then be used at query time to control how the query processor will collapse similar results.

Because the signature file is generated at indexing time, any change to the indexing.collapse_fields setting requires a re-index of the data source to take effect.

Presentation

At query time the query processor will detect which results should be collapsed together. The most relevant result for the current query constraints will be chosen as the main result, and other similar results will be collapsed with it. Collapsing is enabled with the -collapsing=on query processor option. This could be specified as a setting in the results page’s profile.cfg file, or as a per-request CGI parameter e.g. collapsing=on.

The display of collapsed results can be controlled from the search form by using custom FreeMarker tags and inspecting the Data Model. Display options range from simply displaying the number of collapsed results next to the main result, to displaying a simplified view of each collapsed result as a "sub-result" of the main one. Display options can be controlled with the -collapsing_sig, -collapsing_num_ranks and -collapsing_SF query processor options.

To set up result collapsing on your results page, please follow the instructions below. As an example, we will be considering a collection containing job offers, on which:

  • The X metadata field is mapped to the state where the job is advertised.

  • The a metadata field is mapped to the employer offering the job.

This guide will explain how to configure the results page so that results can be collapsed on their content similarity, by state, or by employer.

Configure the data source

Enabling result collapsing is a two-step process. It requires configuration of any data sources to generate the collapsing indexes, and results page configuration to display the collapsed results.

Browse to the manage data source screen where you need to set up the result collapsing definitions. Edit the data source configuration and add the following:

  • Set indexing.collapse_fields to [$],[a],[X]. This will generate a signature file based on the document content, the a and X metadata classes.

  • Update or re-index the data source so that the signature file gets generated.

Configure the results page

Browse to the manage results page screen and the edit the results page configuration, making the following changes:

  • Add -collapsing=on to the query_processor_options setting. This will enable result collapsing at query time.

  • Set the relevant display query_processor_options (e.g. -collapsing_sig, -collapsing_num_ranks and -collapsing_SF) on the results page.

Configure the template

Collapsing-UI-Simple.png

Collapsed results can be displayed with the <@fb.Collapsed /> tag. In its simplest form this tag just displays the number of collapsed results with a link to access them:

Query the results page

Result collapsing has been enabled in the previous steps and should be active, however by default results will be collapsed on the similarity of their content. To collapse results on a specific metadata field, use the collapsing_sig parameter, either as a CGI parameter (http://server/s/search?collection=...&collapsing_sig=[a]) or as a query processor option (-collapsing_sig=[a]).

With collapsing_sig set to [a], 1 job offer for the same employer is collapsed with our example result:

Collapsing-UI-Simple.png

With collapsing_sig set to [X], 6 job offers in the same state are collapsed with our example result:

Collapsing-UI-More.png

Use different labels for different metadata fields

The <@fb.Collapsed /> can be configured to use a different label depending on which metadata field is used for collapsing:

<@fb.Collapsed labels={ "X": "{0} results in the same state", "a": "{0} results from the same employer"} />

When collapsing on [a]:

Collapsing-UI-employer.png

…​and on [X]:

Collapsing-UI-state.png

Display each collapsed result

By default, a link is generated to access the collapsed result. This link uses a special query syntax to return all the documents sharing the same signature.

The form can also be configured to directly display each collapsed result. The number of results to show is controlled by the -collapsing_num_ranks query processor option, and the metadata fields to show is controlled via -collapsing_SF.

Edit the results page configuration and set the following query processor options: -collapsing_num_ranks=2 -collapsing_SF=[a,X].

Then, in your template, add the following snippet after the <@fb.Collapsed /> tag:

<#if s.result.collapsed??>
  <#list s.result.collapsed.results as r>
   <p><a href="${r.indexUrl?html}">${r.title}</a> by ${r.listMetadata["a"]?first!} in ${r.listMetadata["X"]?first!}</p>
  </#list>
</#if>

This will cause the first two collapsed results to be displayed. For each result, its title, employer (a) and state (X) will be shown.

When collapsing on [a]:

Collapsing-UI-Complex-employer.png

…​and on [X]:

Collapsing-UI-Complex-state.png

Note that even if there are six collapsed results, only the first 2 will be shown due to -collapsing_num_ranks=2.

Advanced usage

The signature file can be configured to combine multiple fields together. For example, setting indexing.collapse_fields=[a],[a,X],[X,Y,Z] will generate three different signatures:

  • A signature on the sole a field value,

  • A signature on the concatenation of the a and X field values,

  • A signature on the concatenation of the Y, X and Z field values.

The -collapsing_sig parameter is then used in a similar fashion to collapse results on those combinations: -collapsing_sig=[a], -collapsing_sig=[a,X], -collapsing_sig=[X,Y,Z].