Writing search lifecycle manipulation code to manipulate the data model

Modifications to the search query and response can be implemented by writing code that modifies the data model.

Funnelback supports two methods of achieving this, by implementing:

Pre-post phase step Plugin - SearchLifeCyclePlugin interface method Hook script

pre-process

void preProcess(SearchTransaction transaction)

hook_pre_process.groovy

pre-datafetch

void preDatafetch(SearchTransaction transaction)

hook_pre_datafetch.groovy

post-datafetch

void postDatafetch(SearchTransaction transaction)

hook_post_datafetch.groovy

post-process

void postProcess(SearchTransaction transaction)

hook_post_process.groovy

Plugins

Plugins are the preferred method of writing code to manipulate the data model.

Search lifecycle code implemented via a plugin is written in Java and all lifecycle modifications are made in a single java class by implementing the four methods that correspond to the individual pre-post phase steps.

Hook scripts

Hook scripts are not available when using Funnelback Cloud.

Search lifecycle code implemented via hook scripts is written in Groovy. There is one Groovy hook script for each pre-post phase step.

General concepts

Collection, profile and query parameters

The collection, profile and query parameters within the search question can be read from various elements within the data model (such as the various input parameter maps). These parameters should be accessed from the following parameters as they will always be defined:

  • Collection: transaction.question.collection.id

  • Profile: transaction.question.profile

  • Query: transaction.question.query

Search types

By default search lifecycle pre/post code will run on all search requests including those run by content auditor, accessibility auditor and also extra searches.

Each of these searches has a particular search question type that indicates the type of search that is running.

Conditional code based on the search question type can be used to write code that only affects the data model question and response for the specific types of searches.

Obtaining the search question type

The following Java method provides a function call to return the current search question type which can be called from the hook script. This code can be used in both plugins and also in Groovy hook scripts.

// Get the list of defined question types
import com.funnelback.publicui.search.model.transaction.SearchQuestion.SearchQuestionType;

...

// Get the question type of the current search
SearchQuestionType questionType = transaction.getQuestion().getQuestionType()

Possible values for the question type are:

Question type Description

SearchQuestionType.SEARCH

A search query submitted to the HTML, XML, JSON endpoints (e.g. search.html, search.json)

SearchQuestionType.SEARCH_GET_ALL_RESULTS

A search query submitted to the all-results endpoint.

SearchQuestionType.EXTRA_SEARCH

An extra search configured on a collection.

SearchQuestionType.CONTENT_AUDITOR

A content auditor query

SearchQuestionType.CONTENT_AUDITOR_DUPLICATES

A content auditor duplicates query

SearchQuestionType.ACCESSIBILITY_AUDITOR

An accessibility auditor query

SearchQuestionType.ACCESSIBILITY_AUDITOR_ACKNOWLEDGEMENT_COUNTS

Accessibility auditor query to determine acknowledgement counts

SearchQuestionType.ACCESSIBILITY_AUDITOR_GET_ALL_RESULTS

Accessibility auditor all-results query

SearchQuestionType.FACETED_NAVIGATION_EXTRA_SEARCH

Built-in extra search used to obtain faceted navigation information

Restricting hook script code to specific search types

This allows the pre-post code to target a specific question type using a conditional statement. e.g. only run for the CONTENT_AUDITOR search:

// Only run this for content auditor queries
if (questionType.equals(SearchQuestionType.CONTENT_AUDITOR)) {
    // Whatever you want to do...

}

Search query and response manipulation

Programmatic transformations to the search data model can be implemented in pre/post steps attached to the intermediate phases. These are delivered using plugins or via search interface hook scripts.

The following four pre/post steps are provided as part of the search lifecycle:

  • Pre-process: This code runs after initial question object population, but before any of the input processing occurs. Manipulation of the query and addition or modification of most question attributes can be made at this point.

    Example uses: modify the user’s query terms; convert a postcode to a geo-coordinate and add geospatial constraints

  • Pre-datafetch: This code runs after all of the input processing is complete, but just before the query is submitted. This hook can be used to manipulate any additional data model elements that are populated by the input processing. This is most commonly used for modifying faceted navigation.

    Example uses: Update metadata, gscope or facet constraints.

  • Post-datafetch: This code runs immediately after the response object is populated based on the raw XML return, but before other response elements are built. This is most commonly used to modify underlying data before the faceted navigation is built.

    Example uses: Rename or sort faceted navigation categories, modify live URLs.

  • Post-process: This is used to modify the final data model prior to rendering of the search results.

    Example uses: clean titles; load additional custom data into the data model for display purposes.

Key data model elements

There are a number of key data model elements that are commonly manipulated when writing custom logic for the pre/post phase steps.

See: Funnelback data model for an introduction to the various elements of the data model.

Data model element: question.inputParameters

This parameter contains the set of input parameters supplied via CGI to the search.

The parameter value is an array of strings.

Modify this for any parameters that will be used by the modern UI, or for any parameters that are used to setup the elements that are populated between the pre process and pre datafetch steps.

Data model element: question.additionalParameters

This parameter contains the set of input parameters that will be supplied to the padre query processor. It includes the parameters supplied via CGI and some additional parameters that are set from configuration or environment settings.

Modify this for any parameters that need to be passed directly to padre as query processor options or padre command parameters.

This includes the following:

  • origin

  • maxdist

  • gscope1

  • sort

  • numeric query parameters (e.g. lt_x)

  • SM

  • SF

  • num_ranks

The parameter value is an array of strings.

Data model element: question.query

Modify this if you need to manipulate the passed in query.

Data model element: response.resultPacket.results

This contains the set of individual search results.

Modify sub-keys for each search result to manipulate search result titles or URLs, add additional custom data.

© 2015- Squiz Pty Ltd