Best practices - 2.4 workflow

Background

This article outlines best practices to follow when implementing update workflow.

Use collection.cfg variables

Funnelback has a number of variables that are defined in the collection.cfg. These should always be used (where appropriate):

  • $SEARCH_HOME: is replaced with the install path (usually /opt/funnelback or c:\funnelback)

  • $COLLECTION_NAME: is replaced with the id of the current collection.

  • $CURRENT_VIEW: is replaced with live for reindex of the live view updates, or offline for other updates. Do not use this if you need the view to always be the same value.

  • $GROOVY_COMMAND: is replaced with the path to the Groovy binary.

These can be used in workflow commands that are specified in the collection.cfg file. E.g. Call the pre-index workflow command and pass in the current view and collection name variables so they can be used in the workflow script:

pre_index_command=$SEARCH_HOME/conf/$COLLECTION_NAME/@workflow/pre_index.sh -c $COLLECTION_NAME -v $CURRENT_VIEW

Naming conventions for workflow scripts

Workflow scripts are used to run custom actions before and after each update phase (gather, index, archive, etc.)

Workflow scripts should be stored inside a sub-folder of the collection configuration folder named @workflow.

They should be named after the phase they’re supposed to run in.

e.g. pre_gather.sh, post_index.groovy to make them easy to identify at a glance.

$SEARCH_HOME/conf/COLLECTION/@workflow/pre_gather.sh
$SEARCH_HOME/conf/COLLECTION/@workflow/post_index.groovy

Filenames should be only lowercase alphanumeric, hyphen and underscore permitted between words.

Error handling in workflow scripts

Workflow scripts can fail and errors must be handled appropriately. A common approach is to have a failing workflow script exit with a non-zero exit code so that Funnelback can detect the problem and fail the update (sending a notification email if it has been configured).

To achieve this in shell scripts, make sure you set the -e flag in the hashbang line: #!/bin/sh -e or #!/bin/bash -e. This will cause the shell to immediately exit with an error code if one the script commands fail.

To achieve this in Groovy script, throw an exception with a message of why the script cannot continue:

throw new Exception("Unable to connect to remote system");
Never use System.exit() as it will cause the JVM to shutdown. Depending on the context it may have bad consequences, such as shutting down the Jetty webserver if the script is being run by Jetty.