Upgrading workflow commands
Workflow enabled Funnelback to run arbitrary shell commands between each step in the update cycle.
This is not permitted in the DXP as it is a security risk, and it also prevents automatic upgrades of the search.
This guide outlines the process you need to follow when upgrading workflow.
High level process
The key to successfully upgrading workflow is to break down the different unique tasks that the workflow is performing. When doing this you often need to look at the configuration holistically - because workflow logic is often dependent on both other collection configuration, and also on how the tasks span different workflow commands.
The identified tasks should be performing discrete operations that can then be replaced with other product functionality.
Replacing workflow functionality
Once you have broken down all the workflow functionality into a set of discrete tasks, you then need to figure out how this can be done in the DXP without any custom coding.
The high-level options to replace workflow commands tasks are:
-
Existing plugins and filters: these implement commonly occurring tasks that were previous implemented as workflow - e.g. generating auto-completion, transforming XML. Become familiar with the plugins that are available and the functions they perform.
-
Other built-in functionality: workflow often replicate behavior that is available through built-in functionality - e.g. applying/generating kill lists and configuration.
Common patterns and their replacements
Downloading content for indexing
Typical source: pre-gather or pre-index script
curl 'https://example.com/feed.xml' > $SEARCH_HOME/data/$COLLLECTION_NAME/offline/data/example.xml
Remediation: replace this with configuration of the web crawler/web data source to fetch the content.
Downloading external metadata feeds
Typical source: pre-gather or pre-index script
curl 'https://example.com/emfeed.txt' > $SEARCH_HOME/conf/$COLLLECTION_NAME/external_metadata.cfg
Remediation: replace this with the external metadata fetcher plugin.
Applying gscopes, QIE or kill configuration
Typical source: post-index script
# Apply gscopes to index
$SEARCH_HOME/bin/padre-gs $SEARCH_HOME/$COLLECTION_NAME/data/$CURRENT_VIEW/idx/index $SEARCH_HOME/conf/$COLLECTION_NAME/gscopes.cfg
# Apply QIE to index
$SEARCH_HOME/bin/padre-qi $SEARCH_HOME/$COLLECTION_NAME/data/$CURRENT_VIEW/idx/index $SEARCH_HOME/conf/$COLLECTION_NAME/qie.cfg 0.5
# Apply kill configuration to index
$SEARCH_HOME/bin/padre-fl $SEARCH_HOME/$COLLECTION_NAME/data/$CURRENT_VIEW/idx/index $SEARCH_HOME/conf/$COLLECTION_NAME/kill.cfg -exactmatch -kill
Remediation: Ensure gscopes/qie/kill configuration is setup using the standard configuration files and remove the workflow (rules in standard config are applied automatically).
Generating auto-completion from the search index
Typical source: post-index script
$SEARCH_HOME/conf/$COLLECTION_NAME/@workflow/post_index.sh -c $COLLECTION_NAME -v $CURRENT_VIEW -p auto-completion
Remediation: replace this with the auto-completion plugin.
Downloading an auto-completion CSV from an external source
Typical source:post-index script
curl 'https://example.com/autoc.csv' > $SEARCH_HOME/conf/$COLLLECTION_NAME/_default/auto-completion.csv
Remediation: replace this by setting the auto-completion.source.csv.[name].url
configuration key.