Instant updates
Instant updates are a way of adding and removing content from web, database or file system (filecopy) data sources between full or incremental updates.
Update process
An instant update is initiated in one of the following ways:
-
via the search dashboard - selecting advanced update and choosing one of the four available start new instant update options (note: web, file system (filecopy) and database data sources only).
-
via the admin API - by running the corresponding API call from the queued tasks section.
When the instant update runs new content (generated from one of the add update types) is handled separately to the files associated with the main index. Specifically:
-
Data is downloaded to a
secondary-data
folder -
Indexes are written to the
live/idx
folder with an index stem ofindex_secondary
($SEARCH_HOME/data/$COLLECTION_NAME/live/idx/index_secondary
)
When content is added to a secondary index it is killed from the current live index, if it already is in the index (this prevents duplicates from appearing in the search results).
An instant delete operation is equivalent to killing the document from the live index.
Instant update types
Instant add
- Purpose
-
Add or re-add a site to the index.
- Search dashboard
-
Advanced update → Add or re-add a site to the index
- Administration API
-
queued tasks →
POST /task/v1/queue/INSTANT_UPDATE
- Update phases
-
-
instant-gather
-
instant-convert
-
instant-index
-
Instant delete
- Purpose
-
Remove sites from the index.
- Search dashboard
-
Advanced update → Remove sites from the index
- Administration API
-
queued tasks →
POST /task/v1/queue/REMOVE_URLS_BY_PREFIX_FROM_LIVE_VIEW
- Update phases
-
-
delete-prefix
-
Configuring instant updates
Workflow
This information only applies to non-DXP Funnelback 16 instances where deprecated workflow is used. |
For instant updates to run correctly any workflow commands applied to a normal update need to be configured for the instant update using the equivalent instant update workflow commands applied to the correct instant update phases (as listed above). This means that you will need to add workflow commands for both standard and instant updates.
As a general rule:
-
Convert any pre/post_gather commands to pre/post_instant-gather commands and add these to the collection.cfg (leave the existing pre/post commands as these are still needed for normal updates).
-
Convert any pre/post_index commands to pre/post_instant-index commands (leave the existing pre/post commands as these are still needed for normal updates).
-
Convert any index stems in the added commands to use
$SEARCH_HOME/data/$COLLECTION_NAME/live/idx/index_secondary
instead of$SEARCH_HOME/data/$COLLECTION_NAME/$CURRENT_VIEW/idx/index
. -
Replace any uses of
$CURRENT_VIEW
in the instant update commands withlive
. -
Post update (and post swap, pre/post meta dependencies etc) workflow should be run as post instant-index, post delete-prefix and post delete-list commands as instant updates only have the phases listed above (there is no swap, meta dependencies and hence no post-update phase).
e.g.
# Post index command, runs on normal updates
post_index_command=$SEARCH_HOME/bin/padre-gs $SEARCH_HOME/data/$COLLECTION_NAME/$CURRENT_VIEW/idx/index $SEARCH_HOME/conf/$COLLECTION_NAME/gscopes.cfg
# Equivalent instant update command
post_instant-index_command=$SEARCH_HOME/bin/padre-gs $SEARCH_HOME/data/$COLLECTION_NAME/live/idx/index_secondary $SEARCH_HOME/conf/$COLLECTION_NAME/gscopes.cfg
Search dashboard
The following administration user permissions are required to provide access to instant updates:
-
sec.instant.update
In addition, instant.update.restrict_urls
must be set with appropriate settings. One of the following is required:
-
instant.update.restrict_urls = false
, or -
instant.update.restrict_urls = true
andinstant.update.allowed_urls =
must be set to a list of allowed URLs.
An administration user must be granted these permissions in order to have access to instant updates.
Logs and indexes
-
High level update messages are written to the standard
update-<COLLECTION NAME>.log
. -
Detailed logs are written to the collection’s
live/secondary-logs
folder -
Indexes are written to the collection’s
live/secondary-index
folder while being built (a bit like offline) then moved to the live/idx folder with a stem ofindex_secondary
.
Caveats
-
It is not possible to start an instant update if the data source is already updating.
-
Running an instant update will lock the data source from updating (so a standard update, or other instant updates can’t be run while an instant update is running).
-
Instant updates apply only to web, database and file system (filecopy) data sources.
See also: