Update task scheduler

Funnelback supports automatic scheduling of data source and analytics updates.

The scheduler supports two different update schedules - with updates based on elapsed time since the previous update and updates set to commence at a specified time.

Update based on time since previous update

The scheduling for a data source is set in the data source configuration, for example:

Configuration keyValue
schedule.timezoneAustralia/ACT
schedule.[task-type].auto.desired-time-between-updatesPT24H
schedule.[task-type].auto.no-update-window.start-time09:00:00
schedule.[task-type].auto.no-update-window.durationPT8H

[task-type] must be replaced by one of the supported types which are listed in the next section.

This configures the data source to

Note:

  • The scheduler will not account for any delay introduced by a task picker which may decide to start the update only when resources are available on the server.

  • Updates which fail will be retried after an additional delay (currently hardcoded at 6 hours * number of failed updates since the last success (the number of failures considered is capped to ten).

Update at a fixed time

The scheduling for a data source is set in the data source configuration, for example:

Configuration keyValue
schedule.timezoneAustralia/ACT
schedule.[task-type].fixed.start-times19:30:00,20:00:00
schedule.[task-type].fixed.permitted-days-of-weekMONDAY,TUESDAY,WEDNESDAY,THURSDAY,FRIDAY

[task-type] must be replaced by one of the supported types which are listed in the next section.

This configures the collection to update at 19:30 (i.e. 7:30pm) and at 20:00 (i.e 8.00pm) in the Australia/ACT timezone on weekdays.

Notes:

  • A value of ANY is permitted in place of the comma-separated list of week days to indicate no week-day restriction.

  • Be aware of daylight savings changes. Scheduling an update between 2am and 3am in a timezone which has daylight savings changes around that time may prove confusing to you (updates would be skipped or run twice in an hour), so you might want to avoid doing that.

  • It is permitted, though it won’t usually be too useful, to give one collection both a schedule.[task-type].fixed.start-times and a schedule.[task-type].auto.desired-time-between-updates.

Supported task types

Currently we support the following list of task types in the update scheduler.

  • full-update: Run a full update of a data source (all data source types except push).

  • incremental-update: Run an incremental update of a data source (web/database data sources only)

  • normal-update: Run a normal update of a data source. This applies the incremental crawl ratio for web data sources (all data source types except push).

  • reapply-gscopes-to-live-index: re-applies gscopes to a live index without the need to re-index or run a full update (all data source types except push).

  • rebuild-live-index: Rebuilds the live index for a data source without the need to run a full update (all data source types except push).

  • refresh-update: Runs a refresh update for a web data source. (Web data source only).

  • update-analytics: Runs an incremental update of the analytics for a search. (Search packages only).

Example configuration for scheduling a full update:

Configuration keyValue
schedule.timezoneAustralia/ACT
schedule.full-update.fixed.start-times19:30:00,20:00:00
schedule.full-update.fixed.permitted-days-of-weekMONDAY,TUESDAY,WEDNESDAY,THURSDAY,FRIDAY

Pausing the scheduler

The scheduler can be paused by setting the server configuration option scheduler.paused to true on the Funnelback server that is responsible for running the updates. This will prevent any new updates being scheduled until the scheduler is un-paused (by either removing this setting, or setting it to false.

if a fixed update time passes while the scheduler is paused it will not be updated when the scheduler is un-paused but will run at the next scheduled time.

Task scheduler configuration options

Collection options

The following options are set in the data source or search package configuration.

Server options

The following options are set in the server configuration.

© 2015- Squiz Pty Ltd