Using push collections with multiple query processors

This feature is not available to users of the Squiz Experience Cloud version of Funnelback.

Push collections can be setup to have multiple query processors. We will refer to the machine which allows addition and deletes of documents as the admin. We refer to the machine that a query processors connects to as a query processor’s master. You can setup multiple query processors to connect to a single admin as well as daisy chain query processors to other query processors. Thus a query processor can be a master. The query processors will only be able to fetch data from admin once the data has been committed. Query processors only replicate indexes, document data as well as some internal Push files. You should look at supporting_multiple_query_processors to find out about fetching query and click logs from query processors to the admin machine for better search quality and analytics.

Caveat: This feature is not supported on Windows.

Setting up a query processor

Push query processors work on a pull model which means the admin does not know about its query processors, so almost all of the configuration will be on the query processors. If you wish to daisy chain slaves you should connect the slaves closest to the master server and work your way out.

To setup a Push query processor:

  • Ensure the master server and the query processor share the same server secret, see server secret (global.cfg).

  • Create a push collection on the the query processors with the same name as the push collection you wish to replicate on master.

  • Create the push collection on the the query processors following the multi-server setup steps. See: initial publication

  • Before you make any changes to the Push collection you must first set the push.initial-mode to push.initial-mode=SLAVE.

  • Now you must configure the query processor to talk to its master. Edit the following options in the query processor’s collection.cfg.

  • You must now tell Push to start syncing through the push API.

    • POST /push-api/v1/sync/collections/<collection>/state/start

  • You can check if the Push collection is trying to synchronise with master by checking that its SyncState is Sync, by making the following request to the Push API:

    • GET /push-api/v1/sync/collections/<collection>/state

Promoting a query processor to Admin

If your Admin machine suffers a failure you can promote one of your query processors to be the new Admin machine. To do this make the following call to the PushAPI on the query processors you wish to promote:

POST /push-api/v1/collections/test-push2/mode/?mode=DEFAULT

As query processors are not synchronised with the Admin machine, your new Admin machine may be missing some data.

You should ensure that all of your query processors now point to the new master. To do this just edit the push.replication.master.* options as above and Push will automatically connect to the new master.

Demoting a Admin to a query processor

To demote a admin machine to a query processor you must first empty the push collection by making the following API request:

DELETE /push-api/v1/collections/<collection>

You must then change the mode to SLAVE by making the following API request

POST /push-api/v1/collections/<collection>/mode/?mode=SLAVE

Reducing network load

Currently Funnelback supports ignoring some files to reduce the load on your network, at the cost of reduced usability. You may ignore the following:

  • Document data: resulting in cache copies being unavailable on the query processor.

  • Delete lists: has no effect on the query processor.

Important! If any of the push.replication.ignore.* options are set true, you should not attempt to promote a query processor to Admin, as that will result in a corrupt Admin machine.

Deleting a Push collection with slaves

If a Push collection has slaves it can be difficult to delete that push collection. The easiest way to delete a Push collection with slaves is to first delete the Push collection on each of its slaves, ie delete the Push collections in the reverse order you set them up. This should be done because a Push collection must be stopped before it can be deleted and a slave will constantly make request to the Push collection effectively preventing the Push collection from being stopped.