Using snapshots

This document provides a quick guide on creating and using a snapshot for components in your flows via the Connect UI and the API.

Overview

A snapshot is a copy of the state of a component at a specific point in time. A snapshot is taken at a defined point so that you can revert to that saved state if needed.

In Connect, components run as containers, which are designed to stop when not in use and restart. When a container stops, any data being processed within it is lost. However, if a snapshot was taken when the component was last run, then the data is retained and the component can recommence processing that data again from the point at which the snapshot was taken. Each time a snapshot is taken, any previous snapshot is overwritten by the newer one.

The snapshot function:

  • Avoids data duplication caused by overwriting the same data every time the component starts.

  • Saves Connect time and resources by retaining data which the component has already processed.

Taking snapshots and using them is an asynchronous process. This means that if a component fails before another snapshot is taken, the component will lose all the data between failure and the last snapshot.

A snapshot is limited to 5 KB. Therefore, refrain from trying to use this as an intermediate temporary data storage mechanism.

Use Cases

Snapshots are used differently for different component types:

  • Components that request data periodically: If a snapshot is taken at the time of a data request, the component knows what data to include in the next request.

  • Components that query particular data by ID: A snapshot allows the component to know what IDs were already read.

  • Components that work by iterations using session IDs: Snapshots allow such Components to correlate sessions by ID, so every next iteration is consistent.

So, basically, in each of these use cases the snapshot registers the last action by some marker, and allows the Component to proceed from the same point next time it runs.

Creating and using a snapshot

To create a Snapshot, you need to emit it from your Component function:

emit('snapshot',snapshot)

Let’s see how it works with querying objects by timestamp:

  1. Query all objects since 2017, 1 Jan to establish the first point of reference. Typically, this happens on the first global sync between two systems.

  2. Iterate over all objects. In some cases this stage can take a while due to the amount of data in the source system, therefore it is recommended to use paging if the originating API supports it.

  3. Calculate the last update date based on returned objects. It’s not recommended to calculate the last date as current time as you can miss some objects. This stage is also dependent on the originating API properties.

  4. Emit the new last date as snapshot after iterating through all objects.

Having gone through this list, here is what we get:

params.snapshot = snapshot;

if (!params.snapshot || typeof params.snapshot !== "string" || params.snapshot === '') {
  // Empty snapshot
      console.warn("Warning: expected string in snapshot but found : ", params.snapshot);
      params.snapshot = new Date("2017-01-01T00:00:00.000Z").toISOString();
  }

 var newSnapshot = new Date().toISOString();

...
...
emitSnapshot(newSnapshot);
Please remember that there can be only one snapshot per step in a Flow. Each time another snapshot is made, the last one is overwritten.

EXAMPLE:

'use strict';

const elasticio = require('elasticio-node');
const messages = elasticio.messages;

exports.process = processTrigger;

function processTrigger(msg, cfg, snapshot) {
    console.log('Message %j', msg);
    console.log('Config %j', cfg);
    console.log('Snapshot %j', snapshot);

    snapshot.iteration = snapshot.iteration || 0;

    console.log('Iteration: %d', snapshot.iteration);

    snapshot.iteration += 1;

    this.emit('snapshot', snapshot);
    this.emit('data', messages.newMessageWithBody({iteration: snapshot.iteration}));
    this.emit('end');
}

Additionally, there is a way to reset snapshot for a Flow via the UI. Here is how it is done:

Resetting snapshots via UI

Snapshots via the API

There is a number of API endpoints for creating and using snapshots. They allow you to (follow the links for details):