Plugin: Instagram gatherer

Purpose

Use this plugin to include Instagram content in your search results.

The plugin gathers Instagram posts that are linked to a specific user’s account (as allowed by the Instagram tokens) using the Instagram basic display API.

Version 2 of the Instagram gatherer plugin is NOT backwards compatible with version 1. Please configure metadata mappings based on your requirements. Please refer to the upgrade notes for the upgrade instructions.

Prepare Instagram for indexing

Before you start, the following steps must be completed within Instagram to configure your account for use with the plugin:

  • Create and register an app within Facebook for Developers

  • Configure Instagram access permissions

  • Obtain your Instagram access tokens

Usage

Enable the plugin

  1. Select Plugins from the side navigation pane and click on the Instagram gatherer tile.

  2. From the Location section, select the data source to which you would like to enable this plugin from the Select a data source select list.

The plugin will take effect after setup steps and an advanced > full update of the data source has completed.

Configuration settings

The configuration settings section is where you do most of the configuration for your plugin. The settings enable you to control how the plugin behaves.

The configuration key names below are only used if you are configuring this plugin manually. The configuration keys are set in the data source configuration to configure the plugin. When setting the keys manually you need to type in (or copy and paste) the key name and value.

Instagram access token

Configuration key

plugin.instagram-custom-gather.encrypted.user-access-token

Data type

Encrypted string

Required

This setting is required

User access token for access to the Instagram API.

The plugin.instagram-custom-gather.encrypted.user-access-token plugin property is encrypted as it is a key to the Instagram data in a given account.

Metadata configuration

Default mappings

Only the timestamp is mapped to the date (d) metadata class.

Class ID Type JSON fields included

d

date

/json/timestamp

Fields that have default mappings cannot be mapped to other metadata classes. If you require the field to also be mapped to another metadata class it needs to be cloned using the combine / clone metadata plugin.

Manual metadata mapping

Use the metadata mappings editor to configure your metadata mappings.

Available fields will be listed by viewing the available XML fields after you have run an update of your Instagram feed.

The -SF query processor option must also be configured on your results page to include these metadata fields in the search response (e.g. -SF=[<LIST OF METADATA CLASSES TO DISPLAY>]).

If you wish to maintain the metadata mappings that v1.0.0 had when upgrading, ensure that the following metadata mappings are configured:

Metadata class Type Behavior XML field to map

instagramAuthor

text

index as content

/json/username

instagramCaption

text

index as content

/json/caption

instagramId

text

index as content

/json/id

instagramMediaType

text

index as content

/json/media_type

instagramMediaURL

text

index as content

/json/media_url

instagramThumbnailURL

text

index as content

/json/thumbnail_url

User mentions and hash-tags

User mentions and hash-tags within Instagram content can be made searchable by enabling the social tags plugin.

Plugin error handling

  • Any Instagram server connection error: The gather process is aborted with an error message in gather.log as HTTP response code for gathering page 1 from Instagram is [HTTP status code], where HTTP status code indicates the HTTPS connection issue.

  • Invalid user access token: The gather process is aborted with the same error message indicated for Any Instagram server connection error. The HTTP status code would be 400 (Bad Request).

  • The retrieved Instagram entry does not contain permalink: The gather process will skip this entry and continue with the rest of the Instagram data entries but add a warning message to the log.

Upgrade notes

Upgrading from version [1.0.0] to version [2.x.x]

Versions 2 of the Instagram plugin are NOT backwards compatible with version 1.

Please follow the instructions below to upgrade from 1.0.0 to version 2.x.x

  1. Remove all the version 1 configuration parameters except plugin.instagram-custom-gather.encrypted.user-access-token.

  2. Enable the JSONToXML filter as any previous fields defined by default or through the configuration of the plugin.instagram-custom-gather.config.media-fields key are not available. (See: usage)

    The fields that are available are those returned responses by the Instagram APIs.

  3. Configure your metadata mappings to map the fields from the Instagram JSON to corresponding metadata classes to match your search requirements.

Filter chain configuration

This plugin uses filters which are used to apply transformations to the gathered content.

The filters run in sequence and need be set in an order that makes sense. The plugin supplied filter(s) (as indicated in the listing) should be re-ordered to an appropriate point in the sequence.

Changes to the filter order affects the way the data source processes gathered documents. See: document filters documentation.

Filter classes

This plugin supplies a filter that runs in the main document filter chain: JSONToXML

Drag the JSONToXML plugin filter to where you wish it to run in the filter chain sequence.

Examples

Instagram JSON

The following fields are returned by the Instagram basic display API for each post, please refer to Instagram basic display API documentation for details.

{
  "id": string,
  "media_url": url/string,
  "media_type": enum, //Can be IMAGE, VIDEO, Or CAROUSEL_ALBUM
  "caption": string, //Not available on CAROUSEL_ALBUM
  "permalink": url/string, //Will be omitted if the Media contains copyrighted material, or has been flagged for a copyright violation. Funnelback will not index if it is missing
  "timestamp": dateTime,
  "username": string,
  "is_shared_to_feed": boolean, //For Reels only
  "thumbnail_url": url/string, //Only available on VIDEO
  "children": { //Only available on CAROUSEL_ALBUM
    "data": [
      {
        "id": string
      }
    ]
  }
}

Instagram gathered entry example

A gathered Instagram JSON entry from the given Instagram account using the default media field(s) is shown below:

{
  "media_type": "IMAGE",
  "media_url": "https://scontent.cdninstagram.com/v/t51.29350-15/312620827_2776062742527833_1823212135481542642_n.jpg?_nc_cat=106&ccb=1-7&_nc_sid=8ae9d6&_nc_ohc=8ii0ELSpHfkAX8HUaCH&_nc_ht=scontent.cdninstagram.com&edm=ANo9K5cEAAAA&oh=00_AfDIJCp9bVzASQEtZPv6aWxQY5r1Oy21ZnZe8Q9DCUqkkA&oe=6367B26C",
  "timestamp": "2022-10-25T06:42:22+0000",
  "caption": "post 8",
  "thumbnail_url": null,
  "username": "testsquizgather",
  "permalink": "https://www.instagram.com/p/CkIIVI0Lc8y/",
  "id": "17852387873847441"
}

This is converted into an XML record by the JSON to XML filter.

<json>
  <media_type><![CDATA[IMAGE]]></media_type>
  <media_url><![CDATA[https://scontent.cdninstagram.com/v/t51.29350-15/312620827_2776062742527833_1823212135481542642_n.jpg?_nc_cat=106&ccb=1-7&_nc_sid=8ae9d6&_nc_ohc=8ii0ELSpHfkAX8HUaCH&_nc_ht=scontent.cdninstagram.com&edm=ANo9K5cEAAAA&oh=00_AfDIJCp9bVzASQEtZPv6aWxQY5r1Oy21ZnZe8Q9DCUqkkA&oe=6367B26C]]></media_url>
  <timestamp><![CDATA[2022-10-25T06:42:22+0000]]></timestamp>
  <caption><![CDATA[post 8]]></caption>
  <thumbnail_url><![CDATA[null]]></thumbnail_url>
  <username><![CDATA[testsquizgather]]></username>
  <permalink><![CDATA[https://www.instagram.com/p/CkIIVI0Lc8y/]]></permalink>
  <id><![CDATA[17852387873847441]]></id>
</json>

Change log

[2.1.0]

Changes

  • Updated to the latest version plugin framework (Funnelback shared v16.20) to enable integration with the new plugin management dashboard.

  • Updated the setting of Java HttpClient to allow follow redirection

  • Update Wiremock version to 3.0.4

[2.0.0]

Changes

  • All the field values available from the Instagram basic display API are now available for mapping as Funnelback metadata. However, the plugin now only maps the timestamp to the date (d) metadata class by default, all the other metadata mappings are user definable and must be manually defined when configuring the plugin.

  • The JSON to XML filter is now required and must be used in conjunction with this plugin.