Implementer training - Social media data sources

Funnelback has the ability to index content from the following social media services:

  • YouTube

  • Facebook

  • Flickr

  • Twitter

  • Instagram (via the Instagram gatherer plugin)

Additional services can be added by implementing a custom gatherer plugin.

There are a number of pre-requisites that must be satisfied before social media services can be indexed. These vary depending on the type of service, but generally involve having an account, channel/service identifier and API key for access to the service.

Tutorial: Download and index a YouTube channel

  1. Log in to the search dashboard where you are doing your training.

    See: Training - search dashboard access information if you’re not sure how to access the training. Ignore this step if you’re treating this as a non-interactive tutorial.
  2. Create a new search package called Squiz. Skip the step where you are asked about adding data sources.

  3. Once the search package is created scroll to the components section and click the create a data source button.

  4. Create a data source with the following properties:

    • Data source type: youtube

    • Data source name: Squiz videos

    • YouTube API key: AIzaSyDBFGGkZfR79YsdSpw3jNzfRUgsvXVrVKo

    • Channel IDs?: UC19PRS-wlngHv06TRnEQxDA

    • Include channel’s liked videos?: no

    • Playlist IDs?: PLMOOwxQHsNyl—​x_Nsyooa_gFFjOI3bUR

  5. Update the data source by selecting update data source from the update panel.

  6. Inspect the metadata mappings (settings panel, configure metadata mappings) and observe that a set of YouTube specific fields are automatically mapped.

  7. Return to the Squiz search package and create a new results page called Squiz video search

  8. Add some display options to display the YouTube metadata. Add the following to the query processor options (customize panel, edit results page configuration):

    -SF=[c,t,viewCount,likeCount,dislikeCount,duration,imageSmall]
  9. Update the search template (select edit results page templates from the templates panel). Replace the contents of the <@s.Results> tag (approx line 495) with the following code:

    <#if s.result.class.simpleName == "TierBar">
    <#-- A tier bar -->
    <#if s.result.matched != s.result.outOf>
    <li class="search-tier"><h3 class="text-muted">Results that match ${s.result.matched} of ${s.result.outOf} words</h3></li>
    <#else>
    <li class="search-tier"><h3 class="hidden">Fully-matching results</h3></li>
    </#if>
    <#-- Print event tier bars if they exist -->
    <#if s.result.eventDate??>
    <h2 class="fb-title">Events on ${s.result.eventDate?date}</h2>
    </#if>
    <#else>
    <li data-fb-result="${s.result.indexUrl}" class="result<#if !s.result.documentVisibleToUser>-undisclosed</#if> clearfix">
    
        <h4 <#if !s.result.documentVisibleToUser>style="margin-bottom:4px"</#if>>
          <#if s.result.listMetadata["imageSmall"]?first??>
            <img class="img-thumbnail pull-left" style="margin-right:0.5em;" src="${s.result.listMetadata["imageSmall"]?first?replace("\\|.*$","","r")}" />
          </#if>
    
          <#if question.currentProfileConfig.get("ui.modern.session")?boolean><a href="#" data-ng-click="toggle()" data-cart-link data-css="pushpin|remove" title="{{label}}"><small class="glyphicon glyphicon-{{css}}"></small></a></#if>
            <a href="${s.result.clickTrackingUrl}" title="${s.result.liveUrl}">
              <@s.boldicize><@s.Truncate length=70>${s.result.title}</@s.Truncate></@s.boldicize>
            </a>
    
          <#if s.result.fileType!?matches("(doc|docx|ppt|pptx|rtf|xls|xlsx|xlsm|pdf)", "r")>
            <small class="text-muted">${s.result.fileType?upper_case} (${filesize(s.result.fileSize!0)})</small>
          </#if>
          <#if question.currentProfileConfig.get("ui.modern.session")?boolean && session?? && session.getClickHistory(s.result.indexUrl)??><small class="text-warning"><span class="glyphicon glyphicon-time"></span> <a title="Click history" href="#" class="text-warning" data-ng-click="toggleHistory()">Last visited ${prettyTime(session.getClickHistory(s.result.indexUrl).clickDate)}</a></small></#if>
        </h4>
    
        <p>
        <#if s.result.date??><small class="text-muted">${s.result.date?date?string("d MMM yyyy")}:</small></#if>
        <span class="search-summary"><@s.boldicize><#noautoesc>${s.result.listMetadata["c"]?first!"No description available."}</#noautoesc></@s.boldicize></span>
        </p>
    
        <p>
        	<span class="glyphicon glyphicon-time"></span> ${s.result.listMetadata["duration"]?first!"N/A"}
        	Views: ${s.result.listMetadata["viewCount"]?first!"N/A"}
        	<span class="glyphicon glyphicon-thumbs-up"></span> ${s.result.listMetadata["likeCount"]?first!"N/A"}
        </p>
      </li>
    </#if>
  10. Run a search for !showall and observe the YouTube results:

Extended exercises: social media

  1. Find a YouTube channel and determine the channel ID. Add this as a second channel ID and update the data source.

  2. Set up a new social media data source using one of the other templates (such as Facebook or Twitter). To do this you will need an appropriate API key for the repository and a channel to consume.

  3. Index an Instagram feed using the Instagram custom gatherer. Hint: you will require a custom data source for this and will also need to enable the Instagram custom gatherer plugin on this data source.