Skip to content

Facebook collections


Facebook is a social media site focused sharing content among groups of friends, though it has become widely used by organisations seeking to connect with their customers.

Funnelback supports crawling Facebook pages and gathering data such as posts, events and page information. Funnelback is closely linked to Facebook's graph search API. If you are not familiar with it you may want to look at getting started with Facebook's graph API. You should be comfortable at finding page IDs. You should apply for written approval of automated data collection, by going here.

Please note that your usage of Funnelback to gather content from Facebook must comply with Facebook's terms of service.

Getting your API key and secret

Before you can crawl Facebook you need to get an app ID and secret. To do this first get a Facebook developer account and then go to the developer app page and create a new app. Following this should give you your app ID and secret. Alternatively an access token can be provided using facebook.access-token collection configuration option.

Configuration options

Facebook collections support the following settings:

Metadata mappings

The Facebook gathering template includes predefined Facebook specific metadata mappings:

Class ID Type Behaviour Explanation Metadata fields included
author text content /FacebookXmlRecord/eventOwner/name, /FacebookXmlRecord/page/nameWithLocationDescriptor, /FacebookXmlRecord/postFrom/name
authorId text display /FacebookXmlRecord/eventOwner/id, /FacebookXmlRecord/postFrom/id
c text content Description /FacebookXmlRecord/eventDescription, /FacebookXmlRecord/page/about, /FacebookXmlRecord/page/description, /FacebookXmlRecord/postMessage
category text content /FacebookXmlRecord/page/category
city text content /FacebookXmlRecord/eventVenue/city, /FacebookXmlRecord/page/location/city, /FacebookXmlRecord/postLocation/city
country text content /FacebookXmlRecord/eventVenue/country, /FacebookXmlRecord/page/location/country, /FacebookXmlRecord/postLocation/country
d date date Date /FacebookXmlRecord/eventStartTime, /FacebookXmlRecord/postCreatedTime
eventEndTime text display /FacebookXmlRecord/eventEndTime
eventPrivacy text display /FacebookXmlRecord/eventPrivacy
identifier text display /FacebookXmlRecord/eventId, /FacebookXmlRecord/page/id, /FacebookXmlRecord/postId
image text display /FacebookXmlRecord/page/cover/source, /FacebookXmlRecord/postPictureURL
latLong geospatial x/y co-ordinate N/A /FacebookXmlRecord/eventVenue/latitudeLong, /FacebookXmlRecord/postLocation/latLong
location text display Event location /FacebookXmlRecord/eventLocation
pageFounded text display /FacebookXmlRecord/page/founded
pageMission text display /FacebookXmlRecord/page/mission
pageProduct text display /FacebookXmlRecord/page/products
phone text display /FacebookXmlRecord/page/phone
postcode text display Zip/post code /FacebookXmlRecord/eventVenue/zip, /FacebookXmlRecord/page/location/zip, /FacebookXmlRecord/postLocation/zip
postIconUrl text display /FacebookXmlRecord/postIconURL
postLink text content Post link /FacebookXmlRecord/postLink
postLinkDescription text content /FacebookXmlRecord/postLinkDescription
postLinkTitle text content /FacebookXmlRecord/postLinkCaption
state text display /FacebookXmlRecord/eventVenue/State, /FacebookXmlRecord/page/location/state, /FacebookXmlRecord/postLocation/state
street text display /FacebookXmlRecord/eventVenue/street, /FacebookXmlRecord/page/location/street, /FacebookXmlRecord/postLocation/street
t text content Event/page title /FacebookXmlRecord/eventName, /FacebookXmlRecord/page/name
type text display /FacebookXmlRecord/type

Use the -SF query processor option to access these metadata fields on the search response and in the templates (i.e. -SF=[author,country]).


Please note that Facebook applies limits to the volume of content which can be retrieved from their APIs, and so in the case of large pages Funnelback may be unable to gather all historical content.


Crawling Facebook events is only possible if the facebook.access-token property is specified with a never expiring page access token.

Working with the fetched data

Funnelback will crawl Facebook and convert responses into XML. You can use the metadata customisation tool to map elements to a metadata class.

Note: To preview the crawled records please enable debug mode by setting facebook.debug=true in collection.cfg file.

FacebookQueryPost XML Example

  <postMessage>this is the post message</postMessage>
  <postCreatedTime>Tue Aug 27 13:42:17 EST 2013</postCreatedTime>
  <postFrom class="com.restfb.types.CategorizedFacebookType">
    <id>id of poster</id>
    <name>Name of poster</name>
      <commentId>comment id</commentId>
      <url> id</url>
      <commentMessage>comment one of three</commentMessage>
      <commentFrom class="com.restfb.types.CategorizedFacebookType">
        <id>Comment poster ID</id>
        <name>Name of poster</name>
      <commentCreatedTime>2013-08-27 03:42:37.0 UTC</commentCreatedTime>

FacebookQueryEvent XML Example

  <eventStartTime>2024-02-13 20:00:00.0 UTC</eventStartTime>
  <eventEndTime>2024-02-13 23:00:00.0 UTC</eventEndTime>
  <eventLocation>Lima, Peru</eventLocation>
    <id>owner id</id>

FacebookQueryPage XML Example

    <name>page namme</name>
    <description>the long description</description>
    <about>Another description</about>


Funnelback logo