Facebook data source
Facebook is a social media site focused sharing content among groups of friends, though it has become widely used by organizations seeking to connect with their customers.
Funnelback supports crawling Facebook pages and gathering data such as posts, events and page information. Funnelback is closely linked to Facebook’s graph search API. If you are not familiar with it you may want to look at getting started with Facebook’s graph API. You should be comfortable at finding page IDs. You should apply for written approval of automated data collection.
Please note that your usage of Funnelback to gather content from Facebook must comply with Facebook’s terms of service.
Getting your API key and secret
Before you can gather content from Facebook you need to generate an app ID and secret. To do this first get a Facebook developer account and then go to the developer app page and create a new app. Following this should give you your app ID and secret.
Alternatively an access token can be provided using the facebook.access-token
data source configuration option.
Discarding old posts/items
Posts older than a certain date can be discarded by enabling the date filter plugin and configuring it to discard the older items.
User mentions and hash-tags
User mentions and hash-tags within Facebook content can be made searchable by enabling the social tags plugin.
Metadata mappings
The Facebook gathering template includes predefined Facebook specific metadata mappings:
Class ID | Type | Behaviour | Explanation | Metadata fields included |
---|---|---|---|---|
|
text |
content |
|
|
|
text |
display |
|
|
|
text |
content |
Description |
|
|
text |
content |
|
|
|
text |
content |
|
|
|
text |
content |
|
|
|
date |
date |
Date |
|
|
text |
display |
|
|
|
text |
display |
|
|
|
text |
display |
|
|
|
text |
display |
|
|
|
geospatial x/y co-ordinate |
N/A |
|
|
|
text |
display |
Event location |
|
|
text |
display |
|
|
|
text |
display |
|
|
|
text |
display |
|
|
|
text |
display |
|
|
|
text |
display |
Zip/post code |
|
|
text |
display |
|
|
|
text |
content |
Post link |
|
|
text |
content |
|
|
|
text |
content |
|
|
|
text |
display |
|
|
|
text |
display |
|
|
|
text |
content |
Event/page title |
|
|
text |
display |
|
Use the -SF
query processor option to access these metadata fields on the
search response and in the templates (i.e. -SF=[author,country]
).
Limits
Please note that Facebook applies limits to the volume of content which can be retrieved from their APIs, and so in the case of large pages Funnelback may be unable to gather all historical content.
Caveats
Gathering content for Facebook events is only possible if the facebook.access-token
property is specified with a never expiring page access token.
Working with the fetched data
Funnelback will gather Facebook content and convert responses into XML. You can use the metadata customization tool to map elements to a metadata class.
To preview the crawled records please enable debug mode by setting the facebook.debug=true data source configuration option.
|
FacebookQueryPost XML Example
<FacebookXmlRecord>
<postId>post_id</postId>
<url>www.facebook.com/the_post_id</url>
<postMessage>this is the post message</postMessage>
<postCreatedTime>Tue Aug 27 13:42:17 EST 2013</postCreatedTime>
<type>POST</type>
<postFrom class="com.restfb.types.CategorizedFacebookType">
<id>id of poster</id>
<name>Name of poster</name>
<category>Community</category>
</postFrom>
<postComments>
<FacebookXmlRecord>
<commentId>comment id</commentId>
<url>www.facebook.com/comment id</url>
<commentMessage>comment one of three</commentMessage>
<commentFrom class="com.restfb.types.CategorizedFacebookType">
<id>Comment poster ID</id>
<name>Name of poster</name>
<category>Community</category>
</commentFrom>
<commentCreatedTime>2013-08-27 03:42:37.0 UTC</commentCreatedTime>
<type>COMMENT</type>
</FacebookXmlRecord>
</postComments>
</FacebookXmlRecord>
FacebookQueryEvent XML Example
<FacebookXmlRecord>
<eventId>id</eventId>
<url>www.facebook.com/id</url>
<eventName/>
<eventDescription/>
<eventStartTime>2024-02-13 20:00:00.0 UTC</eventStartTime>
<eventEndTime>2024-02-13 23:00:00.0 UTC</eventEndTime>
<eventLocation>Lima, Peru</eventLocation>
<eventVenue>
<city/>
<country/>
<latitude>-12.043333</latitude>
<longitude>-77.028333</longitude>
<state/>
<street/>
<zip/>
</eventVenue>
<eventPrivacy>OPEN</eventPrivacy>
<eventOwner>
<id>owner id</id>
<name/>
</eventOwner>
<EventRsvpStatus/>
<type>EVENT</type>
</FacebookXmlRecord>
FacebookQueryPage XML Example
<FacebookXmlRecord>
<pageId>id</pageId>
<url>http://www.facebook.com/pages/page</url>
<page>
<id>id</id>
<name>page namme</name>
<category/>
<link>http://www.facebook.com/pages/page</link>
<founded/>
<mission/>
<products/>
<description>the long description</description>
<phone/>
<about>Another description</about>
<talkingAboutCount>0</talkingAboutCount>
<isPublished>true</isPublished>
<location>
<street/>
<city/>
<state/>
<country/>
<zip/>
</location>
</page>
<type>PAGE</type>
</FacebookXmlRecord>