Twitter data source

Twitter is a social media site focused publicly sharing short messages.

Please note that your usage of Funnelback to gather content from Twitter must comply with Twitter’s terms of service.

Getting authentication keys and secrets

Before you can crawl Twitter, ensure that you have:

A Twitter account
Created an application within Twitter https://twitter.com/login/error?username_or_email=username&redirect_after_login=https%3A%2F%2Fapps.twitter.com%2Fapp%2Fnew (your Twitter login will be required)
- Application Name
- Application Description
- Website
- CallbackURL

Once complete, note your OAuth consumer key/consumer secret and OAuth access token / token secret.

Configuration options

The following settings are supported as part of the data source setup.

twitter.oauth.consumer-key: OAuth consumer key.
twitter.oauth.consumer-secret: OAuth consumer secret.
twitter.oauth.access-token: OAuth access token.
twitter.oauth.token-secret: OAuth token secret.
twitter.users: Comma delimited list of user names to crawl.

Discarding old tweets

Tweets older than a certain date can be discarded by enabling the XML date filter plugin and configuring it to discard the older items.

User mentions and hash-tags

User mentions and hash-tags within Twitter content can be made searchable by enabling the social tags plugin.

Metadata mappings

Twitter data sources include a number of default Twitter specific metadata mappings:

Class ID Type Behaviour Explanation Metadata fields included

Class ID	Type	Behaviour	Explanation	Metadata fields included
`author`	text	content		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/screenName`
`authorImage`	text	display		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/profileImageUrl`
`c`	text	content	Tweet	`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/tweet`
`country`	text	content		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/country`
`d`	date	date	Date	`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/createdDate`
`hashtag`	text	content		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/hashtags/Hashtag/text`
`identifier`	text	display		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/id`
`image`	text	display		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/linkedMediaURLs/MediaURL/thumbnail/pictureUrl`
`isReTweet`	text	display		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/isReTweet`
`latLong`	geospatial x/y co-ordinate	N/A		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/latLong`
`linkedDisplayUrl`	text	display		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/linkedURLs/URL/displayURL`
`linkedExpandedUrl`	text	display		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/linkedURLs/URL/expandedURL`
`linkedShortUrl`	text	display		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/linkedURLs/URL/shortURL`
`location`	text	content		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/placeName`
`username`	text	content		`/com.funnelback.socialmedia.twitter.TwitterXmlRecord/username`

author

text

content

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/screenName

authorImage

text

display

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/profileImageUrl

c

text

content

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/tweet

country

text

content

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/country

d

date

Date

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/createdDate

hashtag

text

content

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/hashtags/Hashtag/text

identifier

text

display

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/id

image

text

display

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/linkedMediaURLs/MediaURL/thumbnail/pictureUrl

isReTweet

text

display

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/isReTweet

latLong

geospatial x/y co-ordinate

N/A

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/latLong

linkedDisplayUrl

text

display

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/linkedURLs/URL/displayURL

linkedExpandedUrl

text

display

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/linkedURLs/URL/expandedURL

linkedShortUrl

text

display

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/linkedURLs/URL/shortURL

location

text

content

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/placeName

username

text

content

/com.funnelback.socialmedia.twitter.TwitterXmlRecord/username

Use the -SF query processor option to access these metadata fields on the search response and in the templates (i.e. `-SF=[author,hashtag]).

Limits

Please note that Twitter applies limits to the volume of content which can be retrieved from their APIs, and so in the case of large Twitter streams Funnelback may be unable to gather all historical content.

Working with the fetched data

Funnelback will crawl Twitter and convert responses into XML. You can use the metadata customisation tool to map elements to a metadata class.

To preview the crawled records please enable debug mode by setting the twitter.debug=true data source configuration option.

The XML that Funnelback generates for a Twitter data source is as follows:

<com.funnelback.socialmedia.twitter.TwitterXmlRecord>
  <id>tweet_id</id>
  <username>username</username>
  <screenName>some username</screenName>
  <profileImageUrl/>
  <tweet>tweet content</tweet>
  <createdDate>2018-06-20 14:58:03.0 UTC</createdDate>
  <url>https://twitter.com/user_name/status/tweet_id</url>
  <hashtags>
    <Hashtag>
      <start>110</start>
      <end>119</end>
      <text>hashtag conetnt</text>
    </Hashtag>
  </hashtags>
  <linkedURLs>
    <URL>
      <start>133</start>
      <end>156</end>
      <shortUrl>https://t.co/qwert</shortUrl>
      <expandedURL>http://bit.ly/qwert</expandedURL>
      <displayURL>bit.ly/qwert</displayURL>
    </URL>
  </linkedURLs>
  <isReTweet>false</isReTweet>
  <linkedMediaURLs>
    <MediaURL>
      <baseUrl>http://pbs.twimg.com/media/qwert.jpg</baseUrl>
      <thumbnail>
        <pictureUrl>http://pbs.twimg.com/media/qwert.jpg:thumb</pictureUrl>
        <width>150</width>
        <height>150</height>
        <resizeMethod>CROP</resizeMethod>
      </thumbnail>
      <small>
        <pictureUrl>http://pbs.twimg.com/media/qwert.jpg:small</pictureUrl>
        <width>430</width>
        <height>430</height>
        <resizeMethod>FIT</resizeMethod>
      </small>
      <medium>
        <pictureUrl>http://pbs.twimg.com/media/qwert.jpg:medium</pictureUrl>
        <width>430</width>
        <height>430</height>
        <resizeMethod>FIT</resizeMethod>
      </medium>
      <large>
        <pictureUrl>http://pbs.twimg.com/media/qwert.jpg:large</pictureUrl>
        <width>430</width>
        <height>430</height>
        <resizeMethod>FIT</resizeMethod>
      </large>
    </MediaURL>
  </linkedMediaURLs>
</com.funnelback.socialmedia.twitter.TwitterXmlRecord>

Help Center

Menu

Twitter data source

Getting authentication keys and secrets

Configuration options

Discarding old tweets

User mentions and hash-tags

Metadata mappings

Limits

Working with the fetched data