crawler.user_agent

Background

This parameter specifies the user agent string used by the web crawler when making HTTP(S) requests.

Setting the key

Set this configuration key in the search package or data source configuration.

Use the configuration key editor to add or edit the crawler.user_agent key, and set the value. This can be set to any valid String value.

Default value

crawler.user_agent=Mozilla/5.0 (compatible; Funnelback)

This default browser-based user-agent is used to maximise the chances that we will get content from websites which return different content depending on browser type.

Some sites will return "Your browser doesn’t support frames" as a response if their code doesn’t see a specific user-agent like Mozilla/5.0, and the Funnelback web crawler would then get no content from the site.

Examples

If you are crawling other people’s web sites, then it is proper "netiquette" to identify yourself:

crawler.user_agent=Mozilla/5.0 (compatible; Funnelback)

You may also wish to use this more specific string to identify the Funnelback webcrawler in your web server access logs.