File exceeds maximum download size

Description

This error occurs when Funnelback encounters a file that is larger than the maximum allowed download size.

Error message

Displayed in the url_errors.log file

E http://www.funnelback.com/large-file.pdf [Exceeds max_download_size: 104405535] [2014:01:09:11:46:53]

Cause

Funnelback has encountered a file that matches the inclusion rules, but is larger than the maximum allowed download size.

Resolution

  1. Increase the value of the crawler.max_download_size data source configuration setting to the size of the largest file that is known to exist on the site (or the largest file size that you wish to allow).

  2. Update the data source

setting very large maximum download sizes may require an increase in the amount of memory allocated to the web crawler as filtering of large files requires more memory. This can be adjusted using the gather.max_heap_size setting.