A web collection is a collection of documents obtained from one or more web sites. Funnelback will crawl the web sites, downloading the web pages and any other documents referenced by the web pages.
In order to avoid crawling the entire Internet the crawler uses a number of configuration options to determine which links it will follow and what web sites or domains it should limit its crawl to.
Funnelback supports accessing web sites through HTTP and HTTPS.
- Configuration options