crawler.num_crawlers

Background

This option specifies how many crawler threads will be created to download pages from a set of websites. Note that with the default frontier, only one thread will access a given host at a time (i.e. if you have 10 threads but are only crawling one site then 9 threads will be idle).

Setting the key

Set this configuration key in the search package or data source configuration.

Use the configuration key editor to add or edit the crawler.num_crawlers key, and set the value. This can be set to any valid Integer value.

Default value

crawler.num_crawlers=20

Examples

If you have a small number of distinct websites to crawl, then you might decide to reduce the number of threads:

crawler.num_crawlers=10
Having the extra threads should not have any performance impact on the system (as they will be idle if there is no site for them to crawl).