Crawler Classes Urlstore (collection.cfg)

Description

The crawler stores downloaded documents on a local file system for them to be indexed. This option identifies the Java class to be used to store the documents.

The main store classes are:

com.funnelback.common.store.WarcStore

Store cached documents in a single compressed WARC file.

com.funnelback.common.io.MirrorStore

Store cached documents using a mirror of their URL directory structure.

Default value

crawler.classes.URLStore=com.funnelback.common.store.WarcStore

See also

top