crawler.classes.URLStore
Specifies the Java class used to store content on disk e.g. create a mirror of files crawled
Key: crawler.classes.URLStore
Type: String
Can be set in: collection.cfg
Table of Contents
Description
The crawler stores downloaded documents on a local file system for them to be indexed. This option identifies the Java class to be used to store the documents.
The main store classes are:
-
com.funnelback.common.store.WarcStore
: (default) Store cached documents in a single compressed WARC file. -
com.funnelback.common.io.MirrorStore
: Store cached documents using a mirror of their URL directory structure.