Cookies Txt

Introduction

Name

cookies.txt

Location

~/conf/COLLECTION/

Description

Used to set pre-defined cookie values used by the crawler when updating a web collection. A collection's cookies.txt file is generally retrieved as part of a pre-gather workflow command.

To ensure that the cookies are read and used during the crawl the following collection.cfg default settings should be enabled:

crawler.accept_cookies=true
crawler.packages.httplib=HTTPClient

If the cookies.txt file is present it will be read at crawl start-up and any cookies parsed will then be used during the crawl. Any messages relating to errors parsing the cookies.txt file will be in the main crawl.log file.

Format

Netscape cookie file format, one cookie per line. Comment lines are preceded with a hash.

Example

The example cookies.txt file defines a cookie for the domain www.example.com with no expiry date and a name=value setting of id=1234.

 # Domain     Tailmatch  Path  Secure Expires Name  Value
 www.example.com  TRUE        /    FALSE  0       id    1234

See Also

top