Gscopes Cfg

Introduction

Name

gscopes.cfg

Location

~/conf/collection/

Description

List of mappings from 'General scope' (gscope) numbers to URL patterns

Background

General scopes can be used in numerous ways to narrow down searches to particular sub-parts of a collection. The gscopes.cfg file is a standard place to store mappings from gscope numbers to the URL patterns that the numbers should be applied to.

Format

A text file, with one gscope number to URL pattern per line. The URL pattern must be a Perl compatible regular expressions. Each line is:

(gscope number) (regular expression)

Examples

Maps government websites to different gscope numbers based on state:

1 \.act\.gov\.au/
2 \.qld\.gov\.au/
3 \.tas\.gov\.au/
4 \.nsw\.gov\.au/

Maps the 'documents' section of a website to gscope 34. Additionally gives '.doc' files in the important subdirectory the gscope 45:

34 www\.company\.com/documents/
45 www\.company\.com/documents/important/.*\.doc

Prefix the regular expression with the (?i) directive to use case-insensitive matching:

34 (?i)www\.company\.com/documents/

This will match URLs containing "Documents", "DOCUMENTS" "DoCuments" etc.

See Also

top