gscopes.cfg (configuration file)
Generalised scopes can be used in numerous ways to narrow down searches to particular sub-parts of a collection. The
gscopes.cfg file is a standard place to store mappings from gscopes to the URL patterns that the numbers should be applied to.
A text file, with one gscope name to URL pattern per line. The URL pattern must be a Perl compatible regular expressions. Each line is:
The gscope name is a alpha-numeric ASCII string no longer than 64 characters. White space and all other punctuation is not permitted. Additionally gscopes prefixed with
Fun in any upper or lower case form are reserved for internal use only.
(gscope name) (regular expression)
Maps government websites to different gscopes based on state:
act \.act\.gov\.au/ qld \.qld\.gov\.au/ tas \.tas\.gov\.au/ nsw \.nsw\.gov\.au/
Maps the 'documents' section of a website to gscope
documents. Additionally gives '.doc' files in the important subdirectory the gscope
documents www\.company\.com/documents/ importantWordDocuments www\.company\.com/documents/important/.*\.doc
Prefix the regular expression with the (?i) directive to use case-insensitive matching:
This will match URLs containing "Documents", "DOCUMENTS" "DoCuments" etc.