gscopes.cfg (configuration file)






Maps URL patterns to gscopes.


Generalised scopes can be used in numerous ways to narrow down searches to particular sub-parts of a collection. The gscopes.cfg file is a standard place to store mappings from gscopes to the URL patterns that the numbers should be applied to.


A text file, with one gscope name to URL pattern per line. The URL pattern must be a Perl compatible regular expressions. Each line is:

The gscope name is a alpha-numeric ASCII string no longer than 64 characters. White space and all other punctuation is not permitted. Additionally gscopes prefixed with Fun in any upper or lower case form are reserved for internal use only.

(gscope name) (regular expression)


Maps government websites to different gscopes based on state:

act \.act\.gov\.au/
qld \.qld\.gov\.au/
tas \.tas\.gov\.au/
nsw \.nsw\.gov\.au/

Maps the 'documents' section of a website to gscope documents. Additionally gives '.doc' files in the important subdirectory the gscope importantWordDocuments:

documents www\.company\.com/documents/
importantWordDocuments www\.company\.com/documents/important/.*\.doc

Prefix the regular expression with the (?i) directive to use case-insensitive matching:

documents (?i)www\.company\.com/documents/

This will match URLs containing "Documents", "DOCUMENTS" "DoCuments" etc.

See also

© 2015- Squiz Pty Ltd