Funnelback logo

Documentation

Query logs

Introduction

The Funnelback query logs provide an important record of the queries that have been used on the search service. The query logs are vital for purposes such as generating reports on how the search service is being used.

Log Locations

The Funnelback query log is located at:

data_root/collection_name/live/log/queries.log
Example: /opt/funnelback/shakespeare/live/log/queries.log

This log file is updated for each query.

When the collection is updated, these files are then archived by moving them to the directory:

data_root/collection_name/archive

The archived file is named according to the current date:

Example: /opt/funnelback/shakespeare/archive/queries.log.20040715

Log Formats

Query log files have the following format: one line in the logfile per query processed and each line contains fields separated by commas. Some example lines from a query log:

...
Mon Sep  8 14:51:13 2008,100.200.300.400,gdp graph,g"0",,1,10,2x,0,1,28,info
Mon Sep  8 14:51:26 2008,99.88.77.66,public holidays,g"0",,1,10,2x,21677,425628,7,
Mon Sep  8 14:51:51 2008,4.3.2.1,tax,,,1,10,2x,129974,0,4,
...

The comma-separated fields are described in the following table:

Field no. Field name Notes
1 date_time The format is: Fri Feb 22 12:44:22 2002
2 requester_IP The IP address of the request (note: it may be a proxy, not the end-user's workstation).
3 query The canonical query as actually processed by PADRE.
4 include_scope This will either be a number indicating an fscope value or an expression: g"<expr>" where <expr> is a valid gscope expression. An optional textual scope will be appended after a '''| character if an additional scope parameter is used. e.g. g"0"|.gov.au'''. It is undefined what this field will contain if both an fscope and a gscope parameter are used in the same query. Please note that a gscope expression may contain commas.
5 exclude_scope See include_scope
6 start_rank The result rank for the first item. For example, page 2 may have start_rank = 21 if the first page contained 20 results.
7 num_ranks The number of results per page.
8 codes Query processing settings, represented by single characters as shown in the next table.
9 full_matches Number of items matching all query conditions.
10 partial_matches Number of partial matches.
11 elapsed_time Time taken by PADRE to process the query (in milliseconds).
12 profile The profile used by the query.

The following table gives an explanation of the query processing codes:

Field Code Notes
C Results came from query cache.
I Results used to initialise query cache.
G PADRE run directly from CGI.
S Automatic query word stemming in force.
Z Expired or killed documents included in ranking.
r Top documents reranked by combination of score, recency and "homepageness".
u Top documents reranked by URL.
t Top documents reranked by title.
d Top documents reranked by recency only.
? Unknown reranking method.
h Directly generated HTML.
w Old style result formatting (search.cgi).
x XML result format.
* Other result presentation format.
0-9 Scoring mode

Logging of IP addresses

An administrator can control how search request IP addresses are logged via the "user ID to log" collection setting.

See also

top ⇑