Query logs

The Funnelback query logs provide an important record of the queries that have been used on the search service. The query logs are vital for purposes such as generating reports on how the search service is being used.

Log locations

The Funnelback query log is located at data_root/collection_name/live/log/queries-<HOSTNAME>.log. e.g. /opt/funnelback/data/shakespeare/live/log/queries-localhost.log

This log file is updated for each query.

When the collection is updated, these files are then archived by moving them to the directory data_root/collection_name/archive

The archived file is named according to the current date and server. e.g. /opt/funnelback/data/shakespeare/archive/queries-localhost.20190827_022830.log.gz

Log formats

Query log files have the following format: one line in the logfile per query processed and each line contains fields separated by commas. Some example lines from a query log:

...
Thu Jul 25 13:00:39 2019,59.34.60.144,chocolet,,,0,10,2x,0,0,3,_default,-
Thu Aug  8 14:22:41 2019,190.173.101.154,haloween,,,0,10,2x,0,0,3,_default,-
Tue Aug 27 02:25:49 2019,10.0.2.0,cream,g"typeSeafood",,1,10,S2x,16,0,9,_default,0d5f0777-8caf-4774-8d7f-f537b77e6c3c
...

The comma-separated fields are described in the following table:

Field no. Field name Notes

1

date_time

The format is: Thu Aug 8 14:22:41 2019

2

requester_IP

The IP address of the request (note: it may be a proxy, not the end-user’s workstation).

3

query

The canonical query as actually processed by PADRE.

4

include_scope

This will contain an expression: g"<expr>" where <expr> is a valid gscope expression. An optional textual scope will be appended after a | character if an additional scope parameter is used. e.g. g"0"|.gov.au. Please note that a gscope expression may contain commas.

5

exclude_scope

See include_scope

6

start_rank

The result rank for the first item. For example, page 2 may have start_rank=21 if the first page contained 20 results.

7

num_ranks

The number of results per page.

8

codes

Query processing settings, represented by single characters as shown in the next table.

9

full_matches

Number of items matching all query conditions.

10

partial_matches

Number of partial matches.

11

elapsed_time

Time taken by PADRE to process the query (in milliseconds).

12

profile

The profile used by the query.

13

user_id

Unique identifier of a user, if search session and history is enabled. If not, then a dash - will be written to this field.

The following table gives an explanation of the query processing codes:

Field code Notes

C

Results came from query cache.

I

Results used to initialise query cache.

G

PADRE run directly from CGI.

S

Automatic query word stemming in force.

Z

Expired or killed documents included in ranking.

r

Top documents reranked by combination of score, recency and "homepageness".

u

Top documents reranked by URL.

t

Top documents reranked by title.

d

Top documents reranked by recency only.

?

Unknown reranking method.

h

Directly generated HTML.

w

Old style result formatting (search.cgi).

x

XML result format.

*

Other result presentation format.

0-9

Scoring mode

Logging of IP addresses

An administrator can control how search request IP addresses are logged via the user ID to log collection setting.

© 2015- Squiz Pty Ltd