Query logs
The Funnelback query logs provide an important record of the queries that have been used on the search service. The query logs are vital for purposes such as generating reports on how the search service is being used.
Log locations
The Funnelback query log is located at $SEARCH_HOME/data/collection_name/live/log/queries-<HOSTNAME>.log
. e.g. /opt/funnelback/data/shakespeare/live/log/queries-localhost.log
This log file is updated for each query.
When the collection is updated, these files are then archived by moving them to the directory $SEARCH_HOME/data/collection_name/archive
The archived file is named according to the current date and server. e.g. $SEARCH_HOME/data/shakespeare/archive/queries-localhost.20190827_022830.log.gz
Log formats
Query log files have the following format: one line in the logfile per query processed and each line contains fields separated by commas. Some example lines from a query log:
... Thu Jul 25 13:00:39 2019,59.34.60.144,chocolet,,,0,10,2x,0,0,3,_default,- Thu Aug 8 14:22:41 2019,190.173.101.154,haloween,,,0,10,2x,0,0,3,_default,- Tue Aug 27 02:25:49 2019,10.0.2.0,cream,g"typeSeafood",,1,10,S2x,16,0,9,_default,0d5f0777-8caf-4774-8d7f-f537b77e6c3c ...
The comma-separated fields are described in the following table:
Field no. | Field name | Notes |
---|---|---|
1 |
|
The format is: |
2 |
|
The IP address of the request (note: it may be a proxy, not the end-user’s workstation). |
3 |
|
The canonical query as actually processed by PADRE. |
4 |
|
This will contain an expression: |
5 |
|
See |
6 |
|
The result rank for the first item. For example, page 2 may have |
7 |
|
The number of results per page. |
8 |
|
Query processing settings, represented by single characters as shown in the next table. |
9 |
|
Number of items matching all query conditions. |
10 |
|
Number of partial matches. |
11 |
|
Time taken by PADRE to process the query (in milliseconds). |
12 |
|
The profile used by the query. |
13 |
|
Unique identifier of a user, if search session and history is enabled. If not, then a dash |
The following table gives an explanation of the query processing codes:
Field code | Notes |
---|---|
|
Results came from query cache. |
|
Results used to initialise query cache. |
|
PADRE run directly from CGI. |
|
Automatic query word stemming in force. |
|
Expired or killed documents included in ranking. |
|
Top documents reranked by combination of score, recency and "homepageness". |
|
Top documents reranked by URL. |
|
Top documents reranked by title. |
|
Top documents reranked by recency only. |
|
Unknown reranking method. |
|
Directly generated HTML. |
|
Old style result formatting (search.cgi). |
|
XML result format. |
|
Other result presentation format. |
|
Scoring mode |
Logging of IP addresses
An administrator can control how search request IP addresses are logged via the user ID to log collection setting.