Click report counts are inaccurate

Problem description

Sometimes a clicks for query and queries for click report may show a total count of clicks/queries which is less than that shown on other reports. For example, when viewing the top queries report, the query bananas may have a total of 100 clicks for the time period specified. Clicking through to the top clicks for query: bananas report may show 3 top clicks:

  1. http://fruits.example.com/all_about_bananas with a total of 60 clicks

  2. http://fruits.example.com/bananas_nutrition_info with a total of 20 clicks

  3. http://fruits.example.com/potassium with a total of 10 clicks.

Summed together, the 3 clicks shown have a total of 90 clicks - 10 less than the number of clicks reported for the banana query.

Solution

This is due to the max facts per dimension combination setting. In order to improve scalability and performance, the query reporting system ignores data items (facts) that are outside a certain frequency threshold — for example, only the most popular 500 queries per day are stored by default. This also means that, by default, only the most popular 500 clicks for any query or queries for any click are stored. The bananas query above may have also received clicks on other results, but these clicks were not popular enough to pass the max facts per dimension combination threshold.