Factors affecting query performance

Background

This article provides advice that helps you understand the different factors that affect the performance of a search query.

Details

The responsiveness of search results when running a query depends on many factors, some of which are external to Funnelback.

In the most basic use case a user enters keywords into a Funnelback search page and presses enter. This query is submitted directly to Funnelback which processes the query and returns a set of search results. For a search like this all the time taken between a user clicking search and seeing search results is Funnelback’s processing time, except for any delay caused by the network connection between the user and the Funnelback server.

Next, consider a search that uses a partial integration (where the response from Funnelback (either a html chunk or JSON/XML) is nested inside another sytem such as a content management system.

In this scenario when a user types a search query into the search box and presses enter:

  1. The search keyword is submitted to the CMS search page.

  2. The CMS then takes this keyword and other parameters (such as collection ID) and generates a request to Funnelback.

  3. Funnelback processes this request and returns a result packet (either as HTML, JSON or XML) back to the CMS.

  4. The CMS then takes this result packet and processes it to render the search page that the user will see.

  5. The CMS then returns the rendered HTML to the end user.

As you can see four of the above steps are time taken either communicating with the CMS or with the CMS performing internal processing. One of the steps is Funnelback processing the query. This scenario has all of the same performance issues as the first scenario but with CMS processing and extra network requests added.

Understanding performance

Query performance

Funnelback’s default template can be used to display a report on where time is spent when processing a query. This reports on the time take from when Funnelback receives a query until it returns a response. It does not include any time spent due to a slow network connection between Funnelback and the user.

To access this feature run a query using using the default template. On the search results page select performance from the drop-down menu located at the bottom right hand corner of the search box.

The performance breakdown will be shown in a popup window.

If the template has been customised you can view this information by viewing the XML or JSON interface.

To do this edit your URL so that the URL calls search.xml or search.json instead of search.html.

You will then get the response in XML or JSON format. The relevant section appears in the response packet performanceMetrics element. The JSON packet follows the same structure.

Use a web browser debugger

The network inspection section of your browser debugger can be used to gain insight into the end-to-end time taken to process a query.

Use this tool to see where the overall time is spent when you submit a search. It could be that the search page has to fetch many linked resources or that the CMS takes a long time rendering the search response.

This information when combined with the query performance information returned by Funnelback can be used to understand where the time is being spent when a query runs. This then allows you to focus your effort when attempting to improve performance.

Improving performance

Remove layers of integration

As mentioned above, a common method of deploying Funnelback is to nest Funnelback search pages within a page on a content management system. The CMS is responsible for handling interaction with the user and relays user queries to Funnelback and passes the responses back to the user nested inside a CMS page.

This adds a considerable amount of time to the query response because CMS overheads are being added on top of what would occur if the user made the query directly to Funnelback.

The CMS overheads include:

  • Additional response time of the user contacting the CMS and receiving the response from the CMS.

  • Additional processing time as a result of the CMS page render time and time to translate the user request and Funnelback server response.

Having a user connect directly to Funnelback can result in a significant speed increase.

Server resourcing

Increasing the RAM and CPU allocation to the Funnelback VM can have a great impact on performance. Increasing available RAM allows Funnelback to load indexes into memory which can see significant improvements in query response time.

Multi-server architecture

Funnelback can be scaled to support multiple servers. This is usually done by having a server responsible for administration and crawling, and one or more servers responsible for serving queries to users. Adding additional servers requires additional Funnelback licences.

Configuration optimisation

There are various configuration options that can also improve performance of Funnelback.

Query time optimisations include:

  • Limiting the number of extra searches that run

  • Setting sensible values for the number of results returned

  • Limiting the amount of metadata returned

See: Query processing optimisation for more detailed information in optimising your query.