Factors affecting query performance
This article provides advice that helps you understand the different factors that affect the performance of a search query.
Details
The responsiveness of search results when running a query depends on many factors, some of which are external to Funnelback.
In the most basic use case a user enters keywords into a Funnelback search page and presses enter. This query is submitted directly to Funnelback which processes the query and returns a set of search results. For a search like this all the time taken between a user clicking search and seeing search results is Funnelback’s processing time, except for any delay caused by the network connection between the user and the Funnelback server.
Next, consider a search that uses a partial integration (where the response from Funnelback (either a html chunk or JSON/XML) is nested inside another system such as a content management system.
In this scenario when a user types a search query into the search box and presses enter:
-
The search keyword is submitted to the CMS search page.
-
The CMS then takes this keyword and other parameters (such as collection ID) and generates a request to Funnelback.
-
Funnelback processes this request and returns a result packet (either as HTML, JSON or XML) back to the CMS.
-
The CMS then takes this result packet and processes it to render the search page that the user will see.
-
The CMS then returns the rendered HTML to the end user.
As you can see four of the above steps are time taken either communicating with the CMS or with the CMS performing internal processing. One of the steps is Funnelback processing the query. This scenario has all of the same performance issues as the first scenario but with CMS processing and extra network requests added.
Understanding performance
Query performance
Funnelback’s default template can be used to display a report on where time is spent when processing a query. This reports on the time take from when Funnelback receives a query until it returns a response. It does not include any time spent due to a slow network connection between Funnelback and the user.
To access this feature run a query using the default template. On the search results page select performance from the drop-down menu located at the bottom right hand corner of the search box.
The performance breakdown will be shown in a popup window.
If the template has been customized you can view this information by viewing the XML or JSON interface.
To do this edit your URL so that the URL calls search.xml
or search.json
instead of search.html
.
You will then get the response in XML or JSON format. The relevant section appears in the response packet performanceMetrics
element. The JSON packet follows the same structure.
Use a web browser debugger
The network inspection section of your browser debugger can be used to gain insight into the end-to-end time taken to process a query.
Use this tool to see where the overall time is spent when you submit a search. It could be that the search page has to fetch many linked resources or that the CMS takes a long time rendering the search response.
This information when combined with the query performance information returned by Funnelback can be used to understand where the time is being spent when a query runs. This then allows you to focus your effort when attempting to improve performance.
Improving performance
Remove layers of integration
As mentioned above, a common method of deploying Funnelback is to nest Funnelback search pages within a page on a content management system. The CMS is responsible for handling interaction with the user and relays user queries to Funnelback and passes the responses back to the user nested inside a CMS page.
This adds a considerable amount of time to the query response because CMS overheads are being added on top of what would occur if the user made the query directly to Funnelback.
The CMS overheads include:
-
Additional response time of the user contacting the CMS and receiving the response from the CMS.
-
Additional processing time as a result of the CMS page render time and time to translate the user request and Funnelback server response.
Having a user connect directly to Funnelback can result in a significant speed increase.
Server resourcing
Increasing the RAM and CPU allocation to the Funnelback VM can have a great impact on performance. Increasing available RAM allows Funnelback to load indexes into memory which can see significant improvements in query response time.
Multi-server architecture
Funnelback can be scaled to support multiple servers. This is usually done by having a server responsible for administration and crawling, and one or more servers responsible for serving queries to users. Adding additional servers requires additional Funnelback licences.
Configuration optimization
There are various configuration options that can also improve performance of Funnelback.
Query time optimizations include:
-
Limiting the number of extra searches that run
-
Setting sensible values for the number of results returned
-
Limiting the amount of metadata returned
See: Query processing optimization for more detailed information in optimizing your query.