HTML search results integration guide
Background
Funnelback search results are often served from an HTML endpoint. This endpoint can produce a full page with headers/footers/CSS/JS, or it can produce a section of HTML with only the search results that can be integrated with another platform.
This guide shows how to embed Funnelback HTML search results into an existing website/CMS/platform. The Funnelback higher education stencil is used as an example, but the concepts may be applied to any implementation of Funnelback.
Choosing integration method
There are three ways of integrating your website with Funnelback search results which discusses the advantages and disadvantages of each integration method. Two of the methods involve integration with HTML search results that are templated using Freemarker. This guide outlines how to integrate with these two methods.
Additional advantages of embedding HTML in a CMS over returning the full page from Funnelback are:
-
Headers and footers: It can be challenging to perfectly implement headers and footers from a site when Funnelback serves the full page, especially if there are conflicting CSS frameworks. Challenges can also include Javascript conflicts on menu interactions and sticky menus for mobile devices.
-
Ongoing maintenance/redesign: If the website design is changed in the future, a change also has to be made and tested on the results page template in the Funnelback server.
-
SSL certificate: A dedicated search domain (i.e. search.client.com) is required when Funnelback serves the full page which requires its own SSL Certificate and renewals. This may have a cost and requires future maintenance to keep this certificate up-to-date on the Funnelback server.
How to follow this guide
This guide should be followed conceptually by applying the explanations and example code to the framework/CMS/platform that the implementation will be completed in. Throughout this guide, sections will call out where assumptions or specific techniques have been used for the purpose of the guide that should be generalized for other implementations.
The sample code in this guide uses Javascript, however, these concepts are not language-specific and should be tailored to each individual situation.
URLs used in this guide
Throughout this guide, example URLs are used to demonstrate concepts and specify whether a certain action is taking place on the Funnelback server or on the framework/CMS/platform of the client implementation. These URLs are fictional and do not actually exist.
-
www.client-university.edu: The website of Client University (the client framework/CMS/platform)
-
client.funnelback.com: The Funnelback server where the search indexes are hosted for Client University
In this guide, 'Client University' is the client and 'Funnelback' or 'the Funnelback server' is the vendor. Below is a representation of the requests:
-
The original query is sent from the user to the client framework/CMS/platform
-
The query is passed on to the Funnelback server
-
Funnelback returns HTML containing the search results to the client framework/CMS/platform
-
The search results are embedded into the complete page and returned to the user
Setting up the Funnelback configuration
The core concept to configuring Funnelback to return HTML to embed within a page instead of a full HTML document is to modify the HTML that Funnelback returns to be contained within a <div>
element (or a <main>
may be preferable semantically). The HTML endpoint in Funnelback should not return <html>
, <head>
, or <body>
elements.
This can be accomplished by editing the Freemarker templates on the Funnelback server to adjust what HTML it is producing to change it from producing a full HTML document to instead produce just the desired section of HTML.
Configure the search link
The ui.modern.search_link option in the profile configuration must be configured to match where the search results page will exist on the client framework/CMS/platform so that the links produced by the Funnelback server are correct.
In this guide, the search results page exists at https://www.client-university.edu/search
, so the profile configuration setting would be:
ui.modern.search_link=/search
Configure the integration URL
The ui.integration_url configuration setting in the collection configuration must also be configured. This allows the Funnelback server to be aware of where the search engine results page exists on the client framework/CMS/platform for features such as the Insights Dashboard preview search.
ui.integration_url=https://www.client-university.edu/search?collection={collection}&query={query}&profile={profile}
Environment variables
Wherever possible, environment variables should be used instead of hard-coded values. Environment variables can be used for server addresses, ports, etc. and encourages separating configuration from code (see more information).
Usage of environment variables
The URL of the Funnelback server is environment specific — a development environment should request results from a Funnelback development environment and a production environment should should request results from a Funnelback production environment.
The process to set environment variables depends on the platform/CMS/language/server architecture used and is different for each case.
Creating the search route
This section describes the route handler for the "search" route that would exist at https://www.client-university.edu/search
.
Add the 'search' route
The search route is responsible for relaying the user’s query to the Funnelback server and returning the full web page with the search results back to the user.
In this section, the basic functionality of the search route is implemented:
-
Parse the query parameters from the user’s request
-
Send the query to the Funnelback domain
-
Receive the search results from the Funnelback domain
-
Embed the search results with the rest of the page template and return it to the user
Send the query to the Funnelback domain and receive search results
The code added below constructs the Funnelback server URL to request results from with an environment variable (configured later) and the query parameters provided by the user. It then requests the search results and receives the response from the Funnelback server.
// The 'request' variable is the HTTP request sent by the user
// The 'response' variable is the HTTP response to reply to the user
/* ... imports ... */
const querystring = require('querystring')
const fetch = require('isomorphic-unfetch')
/* ... end imports ... */
/* ... inside the route handler ... */
const { query } = request (1)
const params = querystring.stringify(query) (2)
const endpoint = `https://${process.env.FUNNELBACK_SEARCH_DOMAIN}/s/search.html?${params}` (3)
const funnelbackResponse = await fetch(endpoint) (4)
const searchResults = await funnelbackResponse.text()
response.render('layout', { searchResults }) (5)
/* ... end of the route handler ... */
javascript
1 | Extract the query parameters from the request |
2 | Convert the query parameters to a string to pass along to the Funnelback server |
3 | Construct the URL of the Funnelback server to query |
4 | Send the request to Funnelback and receive the response |
5 | Pass the search results into the main layout (which includes the header and footer) |
The HTML returned by the Funnelback server is already properly escaped, so it should not be escaped again before returning to the user to avoid text such as & showing in the output.
|
Ensure that the querystring is sent to Funnelback exactly as it was sent by the user’s browser. PHP-based servers and CMS, particular ones that use the Please see https://stackoverflow.com/questions/68651/get-php-to-stop-replacing-characters-in-get-or-post-arrays for more information. |
Use an environment variable for the Funnelback search domain
For the fictional 'Client University' production site (www.client-university.edu), the search domain could be a production Funnelback server.
A development/staging/QA site should have a corresponding environment variable to a development/staging/QA Funnelback server.
Test the basic search result page
At this point, the basic search results page can be tested to ensure the queries. The next sections discuss error handling and additional features.
If the CMS is based on ASP.NET Web Forms there are additional considerations to work through. The search form produced by Funnelback is a standard HTML ASP.NET Web Forms CMS surround the whole page with a POST |
Error handling and logging
General error handling
The response code of the HTTP response from the Funnelback server must be checked if it is OK or if there is an error. In the event of an error, a friendly informational message should be displayed to the user (along the lines of "something went wrong") rather than sending on the response code itself or garbled error text. For example, if the Funnelback server replies with a 500 Internal Server Error, the framework/CMS/platform should not send a 500 page to the end user.
Any HTTP responses from the Funnelback server in the 4xx (Client Error) range or 5xx (Server Error) range should be considered an error response to handle. |
It is not enough to check whether there is content in the HTML returned from the Funnelback server, as there are cases where HTML is returned but does not contain search results (for example, a 404 handler in the Funnelback server that returns a 404 page).
Notable errors Funnelback could return
Below are examples of possible HTTP errors that could be returned by the Funnelback server along with potentially causes. Besides the errors listed below, all 4xx and 5xx errors should be handled as described above.
400 Bad Request
-
The query sent to the Funnelback server may be malformed.
-
The query sent to the Funnelback server does not contain required parameters such as
collection
.
401/403 (Unauthorized/Forbidden)
-
If the search endpoint should be publicly available, check that the framework/CMS/platform is requesting the public (search) URL on the Funnelback server and not the protected (admin) URL.
-
If the search endpoint has special configuration in the Funnelback server for access control, such as IP restrictions, ensure that the framework/CMS/platform IP address is allowed in the configuration.
404 Not Found
-
The results endpoint may have been renamed on the Funnelback server causing the
collection
URL parameter to not match any known endpoint. -
The value of the environment variables has changed to an incorrect value for that environment.
-
The URL to query Funnelback with is being incorrectly modified in some way.
500 Internal Server Error
-
Freemarker templates in Funnelback could have syntax errors or attempt access variables that are null.
-
Incorrect collection configuration could produce unexpected side-effects leading to an error.
-
All 500 errors are logged in the Funnelback server logs for further analysis.
502/503/504 (Bad Gateway/Service Unavailable/Gateway Timeout)
-
The Funnelback server may be down or unreachable, contact the appropriate team for Funnelback support.
Timeout / no response / other error
-
Check if there is a firewall or other blocker to traffic to/from the framework/CMS/platform and the Funnelback server.
-
Check if the library used to make the HTTP request is using a supported version of TLS.
-
The Funnelback server may be down or unreachable, contact the appropriate team for Funnelback support.
Passing the end user IP address for logging
The default configuration of Funnelback analytics and search endpoints is to log the IP address of the query. An IP database with geolocations is included in the Funnelback server to show in the Funnelback Insights Dashboard where queries originate.
This functionality can also be extended with the Curator feature to provide promoted results or customized content based on location of the person who is searching.
When the queries are passed from the user through a CMS/server/platform before being passed to Funnelback, the IP address of that CMS/server/platform would be recorded, which is not useful for analytics. This process describes how to ensure that Funnelback records the IP address of the original person who is searching.
Passing the original IP address
The X-Forwarded-For HTTP header is used to pass the original IP address of the end user.
// The 'request' variable is the HTTP request sent by the user
// The 'response' variable is the HTTP response to reply to the user
/* ... inside the route handler ... */
const { query } = request
const params = querystring.stringify(query)
const { connection: { remoteAddress } } = request (1)
const endpoint = `https://${process.env.FUNNELBACK_SEARCH_DOMAIN}/s/search.html?${params}`
const funnelbackResponse = await fetch(endpoint, {
headers: {
'X-Forwarded-For': remoteAddress (2)
}
})
const searchResults = await funnelbackResponse.text()
response.render('search', { searchResults })
module.exports = router
javascript
1 | Extract the IP address from the request |
2 | Simple example of sending the original user IP address as a X-Forwarded-For HTTP header |
Configuring Funnelback for IP logging
By default, Funnelback will look at the last IP address in the X-Forwarded-For header to log in the analytics. The original user IP address should be the first IP address in the X-Forwarded-For header.
The logging.ignored_x_forwarded_for_ranges configuration option should be used to ignore known IP addresses so that the original user IP address is logged in Funnelback analytics.
The update the X-Forwarded-For header value plugin can be used to remove the first or last value of the X-Forwarded-For
header, or all values but the first.
Caveats about this method of passing the IP address
This guide shows a much simplified version of the retrieving the end user IP address and passing it to Funnelback in the X-Forwarded-For header.
In reality, the architecture of every server will differ from platform to platform. If the server exists behind a reverse proxy, the original user IP address was probably appended to the X-Forwarded-For by the reverse proxy and should be extracted from there instead to pass onto Funnelback. The Node solution would likely involve using a package (such as forwarded-for
) to manage these details.
Use the recommended method of accessing the original user IP address based on the framework/CMS/platform, or contact system administrators for more information on the system architecture.
Managing sessions (history and cart)
The Sessions feature (Search/Click history and Results Cart) is enabled through the use of an HTTP cookie. These cookies are only valid on the domain which they are assigned, and can only be assigned
Funnelback configuration
Sessions are enabled by editing the profile configuration for ui.modern.session.
Managing the sessions cookie
The overview image for embedding search results is repeated below.
Reviewing this image now in context of the sessions cookie:
-
The query from the end user to the client framework/CMS/platform will not contain the sessions cookie if they are a first time visitor on that browser, or it may contain the cookie if they are a repeat visitor.
-
If the user’s browser sent the sessions cookie, this cookie should be sent onwards to the Funnelback server
-
If no sessions cookie was sent to the Funnelback server, one will be generated and returned in the
set-cookie
header. Otherwise, the sessions cookie that was sent will also be returned in theset-cookie
header. -
The client framework/CMS/platform sets the
set-cookie
header for the response to the end user with the correct value and domain
Example code of managing the sessions cookie
The name of the sessions cookie is user-id
. If this were to change in the future, it could be worthwhile substituting an environment variable instead of a hard-coded value.
Sessions cookie step 1 and 2: receive cookie from user and pass onwards
For steps (1) and (2) above, the implementation requires getting the sessions cookie from the user’s request and pass it on to the Funnelback server. The base case is that the user has not visited the search before and does not have the cookie set.
// The 'request' variable is the HTTP request sent by the user
/* ... inside route handler before receiving the results from Funnelback ... */
if (request.cookies && request.cookies['user-id']) {
headers.cookie = `user-id=${request.cookies['user-id']}`
}
/* ... rest of route handler ... */
javascript
The headers
variable was created earlier in the [Passing the End User IP Address for Logging] section. This section adds the sessions cookie to that headers variable, if applicable.
Sessions cookie step 3: receive cookie from the Funnelback server
In step (3) above, the Funnelback server sends the sessions cookie in a set-cookie
header. Typically it is the browser who parses a set-cookie
header, not a server, so there may not be a built-in way of parsing this header depending on the framework/CMS/platform. A Node package called 'set-cookie-parser' can be used for this guide to parse the set-cookie
header.
After the search results response has been received from the Funnelback server, parse out the value of the sessions cookie.
/* ... imports ... */
const setCookieParser = require('set-cookie-parser')
/* ... end imports ... */
/* ... inside the route handler after receiving the results from Funnelback ... */
const setCookies = setCookieParser.parse(funnelbackResponse.headers.get('set-cookie'), {
map: true
}) (1)
const userIdCookie = setCookies['user-id'] (2)
/* ... rest of route handler ... */
javascript
1 | Uses a simple parsing library to parse the 'set-cookie' header into a map |
2 | Gets the value of the user-id cookie from the map |
Sessions cookie step 4: send cookie to user’s browser
If the sessions cookie was parsed from the set-cookie
header of the Funnelback response, put that value in a set-cookie
header to send to the end user.
If the sessions cookie is not sent by the Funnelback server, this should be handled and not cause an error. For example, the sessions may have been disabled on the Funnelback server.
/* ... inside the route handler after the previous step ... */
if (userIdCookie) {
res.cookie(userIdCookie.name, userIdCookie.value, { (1)
maxAge: userIdCookie.maxAge * 1000, (2)
domain: process.env.COOKIE_DOMAIN, (3)
})
} else {
// Funnelback did not send a 'set-cookie' value for 'user-id'
// This may imply an error, or it could just mean the configuration was turned off
}
/* ... rest of route handler ... */
javascript
1 | Create a 'set-cookie' header for the end user with the value received from Funnelback |
2 | Double-check whether the cookie max age defaults to seconds or milliseconds and convert appropriately |
3 | An environment variable can be used for the cookie domain to use the same code between a development and production environment |
If the sessions cookie sent by the Funnelback server is different than the cookie sent by the user’s browser, the Funnelback server value should be used. The cookie on the user’s browser may be expired or invalid, so always trust the value returned by the Funnelback server. |
Session cookie domain
The process described above manages the sessions cookie in response to searches, and will ensure that the user’s search history is available to them. The sessions cookie is also used in two other areas:
-
Click tracking
-
Cart (Saved Items)
The client framework/CMS/platform can only set cookies for its own domain, a server cannot set cookies on a user’s browser for a different domain. In this example, the cookie can be set by Client University for domains ending in client-university.edu
.
The click tracking URL exists on the Funnelback server at client.funnelback.com/s/redirect
. As this is a different domain, the sessions cookie in the user’s browser will not be sent to that redirect/click tracking URL when the user clicks on a result.
Similarly, the Cart (Saved Items) API endpoint exists on the Funnelback server at client.funnelback.com/s/cart.json
. Cookies that are valid on client-university.edu
will not be sent to the Cart endpoint as it is on a different domain.
This problem can be solved in two ways:
-
Create a dedicated search subdomain within the client domain
-
Proxy the other requests through to Funnelback
Option 1: create a dedicated search domain
This problem can be solved by creating a new domain on the client side and using a Domain Name Service (DNS) CNAME record. For example, a new domain search.client-university.edu
with the following CNAME record.
search.client-university.edu CNAME client.funnelback.com
Now, all requests sent to search.client-university.edu
, a domain owned by the client, are sent to the Funnelback server. If the domain of the sessions cookie is .client-university.edu
, that cookie will also be sent by the user’s browser to search.client-university.edu
Option 2: proxy the cart requests
The requests that control behavior in the Cart (Saved Items) functionality are generated from client-side Javascript.
The Cart (Saved Items) API endpoint exists on the Funnelback domain, so if the request is made by the end user’s browser while on the client’s domain, the cookie will not be sent by the browser. Instead, the request can be sent to an endpoint on the client framework/CMS/server, which forwards that request onto the Funnelback server, similar to how the search route was set up (see earlier sections).
Each of the GET, POST, PUT, and DELETE requests should be forwarded onto the Funnelback server with the same URL parameters and body data. The response from Funnelback should also be sent back to the user’s browser.
If the official Funnelback sessions plugins are used, ensure to configure the apiBase
options (see configuration options), otherwise follow relevant instructions to find where to configure the base URL of the Cart (Saved Items) API endpoint.
Option 2 (continued): proxy the click redirect requests
The same concept applies for click redirects, if the redirect URL exists on the Funnelback domain, the sessions cookie will not be carried by the user’s browser to that domain. A redirect URL can be set up on the client framework/CMS/server which captures the user’s session cookie and forwards that to the Funnelback server redirect URL.
The redirect URL on the Funnelback server returns a simple HTTP header with a 302 response and the Location
that the browser should redirect to — this HTTP header can be forwarded to the end user with no modification except for the set-cookie
header as described in earlier sections.
If this proxy is implemented, the click link should be configured in the profile ui.modern.click_link.
This process is only necessary for the personal Click History feature (i.e. "my clicks"). Overall click analytics will still work without this proxy. |
Controlling the URL parameters
The Funnelback server search results endpoint can be thought of as an API with required and optional parameters.
The required parameters to pass to Funnelback are:
-
collection
-
query
Optional parameters may include, among many others:
-
profile
-
form
-
facets (the value varies per facet)
The parameters that are sent to the Funnelback server search results endpoint can be controlled separately from what is shown in the user’s browser. This is not necessary to the implementation.
Sanitizing the URL parameters is not necessary, these URL parameters can be passed to the Funnelback server without modification. The Funnelback server sanitizes those parameters when it produces the search results. |
Hide the collection parameter
It may be desirable to hide the collection=<COLLECTION-ID>
parameter from the URL in the user’s browser to shorten the link or otherwise hide the name of the collection.
# For example:
https://client-university.edu/search?query=programs
# Instead of:
https://client-university.edu/search?query=programs&collection=client~client-university-search
The code added below adds the collection parameter which is defined as an environment variable.
// The 'request' variable is the HTTP request sent by the user
/* ... inside the route handler ... */
const { query } = request
const params = querystring.stringify(query)
params.collection = process.env.FUNNELBACK_COLLECTION (1)
/* ... rest of the route handler ... */
javascript
1 | Sends the collection parameter from an environment variable to Funnelback |
Additional changes to the Freemarker templates may be required on the Funnelback server to remove the collection in other page elements such as facet links and the search form.
|