Upweight or downweight sections of a site in the search results
Background
This article explains techniques that can be used to apply upweighting or downweighting to specific sites or sections of sites.
Case 1: upweight or downweight a section of a website for all searches on a collection
If you have a site search and different sections carry different levels of importance then you can tell Funnelback to apply an upweight or downweight to a page if it comes from a given section.
This applies to all queries against the index.
Step 1. Query independent evidence
Create a qie.cfg
file that provides a base upweight or downweight for a given URL or set of URLs. The weighting column is a value between 0 and 1 with 0.0 providing maximum downweight; 1.0 providing maximum upweight and 0.5 is neutral. See the query independent evidence documentation for information on creating the qie.cfg
.
Step 2. Rebuild the index for the collection
Rebuild the collection’s index to generate the qie index file. This can be done by running an advanced update to reindex the live view of the collection.
This will configure the weightings in the index.
Step 3. Query independent influence
The influence of query independent evidence on the ranking needs to be configured before any change is observed in the results. The influence of query independent evidence is controlled using the cool.4
ranking option.
Add a -cool.4
query processor option to set the overall influence given to query independent evidence in the index. The influence values range from 0.0 (no influence) to 1.0 (maximum influence) e.g. -cool.4=0.6
(or cool.4=0.6
CGI parameter) to provide a slight influence from QIE.
Case 2: upweight or downweight a section of a website based on which section is being searched
In case 1 weightings were assigned to URLs globally for a collection - so all searches on a site could upweight certain documents.
However if you wish to upweight content for a certain section that’s being searched from within the section (e.g. providing a publications search that upweights the publications pages but includes others, and a media releases search in another section of the site that upweights media releases) then you can’t use the first method because the query independent evidence is now variable depending on what section you are searching from.
Achieving this variable upweight is possible but more complicated.
Method 1: use gscopes and padre ranking options
Step 1. Create generalised scopes for each section
Create a gscopes.cfg and create a generalised scope for each section of the site where upweighting will be required on that section.
E.g. Create scopes for pages under /publications/ and /media/ for the publications and media releases example above:
pubsScope mysite\.com\/publications\/
mediaScope mysite\.com\/media\/
Step 2: Rebuild the index for the collection
Rebuild the collection’s index to apply the generalised scopes to the index. This can be done by running an advanced update to reindex the live view of the collection.
Step 3. Create profiles
A profile needs to be created for each search that will require section upweighting. See: managing profiles and services for information on how to create a profile.
Step 4. Apply gscope weightings for each profile
Within that profile set a padre_opts.cfg
containing the following (can include other presentation and ranking settings too if required):
-cgscope1=<gscope no. to upweight> -cool.68=<weighting>
Note: weighting accepts a value within 0.0-1.0.
e.g. for the publications search -cgscope1=pubsScope -cool.68=0.7
Method 2: use section metadata and query language upweight operator
Step 1. Apply section metadata to the site content in a common field
e.g. for the previous example tag each page with an appropriate section metadata field.
<meta name="section: publications"/>
<meta name="section: media"/>
Step 3. Update the collection
Update the collection - this will require a full update to ensure the new metadata is gathered as well as indexed.
Step 4. Pass the hidden metadata constraint with the query
This can be achieved in a number of ways such as including a hidden parameter in the search box, or setting the value in a hook script.
e.g. Deploy a search box that sets a hidden metadata constraint for the system query.
<input type="hidden" name="s" value="~fbSection:publications^0.8" />
Note: The tilde operator tells Funnelback to apply a custom weight indicated by the weighting following the carat.
The above example configures Funnelback to apply a moderate upweight (0.8) to pages that have the section metadata class set to the value of publications.