Search package indexes

A search package combines the indexes of one or more data sources allowing a single search to run across the included data sources.

The indexes are combined dynamically, and include the nominated data source indexes when the query runs. There isn’t a separate index produced for the data source.

The only search package specific index files that are generated are a spelling and auto-completion indexes, which are produced for each results page that is configured on the search package. These indexes are regenerated at the end of the update process for each included data source.

Search packages inherit most of their index properties from the included data source indexes - this includes metadata classes and generalized scopes.

Understanding how search packages combine indexes

It is important to understand a few basics about how search packages aggregate content from the different data sources.

  • Metadata classes of all included data sources are combined: this means if you have a class called title in data source A and a class called title in data source B there will be a field called title in the search package that searches the title metadata from both data sources. This means you need to be careful about the names you choose for your metadata classes, ensuring that they only overlap when you intend them to. One technique you can use to avoid this is to namespace your metadata fields to keep them separate. (e.g. use something like websiteTitle instead of title in your website data source).

  • Generalized scopes of all included data sources are combined: the same principles as outlined above for metadata apply to gscopes. You can use gscopes to combine or group URLs across data sources by assigning using the same gscope ID in each data source, but only do this when it makes sense - otherwise you may get results that you don’t want if you choose to scope the search results using your defined gscopes.

  • Geospatial and numeric metadata: these are special metadata types and the value of the fields are interpreted in a special way. If you have any of these classes defined in multiple data sources ensure they are of the same type where they are defined.

  • Search packages combine the indexes at query time: this means you can add and remove data sources from the search package and immediately start searching across the indexes.

    auto-completion and spelling suggestions for the search package won’t be updated to match the changed search package content until one of the data sources completes a successful update.
  • You can scope a query only return information from specific data sources within a search package by supplying the clive parameter with a list of data sources to include.

  • If combined indexes contain an overlapping set of URLs then duplicates will be present in the search results (as duplicates are not removed at query time).