Metadata is missing or truncated in search results

Background

This article shows how to investigate the search results that have either missing or truncated metadata.

Search results is missing metadata

Empty or missing metadata in the search results is quite a common problem which can have a number of causes:

  • Funnelback hasn’t been configured to return the desired metadata classes by setting appropriate summary fields (-SF) and summary mode (-SM) query processor options.

  • The template hasn’t be configured to display the correct metadata classes.

  • The collection hasn’t been correctly configured to map the metadata fields to the metadata classes.

  • The metadata is not present in the source documents.

  • Metadata that is applied via external metadata is not correctly specified.

  • Any filters that generate/extract metadata are not working correctly.

Don’t forget that metadata classes are case sensitive so check that the case of the metadata class used exactly matches what has been configured in the metadata mappings configuration.

Investigate missing metadata

This tutorial covers the different things to check when debugging missing metadata.

  1. Run a search that returns results that have metadata that is expected but missing.

  2. View the data model for the search and locate the result with the missing metadata. Check to see if the metadata is listed in the metaData and listMetadata elements of the search result. If the metadata is listed and showing in the data model but not showing to the end user then the template is misconfigured.

  3. If the metadata is missing from the data model alter the data model URL to force the metadata summary mode and to return the fields that you are interested in (&SM=meta&SF=[<METADATA-FIELDS>]). Update <METADATA-FIELDS> to be a comma separated list of fields you wish to see. Again inspect the metaData and listMetadata elements. If the items are now present then you need to update the query processor options to set the SF value to add the missing metadata fields, and possibly also set an SM value (metadata should be returned by default for most queries unless SM is set to override the default settings). Remember these need to be set on the collection that is accepting the queries (often a meta collection rather than the component).

  4. If a missing query processor option isn’t the cause then we need to look at the index itself. The first thing to check is the metadata mappings on the source collection to verify that the metadata fields are mapped to the correct classes. If the metadata mapping is missing add the mapping then re-index the collection and this should fix the issue.

  5. If the metadata mappings are present then you’ll need to check the source data to see what Funnelback is indexing. The next step will depend on where the metadata comes from.

    1. Look at the cached version the page and check the embedded metadata - this will show you what Funnelback indexed and will include metadata that was present in the source document and anything that was injected via filters that was written into the page. Also check the HTTP headers at the bottom of the cached copy.

    2. Look at the live version the page and check the embedded metadata - this will show you what is currently embedded in the page. Note: this could have changed since Funnelback indexed it but it’s worth checking here and then comparing to the cached version of the page.

    3. Check external_metadata.cfg for the collection to see if the metadata is attached to the page this way. If external metadata is being used to add the metadata then there may be an error in the external metadata file (such as multiple matching lines), or the URL may not match correctly. Check the Step-Index.log and locate the section where the external metadata rules are loaded. If there are problems with the external metadata file then messages will be printed to the log.

    4. If the missing metadata is injected via a filter then check the filter logs for any messages relating to filtering - if the filter had any syntax errors it won’t have been loaded at all. You may also need to modify the filter to print out more information if there’s an error in the logic implemented in the filter.

Metadata is truncated in search results

Sometimes you will see metadata presented in the search results that is cut off or truncated. There are a number of different causes for this:

  • The template is truncating the metadata when it is displayed (in the Freemarker using something like the <@s.Cut> macro or via on of the Freemerker built-in functions).

  • A hook script is truncating the metadata in the data model.

  • The metadata buffer is too small causing the metadata to be truncated when it is returned from the index.

  • The maximum field length for indexed metadata is too small (controlled by the -mdsfml indexer option).

Investigate truncated metadata

This tutorial goes through the steps required to debug truncated metadata.

  1. Run a search that results in metadata getting truncated.

  2. View the data model for the search and locate the result with the truncated metadata.

  3. Check to see if the metadata is truncated in the data model - if it is complete when viewing the data model but truncated in the the search results that you viewed then check the search result template as this will be truncating the metadata when it’s displayed (e.g. with a <@s.Cut> or similar).

  4. If the metadata is truncated in the data model as well then the next thing to try is to increase the size of the metadata buffer. The metadata buffer size usually only needs to be altered when you have a large amount of metadata for a collection. Add the following to the URL for the data model results &MBL=10000 and press enter. Recheck the metadata field to see if it is still truncated. If the metadata field is no longer truncated then you wil need to increase the size of the metadata buffer in the query processor options for the search (either in padre_opts.cfg for the profile, or in collection.cfg to set it at the collection level. Experiment with smaller values of MBL to find an appropriate value to set as large values will increase the memory required to run the search). Once an appropriate value is determined set the buffer size using the -MBL query processor option.

  5. If the metadata is still truncated even with a large metadata buffer then it is most likely that the truncation has occurred at index time (or it is truncated in the source data). However before you check this inspect the collection’s post-process and post-datafetch hook scripts to ensure that these are not responsible for truncating the metadata.

  6. If the hook scripts are not responsible try increasing the metadata field length indexer option (-mdsfml) on the collection that contains the result. Note that if you are searching a meta collection then you will need to change this setting on the component collection that includes the search result. After you change the -mdsfml value you will need to reindex the live view of the collection. The Step-Index.log will indicate if metadata has been truncated.

  7. After the re-indexing is complete recheck the searches - both at the data model level, then what is returned to the user. The metadata field should now be complete (or if it is still truncated then re-run through the steps above trying larger values of the settings).