Jsoup Filter example - extract and add metadata

This example shows how to extract some information from your document and add a corresponding metadata value:

package com.example.pluginexamples;

import com.funnelback.common.filter.jsoup.FilterContext;
import com.funnelback.common.filter.jsoup.IJSoupFilter;

/**
 * Demonstrates using a plugin to count number of level 2 headings in a document.
 */
public class JsoupFilterExtractAddMetadata implements IJSoupFilter {

    @Override
    public void processDocument(FilterContext filterContext) {

        Document doc = filterContext.getDocument(); (1)

        // Find the number of elements with that class.
        int h2Count = doc.getElementsByTag("h2").size(); (2)

        // Add the count to metadata which can be mapped to a metadata class to be used in searches.
        filterContext.getAdditionalMetadata().put("h2-count", Integer.toString(h2Count)); (3)
    }

}
1 Load the document content as a jsoup object.
2 Count the number of H2 elements in the document by selecting all the h2 tags in the document and counting the number of matches.
3 Add the count of H2 elements as an additional metadata field <meta name="h2-count">
When you add metadata to a document using a jsoup filter the metadata is inserted into the HTML as a <meta> element, which will show up when you view the document’s cached version. This differs from document filters which adds the metadata to the document’s metadata multimap, which stores the metadata separate from the document content.