Built-in filters - Filter all documents as XML (ForceXMLMime)

The ForceXMLMime filter forces all documents that are processed by the filter framework to present a text/xml MIME type. This is useful where the data source only includes content from a single XML file or set of XML files and the web server that hosts the XML file is not returning the correct MIME type.

This filter should only be used if all the files being processed are XML files. Use with other file types may result in unexpected behavior or filter errors.
The indexer also includes an indexer option -forcexml that forces the indexer to process all documents as XML. This can be used as an alternative if no custom filtering is being performed that processes XML files.

Enabling

To enable the filter add ForceXMLMime to the filter chain.

The ForceXMLMime filter must appear in the filter chain before other filters that rely on the text/xml MIME type.

Example

To force the gatherer to treat all documents as XML then process the document using a custom transformXML filter:

filter.classes=<default_filter_chain>:ForceXMLMime:com.example.transformXML