Class XmlIndexingConfig


  • public class XmlIndexingConfig
    extends Object
    • Field Detail

      • contentPaths

        private List<ContentPath> contentPaths
        When empty the field 'whenNoContentPaths' describes what is indexed, when non empty only the text within the given paths are indexed.
      • documentPaths

        private List<DocumentPath> documentPaths
        Defines where the documents are within the XML, this can be used to split XML documents. Like urlPaths it is recommended to do this within a filter.
      • fileTypePaths

        private List<FileTypePath> fileTypePaths
        The type at this path (e.g. HTML, PDF, DOC) will be used by the query process to report the original document type. The last file type found will be used.
      • innerDocumentPaths

        private List<InnerDocumentPath> innerDocumentPaths
        Maps an element withing an XML document which contains a (XML escaped) document that may itself be HTML/XML/text. e.g. /root/html could be a path to an element which contains HTML. The indexer will index that document as though it is HTML.
      • urlPaths

        private List<UrlPath> urlPaths
        The URL at this path will be used as the documents URL. This will typically cause cached copies to no longer work, this can not be used with Push collections, this path must come before inner HTML documents with links. It is recommended that filtering be used to change the URL instead to avoid those issues
    • Method Detail

      • getContentPaths

        public List<ContentPath> getContentPaths()
        When empty the field 'whenNoContentPaths' describes what is indexed, when non empty only the text within the given paths are indexed.
      • getDocumentPaths

        public List<DocumentPath> getDocumentPaths()
        Defines where the documents are within the XML, this can be used to split XML documents. Like urlPaths it is recommended to do this within a filter.
      • getFileTypePaths

        public List<FileTypePath> getFileTypePaths()
        The type at this path (e.g. HTML, PDF, DOC) will be used by the query process to report the original document type. The last file type found will be used.
      • getInnerDocumentPaths

        public List<InnerDocumentPath> getInnerDocumentPaths()
        Maps an element withing an XML document which contains a (XML escaped) document that may itself be HTML/XML/text. e.g. /root/html could be a path to an element which contains HTML. The indexer will index that document as though it is HTML.
      • getUrlPaths

        public List<UrlPath> getUrlPaths()
        The URL at this path will be used as the documents URL. This will typically cause cached copies to no longer work, this can not be used with Push collections, this path must come before inner HTML documents with links. It is recommended that filtering be used to change the URL instead to avoid those issues
      • setWhenNoContentPathsAreSet

        public void setWhenNoContentPathsAreSet​(WhenNoContentPathsAreSet whenNoContentPathsAreSet)
        May be set null.