Class SharedXMLUtils


  • public class SharedXMLUtils
    extends Object
    Shared XML Utilities that can help parse documents and transform them in a safe way that aims to avoid common OWASP vulnerabilities.
    • Constructor Detail

      • SharedXMLUtils

        private SharedXMLUtils()
    • Method Detail

      • getTransformer

        public static Transformer getTransformer​(String encoding)
        Get a transformer that can be used for transforming a Document back out into an output stream. Note that the returned Transformer is not threadsafe. What that means is that it should not be used in multiple threads. Note: The filter framework is inherintly called by multiple crawler threads, so when using it for a plugin should call this method repeatedly and use a fresh transformer each time rather than getting one in the constructor for re-use.
        Parameters:
        encoding -
        Returns:
        A new transformer instance.
      • fromInputSource

        public static Document fromInputSource​(InputSource is)
                                        throws IllegalArgumentException,
                                               RuntimeException
        Parse a Document from an inputSource. Internally uses a custom documentBuilder instance with many security settings enabled.
        Parameters:
        inputSource - - e.g. new InputSource(bufferedReader)
        Returns:
        Document - for use with a transformer, or xpath evaluation.
        Throws:
        IllegalArgumentException - when the XML from the input source is bad.
        RuntimeException - when somethnig is wrong with the parser itself.
      • fromInputStream

        public static Document fromInputStream​(InputStream is)
                                        throws IllegalArgumentException,
                                               RuntimeException
        Parse a Document from an inputStream. Internally uses a custom documentBuilder instance with many security settings enabled. Example usage from the filter framework: public FilterResult filterAsBytesDocument(BytesDocument document, FilterContext filterContext) { Document doc = SharedXMLUtils.fromInputStream(document.contentAsInputStream()) // Do some processing to change or query the document. var bos = new ByteArrayOutputStream(); SharedXMLUtils.getTransformer("UTF-8").transform(new DOMSource(doc), new StreamResult(bos)); byte[] documentContentsAsBytes = bos.toByteArray(); }
        Parameters:
        InputStream - - document to read/parse
        Returns:
        Document - after parsing.
        Throws:
        IllegalArgumentException
        RuntimeException
      • fromFile

        public static Document fromFile​(File file)
        Parse a Document from a given File path. Useful for testing. Internally uses a custom documentBuilder instance with many security settings enabled.
        Parameters:
        file - to parse into a Document
        Returns:
        Document after parsing.
      • toString

        public static String toString​(Document document)
        Converts a given Document back out to a String. Useful for testing/debugging.
        Parameters:
        document -
        Returns:
      • toBytes

        public static byte[] toBytes​(Document document,
                                     String charcterEncodingOfBytes)
        Converts a given Document back out to a byte array. Useful for testing/debugging.
        Parameters:
        document -
        charcterEncodingOfBytes - - e.g. UTF-8
        Returns:
      • toString

        public static String toString​(String xml)
        Round trip conversion of a given XML string back into XML with the secure transformations applied.
        Parameters:
        xml -
        Returns:
      • fromString

        public static Document fromString​(String xmlString)
        Parse a Document from a given String. Useful for testing. Internally uses a custom documentBuilder instance with many security settings enabled.
        Parameters:
        xmlString - to parse into a Document
        Returns:
        Document after parsing.
      • fromBytes

        public static Document fromBytes​(byte[] xml)
        Parse a Document from a given byte[]. Useful for testing. Internally uses a custom documentBuilder instance with many security settings enabled.
        Parameters:
        xml - to parse into a Document
        Returns:
        Document after parsing.