Package com.funnelback.common.utils
Class SharedXMLUtils
- java.lang.Object
-
- com.funnelback.common.utils.SharedXMLUtils
-
public class SharedXMLUtils extends Object
Shared XML Utilities that can help parse documents and transform them in a safe way that aims to avoid common OWASP vulnerabilities.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static class
SharedXMLUtils.NoOpEntityResolver
-
Field Summary
Fields Modifier and Type Field Description private static SharedXMLUtils.NoOpEntityResolver
NOOP_ENTITY_RESOLVER
private static TransformerFactory
tf
static String
XML
XML file extension
-
Constructor Summary
Constructors Modifier Constructor Description private
SharedXMLUtils()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description private static DocumentBuilder
documentBuilder()
static Document
fromBytes(byte[] xml)
Parse a Document from a given byte[].static Document
fromFile(File file)
Parse a Document from a given File path.static Document
fromInputSource(InputSource is)
Parse a Document from an inputSource.static Document
fromInputStream(InputStream is)
Parse a Document from an inputStream.static Document
fromString(String xmlString)
Parse a Document from a given String.static Transformer
getTransformer(String encoding)
Get a transformer that can be used for transforming a Document back out into an output stream.static byte[]
toBytes(Document document, String characterEncodingOfBytes)
Converts a given Document back out to a byte array.static String
toString(String xml)
Round trip conversion of a given XML string back into XML with the secure transformations applied.static String
toString(Document document)
Converts a given Document back out to a String.
-
-
-
Field Detail
-
XML
public static final String XML
XML file extension- See Also:
- Constant Field Values
-
tf
private static final TransformerFactory tf
-
NOOP_ENTITY_RESOLVER
private static final SharedXMLUtils.NoOpEntityResolver NOOP_ENTITY_RESOLVER
-
-
Method Detail
-
getTransformer
public static Transformer getTransformer(String encoding)
Get a transformer that can be used for transforming a Document back out into an output stream. Note that the returned Transformer is not threadsafe. What that means is that it should not be used in multiple threads. Note: The filter framework is inherintly called by multiple crawler threads, so when using it for a plugin should call this method repeatedly and use a fresh transformer each time rather than getting one in the constructor for re-use.- Parameters:
encoding
- specifies the preferred character encoding e.g. UTF-8- Returns:
- A new transformer instance.
-
fromInputSource
public static Document fromInputSource(InputSource is) throws IllegalArgumentException, RuntimeException
Parse a Document from an inputSource. Internally uses a custom documentBuilder instance with many security settings enabled.- Parameters:
is
- - e.g. new InputSource(bufferedReader)- Returns:
- Document - for use with a transformer, or xpath evaluation.
- Throws:
IllegalArgumentException
- when the XML from the input source is bad.RuntimeException
- when somethnig is wrong with the parser itself.
-
fromInputStream
public static Document fromInputStream(InputStream is) throws IllegalArgumentException, RuntimeException
Parse a Document from an inputStream. Internally uses a custom documentBuilder instance with many security settings enabled. Example usage from the filter framework: public FilterResult filterAsBytesDocument(BytesDocument document, FilterContext filterContext) { Document doc = SharedXMLUtils.fromInputStream(document.contentAsInputStream()) // Do some processing to change or query the document. var bos = new ByteArrayOutputStream(); SharedXMLUtils.getTransformer("UTF-8").transform(new DOMSource(doc), new StreamResult(bos)); byte[] documentContentsAsBytes = bos.toByteArray(); }- Parameters:
is
- - document to read/parse- Returns:
- Document - after parsing.
- Throws:
IllegalArgumentException
RuntimeException
-
fromFile
public static Document fromFile(File file)
Parse a Document from a given File path. Useful for testing. Internally uses a custom documentBuilder instance with many security settings enabled.- Parameters:
file
- to parse into a Document- Returns:
- Document after parsing.
-
toString
public static String toString(Document document)
Converts a given Document back out to a String. Useful for testing/debugging.- Parameters:
document
- the entire HTML or XML document
-
toBytes
public static byte[] toBytes(Document document, String characterEncodingOfBytes)
Converts a given Document back out to a byte array. Useful for testing/debugging.- Parameters:
document
- the entire HTML or XML documentcharacterEncodingOfBytes
- specifies the preferred character encoding e.g. UTF-8
-
toString
public static String toString(String xml)
Round trip conversion of a given XML string back into XML with the secure transformations applied.- Parameters:
xml
- string to verify XML
-
fromString
public static Document fromString(String xmlString)
Parse a Document from a given String. Useful for testing. Internally uses a custom documentBuilder instance with many security settings enabled.- Parameters:
xmlString
- to parse into a Document- Returns:
- Document after parsing.
-
fromBytes
public static Document fromBytes(byte[] xml)
Parse a Document from a given byte[]. Useful for testing. Internally uses a custom documentBuilder instance with many security settings enabled.- Parameters:
xml
- to parse into a Document- Returns:
- Document after parsing.
-
documentBuilder
private static DocumentBuilder documentBuilder()
-
-