libxml2-wasm supports parsing xml from a string or from a buffer:
import fs from 'node:fs';
import { XmlDocument } from 'libxml2-wasm';
const doc1 = XmlDocument.fromString('<note><to>Tove</to></note>');
const doc2 = XmlDocument.fromBuffer(fs.readFileSync('doc.xml'));
doc1.dispose();
doc2.dispose();
The underlying libxml2 library processes MBCS(mostly UTF-8) only,
the UTF-16 string in Javascript needs an extra step to be converted,
thus XmlDocument.fromBuffer
is much faster than XmlDocument.fromString
.
See the benchmark.
XInclude 1.0 is supported.
When libxml2 find <xi:include>
tag and need the content of another XML,
it uses the callbacks to read the data.
With xmlRegisterInputProvider
,
an XmlInputProvider
object with a set of 4 callbacks could be registered.
These 4 callbacks are
match
open
read
close
First, match
will be called with the url of the included XML.
if match
returns true
,
the other 3 corresponding callbacks will be used to retrieve the content of the XML;
otherwise, other set of callbacks will be considered.
Sometimes the href
attribute of the xinclude tag has a relative path.
In this case, an initial url could be passed into the parsing function,
so that libxml could calculate the actual url of the included XML.
For example, if the href
is sub.xml
,
and the parent XML is parsed in the following call,
const doc = XmlDocument.fromBuffer(
await fs.readFile('/path/to/doc.xml'),
{ url: 'file:///path/to/doc.xml' },
);
doc.dispose();
The registered callbacks will be called with file name file:///path/to/sub.xml
.
For Node.js user who need the callbacks for local file access,
module nodejs predefines fsInputProviders
,
which supports file path or file url.
To enable it, register this provider,
or simply call xmlRegisterFsInputProviders
:
import { XmlDocument } from 'libxml2-wasm';
import { xmlRegisterFsInputProviders } from 'libxml2-wasm/lib/nodejs.mjs';
xmlRegisterFsInputProviders();
const doc = XmlDocument.fromBuffer(
await fs.readFile('path/to/doc.xml'),
{ url: 'path/to/doc.xml' },
);
doc.dispose();
XmlDocument.toBuffer
dumps the content of the XML DOM tree into a buffer gradually,
and calls the XmlOutputBufferHandler
to process the data.
Note that UTF-8 is the only supported encoding for now.
Based on toBuffer
, two more convenience functions are provided:
XmlDocument.toString
and saveDocSync
.
For example, to save an XML to compact string,
xml.toString({ format: false });
To save a formatted XML to file in Node.js environment,
import { saveDocSync } from 'libxml2-wasm/lib/nodejs.mjs';
saveDocSync(xml);