libxml2-wasm
Why Another Xml Lib?
Comparing to the previous two main approaches, pure javascript implementation as well as traditional C implementation binding, using WebAssembly combines the pros from both sides, providing good performance while keeping best compatibility with modern Javascript runtime.
Javascript Implementation | Traditional C Binding | WebAssembly | |
---|---|---|---|
Parsing Speed | Average1 | Fast | Fast |
C/C++ Toolchain at Runtime Environment | Not required | Required2 | Not Required |
Prebuilt Binaries | N/A | One for each OS/Runtime version | Universal for all OS/Runtime versions |
Prebuilt Binary Compatibility | N/A | May broke across libc versions | Very Good |
Browser Compatibility | Yes | No | Yes |
Supported Environments
Due to the usage of WebAssembly, ES module and top level await etc, it requires the minimum version of the following environments,
Environment | Version |
---|---|
NodeJs | v16+ |
Chrome | V89+ |
Edge | V89+ |
Safari | v15+ |
Getting Started
Install
Install the libxml2-wasm
npm package in your most convenient way, e.g.
npm i libxml2-wasm
Import the lib
libxml2-wasm
is an ES module, importing it are different between ES module and commonJS module.
ESM
Import it directly.
import { XmlDocument } from 'libxml2-wasm';
const doc = XmlDocument.fromString('<note><to>Tove</to></note>');
doc.dispose();
CommonJS
Dynamic import is needed:
import('libxml2-wasm').then(({ XmlDocument }) => {
const doc = XmlDocument.fromString('<note><to>Tove</to></note>');
doc.dispose();
});
IMPORTANT: dispose()
is required to avoid memory leak.
Parsing XML
libxml2-wasm supports parsing xml from a string or from a buffer:
import fs from 'node:fs';
import { XmlDocument } from 'libxml2-wasm';
const doc1 = XmlDocument.fromString('<note><to>Tove</to></note>');
const doc2 = XmlDocument.fromBuffer(fs.readFileSync('doc.xml'));
doc1.dispose();
doc2.dispose();
The underlying libxml2 library processes MBCS(mostly UTF-8) only,
the UTF-16 string in Javascript needs an extra step to be converted,
thus XmlDocument.fromBuffer
is much faster than XmlDocument.fromString
.
See the benchmark.
Query nodes
XmlNode
has get
and find
methods which both use xpath to find the node.
Their different is, get
returns the first found node while find
returns all found nodes.
import { XmlDocument } from 'libxml2-wasm';
const doc = XmlDocument.fromString('<note><to>Amy</to><to>Bob</to></note>');
console.log(doc.root.get('to').content); // Amy
console.log(doc.root.find('to').map((node) => node.content).join()); // Amy,Bob
doc.dispose();
Although get
and find
can be used to get attributes of an element,
attr()
and attrs
could be more efficient:
import { XmlDocument } from 'libxml2-wasm';
const doc = XmlDocument.fromString('<line from="left" to="right"/>');
console.log(doc.root.get('@from').content); // left
console.log(doc.root.attr('from').content); // left
console.log(doc.root.find('@*').map((node) => node.content).join()); // left,right
console.log(doc.root.attrs.map((node) => node.content).join()); // left,right
doc.dispose();
When an XPath is used many times,
you could create an XmlXPath
object to avoid redundantly parsing XPath string.
import { XmlDocument, XmlXPath } from 'libxml2-wasm';
const xpath = new XmlXPath('/book/title');
const doc1 = XmlDocument.fromString('<book><title>Harry Potter</title></book>');
const doc2 = XmlDocument.fromString('<book><title>Learning XML</title></book>');
console.log(doc1.get(xpath).content); // Harry Potter
console.log(doc2.get(xpath).content); // Learning XML
doc1.dispose();
doc2.dispose();
xpath.dispose();
Note that similar to XmlDocument
, XmlXPath
owns native memory and needs to be disposed explicitly.
Validating XML
To validate an XML, create the validator from the schema first, then use the validator to validate the XML document.
import fs from 'node:fs';
import { XmlDocument, XsdValidator } from 'libxml2-wasm';
const schema = XmlDocument.fromBuffer(fs.readFileSync('schema.xsd'));
const validator = XsdValidator.fromDoc(schema);
const doc = XmlDocument.fromBuffer(fs.readFileSync('document.xml'));
try {
validator.validate(doc);
} catch (err) {
console.log(err.message);
}
doc.dispose();
validator.dispose();
schema.dispose();
RELAX NG is also supported, with another validator class RelaxNGValidator
.
For the further detail of the APIs, please check the API Doc.