The Stew API

(Follow this link to go back to the README file.)

Stew is a JavaScript library that extends CSS selector with regular expressions.

It is primarily intended to be used in a Node.js environment.1

Installing

Stew is deployed as an npm module under the name stew-select. Hence you can install a pre-packaged version with the command:

npm install stew-select

and you can add it to your project as a dependency by adding a line like:

"stew-select": "latest"

to the dependencies or devDependencies part of your package.json file.

Importing

Stew can be loaded into a Node.js program as follows:

var Stew = require('stew-select').Stew;

The Stew type is an instantiable class, hence (if you're content with the default configuration) you might prefer this alternative:

var stew = new (require('stew-select')).Stew();

Stew also exposes a class named DOMUtil, which can be loaded like this:

var DOMUtil = require('stew-select').DOMUtil;

or like this:

var domutil = new (require('stew-select')).DOMUtil();

API

Stew

stew.select(dom,selector)

This variation of select accepts a DOM object (generated by node-htmlparser) and string containing CSS selector and returns an array of DOM nodes that match the given selector.

For example:

var author_links = stew.select(dom, 'a[href][rel="author"]');

stew.select(html,selector,callback)

This variation of select accepts a string containing HTML, a string containing a CSS selector and a callback method (with the signature callback(err,nodeset)) and passes an array of matching DOM nodes to the callback.

The HTML is parsed using node-htmlparser, if available.

If an error occurs during parsing, it will be passed as the first argument to the callback.

For example:

stew.select(dom, 'a[href][rel="author"]', function(err,nodeset) {
  if(err) {
    console.error(err);
  } else {
    console.log(nodeset);
  }
});

stew.select_first(dom,selector)

This variation of select_first accepts a DOM object (generated by node-htmlparser) and string containing CSS selector and returns the first DOM node that matches the selector.

var title_tag = stew.select(dom, 'head title');

stew.select_first(html,selector,callback)

This variation of select_first accepts a string containing HTML, a string containing a CSS selector and a callback method (with the signature callback(err,node)) and passes the first matching DOM node to the callback.

The HTML is parsed using node-htmlparser, if available.

If an error occurs during parsing, it will be passed as the first argument to the callback.

For example:

stew.select_first(dom, 'html title', function(err,title_tag) {
  if(err) {
    console.error(err);
  } else {
    console.log(domutil.to_text(title_tag));
  }
});

DOMUtil

domutil.parse_html(html,callback)

parse_html accepts a string of HTML and a callback method (with the signature callback(err,node)). The HTML is parsed and the corresponding DOM node will be passed to the callback function.

If html contains more than one "root" node, an array of DOM nodes will be passed to the callback function.

The HTML is parsed using node-htmlparser, if available.

If an error occurs during parsing, it will be passed as the first argument to the callback.

For example, the JavaScript snippet:

var html = '<div>First doc</div> <span><i>Second</i> doc</span>';

domutil.parse_html( html, function(err,dom) {
  if(err) {
    console.error(err);
  } else {
    console.log(dom.length);
  }
});

will output 2, and

var html = '<body>Only doc</body>';

domutil.parse_html( html, function(err,dom) {
  if(err) {
    console.error(err);
  } else {
    console.log(dom.name);
  }
});

will output body.

domutil.to_text(node)

to_text accepts a DOM node and returns a string containing the text content of node or node's descendants.

For example, the JavaScript snippet:

var html = '<span>This example has <b>bold</b> and <i>italic</i> text.</span>';

domutil.parse_html( html, function(err,dom) {
  if(err) {
    console.error(err);
  } else {
    console.log(domutil.to_text(dom));
  }
});

will print:

This example has bold and italic text.

domutil.to_text(node,accept)

This variant of to_text accepts a DOM node and boolean valued filter (with the signature accept(node)) and returns a string containing the text content any of node or node's descendants for which accept(node) returns true

For example, the JavaScript snippet:

var html = '<span>This example has <b>bold</b> and <i>italic</i> text.</span>';
var not_italic = function(node) {
  return node.type != 'tag' || node.name != 'i';\
}

domutil.parse_html( html, function(err,dom) {
  if(err) {
    console.error(err);
  } else {
    console.log(domutil.to_text(dom,not_italic));
  }
});

will print:

This example has bold and  text.

domutil.to_html(node)

to_html accepts a DOM node and returns a string containing an HTML representation of the node and its children.

For example, the JavaScript snippet:

var html = '<span>This example has <b>bold</b> and <i>italic</i> text.</span>';

domutil.parse_html( html, function(err,dom) {
  if(err) {
    console.error(err);
  } else {
    console.log(domutil.to_html(dom));
  }
});

will print:

<span>This example has <b>bold</b> and <i>italic</i> text.</span>

domutil.inner_html(node)

inner_html accepts a DOM node and returns a string containing an HTML representation of node's children.

For example, the JavaScript snippet:

var html = '<span>This example has <b>bold</b> and <i>italic</i> text.</span>';

domutil.parse_html( html, function(err,dom) {
  if(err) {
    console.error(err);
  } else {
    console.log(domutil.inner_html(dom));
  }
});

will print:

This example has <b>bold</b> and <i>italic</i> text.

  1. Although it probably wouldn't be difficult to make Stew work in a browser context, we haven't had any need for that, and so we haven't (yet) attempted to do it. Drop us a note if this is something you'd like to see Stew support.