Rodney Waldhoff
HeyRod.com
Email

GitHub
StackOverflow
LinkedIn

13 ops snippets

Snippets are tiny notes I've collected for easy reference.

gracefully closing node.js applications via signal handling

To make your node.js application gracefully respond to shutdown signals, use process.on(SIGNAL,HANDLER).

For example, to respond to SIGINT (typically Ctrl-c), you can use:

process.on( "SIGINT", function() {
  console.log('CLOSING [SIGINT]');
  process.exit();
} );

Note that without the process.exit(), the program will not be shutdown. (This is you chance to override or "trap" the signal.)

Some common examples (in CoffeeScript):

process.on 'SIGHUP',  ()->console.log('CLOSING [SIGHUP]');  process.exit()
process.on 'SIGINT',  ()->console.log('CLOSING [SIGINT]');  process.exit()
process.on 'SIGQUIT', ()->console.log('CLOSING [SIGQUIT]'); process.exit()
process.on 'SIGABRT', ()->console.log('CLOSING [SIGABRT]'); process.exit()
process.on 'SIGTERM', ()->console.log('CLOSING [SIGTERM]'); process.exit()

PS: On Linux (and similar) you can enter kill -l on the command line to see a list of possible signals, and kill -N PID to send signal N to the process with process ID PID.

Published 8 Jan 2013

Tagged nodejs, javascript, coffeescript, cli, linux, service, ops and dev.

Command-line tool for spidering sites and extracting XML/HTML content

Xidel is a robust tool for spidering, extracting and transforming XML/HTML content from the command line.

It's like wget or curl with a CSS and XPath/XQuery engine (among other features), attached.

xidel doesn't seem to be in the package management repositories I normally use, but you can download it here.

The following example will (1) download a web page, (2) extract a list of links (specified via CSS selector) from it, (3) download the page corresponding to each of those links and finally (4) extract specific pieces of content (specified by CSS selectors) from each page:

xidel [URL-OF-INDEX-PAGE] \
  --follow "css('[CSS-SELECTOR-FOR-LINKS]')" \
  --css "[CSS-SELECTOR-FOR-SOME-TEXT]" \
  --extract "inner-html(css('[CSS-SELECTOR-FOR-SOME-HTML]'))"

As a concrete example, the command:

$ xidel http://reddit.com -f  "css('a')" --css title

will download every page linked from the reddit.com homepage and print the content of its title tag.

There are several more examples on the Xidel site.

Published 11 Feb 2014

Tagged linux, tool, xml, css, html, xpath, one-liner and ops.

Shell script for service-like CoffeeScript/Node.js apps using forever

This is an example of a (bash) shell script that uses the forever module to start and stop a CoffeeScript application as if it were a service.

(Also at rodw/coffee-as-a-service-via-forever.sh.)

Published 11 Feb 2014

Tagged nodejs, linux, service and ops.

Backup or mirror a website using wget

To create a local mirror or backup of a website with wget, run:

wget -r -l 5 -k -w 1 --random-wait <URL>

Where:

-r (or --recursive) will cause wget to recursively download files
-l N (or --level=N) will limit recursion to at most N levels below the root document (defaults to 5, use inf for infinite recursion)
-k (or --convert-links) will cause wget to convert links in the downloaded documents so that the files can be viewed locally
-w (or --wait=N) will cause wget to wait N seconds between requests
--random-wait will cause wget to randomly vary the wait time to 0.5x to 1.5x the value specified by --wait

Some additional notes:

--mirror (or -m) can be used as a shortcut for -r -N -l inf --no-remove-listing which enables infinite recursion and preserves both the server timestamps and FTP directory listings.
-np (--no-parent) can be used to limit wget to files below a specific "directory" (path).

Published 10 Feb 2014

Tagged wget, linux, http, one-liner, web, backup, tool and ops.

Pre-generate pages or load a web cache using wget

Many web frameworks and template engines will defer the generation the HTML version of a document the first time it is accessed. This can make the first hit on a given page significantly slower than subsequent hits.

You can use wget to pre-cache web pages using a command such as:

wget -r -l 3 -nd --delete-after <URL>

Where:

-r (or --recursive) will cause wget to recursively download files
-l N (or --level=N) will limit recursion to at most N levels below the root document (defaults to 5, use inf for infinite recursion)
-nd (or --no-directories) will prevent wget from creating local directories to match the server-side paths
--delete-after will cause wget to delete each file as soon as it is downloaded (so the command leaves no traces behind.)

Published 10 Feb 2014

Tagged wget, linux, http, one-liner, performance, web, tool and ops.

Mapping port 80 to port 3000 using iptables

Port numbers less that 1024 are considered "privileged" ports, and you generally must be root to bind a listener to them.

Rather than running a network application as root, map the privileged port to a non-privileged one:

sudo iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 3000

Now requests to port 80 will be forwarded on to port 3000.

Published 8 Feb 2014

Tagged linux, networking, iptables, one-liner, http, tool and ops.

Cheat Sheet for Linux Run Levels

"Standard" Linux uses the following run levels:

Run Level 0 is halt (shutdown).
Run Level 1 is single-user mode.
Run Level 2 is multi-user mode (without networking)
Run Level 3 is multi-user mode (with networking). This is the normal "terminal" mode. (I.e., before the display manager is run).
Run Level 4 is undefined.
Run Level 5 is multi-usermode with a GUI display manager (X11).
Run Level 6 is reboot.

In Debian and its derivatives run levels 2 thru 5 are the same: multi-user mode with networking, and with a display manager if available.

Run Level 0 is halt (shutdown).
Run Level 1 is single-user mode.
Run Level 2-5 is multi-user mode with networking and a GUI display manager when available.
Run Level 6 is reboot.

Debian also adds Run Level S, which is executed when the system first boots.

Also see Wikipedia's article on run levels.

Tagged linux, debian, cheatsheet, service and ops.

How to disable services in Debian/Linux

The easy way is to install sysv-rc-conf:

aptitude install sysv-rc-conf
sysv-rc-conf

Manually, use update-rc.d and specify the run levels, like so:

update-rc.d SERVICE_NAME stop 0 1 6 3 . start 2 4 5 .

Tagged linux, debian, service and ops.

Find duplicate files on Linux (or OSX).

Find files that have the same size and MD5 hash (and hence are likely to be exact duplicates):

find -not -empty -type f -printf "%s\n" | \         # line 1
  sort -rn | \                                      # line 2
  uniq -d | \                                       # line 3
  xargs -I{} -n1 find -type f -size {}c -print0 | \ # line 4
  xargs -0 md5sum | \                               # line 5
  sort | \                                          # line 6
  uniq -w32 --all-repeated=separate | \             # line 7
  cut -d" " -f3-                                    # line 8

You probably want to pipe that to a file as it runs slowly.

Line 1 enumerates the real files non-empty by size.
Line 2 sorts the sizes (as numbers of descending size).
Line 3 strips out the lines (sizes) that only appear once.
For each remaining size, line 4 finds all the files of that size.
Line 5 computes the MD5 hash for all the files found in line 4, outputting the MD5 hash and file name. (This is repeated for each set of files of a given size.)
Line 6 sorts that list for easy comparison.
Line 7 compares the first 32 characters of each line (the MD5 hash) to find duplicates.
Line 8 spits out the file name and path part of the matching lines.

Some alternative approaches can be found at the original source.

Tagged linux, one-liner and ops.

Launch an HTTP server serving the current directory using Python

The Python SimpleHTTPServer module makes it easy to launch a simple web server using a current working directory as the "docroot".

With Python 2:

python -m SimpleHTTPServer

or with Python 3:

python3 -m http.server

By default, each will bind to port 8080, hence http://localhost:8080/ will serve the top level of the working directory tree. Hit Ctrl-c to stop.

Both accept an optional port number:

python -m SimpleHTTPServer 3001

python3 -m http.server 3001

if you want to bind to something other than port 8080.

Published 20 Feb 2014

Tagged python, http, cli, one-liner, ops and tool.

Backup an entire GitHub repository

The following shell script will back up an organization's GitHub repositories, including the all branches of the source tree and the GitHub wiki and issue list (if any).

(Also at rodw/backup-github.sh.)

Published 1 Jan 2014

Tagged git, backup and ops.

A Cheat Sheet for SQLite

General

Most of the SQLite "meta" commands begin with a dot. When in doubt, try .help
Use Ctrl-d or .exit or .quit to exit (and Ctrl-c to terminiate a long-running SQL query).
Enter .show to see current settings.

Meta-data

Enter .databases to see a list of mounted databases.
Enter .tables to see a list of table names.
Enter .index to see a list of index names.
Enter .schema TABLENAME to see the create table statement for a given table.

Import and Export

Enter .output FILENAME to pipe output to the specified file. (Use .output stdout to return to the default behavior or printing results to the console.)
Enter .mode [csv|column|html|insert|line|list|tabs|tcl] to change the way in which query results are printed.
Enter .separator DELIM to change the delimiter used in (list-mode) exports and imports. (Defaults to |.)
Enter .dump [TABLEPATTERN] to create a collection of SQL statements for recreating the database (or just those tables with naames matching the optional TABLEPATTERN).
Enter .read FILENAME to execute the specified file as a SQL script.

Published 18 Sep 2013

Tagged sql, sqlite, database, cheatsheet and ops.

backup a git repository with 'git bundle'

Run:

cd REPOSITORY_WORKING_DIRECTORY
git bundle create PATH_TO_BUNDLE.git --all

to create a single-file backup of the entire repository.

Note that the bundle file is a functional Git repository:

git clone PATH_TO_BUNDLE.git MY_PROJECT

Tagged git, backup, one-liner, ops and tool.

Snippets are tiny notes I've collected for easy reference.