13 ops snippets

gracefully closing node.js applications via signal handling

To make your node.js application gracefully respond to shutdown signals, use process.on(SIGNAL,HANDLER).

For example, to respond to SIGINT (typically Ctrl-c), you can use:

process.on( "SIGINT", function() {
  console.log('CLOSING [SIGINT]');
  process.exit();
} );

Note that without the process.exit(), the program will not be shutdown. (This is you chance to override or "trap" the signal.)

Some common examples (in CoffeeScript):

process.on 'SIGHUP',  ()->console.log('CLOSING [SIGHUP]');  process.exit()
process.on 'SIGINT',  ()->console.log('CLOSING [SIGINT]');  process.exit()
process.on 'SIGQUIT', ()->console.log('CLOSING [SIGQUIT]'); process.exit()
process.on 'SIGABRT', ()->console.log('CLOSING [SIGABRT]'); process.exit()
process.on 'SIGTERM', ()->console.log('CLOSING [SIGTERM]'); process.exit()

PS: On Linux (and similar) you can enter kill -l on the command line to see a list of possible signals, and kill -N PID to send signal N to the process with process ID PID.

Published 8 Jan 2013

 

Command-line tool for spidering sites and extracting XML/HTML content

Xidel is a robust tool for spidering, extracting and transforming XML/HTML content from the command line.

It's like wget or curl with a CSS and XPath/XQuery engine (among other features), attached.

xidel doesn't seem to be in the package management repositories I normally use, but you can download it here.

The following example will (1) download a web page, (2) extract a list of links (specified via CSS selector) from it, (3) download the page corresponding to each of those links and finally (4) extract specific pieces of content (specified by CSS selectors) from each page:

xidel [URL-OF-INDEX-PAGE] \
  --follow "css('[CSS-SELECTOR-FOR-LINKS]')" \
  --css "[CSS-SELECTOR-FOR-SOME-TEXT]" \
  --extract "inner-html(css('[CSS-SELECTOR-FOR-SOME-HTML]'))"

As a concrete example, the command:

$ xidel http://reddit.com -f  "css('a')" --css title

will download every page linked from the reddit.com homepage and print the content of its title tag.

There are several more examples on the Xidel site.

Published 11 Feb 2014
Tagged linux, tool, xml, css, html, xpath, one-liner and ops.

 

Shell script for service-like CoffeeScript/Node.js apps using forever

This is an example of a (bash) shell script that uses the forever module to start and stop a CoffeeScript application as if it were a service.

#!/bin/bash
# This is an example of a (bash) shell script that uses the forever module ([1])
# to start and stop a CoffeeScript application as if it were a service.
#
# [1] <https://github.com/nodejitsu/forever>
# ASSUMPTIONS
################################################################################
# 1) You've got a CoffeeScript program you want to run via `forever`.
# To use plain Node.js/SSJS, remove the bits about `COFFEE_EXE`
# and change the `forever` command within the `start()` routine.
#
#
# 2) You've got a configuration file at `config/NODE_ENV.json` (where
# `NODE_ENV` is the value of the corresponding environment variable).
# If you don't care about this, remove the bits about checking
# for `NODE_ENV` and `config/NODE_ENV.json`.
#
# 3) `coffee` is already in your path or lives at `./node_modules/.bin`.
#
# 4) `forever` is already in your path or lives at `./node_modules/.bin`.
#
# 5) The CoffeeScript file you want to run is located at
# `./lib/APP-NAME.coffee`, where `APP-NAME` is the name of this file.
# CONFIGURATION
################################################################################
APP="lib/${0}"
CONFIG_DIR="./config"
LOGFILE="forever-`basename $0`.log"
OUTFILE="forever-`basename $0`.out"
ERRFILE="forever-`basename $0`.err"
## DISCOVER COFFEE EXE
if command -v coffee >/dev/null 2>&1; then
COFFEE_EXE="coffee"
else
COFFEE_EXE="./node_modules/.bin/coffee"
fi
## DISCOVER FOREVER EXE
if command -v forever >/dev/null 2>&1 ; then
FOREVER_EXE="forever"
else
FOREVER_EXE="./node_modules/.bin/forever"
fi
# ROUTINES
################################################################################
usage() {
echo "Usage: `basename $0` {start|stop|restart|status|checkenv}" >&2;
}
start() {
# check for NODE_ENV before launching (but launch anyway)
if [[ -z "${NODE_ENV}" ]]; then
echo -e "\n!WARNING! You probably want to set the NODE_ENV environment variable.\n"
fi
${FOREVER_EXE} start -a -l ${LOGFILE} -o ${OUTFILE} -e ${ERRFILE} -c ${COFFEE_EXE} ${APP};
}
stop() { ${FOREVER_EXE} stop ${APP}; }
status() { ${FOREVER_EXE} list; }
checkenv() {
STATUS=0
echo -e "\nChecking prerequisites.\n"
# check for NODE_ENV
if [[ ! -z "${NODE_ENV}" ]]; then
echo -e "NODE_ENV: SET - ${NODE_ENV}\n"
else
echo -e "NODE_ENV: NOT SET\n"
echo -e "!WARNING! You probably want to set the NODE_ENV environment variable.\n"
fi
# check for config/NODE_ENV.json
if [[ -e "${CONFIG_DIR}/${NODE_ENV}.json" ]]; then
echo -e " CONFIG: FOUND - ${CONFIG_DIR}/${NODE_ENV}.json\n"
else
echo -e " CONFIG: NOT FOUND - ${CONFIG_DIR}/${NODE_ENV}.json"
echo -e "!WARNING! You probably want to ensure that the file ${CONFIG_DIR}/[NODE_ENV].json exists.\n"
STATUS=3
fi
# check for coffee
if command -v ${COFFEE_EXE} >/dev/null 2>&1; then
echo -e " COFFEE: FOUND - ${COFFEE_EXE}\n"
else
echo " COFFEE: NOT FOUND - ${COFFEE_EXE}"
echo -e "!WARNING! The coffee executable could not be found. Is it in your PATH?\n"
STATUS=4
fi
# check for forever
if command -v ${FOREVER_EXE} >/dev/null 2>&1; then
echo -e " FOREVER: FOUND - ${FOREVER_EXE}\n"
else
echo " FOREVER: NOT FOUND - ${FOREVER_EXE}"
echo -e "!WARNING! The forever executable could not be found. Is it in your PATH?\n"
STATUS=5
fi
# report status
if [ $STATUS -ne 0 ]; then
echo -e "!WARNING! Required files or programs not found.\n The application may not work properly.\n"
else
echo -e "Everything seems to check out OK.\n"
fi
exit $STATUS
}
# MAIN LOOP
################################################################################
if [[ -z "${1}" ]]; then
usage
exit 1
else
case "$1" in
start)
start;
;;
restart)
stop; sleep 1; start;
;;
stop)
stop
;;
status)
status
;;
checkenv)
checkenv $1
;;
*)
usage
exit 6
;;
esac
exit 0
fi
################################################################################
# (eof)

(Also at rodw/coffee-as-a-service-via-forever.sh.)

Published 11 Feb 2014
Tagged nodejs, linux, service and ops.

 

Backup or mirror a website using wget

To create a local mirror or backup of a website with wget, run:

wget -r -l 5 -k -w 1 --random-wait <URL>

Where:

  • -r (or --recursive) will cause wget to recursively download files
  • -l N (or --level=N) will limit recursion to at most N levels below the root document (defaults to 5, use inf for infinite recursion)
  • -k (or --convert-links) will cause wget to convert links in the downloaded documents so that the files can be viewed locally
  • -w (or --wait=N) will cause wget to wait N seconds between requests
  • --random-wait will cause wget to randomly vary the wait time to 0.5x to 1.5x the value specified by --wait

Some additional notes:

  • --mirror (or -m) can be used as a shortcut for -r -N -l inf --no-remove-listing which enables infinite recursion and preserves both the server timestamps and FTP directory listings.
  • -np (--no-parent) can be used to limit wget to files below a specific "directory" (path).
Published 10 Feb 2014

 

Pre-generate pages or load a web cache using wget

Many web frameworks and template engines will defer the generation the HTML version of a document the first time it is accessed. This can make the first hit on a given page significantly slower than subsequent hits.

You can use wget to pre-cache web pages using a command such as:

wget -r -l 3 -nd --delete-after <URL>

Where:

  • -r (or --recursive) will cause wget to recursively download files
  • -l N (or --level=N) will limit recursion to at most N levels below the root document (defaults to 5, use inf for infinite recursion)
  • -nd (or --no-directories) will prevent wget from creating local directories to match the server-side paths
  • --delete-after will cause wget to delete each file as soon as it is downloaded (so the command leaves no traces behind.)
Published 10 Feb 2014

 

Mapping port 80 to port 3000 using iptables

Port numbers less that 1024 are considered "privileged" ports, and you generally must be root to bind a listener to them.

Rather than running a network application as root, map the privileged port to a non-privileged one:

sudo iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 3000

Now requests to port 80 will be forwarded on to port 3000.

Published 8 Feb 2014

 

Cheat Sheet for Linux Run Levels

"Standard" Linux uses the following run levels:

  • Run Level 0 is halt (shutdown).
  • Run Level 1 is single-user mode.
  • Run Level 2 is multi-user mode (without networking)
  • Run Level 3 is multi-user mode (with networking). This is the normal "terminal" mode. (I.e., before the display manager is run).
  • Run Level 4 is undefined.
  • Run Level 5 is multi-usermode with a GUI display manager (X11).
  • Run Level 6 is reboot.

In Debian and its derivatives run levels 2 thru 5 are the same: multi-user mode with networking, and with a display manager if available.

  • Run Level 0 is halt (shutdown).
  • Run Level 1 is single-user mode.
  • Run Level 2-5 is multi-user mode with networking and a GUI display manager when available.
  • Run Level 6 is reboot.

Debian also adds Run Level S, which is executed when the system first boots.

Also see Wikipedia's article on run levels.


 

How to disable services in Debian/Linux

The easy way is to install sysv-rc-conf:

aptitude install sysv-rc-conf
sysv-rc-conf

Manually, use update-rc.d and specify the run levels, like so:

update-rc.d SERVICE_NAME stop 0 1 6 3 . start 2 4 5 .
Tagged linux, debian, service and ops.

 

Find duplicate files on Linux (or OSX).

Find files that have the same size and MD5 hash (and hence are likely to be exact duplicates):

find -not -empty -type f -printf "%s\n" | \         # line 1
  sort -rn | \                                      # line 2
  uniq -d | \                                       # line 3
  xargs -I{} -n1 find -type f -size {}c -print0 | \ # line 4
  xargs -0 md5sum | \                               # line 5
  sort | \                                          # line 6
  uniq -w32 --all-repeated=separate | \             # line 7
  cut -d" " -f3-                                    # line 8

You probably want to pipe that to a file as it runs slowly.

  1. Line 1 enumerates the real files non-empty by size.
  2. Line 2 sorts the sizes (as numbers of descending size).
  3. Line 3 strips out the lines (sizes) that only appear once.
  4. For each remaining size, line 4 finds all the files of that size.
  5. Line 5 computes the MD5 hash for all the files found in line 4, outputting the MD5 hash and file name. (This is repeated for each set of files of a given size.)
  6. Line 6 sorts that list for easy comparison.
  7. Line 7 compares the first 32 characters of each line (the MD5 hash) to find duplicates.
  8. Line 8 spits out the file name and path part of the matching lines.

Some alternative approaches can be found at the original source.

Tagged linux, one-liner and ops.

 

Launch an HTTP server serving the current directory using Python

The Python SimpleHTTPServer module makes it easy to launch a simple web server using a current working directory as the "docroot".

With Python 2:

python -m SimpleHTTPServer

or with Python 3:

python3 -m http.server

By default, each will bind to port 8080, hence http://localhost:8080/ will serve the top level of the working directory tree. Hit Ctrl-c to stop.

Both accept an optional port number:

python -m SimpleHTTPServer 3001

or

python3 -m http.server 3001

if you want to bind to something other than port 8080.

Published 20 Feb 2014
Tagged python, http, cli, one-liner, ops and tool.

 

Backup an entire GitHub repository

The following shell script will back up an organization's GitHub repositories, including the all branches of the source tree and the GitHub wiki and issue list (if any).

#!/bin/bash
# A simple script to backup an organization's GitHub repositories.
#-------------------------------------------------------------------------------
# NOTES:
#-------------------------------------------------------------------------------
# * Under the heading "CONFIG" below you'll find a number of configuration
# parameters that must be personalized for your GitHub account and org.
# Replace the `<CHANGE-ME>` strings with the value described in the comments
# (or overwrite those values at run-time by providing environment variables).
#
# * If you have more than 100 repositories, you'll need to step thru the list
# of repos returned by GitHub one page at a time, as described at
# https://gist.github.com/darktim/5582423
#
# * If you want to back up the repos for a USER rather than an ORGANIZATION,
# there's a small change needed. See the comment on the `REPOLIST` definition
# below (i.e search for "REPOLIST" and make the described change).
#
# * Thanks to @Calrion, @vnaum, @BartHaagdorens and other commenters below for
# various fixes and updates.
#
# * Also see those comments (and related revisions and forks) for more
# information and general troubleshooting.
#-------------------------------------------------------------------------------
#-------------------------------------------------------------------------------
# CONFIG:
#-------------------------------------------------------------------------------
GHBU_ORG=${GHBU_ORG-"<CHANGE-ME>"} # the GitHub organization whose repos will be backed up
# # (if you're backing up a USER's repos, this should be your GitHub username; also see the note below about the `REPOLIST` definition)
GHBU_UNAME=${GHBU_UNAME-"<CHANGE-ME>"} # the username of a GitHub account (to use with the GitHub API)
GHBU_PASSWD=${GHBU_PASSWD-"<CHANGE-ME>"} # the password for that account
#-------------------------------------------------------------------------------
GHBU_BACKUP_DIR=${GHBU_BACKUP_DIR-"github-backups"} # where to place the backup files
GHBU_GITHOST=${GHBU_GITHOST-"github.com"} # the GitHub hostname (see comments)
GHBU_PRUNE_OLD=${GHBU_PRUNE_OLD-true} # when `true`, old backups will be deleted
GHBU_PRUNE_AFTER_N_DAYS=${GHBU_PRUNE_AFTER_N_DAYS-3} # the min age (in days) of backup files to delete
GHBU_SILENT=${GHBU_SILENT-false} # when `true`, only show error messages
GHBU_API=${GHBU_API-"https://api.github.com"} # base URI for the GitHub API
GHBU_GIT_CLONE_CMD="git clone --quiet --mirror git@${GHBU_GITHOST}:" # base command to use to clone GitHub repos
TSTAMP=`date "+%Y%m%d-%H%M"` # format of timestamp suffix appended to archived files
#-------------------------------------------------------------------------------
# (end config)
#-------------------------------------------------------------------------------
# The function `check` will exit the script if the given command fails.
function check {
"$@"
status=$?
if [ $status -ne 0 ]; then
echo "ERROR: Encountered error (${status}) while running the following:" >&2
echo " $@" >&2
echo " (at line ${BASH_LINENO[0]} of file $0.)" >&2
echo " Aborting." >&2
exit $status
fi
}
# The function `tgz` will create a gzipped tar archive of the specified file ($1) and then remove the original
function tgz {
check tar zcf $1.tar.gz $1 && check rm -rf $1
}
$GHBU_SILENT || (echo "" && echo "=== INITIALIZING ===" && echo "")
$GHBU_SILENT || echo "Using backup directory $GHBU_BACKUP_DIR"
check mkdir -p $GHBU_BACKUP_DIR
$GHBU_SILENT || echo -n "Fetching list of repositories for ${GHBU_ORG}..."
REPOLIST=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/orgs/${GHBU_ORG}/repos\?per_page=100 -q | check grep "^ \"name\"" | check awk -F': "' '{print $2}' | check sed -e 's/",//g'` # hat tip to https://gist.github.com/rodw/3073987#gistcomment-3217943 for the license name workaround
# NOTE: if you're backing up a *user's* repos, not an organizations, use this instead:
# REPOLIST=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/user/repos -q | check grep "^ \"name\"" | check awk -F': "' '{print $2}' | check sed -e 's/",//g'`
$GHBU_SILENT || echo "found `echo $REPOLIST | wc -w` repositories."
$GHBU_SILENT || (echo "" && echo "=== BACKING UP ===" && echo "")
for REPO in $REPOLIST; do
$GHBU_SILENT || echo "Backing up ${GHBU_ORG}/${REPO}"
check ${GHBU_GIT_CLONE_CMD}${GHBU_ORG}/${REPO}.git ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}-${TSTAMP}.git && tgz ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}-${TSTAMP}.git
$GHBU_SILENT || echo "Backing up ${GHBU_ORG}/${REPO}.wiki (if any)"
${GHBU_GIT_CLONE_CMD}${GHBU_ORG}/${REPO}.wiki.git ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.wiki-${TSTAMP}.git 2>/dev/null && tgz ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.wiki-${TSTAMP}.git
$GHBU_SILENT || echo "Backing up ${GHBU_ORG}/${REPO} issues"
check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/repos/${GHBU_ORG}/${REPO}/issues -q > ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.issues-${TSTAMP} && tgz ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.issues-${TSTAMP}
done
if $GHBU_PRUNE_OLD; then
$GHBU_SILENT || (echo "" && echo "=== PRUNING ===" && echo "")
$GHBU_SILENT || echo "Pruning backup files ${GHBU_PRUNE_AFTER_N_DAYS} days old or older."
$GHBU_SILENT || echo "Found `find $GHBU_BACKUP_DIR -name '*.tar.gz' -mtime +$GHBU_PRUNE_AFTER_N_DAYS | wc -l` files to prune."
find $GHBU_BACKUP_DIR -name '*.tar.gz' -mtime +$GHBU_PRUNE_AFTER_N_DAYS -exec rm -fv {} > /dev/null \;
fi
$GHBU_SILENT || (echo "" && echo "=== DONE ===" && echo "")
$GHBU_SILENT || (echo "GitHub backup completed." && echo "")

(Also at rodw/backup-github.sh.)

Published 1 Jan 2014
Tagged git, backup and ops.

 

A Cheat Sheet for SQLite

General

  • Most of the SQLite "meta" commands begin with a dot. When in doubt, try .help
  • Use Ctrl-d or .exit or .quit to exit (and Ctrl-c to terminiate a long-running SQL query).
  • Enter .show to see current settings.

Meta-data

  • Enter .databases to see a list of mounted databases.
  • Enter .tables to see a list of table names.
  • Enter .index to see a list of index names.
  • Enter .schema TABLENAME to see the create table statement for a given table.

Import and Export

  • Enter .output FILENAME to pipe output to the specified file. (Use .output stdout to return to the default behavior or printing results to the console.)
  • Enter .mode [csv|column|html|insert|line|list|tabs|tcl] to change the way in which query results are printed.
  • Enter .separator DELIM to change the delimiter used in (list-mode) exports and imports. (Defaults to |.)
  • Enter .dump [TABLEPATTERN] to create a collection of SQL statements for recreating the database (or just those tables with naames matching the optional TABLEPATTERN).
  • Enter .read FILENAME to execute the specified file as a SQL script.
Published 18 Sep 2013

 

backup a git repository with 'git bundle'

Run:

cd REPOSITORY_WORKING_DIRECTORY
git bundle create PATH_TO_BUNDLE.git --all

to create a single-file backup of the entire repository.

Note that the bundle file is a functional Git repository:

git clone PATH_TO_BUNDLE.git MY_PROJECT
Tagged git, backup, one-liner, ops and tool.

 

This page was generated at 4:16 PM on 26 Feb 2018.
Copyright © 1999 - 2018 Rodney Waldhoff.