13 ops snippets
gracefully closing node.js applications via signal handling
To make your node.js application gracefully respond to shutdown signals, use process.on(SIGNAL,HANDLER)
.
For example, to respond to SIGINT
(typically Ctrl-c), you can use:
process.on( "SIGINT", function() {
console.log('CLOSING [SIGINT]');
process.exit();
} );
Note that without the process.exit()
, the program will not be shutdown. (This is you chance to override or "trap" the signal.)
Some common examples (in CoffeeScript):
process.on 'SIGHUP', ()->console.log('CLOSING [SIGHUP]'); process.exit()
process.on 'SIGINT', ()->console.log('CLOSING [SIGINT]'); process.exit()
process.on 'SIGQUIT', ()->console.log('CLOSING [SIGQUIT]'); process.exit()
process.on 'SIGABRT', ()->console.log('CLOSING [SIGABRT]'); process.exit()
process.on 'SIGTERM', ()->console.log('CLOSING [SIGTERM]'); process.exit()
PS: On Linux (and similar) you can enter kill -l
on the command line to see a list of possible signals, and kill -N PID
to send signal N to the process with process ID PID.
Command-line tool for spidering sites and extracting XML/HTML content
Xidel is a robust tool for spidering, extracting and transforming XML/HTML content from the command line.
It's like wget
or curl
with a CSS and XPath/XQuery engine (among other features), attached.
xidel
doesn't seem to be in the package management repositories I normally use, but you can download it here.
The following example will (1) download a web page, (2) extract a list of links (specified via CSS selector) from it, (3) download the page corresponding to each of those links and finally (4) extract specific pieces of content (specified by CSS selectors) from each page:
xidel [URL-OF-INDEX-PAGE] \
--follow "css('[CSS-SELECTOR-FOR-LINKS]')" \
--css "[CSS-SELECTOR-FOR-SOME-TEXT]" \
--extract "inner-html(css('[CSS-SELECTOR-FOR-SOME-HTML]'))"
As a concrete example, the command:
$ xidel http://reddit.com -f "css('a')" --css title
will download every page linked from the reddit.com homepage and print the content of its title
tag.
There are several more examples on the Xidel site.
Shell script for service-like CoffeeScript/Node.js apps using forever
This is an example of a (bash) shell script that uses the forever module to start and stop a CoffeeScript application as if it were a service.
#!/bin/bash | |
# This is an example of a (bash) shell script that uses the forever module ([1]) | |
# to start and stop a CoffeeScript application as if it were a service. | |
# | |
# [1] <https://github.com/nodejitsu/forever> | |
# ASSUMPTIONS | |
################################################################################ | |
# 1) You've got a CoffeeScript program you want to run via `forever`. | |
# To use plain Node.js/SSJS, remove the bits about `COFFEE_EXE` | |
# and change the `forever` command within the `start()` routine. | |
# | |
# | |
# 2) You've got a configuration file at `config/NODE_ENV.json` (where | |
# `NODE_ENV` is the value of the corresponding environment variable). | |
# If you don't care about this, remove the bits about checking | |
# for `NODE_ENV` and `config/NODE_ENV.json`. | |
# | |
# 3) `coffee` is already in your path or lives at `./node_modules/.bin`. | |
# | |
# 4) `forever` is already in your path or lives at `./node_modules/.bin`. | |
# | |
# 5) The CoffeeScript file you want to run is located at | |
# `./lib/APP-NAME.coffee`, where `APP-NAME` is the name of this file. | |
# CONFIGURATION | |
################################################################################ | |
APP="lib/${0}" | |
CONFIG_DIR="./config" | |
LOGFILE="forever-`basename $0`.log" | |
OUTFILE="forever-`basename $0`.out" | |
ERRFILE="forever-`basename $0`.err" | |
## DISCOVER COFFEE EXE | |
if command -v coffee >/dev/null 2>&1; then | |
COFFEE_EXE="coffee" | |
else | |
COFFEE_EXE="./node_modules/.bin/coffee" | |
fi | |
## DISCOVER FOREVER EXE | |
if command -v forever >/dev/null 2>&1 ; then | |
FOREVER_EXE="forever" | |
else | |
FOREVER_EXE="./node_modules/.bin/forever" | |
fi | |
# ROUTINES | |
################################################################################ | |
usage() { | |
echo "Usage: `basename $0` {start|stop|restart|status|checkenv}" >&2; | |
} | |
start() { | |
# check for NODE_ENV before launching (but launch anyway) | |
if [[ -z "${NODE_ENV}" ]]; then | |
echo -e "\n!WARNING! You probably want to set the NODE_ENV environment variable.\n" | |
fi | |
${FOREVER_EXE} start -a -l ${LOGFILE} -o ${OUTFILE} -e ${ERRFILE} -c ${COFFEE_EXE} ${APP}; | |
} | |
stop() { ${FOREVER_EXE} stop ${APP}; } | |
status() { ${FOREVER_EXE} list; } | |
checkenv() { | |
STATUS=0 | |
echo -e "\nChecking prerequisites.\n" | |
# check for NODE_ENV | |
if [[ ! -z "${NODE_ENV}" ]]; then | |
echo -e "NODE_ENV: SET - ${NODE_ENV}\n" | |
else | |
echo -e "NODE_ENV: NOT SET\n" | |
echo -e "!WARNING! You probably want to set the NODE_ENV environment variable.\n" | |
fi | |
# check for config/NODE_ENV.json | |
if [[ -e "${CONFIG_DIR}/${NODE_ENV}.json" ]]; then | |
echo -e " CONFIG: FOUND - ${CONFIG_DIR}/${NODE_ENV}.json\n" | |
else | |
echo -e " CONFIG: NOT FOUND - ${CONFIG_DIR}/${NODE_ENV}.json" | |
echo -e "!WARNING! You probably want to ensure that the file ${CONFIG_DIR}/[NODE_ENV].json exists.\n" | |
STATUS=3 | |
fi | |
# check for coffee | |
if command -v ${COFFEE_EXE} >/dev/null 2>&1; then | |
echo -e " COFFEE: FOUND - ${COFFEE_EXE}\n" | |
else | |
echo " COFFEE: NOT FOUND - ${COFFEE_EXE}" | |
echo -e "!WARNING! The coffee executable could not be found. Is it in your PATH?\n" | |
STATUS=4 | |
fi | |
# check for forever | |
if command -v ${FOREVER_EXE} >/dev/null 2>&1; then | |
echo -e " FOREVER: FOUND - ${FOREVER_EXE}\n" | |
else | |
echo " FOREVER: NOT FOUND - ${FOREVER_EXE}" | |
echo -e "!WARNING! The forever executable could not be found. Is it in your PATH?\n" | |
STATUS=5 | |
fi | |
# report status | |
if [ $STATUS -ne 0 ]; then | |
echo -e "!WARNING! Required files or programs not found.\n The application may not work properly.\n" | |
else | |
echo -e "Everything seems to check out OK.\n" | |
fi | |
exit $STATUS | |
} | |
# MAIN LOOP | |
################################################################################ | |
if [[ -z "${1}" ]]; then | |
usage | |
exit 1 | |
else | |
case "$1" in | |
start) | |
start; | |
;; | |
restart) | |
stop; sleep 1; start; | |
;; | |
stop) | |
stop | |
;; | |
status) | |
status | |
;; | |
checkenv) | |
checkenv $1 | |
;; | |
*) | |
usage | |
exit 6 | |
;; | |
esac | |
exit 0 | |
fi | |
################################################################################ | |
# (eof) |
(Also at rodw/coffee-as-a-service-via-forever.sh.)
Backup or mirror a website using wget
To create a local mirror or backup of a website with wget
, run:
wget -r -l 5 -k -w 1 --random-wait <URL>
Where:
-r
(or--recursive
) will causewget
to recursively download files-l N
(or--level=N
) will limit recursion to at most N levels below the root document (defaults to 5, useinf
for infinite recursion)-k
(or--convert-links
) will causewget
to convert links in the downloaded documents so that the files can be viewed locally-w
(or--wait=N
) will causewget
to wait N seconds between requests--random-wait
will causewget
to randomly vary the wait time to0.5x
to1.5x
the value specified by--wait
Some additional notes:
--mirror
(or-m
) can be used as a shortcut for-r -N -l inf --no-remove-listing
which enables infinite recursion and preserves both the server timestamps and FTP directory listings.-np
(--no-parent
) can be used to limitwget
to files below a specific "directory" (path).
Pre-generate pages or load a web cache using wget
Many web frameworks and template engines will defer the generation the HTML version of a document the first time it is accessed. This can make the first hit on a given page significantly slower than subsequent hits.
You can use wget
to pre-cache web pages using a command such as:
wget -r -l 3 -nd --delete-after <URL>
Where:
-r
(or--recursive
) will causewget
to recursively download files-l N
(or--level=N
) will limit recursion to at most N levels below the root document (defaults to 5, useinf
for infinite recursion)-nd
(or--no-directories
) will preventwget
from creating local directories to match the server-side paths--delete-after
will causewget
to delete each file as soon as it is downloaded (so the command leaves no traces behind.)
Mapping port 80 to port 3000 using iptables
Port numbers less that 1024 are considered "privileged" ports, and you generally must be root
to bind a listener to them.
Rather than running a network application as root
, map the privileged port to a non-privileged one:
sudo iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 3000
Now requests to port 80 will be forwarded on to port 3000.
Cheat Sheet for Linux Run Levels
"Standard" Linux uses the following run levels:
- Run Level 0 is halt (shutdown).
- Run Level 1 is single-user mode.
- Run Level 2 is multi-user mode (without networking)
- Run Level 3 is multi-user mode (with networking). This is the normal "terminal" mode. (I.e., before the display manager is run).
- Run Level 4 is undefined.
- Run Level 5 is multi-usermode with a GUI display manager (X11).
- Run Level 6 is reboot.
In Debian and its derivatives run levels 2 thru 5 are the same: multi-user mode with networking, and with a display manager if available.
- Run Level 0 is halt (shutdown).
- Run Level 1 is single-user mode.
- Run Level 2-5 is multi-user mode with networking and a GUI display manager when available.
- Run Level 6 is reboot.
Debian also adds Run Level S, which is executed when the system first boots.
Also see Wikipedia's article on run levels.
Find duplicate files on Linux (or OSX).
Find files that have the same size and MD5 hash (and hence are likely to be exact duplicates):
find -not -empty -type f -printf "%s\n" | \ # line 1
sort -rn | \ # line 2
uniq -d | \ # line 3
xargs -I{} -n1 find -type f -size {}c -print0 | \ # line 4
xargs -0 md5sum | \ # line 5
sort | \ # line 6
uniq -w32 --all-repeated=separate | \ # line 7
cut -d" " -f3- # line 8
You probably want to pipe that to a file as it runs slowly.
- Line 1 enumerates the real files non-empty by size.
- Line 2 sorts the sizes (as numbers of descending size).
- Line 3 strips out the lines (sizes) that only appear once.
- For each remaining size, line 4 finds all the files of that size.
- Line 5 computes the MD5 hash for all the files found in line 4, outputting the MD5 hash and file name. (This is repeated for each set of files of a given size.)
- Line 6 sorts that list for easy comparison.
- Line 7 compares the first 32 characters of each line (the MD5 hash) to find duplicates.
- Line 8 spits out the file name and path part of the matching lines.
Some alternative approaches can be found at the original source.
Launch an HTTP server serving the current directory using Python
The Python SimpleHTTPServer
module makes it easy to launch a simple web server using a current working directory as the "docroot".
With Python 2:
python -m SimpleHTTPServer
or with Python 3:
python3 -m http.server
By default, each will bind to port 8080, hence http://localhost:8080/
will serve the top level of the working directory tree. Hit Ctrl-c
to stop.
Both accept an optional port number:
python -m SimpleHTTPServer 3001
or
python3 -m http.server 3001
if you want to bind to something other than port 8080.
Backup an entire GitHub repository
The following shell script will back up an organization's GitHub repositories, including the all branches of the source tree and the GitHub wiki and issue list (if any).
#!/bin/bash | |
# A simple script to backup an organization's GitHub repositories. | |
#------------------------------------------------------------------------------- | |
# NOTES: | |
#------------------------------------------------------------------------------- | |
# * Under the heading "CONFIG" below you'll find a number of configuration | |
# parameters that must be personalized for your GitHub account and org. | |
# Replace the `<CHANGE-ME>` strings with the value described in the comments | |
# (or overwrite those values at run-time by providing environment variables). | |
# | |
# * If you have more than 100 repositories, you'll need to step thru the list | |
# of repos returned by GitHub one page at a time, as described at | |
# https://gist.github.com/darktim/5582423 | |
# | |
# * If you want to back up the repos for a USER rather than an ORGANIZATION, | |
# there's a small change needed. See the comment on the `REPOLIST` definition | |
# below (i.e search for "REPOLIST" and make the described change). | |
# | |
# * Thanks to @Calrion, @vnaum, @BartHaagdorens and other commenters below for | |
# various fixes and updates. | |
# | |
# * Also see those comments (and related revisions and forks) for more | |
# information and general troubleshooting. | |
#------------------------------------------------------------------------------- | |
#------------------------------------------------------------------------------- | |
# CONFIG: | |
#------------------------------------------------------------------------------- | |
GHBU_ORG=${GHBU_ORG-"<CHANGE-ME>"} # the GitHub organization whose repos will be backed up | |
# # (if you're backing up a USER's repos, this should be your GitHub username; also see the note below about the `REPOLIST` definition) | |
GHBU_UNAME=${GHBU_UNAME-"<CHANGE-ME>"} # the username of a GitHub account (to use with the GitHub API) | |
GHBU_PASSWD=${GHBU_PASSWD-"<CHANGE-ME>"} # the password for that account | |
#------------------------------------------------------------------------------- | |
GHBU_BACKUP_DIR=${GHBU_BACKUP_DIR-"github-backups"} # where to place the backup files | |
GHBU_GITHOST=${GHBU_GITHOST-"github.com"} # the GitHub hostname (see comments) | |
GHBU_PRUNE_OLD=${GHBU_PRUNE_OLD-true} # when `true`, old backups will be deleted | |
GHBU_PRUNE_AFTER_N_DAYS=${GHBU_PRUNE_AFTER_N_DAYS-3} # the min age (in days) of backup files to delete | |
GHBU_SILENT=${GHBU_SILENT-false} # when `true`, only show error messages | |
GHBU_API=${GHBU_API-"https://api.github.com"} # base URI for the GitHub API | |
GHBU_GIT_CLONE_CMD="git clone --quiet --mirror git@${GHBU_GITHOST}:" # base command to use to clone GitHub repos | |
TSTAMP=`date "+%Y%m%d-%H%M"` # format of timestamp suffix appended to archived files | |
#------------------------------------------------------------------------------- | |
# (end config) | |
#------------------------------------------------------------------------------- | |
# The function `check` will exit the script if the given command fails. | |
function check { | |
"$@" | |
status=$? | |
if [ $status -ne 0 ]; then | |
echo "ERROR: Encountered error (${status}) while running the following:" >&2 | |
echo " $@" >&2 | |
echo " (at line ${BASH_LINENO[0]} of file $0.)" >&2 | |
echo " Aborting." >&2 | |
exit $status | |
fi | |
} | |
# The function `tgz` will create a gzipped tar archive of the specified file ($1) and then remove the original | |
function tgz { | |
check tar zcf $1.tar.gz $1 && check rm -rf $1 | |
} | |
$GHBU_SILENT || (echo "" && echo "=== INITIALIZING ===" && echo "") | |
$GHBU_SILENT || echo "Using backup directory $GHBU_BACKUP_DIR" | |
check mkdir -p $GHBU_BACKUP_DIR | |
$GHBU_SILENT || echo -n "Fetching list of repositories for ${GHBU_ORG}..." | |
REPOLIST=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/orgs/${GHBU_ORG}/repos\?per_page=100 -q | check grep "^ \"name\"" | check awk -F': "' '{print $2}' | check sed -e 's/",//g'` # hat tip to https://gist.github.com/rodw/3073987#gistcomment-3217943 for the license name workaround | |
# NOTE: if you're backing up a *user's* repos, not an organizations, use this instead: | |
# REPOLIST=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/user/repos -q | check grep "^ \"name\"" | check awk -F': "' '{print $2}' | check sed -e 's/",//g'` | |
$GHBU_SILENT || echo "found `echo $REPOLIST | wc -w` repositories." | |
$GHBU_SILENT || (echo "" && echo "=== BACKING UP ===" && echo "") | |
for REPO in $REPOLIST; do | |
$GHBU_SILENT || echo "Backing up ${GHBU_ORG}/${REPO}" | |
check ${GHBU_GIT_CLONE_CMD}${GHBU_ORG}/${REPO}.git ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}-${TSTAMP}.git && tgz ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}-${TSTAMP}.git | |
$GHBU_SILENT || echo "Backing up ${GHBU_ORG}/${REPO}.wiki (if any)" | |
${GHBU_GIT_CLONE_CMD}${GHBU_ORG}/${REPO}.wiki.git ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.wiki-${TSTAMP}.git 2>/dev/null && tgz ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.wiki-${TSTAMP}.git | |
$GHBU_SILENT || echo "Backing up ${GHBU_ORG}/${REPO} issues" | |
check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/repos/${GHBU_ORG}/${REPO}/issues -q > ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.issues-${TSTAMP} && tgz ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.issues-${TSTAMP} | |
done | |
if $GHBU_PRUNE_OLD; then | |
$GHBU_SILENT || (echo "" && echo "=== PRUNING ===" && echo "") | |
$GHBU_SILENT || echo "Pruning backup files ${GHBU_PRUNE_AFTER_N_DAYS} days old or older." | |
$GHBU_SILENT || echo "Found `find $GHBU_BACKUP_DIR -name '*.tar.gz' -mtime +$GHBU_PRUNE_AFTER_N_DAYS | wc -l` files to prune." | |
find $GHBU_BACKUP_DIR -name '*.tar.gz' -mtime +$GHBU_PRUNE_AFTER_N_DAYS -exec rm -fv {} > /dev/null \; | |
fi | |
$GHBU_SILENT || (echo "" && echo "=== DONE ===" && echo "") | |
$GHBU_SILENT || (echo "GitHub backup completed." && echo "") |
(Also at rodw/backup-github.sh.)
A Cheat Sheet for SQLite
General
- Most of the SQLite "meta" commands begin with a dot. When in doubt, try
.help
- Use
Ctrl-d
or.exit
or.quit
to exit (andCtrl-c
to terminiate a long-running SQL query). - Enter
.show
to see current settings.
Meta-data
- Enter
.databases
to see a list of mounted databases. - Enter
.tables
to see a list of table names. - Enter
.index
to see a list of index names. - Enter
.schema TABLENAME
to see the create table statement for a given table.
Import and Export
- Enter
.output FILENAME
to pipe output to the specified file. (Use.output stdout
to return to the default behavior or printing results to the console.) - Enter
.mode [csv|column|html|insert|line|list|tabs|tcl]
to change the way in which query results are printed. - Enter
.separator DELIM
to change the delimiter used in (list
-mode) exports and imports. (Defaults to|
.) - Enter
.dump [TABLEPATTERN]
to create a collection of SQL statements for recreating the database (or just those tables with naames matching the optional TABLEPATTERN). - Enter
.read FILENAME
to execute the specified file as a SQL script.