13 ops snippets
gracefully closing node.js applications via signal handling
To make your node.js application gracefully respond to shutdown signals, use process.on(SIGNAL,HANDLER)
.
For example, to respond to SIGINT
(typically Ctrl-c), you can use:
process.on( "SIGINT", function() {
console.log('CLOSING [SIGINT]');
process.exit();
} );
Note that without the process.exit()
, the program will not be shutdown. (This is you chance to override or "trap" the signal.)
Some common examples (in CoffeeScript):
process.on 'SIGHUP', ()->console.log('CLOSING [SIGHUP]'); process.exit()
process.on 'SIGINT', ()->console.log('CLOSING [SIGINT]'); process.exit()
process.on 'SIGQUIT', ()->console.log('CLOSING [SIGQUIT]'); process.exit()
process.on 'SIGABRT', ()->console.log('CLOSING [SIGABRT]'); process.exit()
process.on 'SIGTERM', ()->console.log('CLOSING [SIGTERM]'); process.exit()
PS: On Linux (and similar) you can enter kill -l
on the command line to see a list of possible signals, and kill -N PID
to send signal N to the process with process ID PID.
Command-line tool for spidering sites and extracting XML/HTML content
Xidel is a robust tool for spidering, extracting and transforming XML/HTML content from the command line.
It's like wget
or curl
with a CSS and XPath/XQuery engine (among other features), attached.
xidel
doesn't seem to be in the package management repositories I normally use, but you can download it here.
The following example will (1) download a web page, (2) extract a list of links (specified via CSS selector) from it, (3) download the page corresponding to each of those links and finally (4) extract specific pieces of content (specified by CSS selectors) from each page:
xidel [URL-OF-INDEX-PAGE] \
--follow "css('[CSS-SELECTOR-FOR-LINKS]')" \
--css "[CSS-SELECTOR-FOR-SOME-TEXT]" \
--extract "inner-html(css('[CSS-SELECTOR-FOR-SOME-HTML]'))"
As a concrete example, the command:
$ xidel http://reddit.com -f "css('a')" --css title
will download every page linked from the reddit.com homepage and print the content of its title
tag.
There are several more examples on the Xidel site.
Shell script for service-like CoffeeScript/Node.js apps using forever
This is an example of a (bash) shell script that uses the forever module to start and stop a CoffeeScript application as if it were a service.
(Also at rodw/coffee-as-a-service-via-forever.sh.)
Backup or mirror a website using wget
To create a local mirror or backup of a website with wget
, run:
wget -r -l 5 -k -w 1 --random-wait <URL>
Where:
-r
(or--recursive
) will causewget
to recursively download files-l N
(or--level=N
) will limit recursion to at most N levels below the root document (defaults to 5, useinf
for infinite recursion)-k
(or--convert-links
) will causewget
to convert links in the downloaded documents so that the files can be viewed locally-w
(or--wait=N
) will causewget
to wait N seconds between requests--random-wait
will causewget
to randomly vary the wait time to0.5x
to1.5x
the value specified by--wait
Some additional notes:
--mirror
(or-m
) can be used as a shortcut for-r -N -l inf --no-remove-listing
which enables infinite recursion and preserves both the server timestamps and FTP directory listings.-np
(--no-parent
) can be used to limitwget
to files below a specific "directory" (path).
Pre-generate pages or load a web cache using wget
Many web frameworks and template engines will defer the generation the HTML version of a document the first time it is accessed. This can make the first hit on a given page significantly slower than subsequent hits.
You can use wget
to pre-cache web pages using a command such as:
wget -r -l 3 -nd --delete-after <URL>
Where:
-r
(or--recursive
) will causewget
to recursively download files-l N
(or--level=N
) will limit recursion to at most N levels below the root document (defaults to 5, useinf
for infinite recursion)-nd
(or--no-directories
) will preventwget
from creating local directories to match the server-side paths--delete-after
will causewget
to delete each file as soon as it is downloaded (so the command leaves no traces behind.)
Mapping port 80 to port 3000 using iptables
Port numbers less that 1024 are considered "privileged" ports, and you generally must be root
to bind a listener to them.
Rather than running a network application as root
, map the privileged port to a non-privileged one:
sudo iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 3000
Now requests to port 80 will be forwarded on to port 3000.
Cheat Sheet for Linux Run Levels
"Standard" Linux uses the following run levels:
- Run Level 0 is halt (shutdown).
- Run Level 1 is single-user mode.
- Run Level 2 is multi-user mode (without networking)
- Run Level 3 is multi-user mode (with networking). This is the normal "terminal" mode. (I.e., before the display manager is run).
- Run Level 4 is undefined.
- Run Level 5 is multi-usermode with a GUI display manager (X11).
- Run Level 6 is reboot.
In Debian and its derivatives run levels 2 thru 5 are the same: multi-user mode with networking, and with a display manager if available.
- Run Level 0 is halt (shutdown).
- Run Level 1 is single-user mode.
- Run Level 2-5 is multi-user mode with networking and a GUI display manager when available.
- Run Level 6 is reboot.
Debian also adds Run Level S, which is executed when the system first boots.
Also see Wikipedia's article on run levels.
Find duplicate files on Linux (or OSX).
Find files that have the same size and MD5 hash (and hence are likely to be exact duplicates):
find -not -empty -type f -printf "%s\n" | \ # line 1
sort -rn | \ # line 2
uniq -d | \ # line 3
xargs -I{} -n1 find -type f -size {}c -print0 | \ # line 4
xargs -0 md5sum | \ # line 5
sort | \ # line 6
uniq -w32 --all-repeated=separate | \ # line 7
cut -d" " -f3- # line 8
You probably want to pipe that to a file as it runs slowly.
- Line 1 enumerates the real files non-empty by size.
- Line 2 sorts the sizes (as numbers of descending size).
- Line 3 strips out the lines (sizes) that only appear once.
- For each remaining size, line 4 finds all the files of that size.
- Line 5 computes the MD5 hash for all the files found in line 4, outputting the MD5 hash and file name. (This is repeated for each set of files of a given size.)
- Line 6 sorts that list for easy comparison.
- Line 7 compares the first 32 characters of each line (the MD5 hash) to find duplicates.
- Line 8 spits out the file name and path part of the matching lines.
Some alternative approaches can be found at the original source.
Launch an HTTP server serving the current directory using Python
The Python SimpleHTTPServer
module makes it easy to launch a simple web server using a current working directory as the "docroot".
With Python 2:
python -m SimpleHTTPServer
or with Python 3:
python3 -m http.server
By default, each will bind to port 8080, hence http://localhost:8080/
will serve the top level of the working directory tree. Hit Ctrl-c
to stop.
Both accept an optional port number:
python -m SimpleHTTPServer 3001
or
python3 -m http.server 3001
if you want to bind to something other than port 8080.
Backup an entire GitHub repository
The following shell script will back up an organization's GitHub repositories, including the all branches of the source tree and the GitHub wiki and issue list (if any).
(Also at rodw/backup-github.sh.)
A Cheat Sheet for SQLite
General
- Most of the SQLite "meta" commands begin with a dot. When in doubt, try
.help
- Use
Ctrl-d
or.exit
or.quit
to exit (andCtrl-c
to terminiate a long-running SQL query). - Enter
.show
to see current settings.
Meta-data
- Enter
.databases
to see a list of mounted databases. - Enter
.tables
to see a list of table names. - Enter
.index
to see a list of index names. - Enter
.schema TABLENAME
to see the create table statement for a given table.
Import and Export
- Enter
.output FILENAME
to pipe output to the specified file. (Use.output stdout
to return to the default behavior or printing results to the console.) - Enter
.mode [csv|column|html|insert|line|list|tabs|tcl]
to change the way in which query results are printed. - Enter
.separator DELIM
to change the delimiter used in (list
-mode) exports and imports. (Defaults to|
.) - Enter
.dump [TABLEPATTERN]
to create a collection of SQL statements for recreating the database (or just those tables with naames matching the optional TABLEPATTERN). - Enter
.read FILENAME
to execute the specified file as a SQL script.