RSS feed [root] /unix /weblog



title search:


Mon Feb 17 20:40:19 HKT 2020


(google search) (amazon search)
download zip of files only

Fri Jun 16 21:37:41 HKT 2006 From /weblog/unix/script

check exit status

Need to check #? and PIPESTATUS

(google search) (amazon search)

Sun May 28 18:31:46 HKT 2006 From /weblog/unix/script

Batch rename

Script of batch rename at unix system

(google search) (amazon search)

Tue Dec 20 13:20:44 HKT 2005 From /weblog/unix


The answer is wget. It can be used to download just single file, a list of specified files, or a recursive chain of files. For example, the following command will download an entire site, following all links as long as they stay in the same domain.

wget -r

This command will do the same but include referenced CSS, inline images, etc.

wget -p -r

In this case, I wanted all files with extension "mp3", skipping everything else. My first thought was to use the -A option to only "Accept" and download mp3 files.

wget -p -r -A mp3

The problem though is that Escape Pod, like many podcast sites, have their actual mp3 files hosted by a third party in order to reduce bandwidth. I could do the recursive download across domans but thought that might get a bit dangerous.

In the end, I ran three commands. The first downloads the html files for the entire site. The second line scans the html for full URL's and uses sed to filter out everything else. (If I knew sed better this could probably be a shorter command). Note the use of find to navigate all files in the tree, egrep to restrict to actual URL's, sed to eliminate irrelevant parts of the line, and sort/uniq to remove duplicates.

Finally, the third line uses wget to download all files found in the previous command. (Note: remember the ending \ causes the command to extend to the next line).

wget -r -A htm,html

cat `find . -name \*htm\* -print` | egrep "http.*mp3" | \
sed "s/.*\(http:\/\/.*mp3\).*$/\1/" | sort | uniq > files.txt

wget -i files.txt

(google search) (amazon search)

Tue Nov 08 14:33:09 HKT 2005 From /weblog/unix/script

Unix KornShell Quick Reference

(google search) (amazon search)

Tue Oct 04 11:58:05 HKT 2005 From /weblog/unix/script

safer rm

A number of tips to make rm command safer[..]le.php?story=20050928082624470&lsrc=osxh

(google search) (amazon search)