When using properly, command line is often the easiest and fastest way to get tasks done, even though the same thing can be also accomplished in other ways like using Python, Perl, etc.. Many useful little tricks can be found on the internet. I have collected some of them here.


scp with regular expression

It is not guaranteed that the terminal you use can recognize regular expression on the remote machine. The magic here is

scp "user@machine:/path/[regex here]" .

Finding and replacing text within files

sed -i 's/original/new/g' file.txt

Explanation:

  • sed = Stream Editor
  • -i = in-place (i.e. save back to the original file)
  • The command string:
    • s = the substitute command
    • original = a regular expression describing the word to replace (or just the word itself)
    • new = the text to replace it with
    • g = global (i.e. replace all and not just the first occurrence)
  • file.txt = the file name

Monitoring log file in real time

tail -f log

This works, but the downside is that tail reads the whole file into buffer. As an alternative, using less is a more elegant approach:

less +F log

Returning the last n modified file in directory in time order:

ls -Art | tail -n 1

or in reverse order:

ls -t | head -n 1

Piping selected files into tar

ls -Art | tail -n 5 | tar czvf out.tar.gz -T -

Avoiding file auto purge

On many file systems, there may be some rules for file housekeeping. One trick to avoid it is to touch each and every file in the repository. This can be done through the following command:

find /home/example -exec touch {} \;

Checking missing sequence files

Assume the files share the pattern FILE_DDD.txt.

Example 1:

for i in {1..14}
do
seq=`printf "%03d" $i`
if [ ! -f "FILE_${seq}.txt" ]
then
echo "FILE_${i}.txt"
fi
done

Assume a series of files with numbers indicating the day of a year and the hour of day:

GLDAS_NOAH025SUBP_3H.A2003001.0000.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003001.0600.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003001.1200.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003001.1800.001.2015210044609.pss.grb

for a in file_{001..365}.{00..18..6}.txt
do
  [[ -f $a ]] || echo "$a"
done

Note the usages of .. in the above examples. This is very useful to generate a continous list of strings

Assume just number ordering:

ub=1000 # Replace this with the largest existing file's number.
seq "$ub" | while read -r i; do
    [[ -f "$i.txt" ]] || echo "$i.txt is missing"
done

Someone suggested awk, but I am not familiar with it at all.

List file names using regular expression

Example 1: finding files within sequence and deleting

ls | grep -P "^08[5-9].*[0-9]" | xargs -d "\n" rm

Example 2

find your-directory/ -name 'A*[0-9][0-9]' -delete

Removing prefix from files

Changing file extensions

for f in *.html; do
    mv -- "$f" "${f%.html}.php"
done

Many more to be added later! Some additional private notes on scripting are listed in this repo.