Linux Basic Unix tools
Introduction
In this session, we have introduced commands to find, locate files and to compress files, together with other common tools that were not discussed before. While the tools discussed here are technically not considered filters, they can be used in pipes.
find
The find command can be very useful at the start of a pipe to search for files. You might want to add 2>/dev/null to the command lines to avoid cluttering your screen with error messages. Here are some examples.
Find all files in /etc and put the list in etcfiles.txt
find /etc > etcfiles.txt
The output is shown below.
datasoft @ datasoft-linux ~$ cat etcfiles.txt | more /etc /etc/pm /etc/pm/sleep.d /etc/pm/sleep.d/10_grub-common /etc/pm/sleep.d/10_unattended-upgrades-hibernate /etc/pm/sleep.d/novatel_3g_suspend /etc/pm/power.d /etc/pm/config.d /etc/mtab.fuselock /etc/hp /etc/hp/hplip.conf /etc/kernel /etc/kernel/postrm.d /etc/kernel/postrm.d/initramfs-tools /etc/kernel/postrm.d/zz-update-grub /etc/kernel/postinst.d /etc/kernel/postinst.d/initramfs-tools /etc/kernel/postinst.d/update-notifier /etc/kernel/postinst.d/apt-auto-removal /etc/kernel/postinst.d/zz-update-grub /etc/kernel/postinst.d/pm-utils /etc/insserv /etc/insserv/overrides --More--
Find all files of the entire system and put the list in allfiles.txt
find / > allfiles.txt
Find files that end in .conf in the current directory (and all subdirs).
find . -name "*.conf"
The output is shown below
datasoft @ datasoft-linux /$ find . -name "*.conf" | more find: `./proc/37/map_files': Permission denied find: `./proc/37/fdinfo': Permission denied find: `./proc/37/ns': Permission denied find: `./proc/49/task/49/fd': Permission denied find: `./proc/49/task/49/fdinfo': Permission denied find: `./proc/49/task/49/ns': Permission denied find: `./proc/49/fd': Permission denied find: `./proc/49/map_files': Permission denied find: `./proc/49/fdinfo': Permission denied find: `./proc/49/ns': Permission denied find: `./proc/52/task/52/fd': Permission denied find: `./proc/52/task/52/fdinfo': Permission denied find: `./proc/52/task/52/ns': Permission denied find: `./proc/52/fd': Permission denied find: `./proc/52/map_files': Permission denied find: `./proc/52/fdinfo': Permission denied find: `./proc/52/ns': Permission denied find: `./proc/53/task/53/fd': Permission denied find: `./proc/53/task/53/fdinfo': Permission denied find: `./proc/53/task/53/ns': Permission denied find: `./proc/53/fd': Permission denied find: `./proc/53/map_files': Permission denied find: `./proc/53/fdinfo': Permission denied find: `./proc/53/ns': Permission denied ...
Find files of type file (not directory, pipe or etc.) that end in .conf.
find . -type f -name "*.conf"
The output is shown below
datasoft @ datasoft-linux /$ find . -name "*.conf" | more find: `./proc/37/map_files': Permission denied find: `./proc/37/fdinfo': Permission denied find: `./proc/37/ns': Permission denied find: `./proc/49/task/49/fd': Permission denied find: `./proc/49/task/49/fdinfo': Permission denied find: `./proc/49/task/49/ns': Permission denied find: `./proc/49/fd': Permission denied find: `./proc/49/map_files': Permission denied find: `./proc/49/fdinfo': Permission denied find: `./proc/49/ns': Permission denied ...
Find files of type directory that end in .bak .
find /data -type d -name "*.bak"
The output is shown below
datasoft @ datasoft-linux /$ find /data -type d -name "*.bak" | more find: `/data': No such file or directory
Find files that are newer than file42.txt
find . -newer file42.txt
The output is shown below
datasoft @ datasoft-linux /$ find . -newer file42.txt | more find: `file42.txt': No such file or directory
Find can also execute another command on every file found. This example will look for *.odf files and copy them to /backup/.
Find can also execute, after your confirmation, another command on every file found. This example will remove *.odf files if you approve of it for every file found.
find /data -name "*.odf" -exec cp {} /backup/ \;
datasoft @ datasoft-linux /$ find /data -name "*.odf" -exec cp {} /backup/ \; | more find: `/data': No such file or directory
locate
The locate tool is very different from find in that it uses an index to locate files. This is a lot faster than traversing all the directories, but it also means that it is always outdated. If the index does not exist yet, then you have to create it (as root on Red Hat Enterprise Linux) with the updatedb command.
datasoft @ datasoft-linux /$ locate samba | more
/etc/samba
/etc/apparmor.d/abstractions/samba
/etc/dhcp/dhclient-enter-hooks.d/samba
/etc/pam.d/samba
/etc/samba/gdbcommands
/etc/samba/smb.conf
/etc/samba/tls
/usr/bin/samba-regedit
/usr/bin/samba-tool
/usr/lib/samba
/usr/lib/2013.com.canonical.certification:checkbox/bin/samba_test
/usr/lib/i386-linux-gnu/libsamba-credentials.so.0
/usr/lib/i386-linux-gnu/libsamba-credentials.so.0.0.1
/usr/lib/i386-linux-gnu/libsamba-hostconfig.so.0
/usr/lib/i386-linux-gnu/libsamba-hostconfig.so.0.0.1
/usr/lib/i386-linux-gnu/libsamba-policy.so.0
/usr/lib/i386-linux-gnu/libsamba-policy.so.0.0.1
/usr/lib/i386-linux-gnu/libsamba-util.so.0
/usr/lib/i386-linux-gnu/libsamba-util.so.0.0.1
/usr/lib/i386-linux-gnu/samba
/usr/lib/i386-linux-gnu/samba/auth
/usr/lib/i386-linux-gnu/samba/bind9
/usr/lib/i386-linux-gnu/samba/gensec
--More--
Most Linux distributions will schedule the updatedb to run once every day.
sleep
The sleep command is used to suspend execution for at least the integral number of seconds specified by the time operand. The following example shows a six second sleep.
datasoft @ datasoft-linux /$ sleep 6
datasoft @ datasoft-linux /$
time
The time command can display how long it takes to execute a command. In the following example the date command takes only a little time to execute.
datasoft @ datasoft-linux /$ time date Tue Aug 5 17:22:56 IST 2014 real 0m0.001s user 0m0.000s sys 0m0.000s
In the following example the sleep 5 command takes five real seconds to execute, but consumes little cpu time.
datasoft @ datasoft-linux /$ time sleep 5
real 0m5.001s
user 0m0.000s
sys 0m0.000s
This bzip2 command compresses a file and uses a lot of cpu time.
datasoft @ datasoft-linux /$ time bzip2 text.txt
real 0m0.021s
user 0m0.000s
sys 0m0.000s
gzip
The gzip command is used to reduces the size of the named files using Lempel-Ziv coding (LZ77).
datasoft @ datasoft-linux ~$ ls -lh temp.txt
-rw-rw-r-- 1 datasoft datasoft 22 Aug 2 14:36 temp.txt
datasoft @ datasoft-linux ~$ gzip temp.txt
datasoft @ datasoft-linux ~$ ls -lh temp.txt.gz
-rw-rw-r-- 1 datasoft datasoft 49 Aug 2 14:36 temp.txt.gz
gunzip
The gunzip command is used to get back the orginal file, which was compressed by gzip command
datasoft @ datasoft-linux ~$ gunzip temp.txt.gz
datasoft @ datasoft-linux ~$ ls -lh temp.txt
-rw-rw-r-- 1 datasoft datasoft 22 Aug 2 14:36 temp.txt
zcat - zmore
Text files that are compressed with gzip can be viewed with zcat and zmore.
datasoft @ datasoft-linux ~$ head -4 temp.txt
four
three
two
datasoft @ datasoft-linux ~$ gzip temp.txt
datasoft @ datasoft-linux ~$ zcat temp.txt.gz | head -4
four
three
two
bzip2
The bzip2 command is used to reduces the size of the named files using the Burrows-Wheeler block sorting text compression algorithm, and Huffman coding.
datasoft @ datasoft-linux ~$ bzip2 temp.txt
datasoft @ datasoft-linux ~$ ls -lh temp.txt.bz2
-rw-rw-r-- 1 datasoft datasoft 59 Aug 2 14:36 temp.txt.bz2
bunzip2
Files can be uncompressed again with bunzip2.
datasoft @ datasoft-linux ~$ bunzip2 temp.txt.bz2
datasoft @ datasoft-linux ~$ ls -lh temp.txt
-rw-rw-r-- 1 datasoft datasoft 22 Aug 2 14:36 temp.txt
bzcat - bzmore
And in the same way, bzcat and bzmore can display files compressed with bzip2.
datasoft @ datasoft-linux ~$ bzip2 temp.txt
datasoft @ datasoft-linux ~$ bzcat temp.txt.bz2 | head -4
four
three
two
Exercise, Practice and Solution:
1. Explain the difference between these two commands. This question is very important. If you don't know the answer, then look back at the shell chapter.
find /data -name "*.txt"
find /data -name *.txt
When *.txt is quoted then the shell will not touch it. The find tool will look in the /data for all files ending in .txt. When *.txt is not quoted then the shell might expand this (when one or more files that ends in .txt exist in the current directory). The find might show a different result, or can result in a syntax error.;
2. Explain the difference between these two statements. Will they both work when there are 200 .odf files in /data ? How about when there are 2 million .odf files ?
find /data -name "*.odf" > data_odf.txt
find /data/*.odf > data_odf.txt
The first find will output all .odf filenames in /data and all subdirectories. The shell will redirect this to a file. The second find will output all files named .odf in /data and will also output all files that exist in directories named *.odf (in /data). With two million files the command line would be expanded beyond the maximum that the shell can accept. The last part of the command line would be lost.
3. Write a find command that finds all files created after January 30th, 2010.
Code:
>touch -t 201001302359 marker_date
find . -type f -newer marker_date
There is another solution :
find . -type f -newerat "20100130 23:59:59"
}
4. Write a find command that finds all *.odf files created in September 2009.
Code:
touch -t 200908312359 marker_start
touch -t 200910010000 marker_end
find . -type f -name "*.odf" -newer marker_start ! -newer marker_end
The exclamation mark ! -newer can be read as not newer.
5. Count the number of *.conf files in /etc and all its subdirs.
Code:
find /etc -type f -name '*.conf' | wc -l
6. Two commands that do the same thing: copy *.odf files to /backup/ . What would be a
reason to replace the first command with the second ? Again, this is an important question.
cp -r /data/*.odf /backup/
find /data -name "*.odf" -exec cp {} /backup/ \;
cp -r /data/*.odf /backup/
find /data -name "*.odf" -exec cp {} /backup/ \;
The first might fail when there are too many files to fit on one command line.
7. Create a file called loctest.txt. Can you find this file with locate ? Why not ? How do you make locate find this file ?
You cannot locate this with locate because it is not yet in the index. updatedb
8. Use find and -exec to rename all .htm files to .html.
datasoft @ datasoft-linux ~$ find . -name '*.htm'
./one.htm
./two.htm
datasoft @ datasoft-linux ~$ find . -name '*.htm' -exec mv {} {}l \;
datasoft @ datasoft-linux ~$ find . -name '*.htm*'
./one.html
./two.html
9. Issue the date command. Now display the date in YYYY/MM/DD format.
Code:
date +%Y/%m/%d
10. Issue the cal command. Display a calendar of 1582 and 1752. Notice anything special ?
cal 1582
The calendars are different depending on the country. Check http://linux-training.be/files/
studentfiles/dates.txt
Previous:
Linux - Filters
Next:
Linux regular expressions
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics