Using ls to Show Directory Size – Updated & Explained
Note – I’ve written about this before (here). In my opinion, this is a vast improvement over that earlier version.
This title of this post is not entirely accurate. The post is actually about a script that will list files and directories in the same format as ls -l (the “long listing” format of the ls command), except it will correctly report the size of directories (and all of the files within them). Technically, it is actually in the format of ls -lhL.
This is functionally very different than the output of ls. Typically ls lists the size of directories not including their contents. This produces a very small number, which is the amount of disk space the directory’s meta-data takes up (i.e. the names of files within the directory). Here is an example:
[bash light="true"]ls -lhL[/bash]
[plain]
total 116K
drwxr-x— 1 james users 44K 2010-04-22 17:21 movies
drwxr-x— 1 james users 48K 2010-04-03 22:22 music
drwxr-x— 1 james users 0 2009-12-11 16:20 photos
drwxr-x— 1 james users 4.0K 2009-12-13 22:36 data
drwxr-x— 1 james users 20K 2010-04-27 00:56 tv
[/plain]
And here is an example of the output of the script (note the size of the directories):
[plain]
drwxr-x— 1 james users 178G 2010-04-22 17:21 movies
drwxr-x— 1 james users 26G 2010-04-03 22:22 music
drwxr-x— 1 james users 9.9G 2009-12-11 16:20 photos
drwxr-x— 1 james users 27G 2009-12-13 22:36 data
drwxr-x— 1 james users 135G 2010-04-27 00:56 tv
[/plain]
That’s a pretty big difference. Note that if all you’re interested in is the permissions, modification time, and filename, using ls -lh is a much better choice – it’s extremely fast and gives you all the information you need. However, if you want to know how much space the contents of each directory is using, you should use the following script:
[bash gutter="1"]
#!/bin/bash
# script to display the sizes of files and directories in the format of ls -lhL
## note that large space on line 9 (the ‘ ‘ that follows “cut -d”) is a tab, created in a terminal with CTRL+v then TAB
for x in *;
do
y=”$(echo “$x” | sed -e ‘s/\[/\\[/g' -e 's/]/\\]/g’)”
echo -e \
“$(ls -lL | grep “[0-9]\{2\}:[0-9]\{2\} $y$” | sed ‘s/[ ][ ]*/ /g’ | cut -d ‘ ‘ -f 1-4) \
$(du -sh “$x” | cut -d ‘ ‘ -f 1)” \
“$(ls -lL | grep “[0-9]\{2\}:[0-9]\{2\} $y$” | sed ‘s/[ ][ ]*/ /g’ | cut -d ‘ ‘ -f 6-20 )”;
done \
| column -t \
| sed -e ‘s/[ ]\([ ][ ]*\)/\1/g’ \
| sed -e ‘s/[ ][ ]*/ /8g’ \
| sed ‘s/\([ ][ ]*\)\([^ ]*\)\([ ]\)\([ ]*\)\([^ ]*[ ]*[^ ]*[ ]*\)\([^ ]*\)\([ ]\)\([ ]*\)/\1\4\2\3\5\8\6\7/’
[/bash]
NOTE – The script is not displayed entirely correctly in this post; on line 8, the multiple spaces in single-quotes (following cut -d) should be a TAB. Although it’s displayed incorrectly on the page, if you use either the “view source” or “copy to clipboard” buttons (top-right of the script when you hover over it), it will use the correct character. You can also produce this character in a terminal with CTRL+v followed by TAB.
This script should be named something convenient and saved somewhere in your $PATH. I call the script lsd and save it in “/home/james/bin”.
How the script works
If you’re the kind of person who likes to know what a script does before copying it from the internet and running it, here’s what’s going on in this script. Paragraphs are labeled by the script’s line numbers.
Lines 4-11. This is a for loop that is run on every (non-hidden) file in your current directory (*). $x is the variable that holds one of the directory’s filenames per iteration.
Line 6. $y is a variable that equals $x, except brackets (“[" and "]“) are replaced by escaped brackets (“\[" and "\]“), in order to avoid conflict with grep’s regular expressions matching.
Lines 7-10. These lines are all part of a long echo command which arranges the data in the same format as ls -lhL. It is echoing three separate command substitution variables, separated by spaces.
Line 8. The first of three command substitutions. Gets a long directory listing, greps out the single line that matches $x, removes extra spaces between sections, and then chops off the 2nd part of the line at the point where the directory size will be displayed.
Line 9. The second of three command substitutions. Gets the size of $x using du -sh and then chops off the end of the line, leaving only the size.
Line 10. The third of three command substitutions. Gets a long directory listing, greps out the single line that matches $x, removes extra spaces between sections, and then chops off the 1st part of the line up to the point after where the directory size will be displayed. Script lines 8, 9, and 10 will form a single line of output for each iteration of the for loop.
Lines 12-15. Each of these lines formats the script’s output to match the style of ls -lhL.
Line 12. Sorts the output into columns using column -t.
Line 13. Removes extra spaces according to the following rule: for every group of 2 or more consecutive spaces, subtract one space from that group.
Line 14. Removes extra spaces from the last part of the lines, ensuring that filenames containing spaces do not have multiple consecutive spaces.
Line 15. Swaps certain groups of spaces around in order to match the justification style of ls -lhL. Specifically, user and group names are justified left, while file sizes are justified right.
Please leave a comment if you’ve tried this script and found any bugs in it.