Large Directory Causes ls to "Hang"

So you have a directory with millions of files, and ls just hangs ?

Use ls -1 -f to show the files immediately. To delete the files, if you want to remove ALL files in the current directory, use something like

ls -1 -f | xargs rm

After cleaning up very many unwanted files, you are likely to be left with a huge and sparse directory object. Three million files in one directory, for example, apart from taking up space in themselves, will likely push the directory object to occupy over 100 Mb of space.

You might decide to recreate the directory to reclaim that 100 Mb. But if it is /tmp, then do so with great care, and only in single user mode.

Why the ls Command “Hangs”

The “ls” command, by default, will sort its output. To do that, it must first slurp the name of every file into memory. Confronted with a very large directory, it will sit there, reading in file names, and taking up more and more memory until eventually listing the files all at once, in alphanumerical order.

On the other hand, ls -1 -f does not perform any sorting. It just reads the directory and displays files immediately.

An example. Below we have a test directory containing 3 million files. They have names like test_file_a_1, test_file_a_2 and so on, all the way up to test_file_a_3000000. A Perl script created the files, in the order given by the number in the file name.

ls -f -1 can be used to list the first few file names, and it returns immediately:

bash-4.2$ time ls -1 -f | head
.
..
test_file_a_2531963
test_file_a_467778
test_file_a_2677947
test_file_a_329896
test_file_a_835701
test_file_a_1266060
test_file_a_261887
test_file_a_311007

real 0m0.006s
user 0m0.000s
sys 0m0.008s

Now remove the -1 and -f flags, and the ls command takes about 10,000 times longer to run:

bash-4.2$ time /bin/ls | head
test_file_a_1
test_file_a_10
test_file_a_100
test_file_a_1000
test_file_a_10000
test_file_a_100000
test_file_a_1000000
test_file_a_1000001
test_file_a_1000002
test_file_a_1000003

real 0m57.880s
user 0m55.644s
sys 0m2.121s

As well as being much slower, the second form of the command takes a whole lot of memory. By the stage where it is actually printing file names, the simple ls command has stored 3 million of them, ballooning its memory usage to (in this case) 507 Mb. On the other hand, the ls -1 -f process never gets larger than 4.5 Mb, using over 100 times less memory.

An Ext3/Ext4 Bug ?

The time and resource taken to walk an Ext3/4 file system is sometimes called a file system bug. Personally, I would say if there is a bug anywhere, it is with the walking software (eg. find, ls without the “-1 -f” flags, various backup softwares) rather than the file system code.

Order, Order

There is something else with those test above which is a little odd. ls has properly sorted the file names into alphanumerical order. But what sort of order was returned by ls -1 -f ? It seems pretty random. The first three files are test_file_a_2531963, test_file_a_467778, test_file_a_2677947. That does not correspond to the order in which the files were created, or updated, or the inode number, or anything else. What is going on ?

This is an ext4 file system. ext4 (and ext3) file systems use an “Htree” hashing algorithm to store file names, which are then returned in arbitrary order when the directory is read. Therefore, ls -1 -f will show the files in this “raw” directory order, making the results look mixed up and unsorted.

Ext2 file systems sometimes behave like Ext3 and Ext4 in this respect (eg on Fedora 16), and sometimes do not (eg. in Red Hat 4), instead returning files in the order they were created. Ditto Solaris UFS.

Hanging Process ? Spy on it.

As an alternative to ls -1 -f, or where the command is not available, strace (Linux) or truss (Solaris) can be used to spy on a process and winkle out useful information. I first saw this ingenious manoeuvre when my colleague Neil Dixon used it on an “ls” process which was hanging on a massive /tmp directory. In one terminal:

cd verybigdir
ls -l >/dev/null

while that’s chugging away, get the PID of the ls process and spy on it from another terminal window, revealing the file names as ls stats them all:

[root@saturn ~]# strace -p 3963 2>&1 | grep lstat
lstat64(“test_file_a_2433028”, {st_mode=S_IFREG|0664, st_size=0, …}) = 0
lstat64(“test_file_a_2047256”, {st_mode=S_IFREG|0664, st_size=0, …}) = 0
lstat64(“test_file_a_1201573”, {st_mode=S_IFREG|0664, st_size=0, …}) = 0

Thus you can see the file names immediately. If you want to delete them, pump that strace output into an appropriate AWK.

I guess the same approach could be used to interrogate other processes. Has anyone used it to, for example, check to see how far though a backup process has run, or a large find/cpio jobs, or even just a big cp ?

5 thoughts on “Large Directory Causes ls to "Hang"

  1. I needed to take 1 million files and make one big file containing all lines from each one of those 1 million files.

    This post helped me tremendously.

    This is the command that ended up working for me:
    ls -1 -f | xargs -n 1 cat >> ../bigfile.txt

  2. Thank You!!!!
    To delete millions (files and folders) without a high load is the fast.
    ls -1 -f | xargs rm -rfv
    Before, test with
    find -delete (does’t work, too many files in this case)
    for i in *; do rm… (work 1000 times slower)

    delete 100000 files and 1000 dirs takes 191 seconds(3 min. 11s) with “for i in*” and just 6 seconds with ls -1 -f | xargs rm -rfv

    Thanks a lot

Leave a Reply

Your email address will not be published. Required fields are marked *