Dealing with large numbers of files in Unix

Technology

Most of the time, you can move a bunch of files from one folder to another by running a simple mv command like “mv sourcedir/* destdir/“. The problem is that when that asterisk gets expanded, each file in the directory is added as a command line parameter to the mv command. If sourcedir contains a lot of files, this can overflow the command line buffer, resulting in a mysterious “Too many arguments” error.

I ran into this problem recently while trying to manage a directory that had over a million files in it. It’s not every day you run across a directory that contains a metric crap-ton of files, but when the problem arises, there’s an easy way to deal with it. The trick is to use the handy xargs program, which is designed to take a big list as stdin and separate it as arguments to another command:

find sourcedir -type f -print | xargs -l1 -i mv {} destdir/

The -l1 tells xargs to only use one argument at a time to pass to mv. The -i parameter tells xargs to replace the {} with the argument. This command will execute mv for each file in the directory. Ideally, you would optimize this and specify something like -l50, sending mv 50 files at a time to move. This is how I remember xargs working on other Unix systems, but the GNU xargs that I have on my Linux box forces the number of arguments to 1 any time the -i is invoked. Either way, it gets the job done.

Without the -i, the -l parameter will work in Linux, but you can no longer use the {} substitution and all parameters are placed as the final arguments in the command. This is useless for when you want to add a final parameter such as the destination directory for the mv command. On the other hand, it’s helpful for commands that will end with your file parameters, such as when you are batch removing files with rm.

Oddly enough, in OS X the parameters for xargs are a bit wonky and capitalized. The good news is that you can invoke the parameter substitution with multiple arguments at a time. To move a bunch of files in OS X, 50 files at a time, try the following:

find sourcedir -type f -print | xargs -L50 -I{} mv {} destdir/

That’s about all there is to it. This is just a basic example, but once you get used to using xargs and find together, it’s pretty easy to tweak the find parameters and move files around based on their date, permissions or file extension.

What will the next generation of Make: look like? We’re inviting you to shape the future by investing in Make:. By becoming an investor, you help decide what’s next. The future of Make: is in your hands. Learn More.

Tagged
Discuss this article with the rest of the community on our Discord server!

ADVERTISEMENT

Escape to an island of imagination + innovation as Maker Faire Bay Area returns for its 16th iteration!

Prices Increase in....

Days
Hours
Minutes
Seconds
FEEDBACK