WTF is EMFILE and why does it happen to me

By request

The error code EMFILE means that a process is trying to open too many files. Unix systems have a max number of file descriptors that may be assigned, also known as the MAX_OPEN value. On OS X, the default is 256, which is pretty low for many modern programs that do a lot of file system writing and reading.

This max value can be read or modified using the ulimit -n command. Since I’ve bumped up the MAX_OPEN ulimit value to 2560 on my system, here’s what my laptop reports:

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 2560
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 709
virtual memory          (kbytes, -v) unlimited

These ulimit values are important! You don’t want runaway programs to accidentally open way too many files, and take up unnecessary resources on your program by accident.

npm, being a package manager, opens a lot of files, often more than 256 at a single time. In order to get around this limitation, there are two options:

  1. Always be very careful to not open too many files.
  2. Handle EMFILE errors by queuing the open operation, and then attempting it again once a file closes.

The only way to reliably do #2, however, is by monkey-patching Node’s fs module, which is exactly what the graceful-fs module does. A really interesting collision of bugs in npm, graceful-fs, and lockfile, led it to ignore certain open operations, and so you could easily get into cases where the script could not reasonably handle these problems. Basically, it would open a lockfile to reserve a specific tar operation, and then not have any file descriptors left to actually do the tar unpack operation! Also, graceful-fs was not actually monkey-patching with a queue, but instead trying to do some fancy clever back-off stuff, which just wasn’t as solid.

Graceful-fs 2.0 and lockfile 0.4 contain the fixes for their relevant parts of this flub up. The latest version of npm 1.3 has all the fixes.

At this point, no matter HOW small your ulimit -n value is, graceful-fs will prevent it from ever raising an EMFILE error. Of course, it does this at the expense of making open operations potentially take longer. I’m planning on exploring using a slightly smarter monkey-patch, so that it only will enqueue open operations that have some kind of special flag or other opt-in switch.