Life in the Terminal - Volume 1
The terminal is the best friend of an application developer. There are multiple tools that can be piped together to extract meaningful insights for the problem at hand without developing a single line of code.
Today we will be exploring lsof
What is it?
lsof is a command line tool that can be used for analyzing open files. The definition of
file in a unix-based system is very wide: It can represent directories, text files, sockets or streams.
Display Running Files
The simplest, but not very meaningful, exercise that we can do with
lsof is listing all open files in a running system:
$ lsof COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME loginwind 143 georgios cwd DIR 1,5 640 2
The above command will produce a very long list which is not easy manageable by the human brain.
We can split the list into multiple pages by leveraging the powerful
$ lsof | more COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME loginwind 143 georgios cwd DIR 1,5 640 2
Understanding Output Columns
The above output contains multiple columns that might not make sense initially.
By Looking at
man lsof, it is straightforward to understand what all these magic numbers and acronyms are about.
This is the executable managing the open files
This is the id of the process managing the open files
This is the user running the current process
This column might be difficult to graps if you see it for the first time, especially if you have never come across file descriptors before.
FD can be a number or a string describing the file descriptor. It can be one of these values:
cwd: current working directory Lnn: library references (AIX) err: FD information error (see NAME column) jld: jail directory (FreeBSD) ltx: shared library text (code and data) Mxx: hex memory-mapped type number xx m86: DOS Merge mapped file mem: memory-mapped file mmap: memory-mapped device pd: parent directory rtd: root directory tr: kernel trace file (OpenBSD) txt: program text (code and data) v86: VP/ix mapped file
It can also be a cryptic number followed by some random characters: * The number defines the file descriptor * The character that follows defines the access:
r: read w: write u: update space: unknown -: unknown locked
* The next character that follows defines the lock type:
N: Solaris NFS lock of unknown type r: read lock on part of the file R: a read lock on the entire file w: write lock on part of the file W: write lock on the entire file u: read and write lock of any length U: lock of unknown type x: SCO OpenServer Xenix lock on part of the file X: SCO OpenServer Xenix lock on the entire file space: if there is no lock.
Spoiler alert: It is not very likely that you will need the above.
In Unix-based systems everything is a file. Hard-drives, usb sticks, sockets or text files.
TYPE aims to explain what a listed
FD is about. The list here is endless. Use the following hacky command to get to the section of the
lsof documentation listing all the different types:
man lsof | grep -A 200 " TYPE" | less
This column represents the device on which the file is attached or a hex address. If the
DEVICE value represents indeed a device then you can run
diskutil list on your mac computer to get a mapping between the integer you see under this column and the device that is represented by this number.
This is the size of the file or file offset in bytes. This is an optional value.
Unix-based filesystems use a data structure to store filesystem objects. For typical filesystem files, the
NODE represents the number of the said object. It can also be the Internet protocol type e.g
UDP as well
STR if the open file represents a stream. There are a couple of other available node types but they are not commonly used.
This is one of the most useful columns for most of the uses cases. It is actually the name/path of the open file.
Finding Internet related Files using specific ports
You should know a this point that many things in UNIX-based systems represented by files. This stands for internet connections too. In order to find which process has allocated a specific port we can run the following command
$ lsof -i :8080 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 39734 georgios 171u IPv6 0x20964e8e5af3b6db 0t0 TCP *:http-alt (LISTEN)
-i is being used to only consider internet related files.
Finding Open Files Managed by a Specific Executable
$ lsof -c java | less COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 38866 georgios cwd DIR 1,5 672 8617079685 /Users/georgios/projects/...
This command aims to capture all the different open files that have been opened by a specific executable, in this particular example, the executable is
Have in mind that the command above might list open files managed by different processes that have all been started using the
Finding which process manages an open file
This uses case is the reverse of the one above. We do list all processes having open files within a certain directory (
$ lsof +D "/tmp" COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ssh-agent 2720 georgios 3u unix 0x20964e8e45f0f84b 0t0 /private/tmp/...
These are some of the typical uses cases that I am using
lsof for. I might come back with more of those but until then I am hoping that the above explanation is useful enough.