Life in the Terminal - Volume 1
The terminal is the best friend of an application developer. There are multiple tools that can be piped together to extract meaningful insights for the problem at hand without developing a single line of code.
Today we will be exploring lsof
LSOF
What is it?
lsof
is a command line tool that can be used for analyzing open files. The definition of file
in a unix-based system is very wide: It can represent directories, text files, sockets or streams.
Display Running Files
The simplest, but not very meaningful, exercise that we can do with lsof
is listing all open files in a running system:
$ lsof
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
loginwind 143 georgios cwd DIR 1,5 640 2
The above command will produce a very long list which is not easy manageable by the human brain.
We can split the list into multiple pages by leveraging the powerful more
command:
$ lsof | more
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
loginwind 143 georgios cwd DIR 1,5 640 2
Understanding Output Columns
The above output contains multiple columns that might not make sense initially.
By Looking at man lsof
, it is straightforward to understand what all these magic numbers and acronyms are about.
- COMMAND
This is the executable managing the open files
- PID
This is the id of the process managing the open files
- USER
This is the user running the current process
- FD
This column might be difficult to graps if you see it for the first time, especially if you have never come across file descriptors before. FD
can be a number or a string describing the file descriptor. It can be one of these values:
cwd: current working directory
Lnn: library references (AIX)
err: FD information error (see NAME column)
jld: jail directory (FreeBSD)
ltx: shared library text (code and data)
Mxx: hex memory-mapped type number xx
m86: DOS Merge mapped file
mem: memory-mapped file
mmap: memory-mapped device
pd: parent directory
rtd: root directory
tr: kernel trace file (OpenBSD)
txt: program text (code and data)
v86: VP/ix mapped file
It can also be a cryptic number followed by some random characters: * The number defines the file descriptor * The character that follows defines the access:
r: read
w: write
u: update
space: unknown
-: unknown locked
* The next character that follows defines the lock type:
N: Solaris NFS lock of unknown type
r: read lock on part of the file
R: a read lock on the entire file
w: write lock on part of the file
W: write lock on the entire file
u: read and write lock of any length
U: lock of unknown type
x: SCO OpenServer Xenix lock on part of the file
X: SCO OpenServer Xenix lock on the entire file
space: if there is no lock.
Spoiler alert: It is not very likely that you will need the above.
- TYPE
In Unix-based systems everything is a file. Hard-drives, usb sticks, sockets or text files. TYPE
aims to explain what a listed FD
is about. The list here is endless. Use the following hacky command to get to the section of the lsof
documentation listing all the different types:
man lsof | grep -A 200 " TYPE" | less
- DEVICE
This column represents the device on which the file is attached or a hex address. If the DEVICE
value represents indeed a device then you can run diskutil list
on your mac computer to get a mapping between the integer you see under this column and the device that is represented by this number.
- SIZE/OFF
This is the size of the file or file offset in bytes. This is an optional value.
- NODE
Unix-based filesystems use a data structure to store filesystem objects. For typical filesystem files, the NODE
represents the number of the said object. It can also be the Internet protocol type e.g TCP
or UDP
as well STR
if the open file represents a stream. There are a couple of other available node types but they are not commonly used.
- NAME
This is one of the most useful columns for most of the uses cases. It is actually the name/path of the open file.
Finding Internet related Files using specific ports
You should know a this point that many things in UNIX-based systems represented by files. This stands for internet connections too. In order to find which process has allocated a specific port we can run the following command
$ lsof -i :8080
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 39734 georgios 171u IPv6 0x20964e8e5af3b6db 0t0 TCP *:http-alt (LISTEN)
Here, -i
is being used to only consider internet related files.
Finding Open Files Managed by a Specific Executable
$ lsof -c java | less
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 38866 georgios cwd DIR 1,5 672 8617079685 /Users/georgios/projects/...
This command aims to capture all the different open files that have been opened by a specific executable, in this particular example, the executable is java
.
Have in mind that the command above might list open files managed by different processes that have all been started using the java
executable.
Finding which process manages an open file
This uses case is the reverse of the one above. We do list all processes having open files within a certain directory (+D
)
$ lsof +D "/tmp"
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ssh-agent 2720 georgios 3u unix 0x20964e8e45f0f84b 0t0 /private/tmp/...
These are some of the typical uses cases that I am using lsof
for. I might come back with more of those but until then I am hoping that the above explanation is useful enough.