I organized the roles of each folder in a Linux system based on the FHS structure introduced in the book Linux Programming for Everyone.
Directory Structure#
Linux uses the File system Hierarchy Standard (FHS), which is the standard specification for the Linux directory tree.
/
The root of the tree, the root directory. Without this, Linux can't even boot.
/bin
user command binaries
This is where executable files (commands) are stored. It contains the basic system commands needed during boot, while /usr/bin contains other commands for general users. Modern OS versions sometimes don't distinguish between the two. /bin and /usr/bin are directories managed by the distribution, so you shouldn't use the package system for them. Commands you install yourself should ideally be stored in /usr/local/bin and similar locations.
/sbin
system binaries
This is where admin commands are located. It also contains admin commands needed during boot. /usr/sbin contains system management commands and server programs used during normal operation.
/lib
/lib and /lib64 contain libraries for C, Python, Ruby, etc.
/usr
This is used to store files that can be shared across multiple computers. Essential files are stored under this directory. The origin of usr is User Services and Routines, but nowadays it's simply thought of as "user." Files that can't be shared are stored in /var.
/usr/src
This stores the source code of system commands and the Linux kernel source code. You should not store your own program source code here.
/usr/include
This contains the system header files. Kernel header files are stored in /usr/include/linux. In Unix-based operating systems, kernel header files are stored under /usr/include/sys, but Linux has a slightly different structure because the kernel and libc are maintained by different people.
/usr/share
This contains files that don't depend on architecture (CPU type), so they can be shared across different architectures. Data for the man command and info is stored here.
/usr/share/man
Depending on the distribution, it may be stored in /usr/man, but according to FHS, it's stored in /usr/share/man. It's organized as man1, man2, man3..., where the trailing number is the section number. Section 1 is in /usr/share/man/man1. Each folder lists pages, and filenames follow the format 'document_name.section'.
/usr/local
Similar to /usr, it has directories like bin, sbin, lib, share, etc. The difference is who manages the files. /usr is managed by the distribution, while /usr/local is the responsibility of each system's administrator (user).
/var
This is where variable data files are gathered. It's used to store files that change frequently. It's not suitable for storing files shared across multiple computers. When using Linux as a server, this is very important because logs and mailboxes are located here.
/var/log
Log files written by server processes are stored here. Log files are files where events generated by programs are recorded.
/var/spool
User mail (/var/spool/mail) and print input (/var/spool/cups) are temporarily stored here.
SPOOL (Simultaneous Peripheral Operation On-Line) is a technique designed to reduce wait times caused by the slower processing speed of peripheral devices compared to the CPU, by allowing the CPU and I/O devices to operate independently.
/var/run
Process IDs of running server processes are stored here. These files are called PID files. When creating a server, it's good etiquette to write the process ID to this folder on startup and remove it on shutdown. You can check the PID by looking at files ending in .pid in this folder.
/etc
Short for etcetera, this is where system configuration files are stored.
/dev
Short for devices. Placing all types of device files here was the traditional Unix approach. Starting with Linux 2.4, a Device File System (devfs) was introduced that creates device files only for devices that exist on the system. Starting with Linux 2.6, a mechanism called udev was introduced.
The reason new structures were introduced is that the kernel began supporting too many hardware devices. devfs is implemented as part of the kernel, while udev is implemented outside the kernel.
/proc
Short for process. This is typically where the Process File System (procfs) is mounted. The process file system literally represents processes as a file system. If you want to get information about the process with PID 1, you can look at the /proc/1 folder. However, these days it's more common to check using the ps command.
/sys
As information unrelated to processes started being mounted on procfs, a new file system called sysfs was added starting with Linux 2.6 to provide system-related information separately. This is the folder where sysfs is mounted. You can get information about devices and device drivers that exist on the system.
/boot
The Linux kernel is stored in a file called vmlinuz. In BSD, when virtual memory mechanisms were included, "unix" became "vmunix." When this was compressed, the ending became "z," resulting in vmlinuz. The kernel program was originally stored directly in the root folder, but recently it's been stored here.
/root
The superuser's home folder is /root. In the past, the root folder was the superuser's home folder.
/tmp, /var/tmp
Sometimes you need to temporarily create files somewhere. These folders are for that purpose. The two folders differ in storage duration. /tmp may be deleted on reboot, but /var/tmp is not deleted on reboot. For example, vi's recovery files are saved in /var/tmp.
/home
Regular users' home folders can be accessed via the environment variable $HOME. In large organizations, they sometimes append numbers like /home1, /home2.
Since home folder paths can differ depending on the situation, it's safer to use a dedicated API.
Criteria for Distinguishing Folders#
| Perspective | O | X |
|---|---|---|
| Shared across multiple hosts | /usr | /var |
| Operated as read-only | /usr | /var |
| Architecture-dependent | /usr/lib | /usr/share |
| Managed by the distributor | /usr | /usr/local |
| Persists after reboot | /var/tmp | /tmp |
Other important perspectives include:
- Whether backups are needed (whether it changes, whether recovery is needed)
- Whether it's needed per user or just one per system
- Whether permissions need to be separated
- Whether it would be convenient to specify with glob patterns in the shell
Command Reference#
A builtin is a shell's built-in command.
man#
In a terminal like bash or zsh, typing man man will show you a detailed explanation.
man shows you the explanation for the command you're looking for. At 42 Seoul, when taking exams or working on assignments, you can type man command_name to find information. When reading about functions through man, near the last line you'll see SEE ALSO, which shows related functions. There's a number in parentheses next to the function name — that's the manual section number.
Manual sections
You can check which section each implemented command is associated with using the information below.
- User Commands = general commands
- System Calls = system calls
- C Library Functions = C standard library functions
- Devices and Special Files = special files (usually device files found in /dev) and drivers
- File Formats and Conventions = file formats and conventions
- Games et. Al. = games and screensavers
- Miscellanea = miscellaneous
- System Administration tools and Daemons = system administration commands and daemons
Section 1 is usually what you'll look at the most, and if you're doing assignments at 42 Seoul, you'll frequently check section 3 as well. Nowadays you can search on Google and find well-organized articles in various languages, but I think it's sometimes nice to read the explanations through this command.
man options
I didn't find any incredibly useful options, but I'll jot down a few I learned about.
Using the -w option, e.g., man -w sed, prints the path to the manual file for that command.
If you compare it with man sed using vim, you can see how it's written and examine the document format.
Using the -k option, e.g., man -k variables, shows a list of manuals containing that keyword.
This is most effective when you remember a specific keyword or phrase. If you type a command name as the option, you'll just get words containing that command, which isn't very useful. e.g., man -k sed -> used
Normally, if you use man without a number, it shows information from the first section, but if you include a number, e.g., man 1 sed, it searches for the command in that specific section. They try to avoid duplicating commands as much as possible, but when it's unavoidable, duplicates exist, which is why this option is provided.
To become what you are not, behave as you do not.
— T.S. Eliot