본문 바로가기

Job Notes/File System

The Linux Virtual File-system Layer

The Linux Virtual File-system Layer

Neil Brown neilb@cse.unsw.edu.au and others.

29 December 1999 - v1.6


The Linux operating system supports multiple different file-systems, including ext2 (the Second Extended file-system), nfs (the Network File-system), FAT (The MS-DOS File Allocation Table file system), and others. To enable the upper levels of the kernel to deal equally with all of these and other file-systems, Linux defines an abstract layer, known as the Virtual File-system, or vfs. Each lower level file-system must present an interface which conforms to this Virtual file-system. This document describes the vfs interface (as present in Linux 2.3.29). NOTE this document is incomplete.


1. Introduction

2. Objects and Methods

3. Registering and Mounting a file-system

4. The Super-Block and its operations

5. The File and its Operations

6. Names, or dentrys

7. Inodes and Operations

8. Locking

9. Credits

10. Scribbled notes

 

 

 

 

 

1. Introduction

This document describes the internals of one of the fundamental Linux kernel subsystems - the Virtual File-system Layer also known as the VFS switch. This subsystem corresponds to the "vnode/vfs layer" found in commercial UNIX flavours, such as those based on SVR4/SVR5 code base, e.g. SCO UnixWare.

All references to the C source code files are given relative to the /usr/src/linux directory. All header files are relative to the /usr/src/linux/include directory.

 

  

 

 

 

 

2. Objects and Methods

The Virtual File-system interface is structured around a number of generic object types, and a number of methods which can be called on these objects.

The basic objects known to the VFS layer are files, file-systems, inodes, and names for inodes.

2.1 Files

Files are things that can be read from or written to. They can also be mapped into memory and sometimes a list of file names can be read from them. They map very closely to the file descriptor concept that unix has. Files are represented within Linux by a struct file which has a number of methods stored in a struct file_operations.

2.2 Inodes

An inode represents a basic object within a file-system. It can be a regular file, a directory, a symbolic link, or a few other things. The VFS does not make a strong distinction between different sorts of objects, but leaves it to the actual file-system implementation to provide appropriate behaviours, and to the higher levels of the kernel to treat different objects differently.

Each inode is represented by a struct inode which has a number of methods stored in a struct inode_operations.

It may seem that Files and Inodes are very similar. They are but there are some important differences. One thing to note is that there are some things that have inodes but never have files. A good example of this is a symbolic link. Conversely there are files which do not have inodes, particularly pipes (though not named pipes) and sockets (though not UNIX domain sockets).

Also, a File has state information that an inode does not have, particularly a position, which indicates where in the file the next read or write will be performed.

2.3 File-systems

A file-system is a collection of inodes with one distinguished inode known as the root. Other inodes are accessed by starting at the root and looking up a file name to get to another inode.

A file-system has a number of characteristics which apply uniformly to all inodes within the file-system. Some of these are flags such as the READ-ONLY flag. Another important one is the blocksize. I'm not entirely sure why this is needed globallly.

Each file-system is represented by a struct super_block, and has a number of methods stored in a struct super_operations.

There is a strong correlation within Linux between super-blocks (and hence file-systems) and device numbers. Each file-system must (appear to) have a unique device on which the file-system resides. Some file-systems (such as nfs and proc) are marked as not needing a real device. For these, an anonymous device, with a major number of 0, is automatically assigned.

As well as knowing about file-systems, Linux VFS knows about different file-system types. Each type of file-system is represented in Linux by a struct file_system_type. This contains just one method, read_super which instantiates a super_block to represent a given file-system.

2.4 Names

All inodes within a file-system are accessed by name. As the name-to-inode lookup process may be expensive for some file-systems, Linux's VFS layer maintains a cache of currently active and recently used names. This cache is referred to as the dcache.

The dcache is structured in memory as a tree. Each node in the tree corresponds to an inode in a given directory with a given name. An inode can be associated with more than one node in the tree.

While the dcache is not a complete copy of the file tree, it is a proper prefix of that tree (if that is a correct usage of the term). This means that if any node of the file tree is in the cache, then every ancestor of that node is also in the cache.

Each node in the tree is represented by a struct dentry which has a number of methods stored in a struct dentry_operations.

The dentries act as an intermediary between Files and Inodes. Each file points to the dentry that it has open. Each dentry points to the inode that it references. This implies that for every open file, the dentry of that file, and of all the parents of that file are cached in memory. This allows a full path name of every open file to be easily determined, as can be seen from doing:


# ls -l /proc/self/fd

total 0

lrwx------   1 root     root           64 Nov 23 07:51 0 -> /dev/pts/2

lrwx------   1 root     root           64 Nov 23 07:51 1 -> /dev/pts/2

lrwx------   1 root     root           64 Nov 23 07:51 2 -> /dev/pts/2

lr-x------   1 root     root           64 Nov 23 07:51 3 -> /proc/15588/fd/



*첨부파일