Filing it away

File systems organize the data stored on computer hard drives, keeping track of the physical locations of all data elements on disk while allowing users to quickly and reliably retrieve files when needed.

The file system acts as a digital index that lets a computer instantly find a specific file, regardless of the size or configuration of the storage drive or where the data bytes associated with the file sit on the drive’s storage platters.

Every operating system, from MS-DOS to Windows 95, Windows XP and Linux, has its own file system. But although all file systems perform the same basic functions, they vary in design and sophistication.


File systems have come a long way since MS-DOS and early versions of Windows. Those operating systems organized files under the FAT file system, which represents logical areas of the disk in allocation units called clusters, and maps the locations of file data to those areas using a file allocation table (FAT). FAT is also called FAT16 because it uses a 16-bit address space for tracking files and clusters.

FAT clusters vary with the size of the disk. FAT’s 16-bit address space can support up to 65,536 clusters (216). With a 65MB disk, cluster sizes were just 1KB in size, but they ballooned as disks emerged that were able to hold gigabytes of data. And since only a single file can be written to a cluster, this created inefficiencies that ended up wasting as much as 50 percent of available space on a 2GB disk drive.


FAT32, which debuted with Windows 95 OEM Service Release 2 (OSR2), introduced a 32-bit address space. By increasing the size of the file allocation table, it could support more clusters that were smaller in size on large disk drives, reducing the potential for wasted drive space.

Another FAT32 innovation was that it could handle file names with up to 255 characters, whereas FAT could only handle names with up to eight characters. Users could finally create long file names to better describe the contents.

The advent of FAT32 extended the maximum addressable volume size from 2GB to 2TB and improved reliability by allowing the system to switch to a copy of the file allocation table if the default copy should become damaged. But FAT32 also added to file system overhead and was therefore inefficient to run on disks smaller than 260MB.


The next development in Windows file systems was the New Technology File System (NTFS), introduced with Windows NT (which also supported FAT32). With a 64-bit address space and the ability to vary cluster size independently of the disk drive size, NTFS virtually eliminated the cluster size limitation problem.

It also brought other benefits, including file and directory security attributes, file encryption and support for storage volumes of up to 16TB and 232 clusters.

NTFS replaced the familiar file allocation table format with the Master File Table (MFT), which holds more information about files than did FAT. The MFT references all files and directories on the disk drive, including associated metadata such as security settings.

The NTFS also introduced a high level of fault tolerance. It logs disk operation activity prior to committing the transaction. If the system crashes during an update, it can examine the log file and restore the data. When read or write errors occur during normal operation, NTFS automatically identifies and blocks out the bad clusters and copies the data to a new location. Finally, NTFS creates a mirror of the MFT and can revert to the mirror should the original fail.

NTFS’s overhead makes it unsuitable for disks smaller than 400MB, and it can’t be used on floppy disks. Instead, Windows must write to formatted diskettes using FAT32.


The Linux file system, called Extended File System 2 (Ext2), evolved to rectify limitations of Linux’s original file system, Ext, which the operating system inherited from its Minix predecessor. Under the Minix file system, the maximum file system size was restricted to 64MB and file names to 14 characters.

Ext supported 2GB file systems and 255-character file names but suffered from some performance limitations. Ext2 supports 4TB file systems and 255 character file names and remedies those problems.

The Ext2 architecture uses a data structure called identification nodes (inodes) to refer to and locate files and associated data. The inode table includes the file type, size, access rights, pointers to associated data blocks and other attributes. The file system organizes disk space into groups of blocks, which contain both inode information and associated data blocks.

The Linux kernel uses the Virtual File System layer, which interacts with the file system to perform disk I/O. This gives Linux the ability to support multiple file systems, including DOS, FAT16 and FAT32 (which it supports as a native file system).