Why Study the The File Allocation Table (FAT) System
When it comes to organizing a general purpose file system, the system used by DOS in the 80's and 90's is, in many ways, the "Old standby". We study it for its simplicity, historical significance, and for its tendency to be deployed for use on emerging, low-capacity media devices. These days, we'll often find it on USB flash drives and storage cards.Why does it continue to be reborn? As we'll see, although inefficient for large volumes, it is very simple to implement. And, because it has been around for years, it is very well understood and accepted and accessible within almost all operating environments.
Data Structures Composing FAT File Systems
There are three important data structures within a FAT file system: the boot sector, the File Allocation Table (FAT), and the directory entry. The boot sector contains the metadata that describes the particulars of the file system, as well as, in the case of a bootable partition, the code that gets the boot-strapping started. The FATs keep track of which sectors of disk are associated with each file. And, the directory entries maintain the directory tree structure, by organizing the directories as lists of directory entries, which contain the name of the file or directory, a pointer into the FATs, and some meta data about the file, such as when it was created.
The File Allocation Tables (FATs)
As we've discussed, the fundamental unit of storage on a disk is the sector. But, there are many, many sectors. Keeping track of them individually would impose a hefty cost in terms of both time and storage. As a result, most file systems simplify the problem by grouping adjacent sectors together and managing these groups. FAT-based systems call these groups clusters.Because files can grow at any time, even after other files have been created, they can be composed of non-adjacent clusters. The result is that logically adjacent bytes might not by adjacent on the physical media. Because files can be deleted, and compaction is an expensive operation that isn't automatic, the set of physical clusters in use might be fragmented, resulting in fragmentation of the free clusters. The up shot here is that when moving linear thorugh a single file, one can be bouncing around the physical media, encountering penalties for the seeks.
In general, the larger the cluster size, the lower the storage efficiency of the file system. This is because we can not create a file or file tail that is smaller than a cluster. So, the space left over within the cluster, what we call slack space is essentially wasted space. For this reason, it is usually desirable to have the smallest cluster size possible, as this reduces this type of wastage.
Having said this, as we'll see as we learn more about the FAT table, seeking through the clusters of a file is a linear search. So, if one knows that the file system is going to be used to store mostly large files, e.g. long videos, one might intentionally choose to use fewer, larger clusters.
FAT-based systems maintain the list of clusters associated with each file through the use of a redundant FAT table. The FAT table has one entry for each and every cluster on the media. The number of entries within the FAT table depends on the version of the FAT file system in use. In the days when media was small, FAT tables were indexed with 12-bit numbers. But, as media grew, the FAT tables did, too. FAT16 replaced FAT12. And, eventually FAT32 replaced FAT16. Interestingly enough, FAT32 systems only use 28-bit indexes.
Since the number of entries in the FAT table is fixed by the version of the file system in use, in general, the cluster size varies with the capacity of the media. Basically, the cluster size is equal to the number of sectors on the media, divided by the number of FAT table entries (rounded up). This provides the smallest cluster size possible, reducing the wastage. But, as mentioned earlier, there are occasions where we might want to have fewer, larger clusters. And, we can accomodate this by configure the file system such that the FAT table uses fewer than the maximum number of entries.
There are a couple of special entries at the beginning of the FAT table. But, other than these, each entry is just a pointer to another entry, allowing them to be configured into lists. The smallest possible list contains no FAT entries. The largest possible list contains each and every entry. And, multiple lists can be constructed, by having several chains. As you might expect, there is a sentinal value that is used to mark the end of a chain. The "head pointers" are stored within each directory entry. In other words, when we look up a file and find its directory entry, it gives us the first cluster of the file. From there, we can consult this entry in the FAT to find the second cluster, and so on.
In addition to a sentinal value for the last cluster in a file, there are special sentinal values that indicate empty clusters (not part of any file's list), and bad clusters. A cluster is marked bad if the underlying device reports that a sector within it is bad. So, if a hard drive is able to relocate a bad sector using a spare sector, this will be transparent to the FAT file system and will not result in the cluster being marked bad. So, clusters get marked bad when the underlying device (a) lacks the capacity to hide bad secotrs, or (b) has exhausted that capcity, e.g. already consumed all spare sectors.
FAT12 is a little odd in that 12-bits is not a whole number of bytes. So, some entries span two bytes. Given this, any group of three bytes (24 bits) contains two 12-bit entries. Be careful in reading documentation here. Various folks report bizarre things about how the bytes are grouped into the two 3-byte groups. These folks are ignoring endianness. If we have the bytes abcdef and interprete these as three integers, we group them as ab cd ef. But, we interpret these bytes as ba dc fe. So, when we view them as two three-byte values, we get dab and efc.
With respect to the sentinal values, all-0 is free, e.g. 0000. The end of the chain is all Fs on Microsoft-based systems, but actually can be xff8 - xfff, and Linux has usually used xff8 of xfff8. xff7 is a bad cluster. And, xff0-xff6 are not used. What is meant by "x"? Well, all of the leading bits are all 1s (hex f). But, the number of them depends on whether it is a 12-bit, 16-bit, or 32-bit fat entry.
Because, at the least, the first two clusters of the disk are the "reserved area", including the boot sector, these entries are not needed to represent data storage. The first byte of the first entry stores a redundant copy of the "media descriptor", which is also contained within the boot secotr, and described there. The rest of the first entry is all 1 bits, e.g. hex Fs. The low order bytes within the second entry store the end-of-file marker. The lower order bits may be used for administrative flags in FAT16 and FAT32 systems, from the high-order bit down, as follows:
- 1=clean shutdown, 0=dirty shutdwon
- 1=no disk errors during last use, 0=disk errors occured during lat use
In order to determine whether a system is FAT12, FAT16, or FAT32, first look at the number of number of clusters reported in the boot sector. If it is less than 4085, it is FAT12. If it is 65525 or more, it can't be FAT16, so it is FAT32. Otherwise, it is FAT16.
The FAT tables are stored next to each other right after the boot area. Unfortuantely, this means that, although redundant, they are often damaged together. Additionally, problems in the first table are often copied into the second as various disk utilities attempt to "repair" systems by making them consistent.
Directory Entries
A directory is essentially a structured file that contains a list of files and some information about them. Specifically, a basic directory file is composed of entries, by byte, as follows:
- 0-10: 8 byte name + 3 byte extension
- 11: Bit vector (0:read-only, 1:hidden, 2:system, 3:volume label, 4:directory, 5:archive, 6-7: undefined)
- 12-21: Undefined, except for VFAT
- 22-23: Time stamp (5:hours, 6:minutes, 5: 2x seconds, e.g. even seconds)
- 24-25: Date stamp (7:year since 1980, 4:month, 5:day)
- 26-27: Starting cluster, index into FAT table, 0 if empty file)
- 28-31: Size of file in bytes
On pre-FAT32 systems, the root directory entry came immediately after the boot sector. On post FAT-32 systems, the root directory entry is treates as a file and begins at cluster 2 of the FAT table.
Virtual FAT (VFAT)
VFAT is a Windows 95 hack to provide long file names, while providing backward compatibility to older FAT systems. It does this by (a) making use of some of the previously undefined bits in a directory entry, specifically 12-21, (b) giving long file name short nicknames for backward compatibility, and (c) hiding the full long file name in a series of hidden directory entries with a new structure that won't be reported in directory listings on older systems, because they appear hidden to these systems. Additionally, long file names can include spaces, upper adn lower case, and some other characters previsouly disallowedA long file's nickname is contructed by taking the first 6 bytes of its name, appending a ~, and then appending a sequence number to distinguish between multiple long file names with the same short prefix. The extension is retained, but truncated to three, if necessary. All characters are made upper case. Any character previosuly disallowed is replaced with an _. So, we end up with short file names like "HI_THE~1.TXT"
The long file name, up to 255 characters, is encoded in a series of VFAT directory entries that immediately follow. These entries have the following format, by byte:
- 0: 0-5: sequence number, 6:1, if last entry for this name
- 1-10: 2-byte unicode chars for chars 1-5
- 11: 0xf (consider this under non-VFAT)
- 12: 0, indicates extedned file name entry
- 13: checksum of short name
- 14-25: 2-byte unicode chars 6-11
- 26-27: 0 (consider this under non-VFAT)
- 28-31: 2-byte unicode chars 12-13
Notice that each entry can encode 13 characters. the sequence number indicates which substring this is of the full string. The last substring has bit-6 set. These entries are stored before the short entry, tail first.
If a VFAT file system is mounted under an operating system that supports FAT, but not VFAT, these special entries are ignored, as they appear to be empty hidden files. But, some utilities designed to do things like reorder directory entries, could separate so that they are no longer adjacent to the short entries. This could really screw things up. As a result, there is a checksum computed from the short file name that is stored in these entries. They are invalid if they don't match.
If the first byte of a file name is 0xe5, it means that this file has been deleted. It can still be recovered, if its clusters have not be reused. If the first byte actually does start with 0xe5, it is recorded as 0x05. yes, it seems like we could have simplified this to me, too!
The Boot Sector
The boot sector contains a bunch of metadata about the file system as well as the code that gets the OS bootstrapping. Every bootable file system has some type of boot sector, whcih contains a jump to the bootstrap code at the very beginning, followed by filesystem specific information. For reference, UNIX people call the boot sector the superblock.The format of the FAT12 boot sector, again byte-by-byte, is as follows:
- 0-2: Assembly to jump to bootstrap code, later in boot sector
- 3-10: Label, e.g. "MSDOS 5.0"
- 11-12: Bytes/sector
- 13: Sectors/cluster
- 14-15: Number of reserved secotrs, including the boot sector
- 16: Number of copies of the FAT table, usually 2
- 17-18: Number of root directory entries
- 19-20: Number of sectors w/in the file system
- 21: Media describtor (different code for 5.25" floppy, hard drive, etc)
- 22-23: Size of FAT table in sectors
- 24-25: Sectors/track
- 26-27: Number of heads
- 28-29: Number of hidden sectors, e.g. sectors before boot sector, used for partitioning
- 30-509: Bootstrap code (where we jump to)
You can lookup the format for the FAT16 and FAT32 boot sectors -- they are much longer, but begin as above.