NTFS
NTFS is the file system introduced with Windows NT. It is very complex and few people outside of the developers understand it in every detail. So, we aren't going to examine it in the detail that we did FAT. Instead, we are going to look at the parts that are most of interest to forensics folks:
- The Master File Table ($MFT, itself a file)
- The Log File ($LogFile, itself a file)
- The User Journal, a.k.a. the Change Journal ($UsrJrnl, itself a file)
- Alternate Data Streams
The Master File Table (MFT)
The MFT is the closest thing that NTFS has to the FAT table. But, it is more like a database, or a Java Map, that maps from the name of a file to the attributes of the file. Some of the attributes are resident within the file's MFT entry. Others are larger, stored outside of the file's entry, and "pointed to" by the file's entry. Check out Wikipedia and other sources for the details. But, here is a quick run down of some of the components:
- Cluster allocation bitmap: This allows for sparse files by showing which clusters are backed with actual storage, and which are unallocated holes
- Data: This is the blob that has the actual data. This is often non-resident, but can be resident for small files. Also, there can be multiple data entries, see Alternate Data Streams
- File name
- Index root: If this is a driectory, it contains the index of files within the directory
- Index allocation attribute: Describes the bufferss used to store indexes, if they can't all fit in the index root entry above.
- Security descriptors: Ownership, ACLs, etc
- Standard Information: The FAT-like simple metadata for a file
One quick note is that NTFS is very smart about allocating nearby clusters tot he same file, reducing fragmentation. Having said this, the MFT is one huge file, growing over time, which can constantly tack non-resident attributes onto old entries. As a result, it, itself, can suffer from heavy fragmentation.
The Log File
There isn't much to say about the log file, since its structure is not completely well-known. It is used to get the file system up and running quickly after restarts, especially to regain consistency after recovery. Its format is Microsoft's domain. I'm guessing they've never published it, so they can change it. The opaqueness of this structure are among the things that have made complete non-Microsoft implementations challenging.The contents of this file can be of forensic significance, but it is rare. In the event you want to explore into it, there are tools that can give you a look at significant chunks.
The Change Journal, e.g. $UsrJrnl
The change journal is of dramatic forensic value. It notes every time a file is changed, whether the change is to data or metadata. It doesn't contain any actual data -- but notes a lot of information, including the name of the file, the timestamp, the security and other attributes, the size of the file, whether the data, or an alternate data stream, was overwritten, extended, or truncated, created or deleted, etc, etc, etc.Alternate Data Streams
Alternate Data Streams are, in some sense, files hidden within files. They exist because a single file can have multiple data attributes. These alternate data streams do not show up in directory listings, and their storage is not deducted from free space.Many tools have been coded such that they are nto capable of interacting with alternate data streams. But, for those that can, one names one as follows: filename.ext:ads.ext. It is that easy. Try creating some alternate data streams using notepad from the command line, e.g. notepad.exe foo.txt:bar.txt. Ain't that cool? And, yeah, some people hide things here. And, yeah, you can execute programs stored within alternate data streams.
Why do they exist? Apple's HFS has the ability to associate things like icons with files. Microsoft wanted a generalized mechanism for doing the same thing. Ta-da!
Thinking Like A Forensics Analyst
Let's think about FAT systems. When files are deleted, they are just marked as deleted. If we replace the sentinel value in the directory with another character, and no other file is using any of the file's old space, we can get it back.Even when data is overwritten, short files and tails may leave data in the slack space. This data may be of use to us. And, even unallocated clusters may have previosuly been part of files and may contain useful data.
The same is essentially true in NTFS, because of the way clusters work and the entries in the Change Journal.
The NTFS Change Journal is a treasure trove for forensics, because it gives us a ton of information about how and when the system and files were used. But, it is circular log, with a configurable maximum size. So, it will not necessarily go back to the epoch time.
And, well, alternate data streams might be innocuous, but they might also hold maliciously held data, or malware that is of interest in highlighting the security context of a system.
Because NTFS is very good about keeping allocations associated with the same file nearby, the location of unallocated clusters might well give us a clue that they were previously associated. FAT suffers from much more fragmentation, so this is less true there, but we may be able to make inferences, because we know the other allocations and that the FAT table does allocations via a sequential scan.
Warning to all Readers
These are unrefined notes. They are not published documents. They are not citable. They should not be relied upon for forensics practice. They do not define any legal process or strategy, standard of care, evidentiary standard, or process for conducting investigations or analysis. Instead, they are designed for, and serve, a single purpose, to help students to jog their memory of classroom discussions and assist them in thinking critically about the issues presented. The author is certainly not an attorney and is absolutely not giving any legal advice.