Created: Tuesday, 03 July 2018
Updated: Friday, 20 July 2018

To save readers' precious time I would like to emphasize the fact that that this guide applies in raids containing an NTFS formatted volume.

Firstly, keep in mind that this guide serves as a proof of concept, hopefully it will prove useful and enlightening to my fellow readers. In order to understand this approach, it is recommended to have a good grasp of the NTFS internals and file systems in general, a basic knowledge with raid arrays is surmised too.

Facing the daunting challenge to reconstruct a raid consisting of multiple hard disks might prove discouraging at a first stance. Nonetheless, NTFS structure provides a lot of hints to the savvy analyst, on how to overcome the problem when you do not have the configuration parameters of the RAID. In addition, you can verify by yourself, with certainty, the validity of the reconstruction you are about to perform.

To successfully reconstruct a RAID, we have to determine a couple of configuration parameters whose number varies depending on the type of RAID we are aiming at.

Initially, we need to figure out the size of the partition of the RAID. This requires to determine the location of the Master Boot Record (MBR), by searching for the bytes sequence of 0x55 0xAA at the end of each sector. Since we don't know the disk where the MBR resides, please note that I am referring to the RAID's MBR and not to the controller's MBR, we have to repeat this step for each hard disk we have at our disposal.

Once we locate the hard disk that contains the MBR, we must parse the partition entries which are located at offset 0x446, within the MBR. From the partition entry we take note of the size and the starting offset of the respective partition. Assume that the partition size is N bytes and no other partition entries exist and we have 4 available hard drives each of size K bytes. Furthermore we consider that a prone system administrator would want to fully use the available storage capacity. Therefore, he selects a partition size that spans as much as possible given the selected RAID configuration.

Concretely, if our RAID is of type 3, 4 or 5, then we expect that our RAID capacity N should almost equal 3*K, a capacity of one hard disk is consumed because of the parity block. Recall that a block is the minimum allocated size of a RAID. If our RAID is of type 1, then the partition size should almost equal the total size of hard disks, namely 4K. Finally, if we have a combination of type 1 and 0, namely 10 or 01, then we come up with a RAID capacity that is about half of the total capacity of hard drives, in other words 2\K. Nowadays, RAID 3 and 4 are rarely found in use, they have been superseded by the superior RAID 5, which we are going to elaborate from now on.

RAID 5 introduced the novelty of distributing the parity blocks over the whole set of hard drives. For a RAID 5 we need to determine the following parameters:

  • Start offset of the member disks.
  • Order of hard drives in the RAID
  • Parity distribution pattern
  • Block Size
  • Delayed parity existence and block pattern.

So far, the table below gives us a first glimpse of the RAID we need to reconstruct. D stands for Data block and P stands for Parity block respectively.

HD1 HD2 HD3 HD4
D? P? D? P? D? P? D? P?
. . . .
. . . .
D? P? D? P? D? P? D? P?

Start offset of the member disks

To determine the start offset we take into account the fact that we located the MBR of the RAID, we take note of its sector offset, suppose 1180. We can verify whether this offset applies for the remaining members by checking the contents of the previous sector which should be zeros. We also note in which hard disk the MBR was located, say at HD1. With the revealed information, the RAID becomes consequently:

HD1 HD2 HD3 HD4
D0 D? P? D? P? D? P?
. . . .
. . . .
D? P? D? P? D? P? D? P?

Block size of RAID

Next, we need to determine the block size, firstly, we must locate the Volume Boot Record (VBR), we have a hint from the partition entry from which we can extract the start address of the volume in sectors. Assume that the VBR offset is 64 sectors, we go to sector 63 (sector count starts from 0) from the MBR to look for the VBR. Two possible outcomes can occur, if we are lucky, sector 63 from MBR is indeed the VBR. Having said that, we can deduce that the block size is bigger than 64 sectors. If we fail to locate the VBR we need to search for the VBR using the fact that a VBR starts always with the string NTFS at byte offset 0x00 of a sector. Therefore, we perform a sector based search for every remaining member of the RAID so as to find the string. Suppose we find the VBR at hard drive 2 at the start sector, this implicitly means that the block size should be 32KBytes. As an exercise we ask the reader to locate the VBR if the block size was 16KB.

With this new information revealed the RAID becomes:

HD1 HD2 HD3 HD4
D0 D1 D? P? D? P?
. . . .
. . . .
D? P? D? P? D? P? D? P?

Order of hard drives

The astute reader will have noticed that implicitly, the order for the first row has been partially derived. The following steps are the hardest. We take advantage of the Master File Table structure ($MFT) and its well defined pattern. From the VBR we determine the start cluster address, say 8. To translate this in sectors we determine the number of sectors per cluster, suppose that we derive from VBR a number of 8 sectors per cluster, this means that start cluster offset is 64 sectors.

In addition, we know that each $MFT record starts with the string FILE at the beginning of a sector. We perform in every hard disk a sector based search for this string. We are aiming at the first record which is the $MFT record, this is verified by looking at the $FILENAME attribute of the record. An expert should raise his eyebrow at this point and ask instead, how do we know we have not located the $MFTMirror which serves as a backup of the $MFT. Superficially we don't, but we can find out by going further down a couple of $MFT file record entries usually higher than 5, if we continue locating additional record entries, then we have located the $MFT. In our case the $MFT record entry is found at hard disk 3, the answer as to why this occurs is left as an exercise to the reader. Normally, the $MFT has at least a size at the order of 100MBytes, we expect that the last hard drive, namely HD4, at the same offset, will contain garbled data instead of the $MFT entries. This garbled data is the XORed data from the remaining blocks of the row in which they belong to.

With this newly disclosed information the RAID becomes:

HD1 HD2 HD3 HD4
D0 D1 D2 P
. . . .
. . . .
D? P? D? P? D? P? D? P?

Parity pattern

The next steps are to determine whether the RAID employs delayed parity and the rotational pattern.
We have already pinpointed the cluster in the hard disk where $MFT starts. We know that each file record size spans up to 1024 bytes and that the $MFT file record number is located in the record header of each entry. To my experience this number starts from 1 and increases monotonically as new file records are allocated by the file system. Having a block size of 32KBytes, we need to go at cluster 16 and check for every hard disk the MFT record number. For the sake of simplicity we derive the following pattern, dotted cells represent $MFT entries.

Row HD1 HD2 HD3 HD4
1 MBR VBR $MFT 1 to 31 P
2 $MFT 32 to 63 $MFT 64 to 91 $MFT 92 to 127 P
3 $MFT 128 to 161 $MFT 162 to 191 $MFT 92 to 127 P
4 $MFT 192 to 223 $MFT 224 to 255 $MFT 256 to 287 P
5 $MFT 288 to 319 $MFT 320 to 351 P $MFT 352 to 383
6 .... ... P ...
7 .... ... P ...
8 .... ... P ...
9 .... P ... ...
10 .... P ... ...
11 .... P ... ...
12 .... P ... ...
13 P ... ... ...
14 P ... ... ...
15 P ... ... ...
16 P ... ... $MFT 1408 to 1439

This pattern corresponds to RAID 5 with a delay parity of 4, because the parity block repeats for every 4 rows and left asynchronous because data blocks begin from left to right for the whole cycle. A cycle repeats for every 16 rows.

Conclusion

Summing up, to successfully reconstruct a RAID 5, we need to complete a full cycle that is 16 rows for this example, the length of which increases proportionally with the delayed parity size. Our configuration parameters are:

  • Start offset of the member disks = 1180
  • Order of hard drives in the RAID = HD1, HD2, HD3, HD4
  • Parity distribution pattern = Right to left
  • Block Size = 32KBytes
  • Delayed parity existence and block pattern= delayed parity size 4 and block pattern left asynchronous.

This tutorial is a bit idealistic as we might not be so lucky and our first $MFT entry might not be located at the first block of the next hard disk. In case we locate the first $MFT entry a couple of blocks after the first block, then we do not know the exact location in the row within the group of blocks that consists each group.

Sanity checks

Fixup values

NTFS employs a mechanism to prevent from corrupted file record entries, specifically, it replaces bytes 510 to 511 and 1022 to 1023 with signature values. The original values reside at an attribute called Fixup array in the $MFT record header along with the selected signature value which they should match. A fixup array failure could imply a spurious RAID reconstruction. This is best understood with the following example, assume that the last $MFT record of the block is abruptly terminated which results in a missing second sector. This sector should be located at the next block of the next hard disk. If we concatenate a sector from a wrong block, a different signature will be located at 1021-1022 in contrast with the expected one, and hence the occurring error. This can happen if we mistakenly apply RAID with left asynchronous block pattern whereas in reality we have a right asynchronous block pattern. This subtle issue will result in a faulty NTFS with important consequences as file system metadata will be affected.

Mismatched $MFT parent record entry numbers.

A potential issue behind this error I am speculating a bit, is a misinterpreted child $MFT record entry. This child $MFT entry has its flags set to allocated in the record header. By extracting its parent directory entry record number from the Filename attribute, we can direct ourselves to the parent $MFT record entry. In the parent record entry, we should check apart from, whether sequence numbers match, the index node $MFT reference number for the index entry that corresponds to the child being investigated. Provided that the child's Filename attribute parent sequence and the actual parent sequence number matches, there should be any entry in the index nodes for this child record as well. In this case, the B-tree structure is big enough such that the non-resident index allocation attribute comes into effect. This structure contains the index entry of the problematic child. However, its virtual cluster number will be mismatched with the logical cluster number because we messed up the order of blocks.

raid ntfs file systems $mft

Questions on File Systems and Windows Forensics.

Below you will find questions that test your knowledge on this subject. I wrote them while I read material mainly from books in file systems...

© 2012 - 2021 Armen Arsakian updated atThursday 07 November 2019Contact: contact at arsakian.com

-1572 . 3519