Notes@HKU by Jax

External storage

Hard disk drives

Components

  • Platter: A disk that stores data magnetically.
  • Tracle: A ring on a platter that stores data.
  • Sector: A segment of a track, usually 512 bytes.
  • Cylinder: A set of tracks on different platters that are aligned vertically.
  • Read/write head: A device that reads and writes data on a platter.

Sector format

  • 30 sectors, 600 bytes each, 512 bytes for data, 88 bytes for control information
  • Sector fields:
    • Gap 1 (17b): Separates sectors
    • ID Field (7b): Contains Synch (1b), Track, head, sector (4b), CRC (2b)
    • Gap 2 (41b): Separates ID and data field
    • Data Field (515b): Contains Synch (1b), Data (512b), CRC (2b)
    • Gap 3 (40b): Separates data and ID field

Disk layout methods

There are two methods for laying out data on a disk:

  • Constant Angular Velocity (CAV): The disk rotates at a constant speed, and the data is read at a constant rate. Easier read/write, but density of data decreases from inner track to outer tracks, wasting space.
  • Multiple Zone Recording (MZR): Divides disk into zones, with each zone having a different number of sectors per track. Data density is nearly constant and allows maximized storage capacity.

Disk Access Time

  • Average seek time TseekT_\text{seek}: Time taken to move the read/write head to the desired track.
  • Average rotational latency TlatencyT_\text{latency}: Time required for sector to rotate under the head
  • Data transfer time TtransferT_\text{transfer}: Time taken to read or write the data once the head is in position.
  • Time for one rotation TrotationT_\text{rotation}
  • TIme to rotate one sector TsectorT_\text{sector}
  • Number of sectors per track nsectorsn_\text{sectors}

Formulas in seconds:

1rps=60RPMTrotation1rpsTlatency12×TrotationTsectorTrotationnsectorsTtransferbytesbytes per track×1rps\begin{aligned} & \frac{1}{rps} = \frac{60}{RPM} \\ & T_\text{rotation} & \frac{1}{rps} \\ & T_\text{latency} & \frac12 \times T_\text{rotation} \\ & T_\text{sector} & \frac{T_\text{rotation}}{n_\text{sectors}} \\ & T_\text{transfer} & \frac{\text{bytes}}{\text{bytes per track}} \times \frac{1}{rps} \\ \end{aligned}

Time to access nn consecutive sectors ignoring transfer speed is given by Tseek+Tlatency+n×TsectorT_\text{seek} + T_\text{latency} + n \times T_\text{sector}

Time to access nn non-consecutive sectors ignoring transfer speed is given by (Tseek+Tlatency+Tsector)×n(T_\text{seek} + T_\text{latency} + T_\text{sector}) \times n

Solid State Drives (SSD)

SSD

  • Non-volatile storage devices
  • Limited write cycles
  • Faster I/O, access time than HDDs
  • No moving parts: Lower power consumption, less heat, no noise, high durability

Redundant Array of Independent Disks (RAID)

The goal of RAID is to improve performance and reliability of data storage by using multiple disks.

RAID LevelData Layout / ParityFault ToleranceAdvantagesDisadvantages
0Data is striped (distributed in a round-robin fashion) across disks; no parityNone – one disk failure causes total data lossExcellent I/O performance; simple designNo redundancy – loss of any disk results in loss of the entire array
1Data is mirrored on two disksCan tolerate disk failure as long as one copy remains100% redundancy; potentially reduced seek time during read operationsRequires double the physical writes; highest storage overhead
2Data is striped at the bit level with dedicated disks for Error Correction Codes (Hamming codes)Error correction possible, but requires extra disksHigh data transfer rate (especially with smaller strip sizes); simpler ECC controller design compared to later RAID levelsExpensive; high ECC disk-to-data disk ratio; not used commercially
3Data is striped at the bit level; one dedicated disk stores parity bitsRecovers from a single disk failureVery high read/write speeds; efficient use of ECC for fault recoveryDedicated parity disk can become a bottleneck
4Data is striped at the block level with a dedicated parity diskRecovers from a single disk failureAllows individual spindle control; similar benefits as RAID 3Write penalty (requiring extra read/write cycles to update parity); uncommon
5Data and parity are distributed across all disks; parity is calculated on a block basisCan endure one disk failureHigh I/O performance; distributed parity avoids single-disk bottlenecks; widely usedMore complex parity recalculation; rebuild process can be challenging
6Similar to RAID 5 but with dual distributed parity (using two different parity methods)Tolerates up to two disk failuresHighest fault tolerance; enhanced data reliabilityHigher controller complexity and parity computation overhead

On this page