In this article, I’ll break down the overall structure of data stored on CD-ROMs, cover what data is currently missing from most CD-ROM image formats, and propose a new CD image format that obviates the need for CUE sheets to describe disc tracks, while also providing more complete disc preservation.
I’ll start from the highest level structure and then go progressively deeper into the lesser known details of the format.
Compact Discs can store 650MB – 737MB of data on them. The data is written to discs in a spiral pattern, and the exact maximum amount of storage space is dependent upon how narrow the spiral is written onto the disc. One cannot make the spiral too dense, or drives will become unable to read them.
Data is encoded into this spiral via pits and lands, which is roughly analagous to ones and zeroes, but we’ll delve into that more later.
We’re going to presume 650MB CDs for the remainder of this article, although the information is applicable to more dense CDs as well.
One 650MB CD holds 74 minutes of audio data in signed 16-bit stereo format at 44.1KHz frequency. This is known as the Redbook audio format.
The disc is divided into 333,000 sectors, each of which contains 2,352 bytes of data. Every 75 sectors represents exactly one second of audio, thus:
333000 sectors / 75 sectors per second = 4440 seconds = 74 minutes
The Redbook audio standard specifies a lead-in area, which encodes the disc’s table of contents, or TOC. It also specifies a lead-out area, which tells the disc player when to stop playing a CD. And it also specifies that there should be a two-second pregap of silence before each track.
Some audio CDs omit the gaps to allow one song to seamlessly transition into the next without any silence.
The TOC is used to tell the disc players where each track is located, within approximately one second of accuracy. CD players read the TOC as their first step when a disc is started, and they cache this information for track seeking later on.
CDs can have up to 99 tracks, numbered 1 – 99. Each track can further have up to 99 indexes, numbered 1 – 99 as well. I’m not personally aware of any CDs that attempt to use track 0, but when it comes to index 0, this is the start of the track pregap, and index 1 is the start of the music.
The TOC stores only the track numbers, and the individual tracks contain the index numbers in the Q-subchannel data, which we’ll get to shortly.
Some bands got clever with the first track’s first index, and would set this further into the disc. The TOC points each track’s index 1, and so a portion of the track would be skipped. And now by rewinding, you would reveal a hidden “track 0” of audio. But it’s really just audio hiding in the pregap of track 1.
Get used to abuses of the CD-ROM format. They’re very common.
Later on, the Yellowbook standard came along which defined a method of storing data onto CDs.
But it turns out that CDs aren’t all that reliable, and the lower-level CIRC coding (which we’ll get to in a bit) wasn’t enough error correction.
And so data CDs split up each 2,352-byte sector into 2,048 bytes of actual data, a 12-byte sync pattern to identify the start of each sector, a 3-byte address within the current track, a 1-byte mode specifier, a 4-byte checksum, 8-bytes of reserved data, and 276-bytes of Reed Solomon Product Code (RSPC) error correction. The error correction portion is split into 172-bytes of P-parity and 104-bytes of Q-parity. This gives us the following format for each sector:
333000 sectors * 2048 bytes = ~650 MB of storage per disc
RSPC is used to provide a higher-level error correction. It can detect damages in data caused by disc scratches and fingerprint smudges, and can repair some of the errors. The sector checksum, or EDC is a simple cyclic redundancy check to ensure that the RSPC-corrected data is valid.
The above is what’s known as mode 1. The Yellowbook standard also describes mode 2, which can be used for more data storage when the absolute integrity of the data is not essential, such as for video data. We gain more storage at the expense of some error correcting ability:
333000 sectors * 2336 bytes = ~741 MB of storage per disc
.iso CD-ROM images are data-only tracks that consist of only the 2048-bytes of mode 1 data per track. This is the most compact representation of a CD, but also the one that omits the most data.
It is really only suitable for distributing images to be burned onto CDs, eg Linux OS releases.
.bin CD-ROM images store 2,352 bytes per sector, and can thus encode both audio and data tracks (in modes 1 and 2.)
The .bin format still omits subchannel-data, which we will get to soon, and the lead-in and lead-out portions of the disc.
In its place, .bin images come with .cue files, or CUE sheets, to describe the table of contents in text form.
Now we’ll start going lower-level.
CD-ROMs have more than just 2,352 bytes per sector. Every sector is split into 98 F3-frames:
These F3 frames are where you find the subchannel data. There are eight of these channels labeled P, Q, R, S, T, U, V, W. Each subchannel gets 12-bytes of data within each F3 frame. Thus, you must decode an entire sector to get the eight subchannel blocks of data: the subchannel blocks are split across multiple F3 frames.
The P-subchannel is a very simple bit pattern that is used to identify the start of tracks. The Q-subchannel data is much more interesting: in the lead-in area, the table of contents are stored here.
Because the subchannel data is not protected by the RSPC codes (it’s at a lower level on the disc), that means it’s not always possible to read back these codes without errors. The Q-subchannel encodes a 2-byte CRC for each block, and then the lead-in repeats the TOC over and over again, usually for around 7,500 sectors, so that the disc player can keep reading it until it is able to decode all of the track starting locations.
The Q-subchannel is also used within the tracks, and this tells the disc player where the laser is currently reading, both in absolute and relative time, which is how disc players can display timestamps while playing music.
The Q-subchannel data looks like this: