-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathformat.txt
178 lines (133 loc) · 6.2 KB
/
format.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
-----------------------------------------------------------------------------
ECM (Error Code Modeler) Format Specification v1.0
Written by Neill Corlett
-----------------------------------------------------------------------------
Introduction
------------
For many people who distribute images of CD-ROMs, the common practice is to
create a raw copy in either BIN/CUE (CDRWin), CDI (DiscJuggler), NRG (Nero),
or CCD (CloneCD) format, and then compress that image using an archive format
such as RAR. The goal is to reduce the CD image to the smallest possible
size for quicker Internet transfers. However, there is a major problem with
this technique.
Raw CD images contain both the useful sector data as well as subcodes,
including ECC and EDC (Error Correction/Detection Codes). In many cases,
it's necessary to include all subcode information to ensure a complete and
accurate CD image. However, most raw CD images include a large number of
"plain" sectors, for which the subcode data is merely redundant. Since the
RAR format includes CRC verification and the TCP/IP transport layer also
includes error detection, there is no purpose to transferring this redundant
data.
Making the problem worse, the ECC/EDC data in a typical CD sector is very
poorly modeled by typical Lempel-Ziv compression schemes (such as the one
used by RAR). This means that, for practical purposes, ECC/EDC data is
uncompressible.
However, with a simple algorithm, the ECC/EDC data can be reconstructed
for a typical CD sector. What's needed is a compressed file format which
uses the ECC/EDC algorithms as a model to eliminate the redundant data.
This is ECM, the Error Code Modeler format.
-----------------------------------------------------------------------------
Sector types
------------
There are three basic types of CD data sectors:
Type #1: Mode 1
-----------------------------------------------------
0 1 2 3 4 5 6 7 8 9 A B C D E F
0000h 00 FF FF FF FF FF FF FF FF FF FF 00 [-ADDR-] 01
0010h [---DATA...
...
0800h ...DATA---]
0810h [---EDC---] 00 00 00 00 00 00 00 00 [---ECC...
...
0920h ...ECC---]
-----------------------------------------------------
Type #2: Mode 2 (XA), form 1
-----------------------------------------------------
0 1 2 3 4 5 6 7 8 9 A B C D E F
0000h 00 FF FF FF FF FF FF FF FF FF FF 00 [-ADDR-] 02
0010h [--FLAGS--] [--FLAGS--] [---DATA...
...
0810h ...DATA---] [---EDC---] [---ECC...
...
0920h ...ECC---]
-----------------------------------------------------
Type #3: Mode 2 (XA), form 2
-----------------------------------------------------
0 1 2 3 4 5 6 7 8 9 A B C D E F
0000h 00 FF FF FF FF FF FF FF FF FF FF 00 [-ADDR-] 02
0010h [--FLAGS--] [--FLAGS--] [---DATA...
...
0920h ...DATA---] [---EDC---]
-----------------------------------------------------
Key:
ADDR: Sector address, encoded as minutes:seconds:frames in BCD
FLAGS: Used in Mode 2 (XA) sectors describing the type of sector; repeated
twice for redundancy
DATA: Area of the sector which contains the actual data itself
EDC: Error Detection Code
ECC: Error Correction Code
The ECM format uses these three basic types to model CD sectors.
-----------------------------------------------------------------------------
The ECM file format
-------------------
The first 4 bytes are the magic identifier: 45 43 4D 00, or "ECM".
After this comes an arbitrary (can be infinitely long) number of records.
Each record has 3 parts: Type (0...3), Count (1...2^31-1), and Data.
- Type 0: Literal bytes follow; length is indicated by Count.
- Type 1: Sectors of type #1 follow; Count tells how many.
- Type 2: Sectors of type #2 follow; Count tells how many.
- Type 3: Sectors of type #3 follow; Count tells how many.
The Type and Count are encoded first, in the following format.
(Second through fifth bytes are OPTIONAL.)
first byte second byte third byte fourth byte fifth byte
AaaaaaTT Bbbbbbbb Cccccccc Dddddddd 00eeeeee
T - Type.
a - Bits 0-4 of Count.
A - Set if the second byte exists.
b - Bits 5-11 of Count.
B - Set if the third byte exists.
c - Bits 12-18 of Count.
C - Set if the fourth byte exists.
d - Bits 19-25 of Count.
D - Set if the fifth byte exists.
e - Bits 26-31 of Count.
NOTE: The Count value that is encoded is the actual Count minus 1.
Type 0 and a count of exactly 2^32 (meaning it's encoded as FFFFFFFF)
indicates that there are no more records. Immediately following this
marker is a 4-byte EDC for the entire original (unencoded) file. This is
computed using the same algorithm as the EDC for a CD sector. After this,
the ECM file ends.
Counts of 2^31 and higher are invalid.
-----------------------------------------------------------------------------
Sector type #1
--------------
Stored in the ECM file as follows:
3 bytes - ADDR
2048 bytes - DATA
This expands to a complete 2352-byte Mode 1 sector.
The sync, reserved, mode (01), EDC, and ECC bytes are reconstructed upon
decoding.
-----------------------------------------------------------------------------
Sector type #2
--------------
Stored in the ECM file as follows:
4 bytes - FLAGS
2048 bytes - DATA
This expands to a 2336-byte Mode 2 Form 1 sector. The sync, address, and
mode bytes are NOT included.
The redundant flags, EDC, and ECC bytes are reconstructed upon decoding.
-----------------------------------------------------------------------------
Sector type #3
--------------
Stored in the ECM file as follows:
4 bytes - FLAGS
2324 bytes - DATA
This expands to a 2336-byte Mode 2 Form 2 sector. The sync, address, and
mode bytes are NOT included.
The redundant flags and EDC bytes are reconstructed upon decoding.
-----------------------------------------------------------------------------
Where to find me
----------------
email: [email protected]
www: http://lfx.org/~corlett/
-----------------------------------------------------------------------------