Jonathan Hays - Mu-Law and A-Law Compression Tutorial

Disclaimer: This post comes from my old blog back in 2004. I’m reposting it here so that I don’t lose the content. The source was hand-written HTML so the formatting probably appears a bit off.
	<table width="600" cellspacing="15"><tr>
		<td width="100%">
			<br>
			<br>
			<font size="3" face="Arial, Helvetica, sans-serif"><b>Overview:</b></font>
			What are A-Law and Mu-Law compression?  In the simplest terms, they are 
			standard forms of audio compression for 16 bit sounds.  Like most audio 
			compression techniques, they are lossy, which means that when you expand them
			back from their compressed state, they will not be exactly the same as
			when you compressed them.  The compression is always 2:1, meaning that 
			audio compressed with either of these algorithms will always be exactly half
			of their original size.
			<br>
			Mu-Law and A-Law compression are both logarithmic forms of data compression,
			and are extremely similar, as you will see in a minute.  One definition of Mu-Law is 
			<br><i>
			<br>          "...a form of logarithmic data compression
			<br>          for audio data.  Due to the fact that we hear logarithmically, 
			<br>          sound recorded at higher levels does not require the same 
			<br>          resolution as low-level sound.  This allows us to disregard
			<br>          the least significant bits in  high-level data.  This turns  
			<br>          out to resemble a logarithmic transformation.  The resulting
			<br>          compression forces a 16-bit number to be represented as an 8-bit 
			<br>          number." </i>
			<a href="https://web.archive.org/web/20040608152810/http://www-s.ti.com/sc/psheets/spra267/spra267.pdf">
			(www-s.ti.com/sc/psheets/spra267/spra267.pdf)
			</a>
			<br>
			<br>And from the comp.dsp newsgroup FAQ we also get this definition:
			<br><i>
			<br>          Mu-law (also "u-law") encoding is a form of logarithmic
			<br>          quantization or companding. It's based on the observation that
			<br>          many signals are statistically more likely to be near a low
			<br>          signal level than a high signal level. Therefore, it makes
			<br>          more sense to have more quantization points near a low level
			<br>          than a high level. In a typical mu-law system, linear samples
			<br>          of 14 to 16 bits are companded to 8 bits. Most telephone
			<br>          quality codecs (including the Sparcstation's audio codec) use
			<br>          mu-law encoded samples.
			<br></i>          
			<br>In simpler terms, this means that sound is represented as a wave, and humans
			can only hear audio in the middle of the wave.  We can remove data from the upper
			and lower frequencies of a sound, and humans will not be able to hear a significant
			difference.  Both Mu-Law and A-Law take advantage of this, and are able to
			compress 16-bit audio in an manner acceptable to human ears.
			A-Law and Mu-Law compression appear to have been developed at around the
			same time, and basically only differ by the particular logarithmic function
			used to determine the translation.  When we get to the work of
			implementing the algorithms, you will see that the differences are nominal.
			The main difference is that Mu-Law attempts to keep the top five bits of precision, 
			and uses a logarithmic function to determine the bottom three bits, while A-Law
			compression keeps the top four bits and uses the logarithmic function to figure out
			the bottom four.  Both of these algorithms are used as telecommunication standards,
			A-Law being used mainly in Europe, and Mu-Law being used in the United States.
			<br>
			<br><b><i>DISCLAIMER:</i></b>
			<br>Please understand that I am glossing over several of the details, but recognize that 
			the entire purpose of this document is to make two extremely useful algorithms much 
			more accessable to "average" programmers, like myself.
			<br>
			<br>
			<br><font size="3" face="Arial, Helvetica, sans-serif"><b>Mu-Law Compression:</b></font>
			<br>As you read this explanation, remember that the purpose of the algorithm is to
			compress a 16-bit source sample down to an 8-bit sample.  The crux of Mu-Law
			functionality is deciding which of the samples need to keep the most of their
			precision.  Even the "most-important" sample will still lose precision.  It
			simply becomes a matter of determining how much each sample loses, and minimizing
			the loss on samples deemed "more important".  
			<br>To generate a compressed Mu-Law sample from an uncompressed sample, the following
			algorithm is applied to the 16-bit source sample.  
			<br><i>(Please refer to the code listing for Mu-Law compression.)</i>
			<br>
			<br>First, the algorithm first stores off the sign.  It then adds in a bias value 
			which (due to wrapping) will cause high valued samples to lose precision.
			The top five most significant bits are pulled out of the sample (which has 
			been previously biased).  Then, the bottom three bits of the compressed byte are 
			generated using a small look-up table, based on the biased value of the source sample.  
			The 8-bit compressed sample is then finally created by logically OR'ing together 
			the 5 most important bits, the 3 lower bits, and the sign when applicable.  The bits 
			are the logically NOT'ed, which I assume is for transmission reasons (although you 
			might not transmit your sample.)
			<br>
			<br>
			<br><b>MuLaw Compresion Code:</b>
			<br>
			<font size="1" face="Courier, Helvetica, sans-serif">
			<br>const int cBias = 0x84;
			<br>const int cClip = 32635;
			<br>
			<br>static char MuLawCompressTable[256] = 
			<br>{
			<br>     0,0,1,1,2,2,2,2,3,3,3,3,3,3,3,3,
			<br>     4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
			<br>     5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,
			<br>     5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,
			<br>     6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
			<br>     6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
			<br>     6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
			<br>     6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
			<br>     7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7
			<br>};
			<br>
			<br>unsigned char LinearToMuLawSample(short sample)
			<br>{
			<br>     int sign = (sample &gt;&gt; 8) &amp; 0x80;	
			<br>     if (sign)
			<br>          sample = (short)-sample;
			<br>     if (sample &gt; cClip)
			<br>          sample = cClip;
			<br>     sample = (short)(sample + cBias);
			<br>     int exponent = (int)MuLawCompressTable[(sample&gt;&gt;7) &amp; 0xFF];
			<br>     int mantissa = (sample &gt;&gt; (exponent+3)) &amp; 0x0F;
			<br>     int compressedByte = ~ (sign | (exponent &lt;&lt; 4) | mantissa);
			<br>	
			<br>     return (unsigned char)compressedByte;
			<br>}
			<br></font>
			<br>
			<br>
			<br>
			<br><font size="3" face="Arial, Helvetica, sans-serif"><b>A-Law Compression:</b></font>
			<br>
			<br>As mentioned earlier, A-Law compression is extremely similar to Mu-Law compression.  As you
			will see, they differ primarily in the way that they keep precision.  The 
			following is a short synopsis of the encoding algorithm, and the code example follows the
			written explanation.
			First, the sign is stored off.  Then the code branches.  If the absolute value of the source
			sample is less than 256, the 16-bit sample is simply shifted down 4 bits and converted 
			to an 8-bit value, thus losing the top 4 bits in the process.  
			However, if it is more than 256, a logarithmic algorithm is applied to the sample to 
			determine the precision to keep.  In that case, the sample is shifted down to access the
			seven most significant bits of the sample.  Those seven bits are then used to determine the
			precision of the bottom 4 bits.  Finally, the top seven bits are shifted back up four bits
			to make room for the bottom 4 bits.  The two are then logically OR'd together to create the
			eight bit compressed sample.  The sign is then applied, and the entire compressed sample
			is logically XOR'd, again, I assume for transmission reasons.
			<br>
			<br>
			<br><b>A-Law Compression Code:</b>
			<br>
			<font size="1" face="Courier, Helvetica, sans-serif">
			<br>static char ALawCompressTable[128] = 
			<br>{
			<br>     1,1,2,2,3,3,3,3,
			<br>     4,4,4,4,4,4,4,4,
			<br>     5,5,5,5,5,5,5,5,
			<br>     5,5,5,5,5,5,5,5,
			<br>     6,6,6,6,6,6,6,6,
			<br>     6,6,6,6,6,6,6,6,
			<br>     6,6,6,6,6,6,6,6,
			<br>     6,6,6,6,6,6,6,6,
			<br>     7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7,
			<br>     7,7,7,7,7,7,7,7
			<br>};
			<br>
			<br>unsigned char LinearToALawSample(short sample)
			<br>{
			<br>     int sign;
			<br>     int exponent;
			<br>     int mantissa;
			<br>     unsigned char compressedByte;
			<br>
			<br>     sign = ((~sample) &gt;&gt; 8) &amp; 0x80;
			<br>     if (!sign) 
			<br>          sample = (short)-sample;
			<br>     if (sample &gt; cClip)
			<br>          sample = cClip;
			<br>     if (sample &gt;= 256)
			<br>     {
			<br>          exponent = (int)ALawCompressTable[(sample &gt;&gt; 8) &amp; 0x7F];
			<br>          mantissa = (sample &gt;&gt; (exponent + 3) ) &amp; 0x0F;
			<br>          compressedByte = ((exponent &lt;&lt; 4) | mantissa);
			<br>     }
			<br>     else 
			<br>     {
			<br>          compressedByte = (unsigned char)(sample &gt;&gt; 4);
			<br>     }
			<br>     compressedByte ^= (sign ^ 0x55);
			<br>     return compressedByte;
			<br>}
			<br></font>
			<br>
			<br>
			<br><font size="3" face="Arial, Helvetica, sans-serif"><b>Decompression:</b></font>
			<br>Now, the most obvious way to decompress a compressed Mu-Law or A-Law sample would 
			be to reverse the algorithm.  But a more efficient method exists.  Consider for a moment the 
			fact that A-Law and Mu-Law both take a 16-bit value and crunch it down to an 8-bit value.  
			The reverse of that is to take an 8-bit value and turn it into a sixteen bit value.  In the 
			graphics world, it is extremely common to represent 32 and 24 bit values with an eight bit 
			index into a palette table.  So, why not take a page from the world of graphics and use 
			palettes for the Mu-Law and A-Law compression look up?  Sounds good to me.  In fact, 
			these palettes will be smaller than their 24 and 32 bit cousins because we only need to 
			represent 16 bit values, not 24 and 32.  In a nutshell, we will create static lookup 
			tables to do the reverse conversion from A-Law and Mu-Law.  The two differing tables 
			are presented below.  To convert from your compressed sample back to the raw 16-bit 
			sample, just use your compressed sample as the index into the table, and the corresponding 
			value in the table is your decompressed 16-bit sample.  Obviously, the downside is that 
			this method requires the memory overhead for the tables, but each table is only 512 bytes.  
			In this day and age, that's downright cheap for the absolute fastest decompression!
			<br>
			<br>
			<br><b>Decompression Code:</b>
			<br><font size="1" face="Courier, Helvetica, sans-serif">
			<br>static short MuLawDecompressTable[256] = 
			<br>{        
			<br>     -32124,-31100,-30076,-29052,-28028,-27004,-25980,-24956,
			<br>     -23932,-22908,-21884,-20860,-19836,-18812,-17788,-16764,
			<br>     -15996,-15484,-14972,-14460,-13948,-13436,-12924,-12412,
			<br>     -11900,-11388,-10876,-10364, -9852, -9340, -8828, -8316,
			<br>      -7932, -7676, -7420, -7164, -6908, -6652, -6396, -6140,
			<br>      -5884, -5628, -5372, -5116, -4860, -4604, -4348, -4092,
			<br>      -3900, -3772, -3644, -3516, -3388, -3260, -3132, -3004,
			<br>      -2876, -2748, -2620, -2492, -2364, -2236, -2108, -1980,
			<br>      -1884, -1820, -1756, -1692, -1628, -1564, -1500, -1436,
			<br>      -1372, -1308, -1244, -1180, -1116, -1052,   -988,   -924,
			<br>       -876,  -844,  -812,  -780,  -748,  -716,  -684,  -652,
			<br>       -620,  -588,  -556,  -524,  -492,  -460,  -428,  -396,
			<br>       -372,  -356,  -340,  -324,  -308,  -292,  -276,  -260,
			<br>       -244,  -228,  -212,  -196,  -180,  -164,  -148,  -132,
			<br>       -120,  -112,  -104,   -96,   -88,   -80,   -72,   -64,
			<br>        -56,   -48,   -40,   -32,   -24,   -16,    -8,     0,
			<br>      32124, 31100, 30076, 29052, 28028, 27004, 25980, 24956,
			<br>      23932, 22908, 21884, 20860, 19836, 18812, 17788, 16764,
			<br>      15996, 15484, 14972, 14460, 13948, 13436, 12924, 12412,
			<br>      11900, 11388, 10876, 10364,  9852,  9340,  8828,  8316,
			<br>       7932,  7676,  7420,  7164,  6908,  6652,  6396,  6140,
			<br>       5884,  5628,  5372,  5116,  4860,  4604,  4348,  4092,
			<br>       3900,  3772,  3644,  3516,  3388,  3260,  3132,  3004,
			<br>       2876,  2748,  2620,  2492,  2364,  2236,  2108,  1980,
			<br>       1884,  1820,  1756,  1692,  1628,  1564,  1500,  1436,
			<br>       1372,  1308,  1244,  1180,  1116,  1052,   988,   924,
			<br>        876,   844,   812,   780,   748,   716,   684,   652,
			<br>        620,   588,   556,   524,   492,   460,   428,   396,
			<br>        372,   356,   340,   324,   308,   292,   276,   260,
			<br>        244,   228,   212,   196,   180,   164,   148,   132,
			<br>        120,   112,   104,    96,    88,    80,    72,    64,
			<br>         56,    48,    40,    32,    24,    16,     8,     0
			<br>};
			<br>
			<br>
			</font>
			<br>
			<br><font size="1" face="Courier, Helvetica, sans-serif">
			<br>static short ALawDecompressTable[256] = 
			<br>{
			<br>     -5504, -5248, -6016, -5760, -4480, -4224, -4992, -4736,
			<br>     -7552, -7296, -8064, -7808, -6528, -6272, -7040, -6784,
			<br>     -2752, -2624, -3008, -2880, -2240, -2112, -2496, -2368,
			<br>     -3776, -3648, -4032, -3904, -3264, -3136, -3520, -3392,
			<br>     -22016,-20992,-24064,-23040,-17920,-16896,-19968,-18944,
			<br>     -30208,-29184,-32256,-31232,-26112,-25088,-28160,-27136,
			<br>     -11008,-10496,-12032,-11520,-8960, -8448, -9984, -9472,
			<br>     -15104,-14592,-16128,-15616,-13056,-12544,-14080,-13568,
			<br>     -344,   -328,   -376,   -360,   -280,   -264,   -312,   -296,
			<br>     -472,   -456,   -504,   -488,   -408,   -392,   -440,   -424,
			<br>     -88,     -72,    -120,   -104,    -24,      -8,      -56,     -40,
			<br>     -216,   -200,   -248,   -232,   -152,   -136,   -184,   -168,
			<br>     -1376, -1312, -1504, -1440, -1120, -1056, -1248, -1184,
			<br>     -1888, -1824, -2016, -1952, -1632, -1568, -1760, -1696,
			<br>     -688,   -656,   -752,   -720,   -560,   -528,   -624,   -592,
			<br>     -944,   -912,  -1008,  -976,   -816,   -784,   -880,   -848,
			<br>      5504,   5248,   6016,   5760,   4480,   4224,   4992,   4736,
			<br>      7552,   7296,   8064,   7808,   6528,   6272,   7040,   6784,
			<br>      2752,   2624,   3008,   2880,   2240,   2112,   2496,   2368,
			<br>      3776,   3648,   4032,   3904,   3264,   3136,   3520,   3392,
			<br>      22016, 20992, 24064, 23040, 17920, 16896, 19968, 18944,
			<br>      30208, 29184, 32256, 31232, 26112, 25088, 28160, 27136,
			<br>      11008, 10496, 12032, 11520,  8960,   8448,   9984,   9472,
			<br>      15104, 14592, 16128, 15616, 13056, 12544, 14080, 13568,
			<br>      344,   328,   376,   360,   280,   264,   312,   296,
			<br>      472,   456,   504,   488,   408,   392,   440,   424,
			<br>      88,    72,   120,   104,    24,     8,    56,    40,
			<br>      216,   200,   248,   232,   152,   136,   184,   168,
			<br>      1376,  1312,  1504,  1440,  1120,  1056,  1248,  1184,
			<br>      1888,  1824,  2016,  1952,  1632,  1568,  1760,  1696,
			<br>      688,   656,   752,   720,   560,   528,   624,   592,
			<br>      944,   912,  1008,   976,   816,   784,   880,   848
			<br>};
			<br></font>
			<br>
			<br>
		</td>
		<td width="25">
		</td>
	</tr></table>