Advanced Audio Coding

From Free net encyclopedia

Template:Cleanup-date

Advanced Audio Coding (AAC), also known as MPEG-2 Part 7, and also MPEG-4 Part 3 in a slightly modified form, is a digital audio encoding and lossy compression format. It was popularized by Apple Computer through its iPod and iTunes Music Store.

AAC was designed as an improved-performance codec relative to MP3 (which was specified in MPEG-1) and MPEG-2 Part 3 (which is also known as "MPEG-2 Audio" or ISO/IEC 13818-3).

AAC was promoted as the successor to MP3 for audio coding at medium to high bit rates.

Contents

How AAC works

AAC is a wideband audio coding algorithm that exploits two primary coding strategies to dramatically reduce the amount of data needed to convey high-quality digital audio.

  1. Signal components that are perceptually irrelevant are discarded.
  2. Redundancies in the coded audio signal are eliminated.
  3. The signal is then MDCTed according to its complexity.
  4. Internal error correction codes are added.
  5. The signal is stored or transmitted.

MPEG-4 audio standard does not require a single or small set of highly efficient compression schemes but rather a complex toolbox to perform a wide range of operations from low-bit-rate speech coding to high-quality audio coding and music synthesis.

  • The MPEG-4 audio coding algorithm family spans the range from low bit-rate speech coding (down to 2 kbit/s) to high-quality audio coding (at 64 kbit/s per channel and higher).
  • AAC has the sampling frequencies between 8 kHz and 96 kHz and any number of channels between 1 and 48.
  • In contrast to MP3's hybrid filterbank, AAC uses the modified discrete cosine transform (MDCT) together with the increased window lengths of 2,048 points. AAC is much more capable of encoding audio with streams of complex pulses and square waves than MP3 or Musicam.

AAC can be switched dynamically between MDCT block lengths of 2,048 points to 256 points.

  • If a single change or transient occurs, the short window of 256 points is chosen for better resolution.
  • By default the longer 2,048-point window is used to improve the coding efficiency.

Modular encoding

AAC takes a modular approach to encoding. Depending on the complexity of the bitstream to be encoded, the desired performance and the acceptable output, implementers may create profiles to define which of a specific set of tools they want use for a particular application. The standard offers four default profiles:

  • Low Complexity Profile (LC) - the simplest and most widely used and supported.
  • Main Profile (MAIN), which expands upon LC with backwards prediction.
  • Sample-Rate Scalable (SRS) is also called Scalable Sample Rate (MPEG-4 AAC-SSR) in line with the standard nomenclature.
  • Long Term Prediction (LTP), added in MPEG-4, an improvement of the MAIN profile using a forward predictor with lower computational complexity

Depending on the AAC profile and the MP3 encoder, 96 kbit/s AAC can give nearly the same or better perceptional quality as 128 kbit/s MP3.

AAC Low Delay

The MPEG-4 Low Delay Audio Coder (AAC-LD) is designed to combine the advantages of perceptual audio coding with the low delay necessary for two way communication. The codec is closely derived from MPEG-2 Advanced Audio Coding (AAC).

  • The most stringent requirements are a maximum algorithmic delay of only 20 ms and a good audio quality for all kind of audio signals including speech and music.
  • In this way, the AAC LD coding scheme bridges the gap between speech coding schemes and high quality audio coding schemes.


Image:AAC low-delay.png

Two-way communication with AAC LD is possible on usual analog telephone lines and via ISDN connections. Compared to known speech coders, the codec is capable of coding both music and speech signals with good quality. Unlike speech coders, however, the achieved coding quality scales up with bitrate. Transparent quality can be achieved.

AAC LD can also process stereo signals by using the advanced stereo coding tools of AAC. Thus it is possible to transmit a stereo signal with a bandwidth of 7 kHz via one ISDN line or with a bandwidth of 15 kHz via two ISDN lines.

Error protection toolkit

Applying error protection enables error correction up to a certain extent. Error correcting codes are usually applied equally to the whole payload.

But since different parts of an AAC payload show different sensitivity to transmission errors, this would not be a very efficient approach.

The AAC payload can be subdivided into parts with different error sensitivities. Independent error correcting codes can be applied to any of these parts using the Error Protection (EP) tool defined in MPEG-4 Audio. This provides the error correcting capability just the most sensitive parts of the payload in order to keep the additional overhead low.

Error Resilient AAC
Error resilience techniques can be used to make the coding scheme itself more robust against errors. For AAC three custom-tailored methods were developed and defined in MPEG-4 Audio:

  • Huffman Codeword Reordering (HCR) to avoid error propagation within spectral data
  • Virtual Codebooks (VCB11) to detect serious errors within spectral data
  • Reversible Variable Length Code (RVLC) to reduce error propagation within scale factor data

AAC's improvements over MP3

Some of its advances:

  • Sample frequencies from 8 kHz to 96 kHz (official MP3: 16 kHz to 48 kHz)
  • Up to 48 channels
  • Higher efficiency and simpler filterbank (hybrid → pure MDCT)
  • Higher coding efficiency for stationary signals (blocksize: 576 → 1024 samples)
  • Higher coding efficiency for transient signals (blocksize: 192 → 128 samples)
  • Much better handling of frequencies above 16 kHz
  • More flexible joint stereo (separate for every scale band)

The result is a specification that allows developers more flexibility to design codecs that offer effcient compression compared to MP3. However, the advantages are not entirely decisive, and the MP3 specification, while outdated, has proven surprisingly robust. Although AAC and AAC+ completely dominate MP3 at very low bitrates, at medium to higher bitrates the two formats are more comparable. In the future as developers learn to better exploit the AAC format, AAC is expect to gain additional ground and perhaps overtake MP3.

AAC ISO standard

AAC, which was first specified in the standard known formally as ISO/IEC 13818-7, was published in 1997 as a new "part" (distinct from ISO/IEC 13818-3) in the MPEG-2 family of international standards.

Products that support AAC

iTunes and iPod

Image:ITunes-aac.png In April, 2003, Apple Computer brought mainstream attention to AAC by announcing that its iTunes and iPod products would support songs in MPEG-4 AAC format (via a firmware update for older iPods), and that customers could download popular songs in a protected version of the format via the iTunes Music Store. AAC has now become so associated with Apple hardware and software that people commonly and mistakenly believe that AAC expands to "Apple Audio Codec." Optionally, a digital rights management scheme (named FairPlay) can be employed in tandem.

Apple has added support for VBR encoding of AAC tracks in iTunes v5.0. They have also added certain enhancements in higher-end iPods such as chapters (bookmarks that can incorporate web links and pictures set to appear at certain times during playback of audio books and podcasts) which are not features of AAC itself, but of the Apple proprietary file-format that wraps the AAC bitstream.

Other Media Players

Portable Devices

For a number of years, many mobile (cell) phones from the big manufacturers such as Nokia, Motorola and Sony Ericsson have supported AAC playback. During 2005, the buzz around music on mobile phones increased dramatically. Many manufacturers announced dedicated music phones, such as the Sony Ericsson S700i, Sony Ericsson W600, Sony Ericsson K750i/Sony Ericsson W800, Sony Ericsson W900i, Nokia N91, Motorola ROKR E1, and Motorola SLVR- all with AAC playback as standard. This trend towards supporting AAC continues with the ever increasing number of advanced phones on the market today.

Also, the PlayStation Portable has had support for AAC files as of the version 2 firmware update (released August 2005).

Epson supports AAC playback in the P-2000 and P-4000 Multimedia / Photo Storage Viewers. This support is not available in their older models, however.

Extensions & improvements

Some technology extensions have been added to the AAC standard

  • High Efficiency AAC (HE-AAC) - SBR technology has been applied to AAC, and was incorporated into the standard to form High Efficiency AAC v1; see MPEG-4 Part 3 for further details.
  • The codec design was further improved in MPEG-4 Part 3, known formally as ISO/IEC 14496-3, with the addition of Perceptual Noise Substitution (PNS) and a Long Term Predictor (LTP).

Although the AAC codec specified in MPEG-2 Part 7 and the AAC specified in MPEG-4 Part 3 are somewhat different, they are both informally known as AAC.

See also

External links

de:Advanced Audio Coding es:Advanced Audio Coding fr:Advanced Audio Coding it:Advanced Audio Coding nl:Advanced Audio Coding ja:AAC no:Advanced Audio Coding pl:AAC pt:Advanced Audio Coding ru:AAC fi:AAC sv:AAC vi:AAC