MPEG-2
From Free net encyclopedia
MPEG-2 (1994) is a standard used to compress audio and video (AV) digital data. MPEG-2 is the designation for a group of coding standards for AV, agreed upon by MPEG (Moving Pictures Experts Group), and published as the ISO/IEC 13818 international standard. MPEG-2 is typically used to encode audio and video for broadcast signals, including direct broadcast satellite and Cable TV. MPEG-2, with some modifications, is also the coding format used by standard commercial DVD movies.
MPEG-2 includes a Systems part (part 1) that defines Transport Streams, which are designed to carry digital video and audio over somewhat-unreliable media, and are used in broadcast applications.
The Video part (part 2) of MPEG-2 is similar to MPEG-1, but also provides support for interlaced video (the format used by broadcast TV systems). MPEG-2 video is not optimized for low bit-rates (less than 1 Mbit/s), but outperforms MPEG-1 at 3 Mbit/s and above. All standards-conforming MPEG-2 Video decoders are fully capable of playing back MPEG-1 Video streams.
With some enhancements, MPEG-2 Video and Systems are also used in most HDTV transmission systems.
The MPEG-2 Audio part (defined in Part 3 of the standard), enhances MPEG-1's audio by allowing the coding of audio programs with more than two channels. Part 3 of the standard allows this to be done in a backwards compatible way, allowing MPEG-1 audio decoders to decode the two main stereo components of the presentation.
In part 7 of the MPEG-2 standard, audio can alternatively be coded in a non-backwards-compatible way, which allows encoders to make better use of available bandwidth. Part 7 is referred to as MPEG-2 AAC.
Contents |
The standard
General information about MPEG-2 Video and Audio and Systems excluding modifications when used on DVD / DVB.
A MPEG-2 Program Stream typically consists of two elements:
- video data + time stamps
- audio data + time stamps
Video coding (simplified)
MPEG-2 is for the generic coding of moving pictures and associated audio and creates a video stream out of three types of frame data (intra frames, forward predictive frames and bidirectional predicted frames) that can be arranged in a specified order called the GOP structure (GOP = Group Of Pictures - see below). (Actually, the standard itself does not define or use the term GOP, except in the name of a syntax structure called a GOP header — however, users of MPEG-2 have found that the GOP concept helps convey a basic understanding of the standard.)
Typically the originating material is a video sequence at a pre-set pixel resolution at 25 (CCIR) or (approximately) 29.97 (FCC) frames/second with sound.
MPEG-2 supports both interlaced and progressive scan video streams. In progressive scan streams, the basic unit of encoding is a frame, while in interlaced streams, the basic unit may be either a field or a frame. In the discussion below, the generic terms "picture" and "image" refer to either fields or frames, depending on the type of stream.
An MPEG-2 video bitstream is made up of a series of data frames encoding pictures. The three ways of encoding a picture are: intra-coded (I picture), forward predictive (P picture) and bidirectional predictive (B picture).
The video image is separated into one luminance (Y) and two chrominance channels (also called color difference signals Cb and Cr). Blocks of the luminance and chrominance arrays are organized into "macroblocks", which are the basic unit of coding within a picture. Each macroblock is divided into four 8×8 luminance blocks. The number of 8×8 chrominance blocks per macroblock depends on the chrominance format of the source image. For example, in the common 4:2:0 format, there is one chrominance block per macroblock for each of the two chrominance channels, making a total of six blocks per macroblock.
In the case of I pictures, the actual image data is then passed through the encoding process described below. P and B pictures are first subjected to a process of "motion compensation", in which they are predicted from the previous (and in the case of B pictures, the next) image in time order. Each macroblock in the P or B picture is associated with an area in the previous or next image that is well-correlated with it, as selected by the encoder using a "motion vector". The motion vector that maps the macroblock to its correlated area is encoded, and then the difference between the two areas is passed through the encoding process described below.
Each block is treated with an 8x8 discrete cosine transform. The resulting DCT coefficients are then quantized, re-ordered to maximize the probability of long runs of zeros and low amplitudes of subsequent values, and then run-length coded. Finally a fixed-table huffman encoding scheme is applied.
I pictures encode for spatial redundancy, P and B pictures for temporal redundancy. Because adjacent frames in a video stream are often well-correlated, P pictures may be 10% of the size of I pictures, and B pictures 2% of their size.
The sequence of different frame types is called the Group of Pictures (GOP) structure. There are many possible structures but a common one is 15 frames long, and has the sequence I_BB_P_BB_P_BB_P_BB_P_BB_. A similar 12-frame sequence is also common. The ratio of I, P and B pictures in the GOP structure is determined by the nature of the video stream and the bandwidth constraints on the output stream, although encoding time may also be an issue. This is particularly true in live transmission and in real-time environments with limited computing resources, as a stream containing many B pictures can take three times longer to encode than an I-picture-only file.
The output bit-rate of an MPEG-2 encoder can be constant or variable, with the maximum bit rate determined by the playback media — for example the DVD movie maximum is 10.4 Mbit/s. To achieve a constant bit-rate the degree of quantization is iteratively altered to achieve the output bit-rate requirement. Increasing quantization leads to visible artifacts when the stream is decoded, generally in the form of "mosaicing", where the discontinuities at the edges of macroblocks become more visible as bit rate is reduced.
MPEG video compression also defines DC pictures (D-pictures), which are similar to I-pictures but include only the DC value of each block. A system can use a stream of only D-pictures to support rapid searching through another stream.
Audio encoding
MPEG-2 also introduces new audio encoding methods. These are
- low bitrate encoding with halved sampling rate (MPEG-1 Layer 1/2/3 LSF)
- multichannel encoding with up to 5.1 channels
- MPEG-2 AAC
Profiles and Levels
Abbr. | Name | Frames | YUV | Streams | Comment |
---|---|---|---|---|---|
SP | Simple Profile | P, I | 4:2:0 | 1 | no interlacing |
MP | Main Profile | P, I, B | 4:2:0 | 1 | |
422P | 4:2:2 Profile | P, I, B | 4:2:2 | 1 | |
SNR | SNR Profile | P, I, B | 4:2:0 | 1–2 | SNR: Signal to Noise Ratio |
SP | Spatial Profile | P, I, B | 4:2:0 | 1–3 | low, normal and high quality decoding |
HP | High Profile | P, I, B | 4:2:2 | 1–3 |
Abbr. | Name | Pixel/line | Lines | Framerate (Hz) | Bitrate (Mbit/s) |
---|---|---|---|---|---|
LL | Low Level | 352 | 288 | 30 | 4 |
ML | Main Level | 720 | 576 | 30 | 15 |
H-14 | High 1440 | 1440 | 1152 | 30 | 60 |
HL | High Level | 1920 | 1152 | 30 | 80 |
Profile @ Level | Resolution (px) | Framerate max. (Hz) | Sampling | Bitrate (Mbit/s) | Application |
---|---|---|---|---|---|
SP@LL | 176 × 144 | 15 | 4:2:0 | 0.096 | Wireless handsets |
SP@ML | 352 × 288 | 15 | 4:2:0 | 0.384 | PDAs |
320 × 240 | 24 | ||||
MP@LL | 352 × 288 | 30 | 4:2:0 | 4 | Set-top boxes (STB) |
MP@ML | 720 × 480 | 30 | 4:2:0 | 15 (DVD: 9.8) | DVD, SD-DVB |
720 × 576 | 25 | ||||
MP@H-14 | 1440 × 1080i | 30 | 4:2:0 | 60 (HDV: 25) | HDV |
1280 × 720p | 30 | ||||
MP@HL | 1920 × 1080i | 30 | 4:2:0 | 80 | ATSC 1080i, 720p60, HD-DVB (HDTV) |
1280 × 720p | 60 | ||||
422P@LL | 4:2:2 | ||||
422P@ML | 720 × 480 | 30 | 4:2:2 | 50 | Sony IMX using I-frame only |
422P@H-14 | 1440 × 1080i | 30 | 4:2:2 | 80 | Potential future MPEG-2-based HD products from Sony and Panasonic |
1280 × 720p | 60 | ||||
422P@HL | 1920 × 1080i | 30 | 4:2:2 | 300 | Potential future MPEG-2-based HD products from Panasonic |
1280 × 720p | 60 |
DVD
Additional restrictions and modifications of MPEG-2 on DVD are:
- Resolution
- 720 × 480, 704 × 480, 352 × 480, 352 × 240 pixel (NTSC)
- 720 × 576, 704 × 576, 352 × 576, 352 × 288 pixel (PAL)
- Aspect ratio (image) (Display AR)
- 4:3
- 16:9 (2.21:1 also specified but little if ever used)
- Frame rate
- 29.97 frame/s (NTSC)
- 25 frame/s (PAL)
- Note: By using a pattern of REPEAT_FIRST_FIELD flags on the headers of encoded pictures, pictures can be displayed for either two or three fields and almost any picture display rate (minimum 2/3 of the frame rate) can be achieved. This is most often used to display 23.976 (approximately film rate) video on NTSC.
- Audio+video bitrate
- Buffer average maximum 9.8 Mbit/s
- Peak 15 Mbit/s
- Minimum 300 Kbit/s
- YUV 4:2:0
- Additional subtitles possible
- Closed captioning (NTSC only)
- Audio
- Linear Pulse Code Modulation (LPCM): 48 kHz or 96 kHz; 16- or 24-bit; up to six channels (not all combinations possible due to bitrate constraints)
- MPEG Layer 2 (MP2): 48 kHz, up to 5.1 channels (required in PAL players only)
- Dolby Digital (DD, also known as AC-3): 48 kHz, 32–448 kbit/s, up to 5.1 channels
- Digital Theater Systems (DTS): 754 kbit/s or 1510 kbit/s (not required for DVD player compliance)
- NTSC DVDs must contain at least one LPCM or Dolby Digital audio track.
- PAL DVDs must contain at least one MPEG Layer 2, LPCM, or Dolby Digital audio track.
- Players are not required to playback audio with more than two channels, but must be able to downmix multichannel audio to two channels.
- GOP structure
- Sequence header must be outputted for every GOP
- Maximum frames per GOP: 18 (NTSC) / 15 (PAL), i.e. 0.6 seconds both
- Closed GOP required for multiple-angle DVDs
DVB
Additional restrictions and modifications on DVB-MPEG.
Restricted to one of the following resolutions for SDTV:
- 720, 640, 544, 480 or 352 × 480 pixel, 24/1.001, 24, 30/1.001 or 30 frame/s
- 352 × 240 pixel, 24/1.001, 24, 30/1.001 or 30 frame/s
- 720, 704, 544, 480 or 352 × 576 pixel, 25 frame/s
- 352 × 288 pixel, 25 frame/s
For HDTV:
- 720 x 576 x 50 frames/s progressive (576p50)
- 1280 x 720 x 25 or 50 frames/s progressive (720p50)
- 1440 or 1920 x 1080 x 25 frames/s progressive (1080p25 - film mode)
- 1440 or 1920 x 1080 x 25 frames/s interlace (1080i25)
- 1920 x 1080 x 50 frames/s progressive (1080p50) possible future H.264/AVC format
ATSC
Restricted to one of the following resolutions
- 1920 × 1080 pixel, 30 frame/s (1080i)
- 1280 × 720 pixel, 60 frame/s (720p)
- 720 × 576 pixel, 25 frame/s (576i, 576p)
- 720 or 640 × 480 pixel, 30 frame/s (480i, 480p)
Note: 1080i is encoded with 1920 × 1088 pixel frames, but the last 8 lines are discarded prior to display.
The field of MPEG-2 over ATSC is digital television (DTV). The larger two of these resolutions are typically involved in delivering HDTV content.
ISO/IEC 13818
- Part 1
- Systems - describes synchronization and multiplexing of video and audio.
- Part 2
- Video - compression codec for interlaced and non-interlaced video signals.
- Part 3
- Audio - compression codec for perceptual coding of audio signals. A multichannel-enabled extension of MPEG-1 audio.
- Part 4
- Describes procedures for testing compliance.
- Part 5
- Describes systems for Software simulation.
- Part 6
- Describes extensions for DSM-CC (Digital Storage Media Command and Control.)
- Part 7
- Advanced Audio Coding (AAC)
- Part 9
- Extension for real time interfaces.
- Part 10
- Conformance extensions for DSM-CC.
(Part 8: 10-bit video extension. Primary application was studio video. Part 8 has been withdrawn due to lack of interest by industry).
Patent holders
Approximately 640 patents world wide make up the "essential" intellectual property surrounding MPEG-2. These are held by over 20 corporations and one university:
- Alcatel
- Canon Inc.
- Columbia University
- France Télécom (CNET)
- Fujitsu
- General Electric Capital Corporation
- General Instrument Corp. (now the broadband division of Motorola)
- GE Technology Development, Inc.
- Hitachi, Ltd.
- KDDI Corporation (KDDI)
- Lucent Technologies
- LG Electronics Inc.
- Matsushita
- Mitsubishi
- Nippon Telegraph and Telephone Corporation (NTT)
- Philips
- Robert Bosch GmbH
- Samsung
- Sanyo Electric Co., Ltd.
- Scientific Atlanta
- Sharp
- Sony
- Thomson Licensing S.A.
- Toshiba
- Victor Company of Japan, Limited (JVC).
Improvements to this article
Please help improve this article with a more novice introduction. A simple diagram that describes compression would be very helpful. For example compression is achieved thru time bases similarities in the frames and spacial, nearby pixels, similarities.
See also
External links
- A Beginners Guide for MPEG-2 Standard
- MPEG-2 Overview
- MPEG-2 video compression
- List of MPEG resources
- Videobitrates and videoresolutionda:MPEG-2
de:MPEG-2 es:MPEG-2 fr:MPEG-2 ko:MPEG-2 it:MPEG-2 ja:MPEG-2 pl:MPEG-2 sk:MPEG-2 fi:MPEG-2 sv:MPEG-2 zh:MPEG-2