H.264/MPEG-4 AVC

wikipedia, “H.264/MPEG-4 AVC”, public translation into Russian from English More about this translation.

Translate into another language.

MPEG-4 - это состоящий из нескольких "частей" набор стандартов, где каждая из частей стандартизирует некоторые мультимедийные элементы, в частности аудио, видео и файловые форматы. Более подробную информацию о том, какие части имеются и чему они посвящены, можно узнать, обратившись к статье об MPEG-4

H.264/AVC/MPEG-4 Part 10 (Advanced Video Coding) это стандарт для сжатия видео. Последняя черновая работа над первой версией стандарта была выполнена в мае 2003 года

H.264/AVC - самый современный стандарт кодека, основанного на блочно-ориентированной компенсации движения, разработанный группой ITU-T Video Coding Experts Group (VCEG) совместно с ISO/IEC Moving Picture Experts Group (MPEG). Партнёрская команда, занимавшаяся этим проектом, известна как Joint Video Team (JVT). Стандарты ITU-T H.264 и ISO/IEC MPEG-4 AVC (официально именющийся ISO/IEC 14496-10 - MPEG-4 Part 10, Advanced Video Coding) поддерживаются совместно, таким образом они идентичны по техническому сожержанию. Формат H.264 используется в таких приложениях как Blu-ray диски, видео с YouTube и iTunes Store, DVB вещание, сервисы прямого спутникового телевещания, сервисы кабельного телевидения, а также видеоконференции в режиме реального времени.

1. Обзор

2. Комитет стандартизации и история

3. Приложения

4. Лицензирование патентов

4.1. Патенты и лицензии свободного программного обеспечения GNU

5. Возможности

6. Профили

7. Уровни

8. Буферизация декодированной картинки

9. Версии

10. Сравнение возможностей программных кодировщиков

11. Аппаратный кодировщик и IP

12. Смотри также

13. Заметки

14. Сноски

15. Ссылки на внешние ресурсы

15.1 Введение

15.2 Стандарт

15.3 Эталонный кодек

15.4 Документы комитета стандартизации

15.5 Разное

---

Обзор

Целью проекта H.264/AVC являлось создание стандарта, способного обеспечить хорошее качество видео на существенно меньших битрейтах, чем у предыдущих стандартов (например вполовину и меньше, чем битрейт MPEG-2, H.263, или MPEG-4 Part 2), не усложняя дизайн настолько, чтобы он стал неудобным или его реализация - чрезмерно затратной. Еще одной, дополнительной задачей было обспечить достаточную гибкость, чтобы стандарт можно было использовать в широком спектре приложений на большом многообразии сетей и систем, включая низкие и высокие битрейты, видео с низким и высоким разрешениями, вещание, хранение на DVD, RTP/IP сети и мультимедийные телефонные системы ITU-T.

Стандарт H.264 является "семейством стандартов", составляющие которого - профили - описаны ниже. Конкретный декодер должен уметь декодировать по крайней мере один профиль, поддержка всех профилей не является обязательной. Поддерживаемые определенным декодером профили указываются в его спецификации.

Стандартизация первой версии H.264/AVC была завершена в мае 2003 года. Затем JVT разработала расширения оригинального стандарта, известные как Fidelity Range Extensions (FRExt).

Further recent extensions of the standard have included adding five new profiles intended primarily for professional applications, adding extended-gamut color space support, defining additional aspect ratio indicators, defining two additional types of "supplemental enhancement information" (post-filter hint and tone mapping), and deprecating one of the prior FRExt profiles that industry feedback indicated should have been designed differently.

Scalable Video Coding as specified in Annex G of H.264/AVC allows the construction of bitstreams that contain sub-bitstreams that conform to H.264/AVC. For temporal bitstream scalability, i.e., the presence of a sub-bitstream with a smaller temporal sampling rate than the bitstream, complete access units are removed from the bitstream when deriving the sub-bitstream. In this case, high-level syntax and inter prediction reference pictures in the bitstream are constructed accordingly. For spatial and quality bitstream scalability, i.e. the presence of a sub-bitstream with lower spatial resolution or quality than the bitstream, NAL (Network Abstraction Layer) removed from the bitstream when deriving the sub-bitstream. In this case, inter-layer prediction, i.e., the prediction of the higher spatial resolution or quality signal by data of the lower spatial resolution or quality signal, is typically used for efficient coding. The Scalable Video Coding extension was completed in November 2007.

The H.264 name follows the ITU-T naming convention, where the standard is a member of the H.26x line of VCEG video coding standards; the MPEG-4 AVC name relates to the naming convention in ISO/IEC MPEG, where the standard is part 10 of ISO/IEC 14496, which is the suite of standards known as MPEG-4. The standard was developed jointly in a partnership of VCEG and MPEG, after earlier development work in the ITU-T as a VCEG project called H.26L. It is thus common to refer to the standard with names such as H.264/AVC, AVC/H.264, H.264/MPEG-4 AVC, or MPEG-4/H.264 AVC, to emphasize the common heritage. The name H.26L, referring to its ITU-T history, is less common, but still used. Occasionally, it is also referred to as "the JVT codec", in reference to the Joint Video Team (JVT) organization that developed it. (Such partnership and multiple naming is not uncommon. For example, the video codec standard known as MPEG-2 also arose from the partnership between MPEG and the ITU-T, where MPEG-2 video is known to the ITU-T community as H.262.[1])

[править]

История комитета стандартизации

In early 1998 the Video Coding Experts Group (VCEG - ITU-T SG16 Q.6) issued a call for proposals on a project called H.26L, with the target to double the coding efficiency (which means halving the bit rate necessary for a given level of fidelity) in comparison to any other existing video coding standards for a broad variety of applications. VCEG was chaired by Gary Sullivan (Microsoft [formerly PictureTel], USA). The first draft design for that new standard was adopted in August 1999. In 2000, Thomas Wiegand (Heinrich Hertz Institute, Germany) became VCEG co-chair. In December 2001, VCEG and the Moving Picture Experts Group (MPEG - ISO/IEC JTC 1/SC 29/WG 11) formed a Joint Video Team (JVT), with the charter to finalize the video coding standard. Formal approval of the specification came in March 2003. The JVT was (is) chaired by Gary Sullivan, Thomas Wiegand, and Ajay Luthra (Motorola, USA). In June 2004, the Fidelity range extensions (FRExt) project was finalized. From January 2005 to November 2007, the JVT was working on an extension of H.264/AVC towards scalability by an Annex (G) called Scalable Video Coding (SVC). The JVT management team was extended by Jens-Reiner Ohm (Aachen University, Germany). Since July 2006, the JVT works on Multiview Video Coding (MVC), an extension of H.264/AVC towards free viewpoint television and 3D television.

[править]

Приложения

Будущая информация: видеосервисы использующие H.264/MPEG-4 AVC

The H.264 video format has a very broad application range that covers all forms of digital compressed video from low bit-rate Internet streaming applications to HDTV broadcast and Digital Cinema applications with nearly lossless coding. With the use of H.264, bit rate savings of 50% [2] or more are reported. Digital Satellite TV quality, for example, was reported to be achievable at 1.5 Mbit/s, compared to the current operation point of MPEG 2 video at around 3.5 Mbit/s.[3] In order to ensure compatibility and problem-free adoption of H.264/AVC, many standards bodies have amended or added to their video-related standards so that users of these standards can employ H.264/AVC.

Blu-ray и HD DVD поддерживают H.264/AVC High Profile в качестве обязательного. Sony так-же выбрала этот формат для Memory Stick Video.

В конце 2004 DVB одобрил использование H.264/AVC для телевещания.

The Advanced Television Systems Committee (ATSC) standards body in the United States approved the use of H.264/AVC for broadcast television in July 2008, although the standard is not yet used for fixed ATSC broadcasts within the United States.[5] [6] It has since been approved for use with the more recent ATSC-M/H (Mobile/Handheld) standard, using the AVC and SVC portions of H.264.[7]

AVCHD - формат записи высокого разрешения от Sony и Panasonic соответствующий H.264(с нововведениями и ограничениями зависимыми от конкретного приложения).

AVC-Intra - формат сжатия I-Frame фирмы Panasonic.

The CCTV (Close Circuit TV) or Video Surveillance market has included the technology in many products. Prior to this technology, the compression formats used within the industry's DVRs Digital Video Recorders were generally low quality in compression capability. With the application of the H.264 compression technology to the video surveillance industry, the quality of the video recordings became substantially improved. Starting in 2008, some in the surveillance industry promoted the H.264 technology as synonymous with "high quality" video.

[править]

Patent licensing

In countries where patents on software algorithms are upheld, vendors and commercial users of products which make use of H.264/AVC are expected to pay patent licensing royalties for the patented technology[8] that their products use. This applies to the Baseline Profile as well.[9] A private organization known as MPEG LA, which is not affiliated in any way with the MPEG standardization organization, administers the licenses for patents applying to this standard, as well as the patent pools for MPEG-2 Part 1 Systems, MPEG-2 Part 2 Video, MPEG-4 Part 2 Video, and other technologies. The last US MPEG LA patents for H.264 may not expire until 2028[10].

On February 2, 2010 MPEG LA announced that H.264-encoded Internet Video that is free to end users would continue to be exempt from royalty fees until at least December 31, 2015. [11] However, other fees remain in place. The license terms are updated in 5-year blocks. [12]

In 2005, Qualcomm, which was the assignee of US Patents 5,452,104,[13] and 5,576,767[14] sued Broadcom in US District Court, alleging that Broadcom infringed the two patents by making products that were compliant with the H.264 video compression standard.[15] In 2007, the District Court found that the patents were unenforceable because Qualcomm had failed to disclose them to the JVT prior to the release of the H.264 standard in May 2003.[15] In December 2008, the US Court of Appeals for the Federal Circuit affirmed the District Court's order that the patents be unenforceable but remanded to the District Court with instructions to limit the scope of unenforceability to H.264 compliant products.[15]

[править]

Patents and GNU Free Software licenses

Discussions are often held regarding the legality of free software implementations of formats like H.264, especially concerning the legal use of GNU LGPL and GPL implementations of H.264 and other patented formats. Consensus in discussions is that the allowable use depends on the laws of local jurisdictions. If operating or shipping a product in a country or group of countries where none of the patents covering H.264 apply, then using, for example, an LGPL implementation of the format is not a problem: There is no conflict between the software license and the (non-existent) patent license.

Conversely, shipping a product in the U.S. which includes (though not necessarily implements) a GPL H.264 decoder/encoder requires that the copyright terms of the GPL license be upheld, otherwise conveying the codec would be in violation of the software license of the implementation. In simple terms, LGPL and GPL licenses version 3.0 and above require that any rights held in conjunction with distributing the code also apply to anyone receiving the code,[16] and no further restrictions are put on distribution or use.[17] A product which incorporates GPLed code must not rely upon a discriminatory patent license that would prohibit the user from exercising rights granted to them by the GPL.[18] Thus, the right to distribute patent-encumbered code under those licenses as part of the product is revoked per the terms of the GPL and LGPL.[18] It should be realized that the party who would enforce any such breach of copyright would be the people who hold copyright: its writers, whereby any suit on a breach of that clause would have to argue that there exist valid, applicable patents that apply to the capabilities GPL licenced code,[18] a stance copyright holders[nb 1] have not taken.[19]

[править]

Features

H.264/AVC/MPEG-4 Part 10 contains a number of new features that allow it to compress video much more effectively than older standards and to provide more flexibility for application to a wide variety of network environments. In particular, some such key features include:

Multi-picture inter-picture prediction including the following features:

Using previously-encoded pictures as references in a much more flexible way than in past standards, allowing up to 16 reference frames (or 32 reference fields, in the case of interlaced encoding) to be used in some cases. This is in contrast to prior standards, where the limit was typically one; or, in the case of conventional "B pictures", two. This particular feature usually allows modest improvements in bit rate and quality in most scenes. But in certain types of scenes, such as those with repetitive motion or back-and-forth scene cuts or uncovered background areas, it allows a significant reduction in bit rate while maintaining clarity.

Variable block-size motion compensation (VBSMC) with block sizes as large as 16×16 and as small as 4×4, enabling precise segmentation of moving regions. The supported luma prediction block sizes include 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, and 4×4, many of which can be used together in a single macroblock. Chroma prediction block sizes are correspondingly smaller according to the chroma subsampling in use.

The ability to use multiple motion vectors per macroblock (one or two per partition) with a maximum of 32 in the case of a B macroblock constructed of 16 4×4 partitions. The motion vectors for each 8×8 or larger partition region can point to different reference pictures.

The ability to use any macroblock type in B-frames, including I-macroblocks, resulting in much more efficient encoding when using B-frames. This feature was notably left out from MPEG-4 ASP.

Six-tap filtering for derivation of half-pel luma sample predictions, for sharper subpixel motion-compensation. Quarter-pixel motion is derived by linear interpolation of the halfpel values, to save processing power.

Quarter-pixel precision for motion compensation, enabling precise description of the displacements of moving areas. For chroma the resolution is typically halved both vertically and horizontally (see 4:2:0) therefore the motion compensation of chroma uses one-eighth chroma pixel grid units.

Weighted prediction, allowing an encoder to specify the use of a scaling and offset when performing motion compensation, and providing a significant benefit in performance in special cases—such as fade-to-black, fade-in, and cross-fade transitions. This includes implicit weighted prediction for B-frames, and explicit weighted prediction for P-frames.

Spatial prediction from the edges of neighboring blocks for "intra"coding, rather than the "DC"-only prediction found in MPEG-2 Part 2 and the transform coefficient prediction found in H.263v2 and MPEG-4 Part 2. This includes luma prediction block sizes of 16×16, 8×8, and 4×4 (of which only one type can be used within each macroblock).

Lossless macroblock coding features including:

A lossless "PCM macroblock" representation mode in which video data samples are represented directly,[20] allowing perfect representation of specific regions and allowing a strict limit to be placed on the quantity of coded data for each macroblock.

An enhanced lossless macroblock representation mode allowing perfect representation of specific regions while ordinarily using substantially fewer bits than the PCM mode.

Flexible interlaced-scan video coding features, including:

Macroblock-adaptive frame-field (MBAFF) coding, using a macroblock pair structure for pictures coded as frames, allowing 16×16 macroblocks in field mode (compared with MPEG-2, where field mode processing in a picture that is coded as a frame results in the processing of 16×8 half-macroblocks).

Picture-adaptive frame-field coding (PAFF or PicAFF) allowing a freely-selected mixture of pictures coded as MBAFF frames with pictures coded as individual single fields (half frames) of interlaced video.[clarification needed]

New transform design features, including:

An exact-match integer 4×4 spatial block transform, allowing precise placement of residual signals with little of the "ringing" often found with prior codec designs. This is conceptually similar to the well-known DCT design, but simplified and made to provide exactly-specified decoding.

An exact-match integer 8×8 spatial block transform, allowing highly correlated regions to be compressed more efficiently than with the 4×4 transform. This is conceptually similar to the well-known DCT design, but simplified and made to provide exactly-specified decoding.

Adaptive encoder selection between the 4×4 and 8×8 transform block sizes for the integer transform operation.

A secondary Hadamard transform performed on "DC" coefficients of the primary spatial transform applied to chroma DC coefficients (and also luma in one special case) to obtain even more compression in smooth regions.

A quantization design including:

Logarithmic step size control for easier bit rate management by encoders and simplified inverse-quantization scaling.

Frequency-customized quantization scaling matrices selected by the encoder for perceptual-based quantization optimization.

An in-loop deblocking filter which helps prevent the blocking artifacts common to other DCT-based image compression techniques, resulting in better visual appearance and compression efficiency.

An entropy coding design including:

Context-adaptive binary arithmetic coding (CABAC), an algorithm to losslessly compress syntax elements in the video stream knowing the probabilities of syntax elements in a given context. CABAC compresses data more efficiently than CAVLC but requires considerably more processing to decode.

Context-adaptive variable-length coding (CAVLC), which is a lower-complexity alternative to CABAC for the coding of quantized transform coefficient values. Although lower complexity than CABAC, CAVLC is more elaborate and more efficient than the methods typically used to code coefficients in other prior designs.

A common simple and highly structured variable length coding (VLC) technique for many of the syntax elements not coded by CABAC or CAVLC, referred to as Exponential-Golomb coding (or Exp-Golomb).

Loss resilience features including:

A Network Abstraction Layer (NAL) definition allowing the same video syntax to be used in many network environments. One very fundamental design concept of H.264 is to generate self contained packets, to remove the header duplication as in MPEG-4's Header Extension Code (HEC).[21] This was achieved by decoupling information relevant to more than one slice from the media stream. The combination of the higher-level parameters is called a parameter set.[21] The H.264 specification includes two types of parameter sets: Sequence Parameter Set (SPS) and Picture Parameter Set (PPS). An active sequence parameter set remains unchanged throughout a coded video sequence, and an active picture parameter set remains unchanged within a coded picture. The sequence and picture parameter set structures contain information such as picture size, optional coding modes employed, and macroblock to slice group map.[21]

Flexible macroblock ordering (FMO), also known as slice groups, and arbitrary slice ordering (ASO), which are techniques for restructuring the ordering of the representation of the fundamental regions (macroblocks) in pictures. Typically considered an error/loss robustness feature, FMO and ASO can also be used for other purposes.

Data partitioning (DP), a feature providing the ability to separate more important and less important syntax elements into different packets of data, enabling the application of unequal error protection (UEP) and other types of improvement of error/loss robustness.

Redundant slices (RS), an error/loss robustness feature allowing an encoder to send an extra representation of a picture region (typically at lower fidelity) that can be used if the primary representation is corrupted or lost.

Frame numbering, a feature that allows the creation of "sub-sequences", enabling temporal scalability by optional inclusion of extra pictures between other pictures, and the detection and concealment of losses of entire pictures, which can occur due to network packet losses or channel errors.

Switching slices, called SP and SI slices, allowing an encoder to direct a decoder to jump into an ongoing video stream for such purposes as video streaming bit rate switching and "trick mode" operation. When a decoder jumps into the middle of a video stream using the SP/SI feature, it can get an exact match to the decoded pictures at that location in the video stream despite using different pictures, or no pictures at all, as references prior to the switch.

A simple automatic process for preventing the accidental emulation of start codes, which are special sequences of bits in the coded data that allow random access into the bitstream and recovery of byte alignment in systems that can lose byte synchronization.

Supplemental enhancement information (SEI) and video usability information (VUI), which are extra information that can be inserted into the bitstream to enhance the use of the video for a wide variety of purposes.[clarification needed]

Auxiliary pictures, which can be used for such purposes as alpha compositing.

Support of monochrome, 4:2:0, 4:2:2, and 4:4:4 chroma subsampling (depending on the selected profile).

Support of sample bit depth precision ranging from 8 to 14 bits per sample (depending on the selected profile).

The ability to encode individual color planes as distinct pictures with their own slice structures, macroblock modes, motion vectors, etc., allowing encoders to be designed with a simple parallelization structure (supported only in the three 4:4:4-capable profiles).

Picture order count, a feature that serves to keep the ordering of the pictures and the values of samples in the decoded pictures isolated from timing information, allowing timing information to be carried and controlled/changed separately by a system without affecting decoded picture content.

These techniques, along with several others, help H.264 to perform significantly better than any prior standard under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less, especially on high bit rate and high resolution situations.[22]

Like other ISO/IEC MPEG video standards, H.264/AVC has a reference software implementation that can be freely downloaded.[23] Its main purpose is to give examples of H.264/AVC features, rather than being a useful application per se. Some reference hardware design work is also under way in the Moving Picture Experts Group. The above mentioned are complete features of H.264/AVC covering all profiles of H.264. A profile for a codec is a set of features of that codec identified to meet a certain set of specifications of intended applications. This means that many of the features listed are not supported in some profiles. Various profiles of H.264/AVC are discussed in next section.

[править]

Profiles

The standard defines various sets of capabilities, which are referred to as profiles, targeting specific classes of applications.

Profiles for non-scalable 2D video applications include the following:

Constrained Baseline Profile (CBP)

Primarily for low-cost applications, this profile is most typically used in videoconferencing and mobile applications. It corresponds to the subset of features that are in common between the Baseline, Main, and High Profiles described below.

Baseline Profile (BP)

Primarily for low-cost applications that require additional data loss robustness, this profile is used in some videoconferencing and mobile applications. This profile includes all features that are supported in the Constrained Baseline Profile, plus three additional features that can be used for loss robustness (or for other purposes such as low-delay multi-point video stream compositing). The importance of this profile has faded somewhat since the definition of the Constrained Baseline Profile in 2009. All Constrained Baseline Profile bitstreams are also considered to be Baseline Profile bitstreams, as these two profiles share the same profile identifier code value.

Main Profile (MP)

This profile is used for standard-definition digital TV broadcasts that use the MPEG-4 format as defined in the DVB standard[24]. It is not, however, used for high-definition television broadcasts, as the importance of this profile faded when the High Profile was developed in 2004 for that application.

Extended Profile (XP)

Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching.

High Profile (HiP)

The primary profile for broadcast and disc storage applications, particularly for high-definition television applications (for example, this is the profile adopted by the Blu-ray Disc storage format and the DVB HDTV broadcast service).

High 10 Profile (Hi10P)

Going beyond typical mainstream consumer product capabilities, this profile builds on top of the High Profile, adding support for up to 10 bits per sample of decoded picture precision.

High 4:2:2 Profile (Hi422P)

Primarily targeting professional applications that use interlaced video, this profile builds on top of the High 10 Profile, adding support for the 4:2:2 chroma subsampling format while using up to 10 bits per sample of decoded picture precision.

High 4:4:4 Predictive Profile (Hi444PP)

This profile builds on top of the High 4:2:2 Profile, supporting up to 4:4:4 chroma sampling, up to 14 bits per sample, and additionally supporting efficient lossless region coding and the coding of each picture as three separate color planes.

For camcorders, editing, and professional applications, the standard contains four additional all-Intra profiles, which are defined as simple subsets of other corresponding profiles. These are mostly for professional (e.g., camera and editing system) applications:

High 10 Intra Profile

The High 10 Profile constrained to all-Intra use.

High 4:2:2 Intra Profile

The High 4:2:2 Profile constrained to all-Intra use.

High 4:4:4 Intra Profile

The High 4:4:4 Profile constrained to all-Intra use.

CAVLC 4:4:4 Intra Profile

The High 4:4:4 Profile constrained to all-Intra use and to CAVLC entropy coding (i.e., not supporting CABAC).

As a result of the Scalable Video Coding (SVC) extension, the standard contains three additional scalable profiles, which are defined as a combination of a H.264/AVC profile for the base layer (identified by the second word in the scalable profile name) and tools that achieve the scalable extension:

Scalable Baseline Profile

Primarily targeting video conferencing, mobile, and surveillance applications, this profile builds on top of a constrained version of the H.264/AVC Baseline profile to which the base layer (a subset of the bitstream) must conform. For the scalability tools, a subset of the available tools is enabled.

Scalable High Profile

Primarily targeting broadcast and streaming applications, this profile builds on top of the H.264/AVC High Profile to which the base layer must conform.

Scalable High Intra Profile

Primarily targeting production applications, this profile is the Scalable High Profile constrained to all-Intra use.

As a result of the Multiview Video Coding (MVC) extension, the standard contains two multiview profiles:

Stereo High Profile

This profile targets two-view stereoscopic 3D video and combines the tools of the High profile with the inter-view prediction capabilities of the MVC extension.

Multiview High Profile

This profile supports two or more views using both inter-picture (temporal) and MVC inter-view prediction, but does not support field pictures and macroblock-adaptive frame-field coding.

Predefined profilesFeature CBP BP XP MP HiP Hi10P Hi422P Hi444PP

B slices No No Yes Yes Yes Yes Yes Yes

SI and SP slices No No Yes No No No No No

Flexible macroblock ordering (FMO) No Yes Yes No No No No No

Arbitrary slice ordering (ASO) No Yes Yes No No No No No

Redundant slices (RS) No Yes Yes No No No No No

Data partitioning No No Yes No No No No No

Interlaced coding (PicAFF, MBAFF) No No Yes Yes Yes Yes Yes Yes

CABAC entropy coding No No No Yes Yes Yes Yes Yes

8×8 vs. 4×4 transform adaptivity No No No No Yes Yes Yes Yes

Quantization scaling matrices No No No No Yes Yes Yes Yes

Separate Cb and Cr QP control No No No No Yes Yes Yes Yes

Monochrome (4:0:0) No No No No Yes Yes Yes Yes

Chroma formats 4:2:0 4:2:0 4:2:0 4:2:0 4:2:0 4:2:0 4:2:0/4:2:2 4:2:0/4:2:2/4:4:4

Sample depths (bits) 8 8 8 8 8 8 to 10 8 to 10 8 to 14

Separate color plane coding No No No No No No No Yes

Predictive lossless coding No No No No No No No Yes

[править]

Levels

Levels with maximum property valuesLevel Max macroblocks Max video bit rate (VCL) Examples for high resolution @

frame rate

(max stored frames)

per second per frame BP, XP, MP

(kbit/s) HiP

(kbit/s) Hi10P

(kbit/s) Hi422P, Hi444PP

(kbit/s)

1 1,485 99 64 80 192 256 128×[email protected] (8)

1b 1,485 99 128 160 384 512 128×[email protected] (8)

1.1 3,000 396 192 240 576 768 176×[email protected] (9)

1.2 6,000 396 384 480 1,152 1,536 320×[email protected] (7)

1.3 11,880 396 768 960 2,304 3,072 320×[email protected] (7)

2 11,880 396 2,000 2,500 6,000 8,000 320×[email protected] (7)

2.1 19800 792 4,000 5,000 12,000 16,000 352×[email protected] (7)

2.2 20,250 1,620 4,000 5,000 12,000 16,000 352×[email protected](10)

3 40,500 1,620 10,000 12,500 30,000 40,000 352×[email protected] (12)

3.1 108,000 3,600 14,000 17,500 42,000 56,000 720×[email protected] (13)

3.2 216,000 5,120 20,000 25,000 60,000 80,000 1,280×[email protected] (5)

1,280×1,[email protected] (4)

4 245,760 8,192 20,000 25,000 60,000 80,000 1,280×[email protected] (9)

1,920×1,[email protected] (4)

2,048×1,[email protected] (4)

4.1 245,760 8,192 50,000 62,500 150,000 200,000 1,280×[email protected] (9)

1,920×1,[email protected] (4)

2,048×1,[email protected] (4)

4.2 522,240 8,704 50,000 62,500 150,000 200,000 1,920×1,[email protected] (4)

2,048×1,[email protected] (4)

5 589,824 22,080 135,000 168,750 405,000 540,000 1,920×1,[email protected] (13)

2,048×1,[email protected] (13)

2,048×1,[email protected] (12)

2,560×1,[email protected] (5)

3,680×1,[email protected] (5)

5.1 983,040 36,864 240,000 300,000 720,000 960,000 1,920×1,[email protected] (16)

4,096×2,[email protected] (5)

4,096×2,[email protected] (5)

[править]

Decoded picture buffering

Previously-decoded pictures are used by H.264/AVC encoders to provide predictions of the values of samples in other pictures. This allows the encoder to make efficient decisions on the best way to encode a given picture. Such pictures are stored in a virtual decoded picture buffer (DPB). The maximum capacity of the DPB in units of frames (or pairs of fields), as shown in parentheses in the right column of the table above, can be computed as follows:

Min(Floor(MaxDpbMbs / (PicWidthInMbs * FrameHeightInMbs)), 16)

where MaxDpbMbs is a constant value provided in the table below as a function of level number, and PicWidthInMbs and FrameHeightInMbs are the picture width and frame height for the coded video data, expressed in units of macroblocks. (This formula is specified in sections A.3.1.h and A.3.2.f of the 2009 edition of the standard.)Level 1 1b 1.1 1.2 1.3 2 2.1 2.2 3 3.1 3.2 4 4.1 4.2 5 5.1

MaxDpbMbs 396 396 900 2,376 2,376 2,376 4,752 8,100 8,100 18,000 20,480 32,768 32,768 34,816 110,400 184,320

For example, for an HDTV picture that is 1920 samples wide (PicWidthInMbs = 120) and 1080 samples high (FrameHeightInMbs = 68), a Level 4 decoder has a maximum DPB storage capacity of Floor(32768/(120*68)) = 4 frames (or 8 fields). Thus, the value 4 is shown in parentheses in the table above in the right column of the row for Level 4 with the frame size 1920×1080.

It is important to note that the current picture being decoded is not included in the computation of DPB fullness (unless the encoder has indicated for it to be stored for use as a reference for decoding other pictures or for delayed output timing). Thus, a decoder needs to actually have sufficient memory to handle (at least) one frame more than the maximum capacity of the DPB as calculated above.

[править]

Versions

Versions of the H.264/AVC standard include the following completed revisions, corrigenda, and amendments (dates are final approval dates in ITU-T, while final "International Standard" approval dates in ISO/IEC are somewhat different and slightly later in most cases). Each version represents changes relative to the next lower version that is integrated into the text. Bold faced versions are published (or planned to be published).

Version 1: (May 2003) First approved version of H.264/AVC containing Baseline, Extended, and Main profiles.

Version 2: (May 2004) Corrigendum containing various minor corrections.

Version 3: (March 2005) Major addition to H.264/AVC containing the first Amendment providing Fidelity Range Extensions (FRExt) containing High, High 10, High 4:2:2, and High 4:4:4 profiles.

Version 4: (September 2005) Corrigendum containing various minor corrections and adding three aspect ratio indicators.

Version 5: (June 2006) Amendment consisting of removal of prior High 4:4:4 profile (processed as a corrigendum in ISO/IEC).

Version 6: (June 2006) Amendment consisting of minor extensions like extended-gamut color space support (bundled with above-mentioned aspect ratio indicators in ISO/IEC).

Version 7: (April 2007) Amendment containing the addition of High 4:4:4 Predictive and four Intra-only profiles (High 10 Intra, High 4:2:2 Intra, High 4:4:4 Intra, and CAVLC 4:4:4 Intra).

Version 8: (November 2007) Major addition to H.264/AVC containing the Amendment for Scalable Video Coding (SVC) containing Scalable Baseline, Scalable High, and Scalable High Intra profiles.

Version 9: (January 2009) Corrigendum containing minor corrections.

Version 10: (March 2009) Amendment containing definition of a new profile (the Constrained Baseline profile) with only the common subset of capabilities supported in various previously-specified profiles.

Version 11: (March 2009) Major addition to H.264/AVC containing the Amendment for Multiview Video Coding (MVC) extension, including the Multiview High profile.

Version 12: (November 2009) Amendment containing definition of a new MVC profile (the Stereo High profile) for two-view video coding with support of interlaced coding tools and specifying an additional SEI message (the frame packing arrangement SEI message).

Version 13: (November 2009) Corrigendum containing minor corrections.

[править]

Software encoder feature comparison

[citation needed]

AVC software implementationsFeature QT Nero LEAD x264 MainConcept DivX Dicas Elecard TSE VSofts ProCoder Avivo Elemental IPP

B slices Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes

SI and SP slices No No No No No No No No No No No No No No

Multiple reference frames Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes

Flexible Macroblock Ordering (FMO) No No No No No No No No No Yes No No No No

Arbitrary slice ordering (ASO) No No No No No No No No No No No No No No

Redundant slices (RS) No No No No No No No No No No No No No No

Data partitioning No No No No No No No No No No No No No No

Interlaced coding (PicAFF, MBAFF) No MBAFF MBAFF Yes Yes Yes Yes Yes No MBAFF Yes MBAFF Yes No

CABAC entropy coding Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes

8×8 vs. 4×4 transform adaptivity No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes

Quantization scaling matrices No No No Yes Yes Yes No No No Yes No No No No

Separate Cb and Cr QP control No No No Yes Yes Yes Yes Yes No No No No No No

Monochrome (4:0:0) No No No No No No No Yes No No No No No No

Chroma formats (4:2:x) 0 0 0 0 0, 2 0, 2 0 0, 2 0, 2 0, 2, 4 0 0 0 0

Largest sample depth (bit) 8 8 8 8 10 10 8 8 8 10 8 8 8 12

Separate color plane coding No No No No No No No No No No No No No No

Predictive lossless coding No No No Yes No No No Yes No No No No No No

Film grain modeling No No No No No No No No No No No No No No

Fully supported profilesProfile QT Nero LEAD x264 MainConcept DivX Dicas Elecard TSE VSofts ProCoder Avivo Elemental IPP

Constrained baseline Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Baseline No No No No No No No No No No No No No No

Extended No No No No No No No No No No No No No No

Main No Yes/No Yes/No Yes Yes/No Yes/No Yes Yes No Yes/No Yes No Yes No

High No No No No No No No No No No No No No No

[править]

Hardware encoder and IP

Because H.264 encoders required significant computing power, software encoders run on general CPU are typically slow, especially dealing with HD contents. To offload the CPU and/or to do realtime encoding, hardware encoders may be employed.

A hardware H.264 encoder can be an ASIC or an FPGA. An FPGA is a general programmable chip. To use an FPGA as a hardware encoder, an H.264 encoder IP is required. As technology evolves, a full HD (main profile, level 4.1, 1080p, 30fps) H.264 encoder can run on a single chip of low cost FPGA in 2009.

ASIC encoders with H.264 encoder function are available from many different semiconductor companies, but the H.264 encoder IP used in the ASIC are mostly licensed from a few IP vendors. Some H.264 IP vendors' IP are for FPGA or ASIC only, and some are for both FPGA and ASIC.[25]

[править]

See also

Codec

Comparison of video on demand services

Comparison of H.264 and VC-1

Dirac (codec) - An open competitor to H.264

H.263

H.264/MPEG-4 AVC Products and Implementations

H.265

IPTV

ISO/IEC Moving Picture Experts Group (MPEG)

ITU-T Video Coding Experts Group (VCEG)

MPEG-2

MPEG-4

Scalable Video Coding

Theora - Another open competitor to H.264 for use online.

x264 - Software encoder

[править]

Notes

^ Not all copyright holders are necessarily represented by the project as it is free software.

[править]

References

^ "H.262 : Information technology - Generic coding of moving pictures and associated audio information: Video". Retrieved 2007-04-15.

^ H.264 Joint Video Surveillance Group Compression Research Data: 2008

^ Wenger, et al.. RFC 3984 : RTP Payload Format for H.264 Video. p. 2.

^ Memorystick.org

^ ATSC Standard A/72 Part 1: Video System Characteristics of AVC in the ATSC Digital Television System

^ ATSC Standard A/72 Part 2: AVC Video Transport Subsystem Characteristics

^ ATSC Standard A/153 Part 7: AVC and SVC Video System Characteristics

^ "Summary of AVC/H.264 License Terms". Retrieved 2010-03-25.

^ "OMS Video, A Project of Sun's Open Media Commons Initiative". Retrieved 2008-08-26.

^ http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-July/020737.html

^ http://www.mpegla.com/Lists/MPEG%20LA%20News%20List/Attachments/226/n-10-02-02.pdf

^ http://www.mpegla.com/main/programs/AVC/Pages/FAQ.aspx

^ US Patent 5,452,104

^ US Patent 5,576,767

^ a b c See Qualcomm Inc. v. Broadcom Corp., No. 2007-1545, 2008-1162 (Fed. Cir. Dec. 1, 2008). For articles in the popular press, see signonsandiego.com, "Qualcomm loses its patent-rights case" and "Qualcomm's patent case goes to jury"; and bloomberg.com "Broadcom Wins First Trial in Qualcomm Patent Dispute".

^ GNU.org, GPL version 3, section 10

^ GPL, version 3, section 7

^ a b c GPL, version 3, section 11

^ FFmpeg Licence and Legal Considerations

^ Gary J. Sullivan, Pankaj Topiwala, and Ajay Luthra (2007) The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions, Retrieved 2009-10-08

^ a b c RFC 3984, p.3

^ Apple: H.264 FAQ

^ H.264/AVC JM Reference Software Download

^ http://www.etsi.org/deliver/etsi_ts/101100_101199/101154/01.09.01_60/ts_101154v010901p.pdf

^ Design-reuse.com

Pages: ← previous Ctrl next
1 2 3 4

Original (English): H.264/MPEG-4 AVC

Translation: © rusxg, Michael_Dragunov, blueboar2, gHOSST666, Atreides .

License: Материал из Wikipedia

translatedby.com crowd

Like this translation? Share it or bookmark!