Applications
The standards that make up Book B
MPEG-2 restrictions
Subpictures
Vertical Blanking Interval (VBI) packets
Letterboxing
Audio
Interactive Functions
Title structure breakdown
Presentation of Program Chains
MPEG-2 restrictions
Subpictures
Vertical Blanking Interval (VBI) packets
Letterboxing
Audio
Interactive Functions
Title structure breakdown
Presentation of Program Chains
Standards
Video |
ITU-T H.262/ISO-IEC 13818-2 (MPEG-2 Video) ISO/IEC 11172-2 (MPEG-1 Video) |
Audio |
ISO/IEC 13818-3 (MPEG-2 Audio) ISO/IEC 11172-3 (MPEG-1 Audio) Dolby AC-3 standard |
System |
ITU-T H.222 / ISO/IEC 13818-1 (MPEG-2 Systems) Program/PES stream only (no Transport streams) |
Video Data Specifications
DVD adds many additional restrictions to the popular compliance parameter sets of MPEG. One good example is the restriction on the coded size of a picture: MPEG-2 Main Profile @ Main Level allows any coded frame size between 16 and 720 pixels horizontally and 16 and 576 pixels vertically. DVD, however, restricts the coded frame sizes to a very limited but practical subset.
In MPEG, audio can be coded at a sample rate of 32, 44.1 or 48 kHz. In DVD, the rates of both Dolby AC-3 and MPEG audio are strictly set to 48 kHz.
MPEG is a generic representation meant for a wide variety of applications. DVD has taken a practical subset to promote interoperability by simplifying implementations and insuring features (such as random accessibility).
Coded representation | MPEG-1 (SIF combo) MPEG-2 (Main Profile @ Main Level) |
Frame rate | 29.97 or 25 Hz |
TV system | 525/60 or 625/50 |
Aspect ratio | 4:3 (all video formats) 16:9 (all formats except 352 pixels/line) |
Display Mode | pan & scan, letterbox |
User_data | closed caption |
Coded frame sizes | 525/60: 720x480, 704x480, 352x480, 352x240 625/50: 720x576, 704x576, 352x576, 352x288 (MPEG-1 is allowed only in 352x240 or 352x288 res). |
GOP size | max 36 fields or 18 frames (NTSC) max 30 fields or 15 frames (PAL) |
Buffer size | 1.8535008 Mbits (MPEG-2) max 327689 bits (MPEG-1) |
Transfer method | VBR, CBR (MPEG-2) only CBR for MPEG-1 |
Maximum bitrate | 9.8 Mbit/sec |
Low_delay | NOT permitted !!!! |
Notes:
the frame rate is the intended display frame rate. The number of coded frames in a sequence may vary due to 3:2 pulldown (the DVD MPEG decoder performs this function). The permitted values in DVD are more restrictive than MPEG-2 MP@ML which also includes 23.976, 24, and 30 frames/sec rates.
aspect ratio is the display aspect ratio. Only 16:9 and 4:3 are permitted. Note: MP@ML's 2.21:1 is not included.
MP@ML has no GOP size restriction. In fact, the GOP() is considered to be an insignificant layer in MPEG-2. Instead the sequence() layer serves as the most important boundary in the generic MPEG sense.
The MPEG-1 and MPEG-2 vbv_buffer_size limits are the same as MP@ML and Constrained Parameters Bitstreams, respectively.
The maximum bitrate of 9.8 Mbit/sec is more restrictive than MP@ML's 15 Mbit/sec limit. However, the point of diminishing returns (no visual difference between original video and compressed video) is widely known to be around 9 Mbit/sec.
user_data() fields in MPEG video picture headers contain closed captioning (similar to Grand Alliance and DVB methods). See this ATSC (Advanced Television Systems Committee) site for more information: http://www.atsc.org/
[is this the same in DVB, DVD, and ATSC ? ]For picture sizes, only a very limited set of coded dimensions are legal.
Variable bit rate is permitted only in MPEG-2 streams since the VBV model in MPEG-2 has provisions for it. MPEG-1 was an earlier standard (by two years) and not developed the VBV model to handle 3:2 pulldown cases.
contrary to popular belief: all DVD players are required to decode video streams up to 9.8 Mbit/sec for indefinite periods of time. The popular average rate of 3.5 Mbit/sec or 4.7 Mbit/sec are merely canonical figures created by the notion that only single sided, single layer discs will hold feature length films. Should Single Sided, Double Layer discs prevail, the average rate would be almost twice as great. ALL DVD PLAYERS MUST SUSTAIN A 9.8 MBIT/SEC VIDEO DECODE RATE!!!!!!! Hardwired (Application Specific Integrated Circuits---ASICs) implementations of MPEG-2 MP@ML decoders are generally capable of handling 15 mbit/sec sustained rates.
MPEG Display Formats
MPEG-2 video decoder chips have implemented pan & scan for a few years already since it has been a requirement for cable TV and direct broadcast satellite applications. The letterbox requirement (vertical filter) is a relatively new addition to the MPEG decoder universe. The second generation DVD MPEG-2 video decoders will most likely perform "on-chip" sub-picture reconstruction.
Display 4:3 | Display 16:9 | |
Source 4:3 | No conversion | horizontal filtering accomplished by TV monitor. |
Source 16:9 | letterbox (vertical filter) or Pan & Scan | No conversion |
Note: Letterbox Conversion is a mandatory feature in the DVD Player !!!
Subpictures
run-length compressed bitmaps that are overlayed ontop of the MPEG reconstructed video.
Applications include: Menus, sub-titles, karaoke, and simple animation.
Pixels are divided into four types: 1. background 2. Foregound 3. Empahsis-1 4. Emphasis-2
4 colors out of 16 color palette (4 colors are determined once per PGC).
4 out of 16 contrast values
up to a maximum of 32 sub-picture bitstreams. Each subpicture stream could, for example, could contain text from a particular language.
subpicture buffer size is restricted to 62 Kbytes. This means a maximum of 62 KB per GOP/cell. 32 Kbytes of this is control data.
Maximum number of bits per run-length coded line is 1440 bits.
Display area maximum: 720x480 (525/60) and 720x576 (625/50)
area, content, color, and contrast can be changed every video field
Sub-Picture Display Control Sequences (SP_DCSQ) control the presentation of Sub-pictures. Presentation effects include: scroll up/down, fade in/out, etc.
Structure of Sub-picture Decoding Unit (SPU):
[ SPUH ][ PXD ][ DCSQT ]
SPUH: Sub-picture Unit Header (size of SPU, start address of DCSQT)
PXD: Pixel Data (variable length run-length coded)
DCSQT: Display Control Sequence Table (one or more display control
command sequences).
DCSQT: [DCSQ 0][DCSQ 1][DCSQ 2] ... [DCSQ n]
DCSQ: [Start time] [ Pointer to next DCSQ] [Command Sequence]
Command Sequence: [DCC 0][DCC 1]... [DCC m]
Display Control Commands (DCC):
VBI Decoding
only 1 VBI channel per program (sub-pictures have up to 32) Line range is from 10 to 23 NTSC and 6 to 23.5 for PAL. Separate palette (16 Y values, Cr=Cb=128) from subpictures. No highlight Restricted DCSQ command set closed captioning for NTSC-rate (525/60) is coded exclusively in the user_data() field of the group_of_pictures_header()
VBI information is losslessly represented as a waveform, and coded into packets. The 525/60 player uses a far more efficient alternative: the source character stream is coded in the MPEG video user_data() field. The NTSC/PAL modulator chip then creates the VBI closed caption (Line 21) signal from this character stream.
This brings our tally of closed caption representations to THREE ways!!
as packets of 16-level sampled VBI waveforms (PAL) as user_data() character streams (NTSC) as rendered subpictures (NTSC and PAL)
Letterbox
Picture Size Conversion
All DVD players are required to have built-in vertical filters which scale a 16:9 coded video image to fit within a traditional 4:3 display. This player feature is needed since it is anticipated that a majority of movies will be coded for the 16:9 aspect ratio, while at the same time most TV displays (in the early years) will be 4:3. In the same vien as multilingual audio, a single coded aspect ratio in market distribution reduces confusion and bolsters economy of scale.
525/60 (NTSC-rate display):
(Note: 480*(4/3)/(16/9) = 480*0.75 = 360)
_____________________
| 60 |
|---------------------|
| |
| 360 | 480 lines total
| |
|---------------------|
| 60 |
-----------------------
625/50 (PAL-rate display):
(Note: 576*(4/3)/(16/9) = 576*0.75 = 432)
_____________________
| 72 |
|---------------------|
| |
| 432 | 576 lines total
| |
|---------------------|
| 72 |
-----------------------
A simple bi-linear vertical filter can be applied, yielding good visual results. Here, two source samples (s[n],s[n+1]) are weighted by simple complementary factors and added together to form the destination sample value (d[m]). These weights are easily implemented with shifters. For interlaced displays, vertical filtering occurs only within the same field parity.
d[0] = (3/4)*s[0] + (1/4)*s[1]
d[1] = (1/2)*s[1] + (1/2)*s[2]
d[2] = (1/4)*s[2] + (3/4)*s[3]
A decoder can determine whether inter or intra-parity vertical filtering is applied by testing the progressive_frame flag of the MPEG-2 video stream. (MPEG-1 frames are always progressive by definition). This flag indicates that a picture contains interlaced or progressive vertically correlated information. Almost all MPEG-2 coded movies consist exclusively of progressive frames. In a sense, MPEG-2's interlaced prediction modes are underutilized by DVD.
Audio
Audio data specifications
Linear PCM | Dolby AC-3 | MPEG-2 audio | |
---|---|---|---|
Sampling frequency | 48 or 96 kHz | 48 kHz | 48 kHz |
Number of bits per sample | 16/20/24 | compressed (16 bits) | compressed (16 bits) |
Max transfer rate | 6.144 Mbit/sec | 448 kbits/sec | 640 kbits/sec |
Max Number of channels | 8 | 5.1 | 5.1 or 7.1 |
NTSC | PAL | |
---|---|---|
Mandatory | Dolby AC-3 and/or Linear PCM | MPEG-2 audio and/or Linear PCM |
Optional | MPEG-2 Audio | Dolby AC-3 |
Philips' provided three practical scenarios for audio.
Use Channels kbits/sec
Case 1: One mono language channel to be mixed with the Center multichannel set.Multichannel music & effects 5.1 or 7.1 384 Mono English dialogue 1 64 Mono French dialogue 1 64 Mono German dialogue 1 64
Case 2: One of the stereo lingual signals mixed with the L & R channel of the playback multichannel set.Multichannel music & effects 5.1 or 7.1 384 Mono English dialogue 2 128 Mono French dialogue 2 128 Mono German dialogue 2 128
Case 3: One to be selected for playback.Multichannel with English dialogue 5.1 or 7.1 384 Multichannel with French dialogue 5.1 or 7.1 384 Multichannel with German dialogue 5.1 or 7.1 384
Audio Signal Decoding System
General
48 kHz and 96 kHz uncompressed PCM audio High Definition Audio Experience
Dolby AC-3 parameters
Sampling frequency | 48 kHz |
bitrate | 64 kbits/sec to 448 kbits/sec per stream |
Audio coding mode | 1/0, 2/0, 3/0, 2/1, 2/2, 3/1, and 3/2 (acmod) |
Characteristics |
|
MPEG Audio parameters
Sampling frequency | 48 kHz |
MPEG-1: |
Layer II only Mono (32 to 192 kb/s) and Stereo (64 to 384 kb/s) |
MPEG-2 |
|
LPCM Coding
Due to the user rate bandwidth limitation of 6.144 mbit/sec for any LPCM audio stream, not all combinations of channel count, sample precision and sample rates are permitted. However, up to 8 separate streams are permitted, as long as the combined stream rate is less than or equal to 9.8 mbit/sec. DVD nomenclature states that a single LPCM stream consists of one to 8 channels.
Sample Rate | Sample Prec. | Channel Count | |||
---|---|---|---|---|---|
(Hz) | (bits) | Mono | 2 CH | 5 CH | 8 CH |
48,000 | 16 | Yes | Yes | Yes | Yes |
20 | Yes | Yes | Yes | No | |
24 | Yes | Yes | Yes | No | |
96,000 | 16 | Yes | Yes | No | No |
20 | Yes | Yes | No | No | |
24 | Yes | Yes | No | No |
Interactive functions
Basic User interface:
GUI Display:
Seamless play function
Still picture function
There are 6 search functions defined for DVD. Two are present
in most of today's VCRs: the linear style Time Search and Scan
(Fast forward, rewind). The other 4 are made possible thanks to
the non-linear, random-access playback capability of DVD.
User operation (ability to scan through or play) can be prohibited by
content. This is signalled by such attributes as the parental control level.
For example, certain Part_of_Title's can be skipped over which contain
R-rated (US) scenes.
Menu:
Basic
1. Title A
2. Title B
3. Title C
4. Previous 5. next
Multi-page Menu
1. Title A 4. Title D 7. Title G
2. Title B 5. Title E 8. Title H
3. Title C 6. Title F 9. Title I
Exit Next Prev Exit Next Prev Exit
Interactivity
Level of functionality
1. simply play
2. interactivity similar to Video-CD
3. Interactivity simular to PC Applications
Functions
Information Control
Menu
Search functions:
Search Functions by User
Title Search
User can select the exact title to shuttle to.
Part_of_Title Search
User can go to specific version (PG-13, R, directors cut, children's
version) or camera angle by either title name or number.
Program Search
User can go to a specific scene (car chase, opening
credits, gun fight, etc.) within a program chain.
Time Search
User can go to a specific SMPTE style time code
(HH:MM:SS:FF) location within a program chain.
Scan:
Scan (linearly) forward or backwards in time.
GoUp:
Within the current program chain, jump to the next
program chain. This command traverses the DVD control
information hierarchy.
For Time Searches, all DVD players are required to arrive to the nearest I picture. It is optional that DVD players be capable of arriving at the exact picture (regardless of its picturing coding type).
Navigation Commands and Parameters
The author (content provider) is given the freedom of creating an arbitrary branching structure for a given title. Of course some restraint should be exercised since, thanks to interframe MPEG coding dependencies and physical servo mechanism limitations, a program chain cannot be constructed of 30 pictures/sec of totally randomly located information on the disc.
However, the constant DVD transfer rate of 11 Mbit/sec provides some flexibility when the average program rate is kept lower. For example, if the average bit rate is only 5 Mbit/sec, then the player can waste 6 Mbit/sec of potential transfer rate in random access overhead.
Player Settings:
There are 24 system parameters for player setting:
SPRM | Meaning |
---|---|
0 | Menu Description Language Code |
1 | Audio stream number |
2 | Sub-picture Stream number |
3 | Angle Number |
4 | Title Number |
5 | VTS title Number |
6 | Title PGC Number |
7 | Part of title number for one sequential_PGC_Title |
8 | Highlighed Buttom number |
9 | Navigation Timer |
10 | Title PGC number for Navigation Timer |
11 | Audio Mixing Mode for Karaoke |
12 | Country Code for Parental Management |
13 | Parental Level |
14 | Player Configuration for Video |
15 | Player Configuration for Audio |
16 | Initial Language Code for Audio |
17 | Initial Language Code for Sub-picture |
18 | Initial Language Code Extension for Sub-picture |
19 | Initial Language Code for Sub-picture |
20 | Reserved |
21 | Reserved |
22 | Reserved |
23 | Reserved |
General Parameters:
Used for interactive operation of titles, such as quizzes, or games.
16 general parameters for navigation. These are RAM variables in the DVD players for use as, e.g., arithmetic scratch pads, counters, etc. Arithmetical operations are available (add, compare, etc.)
Navigation Commands
Each command consists of a single instruction or a combination of two or three instructions.
Instruction Groups:
Goto | branch between command |
Link | transfer between same Domain |
Jump | transfer between each Domain |
Compare | recognition of parameter value |
SetSystem | player system setting |
Set | calculate GPRM values |
Forum
File Structure Hierarchy
The DVD is broken into two separate types of information: Navigation Data (control) and Presentation (object) data. Control data acts as pointers (like an operating system's File Allocation Table) to the actual video and audio object data on the disc.
In the DVD reference player model, Presentation and Navigation data packets are separated at the track buffer.
Control data can be expressed as a series of nested layers:
Title | distinguishes multiple movies or TV episodes on one disc. Each title is one of two types: a single program chain (One_Sequential_PGC_Title) or a collection of different program chains (Multi_PGC_Title). |
Program Chain | A collection of programs with, e.g., a particular theme in common. |
Part_of_Title | Links to one or more Program (PG) units on the disc. Like PGC, this mechanism can be used to create different versions (camera angle, ratings, outcomes, etc.) of the same program chain. POTs can also be used to mark scenes. |
Program | Usually a scene. Consists of multiple cells. |
Cell | Preceded by a navigation packet, and alternating video and audio packets. A cell is typically all the video and audio data associated with an integer number of a group of pictures. |
VOBU | Video Object Unit: nominally a group of pictures (GOP) |
GOP |
1. smallest granularity of random access on disc (Group of pictures being
with a coded Intra frame) 2. largest interframe dependent coding unit. (Interframe compression is bounded within a GOP) Usually 15 display frames of data (0.5 seconds duration) for NTSC-rate (525/60) content. |
Packet | DVD packets are 2048 bytes (sector payload size) large. As per MPEG-2 PES/Program streams, they contain data from only one data elementary stream (video, audio, etc.) |
NAV packet | contains the optional Buttom-Command defining the playback behaviour of the current cell. |
1. Logical structure of Video Manager and Video Title Set [notes from Hitachi] =========================================================
A disc volume may contain up to 99 different titles, each with an initial Navigation Menu allowing the user to select among different versions of the title. The root menu which branches to all titles on the disc is coded within the Video Manager. Each title is organized as a Video Tile Set (VTS).
DVD: [VM][VTS #1][VTS #2] ..... [VTS #n] where n<=99
The VM's VMGI includes: Attributes for the Menu, Tile Search Pointers, and the PCGI for the Menu.
VM: [VMGI][VOBS for Menu][Back up for VMGI]
The Control Data (VTSI) for the title (VTS) includes: attributes for Menu, Attributes for Title, Part of Title Search Pointer, Time Map Table, PGCI for Menu, and PGCI for Title. The Video Objects (VOBS) contain the actual program chains, Part_of_Tiles, programs, and so forth.
VTS: [VTSI][VOBS for Menu][VOBS for Title][Back up for VTSI]
Legend:
VM Video Manager: sets up menus for a series of titles (1 through n)
VTS Video Tile Set: a collection of video objects.
VMGI Video Manager Information:
VOBS Video Object Set
PGCI Program Chain Information
Structure of Title
A title begins with the entry program chain (Entry PGC). It can branch to a single program chain (One_Sequential_PGC_Title) or multiple program chains (Multi_PGC_Title). The location of the branch is determined by the link condition.
Structure of a Program Chain (PGC)
The program chain is broken into two separate entities:The PGCI defines the playback order of Programs by acting as a table of addresses which point to the sector locations of the program cells on the DVD. A program cell is essentially a group of pictures (GOP), spanning multiple sectors, and contains the actual interleaved packets of compressed bits for video and audio data.
Part_of_Title (PTT)
The Part_of_Title divides a title in a maximum of 99 different pieces. The intent of the PTT is aid in the construction of multiple versions of the same title.
One_Sequential_PCG_Title: The Part_of_Title and Program numbers are synchronized.
[ PTT #1 | PTT #2 | .... | PTT #n ] Part_of_Title
[ [PG #1] | [PG #2] | .... | [PG #n] ] Program Chain (PGC)
Multi_PGC_Title:
branch PTT #2
--> [PG #1] (PGC1)
PTT #3 PTT #m
PTT #1 --> [PG #1] [PG #j] ... [PG #k] (PGC2)
[PG #1]
--> [PG #1] (PGC3)
Presentation
Combinations of presentation types
Type | Count | Representation |
---|---|---|
Video | 1 stream only | MPEG-1 or MPEG-2 Video |
Audio | maximum of 8 streams | Linear PCM and/or: Dolby AC-3 (NTSC) MPEG audio (PAL) |
Sub-picture | max 32 streams | Run-length encoded with bitmap of 2 bits/pixel |
Presentation stream rates
max total of combined audio and video: 9.8 Mbit/sec max sum of Elementary streams + systems overhead: 10.08 Mbit/sec.
Presentation of PGC
The program chain (PGC) can be presented either serially (linear) or in random/shuffle (non-linear) fashion.
For example, a quiz title should break each question into separate programs. The next program chain branched to would be determined by the answered provided by the user.
Still image presentation
Still pictures are coded as MPEG intra frames. They may be displayed for indefinite duration. They can be accompanied by background music, or audio can be muted.
still function is created by the action of the navigation system The same video frame and sub-picture is frozen (displayed over and over again on the TV) while audio is or playing in background.
There are three types of the Still Function:
Type | Timing | Still time in seconds |
---|---|---|
PGC Still | Stills at end of the PGC | 0-254, limitless |
Cell Still | Stills at end of the Cell | 0-254, limitless |
VOBU Still | Stills in every VOBU in the Cell | limitless |
VOBU: Video Object Unit.
Location of each command
Within a program chain (PGC), commands can be located at the front of the chain, in between cells of the chain, and at the end of the chain.
Program chain
[Pre-Commands] [Cell] [Cell] [Cell-Command] [Cell] [Post-Commands]
Each cell can have one command. There is a restriction that
no more than 128 commands can be contained within a program chain:
Pre-commands + Cell Commands + Post Commands <= 128
Further, there are a maximum of 36 buttons, each of which can
have one associated command.
Example of a PGC transition
[taken from the Hitachi overheads]3 quiz problems are presented to the user. Each quiz problem/question is coded as a separate program chain. One of the questions prompts the user for a "Yes" or "No" answer.
The Link command is used to branch from the original top-level menu to one of the three program chains. The Set Command is used to tally a score. Finally, the CompareLink command (which consist of two commands, Compare & Link) branches to a particular Program depending on the user's answer.