본문 바로가기

Job Notes/DVD

DVD Technical Notes(3) - Applications


Applications

The standards that make up Book B
MPEG-2 restrictions
Subpictures
Vertical Blanking Interval (VBI) packets
Letterboxing
Audio
Interactive Functions
Title structure breakdown
Presentation of Program Chains


Standards

Video ITU-T H.262/ISO-IEC 13818-2 (MPEG-2 Video)
ISO/IEC 11172-2 (MPEG-1 Video)
Audio ISO/IEC 13818-3 (MPEG-2 Audio)
ISO/IEC 11172-3 (MPEG-1 Audio) Dolby AC-3 standard
System ITU-T H.222 / ISO/IEC 13818-1 (MPEG-2 Systems)
Program/PES stream only (no Transport streams)


Video Data Specifications

DVD adds many additional restrictions to the popular compliance parameter sets of MPEG. One good example is the restriction on the coded size of a picture: MPEG-2 Main Profile @ Main Level allows any coded frame size between 16 and 720 pixels horizontally and 16 and 576 pixels vertically. DVD, however, restricts the coded frame sizes to a very limited but practical subset.

In MPEG, audio can be coded at a sample rate of 32, 44.1 or 48 kHz. In DVD, the rates of both Dolby AC-3 and MPEG audio are strictly set to 48 kHz.

MPEG is a generic representation meant for a wide variety of applications. DVD has taken a practical subset to promote interoperability by simplifying implementations and insuring features (such as random accessibility).

Coded representation MPEG-1 (SIF combo)
MPEG-2 (Main Profile @ Main Level)
Frame rate 29.97 or 25 Hz
TV system 525/60 or 625/50
Aspect ratio 4:3 (all video formats)
16:9 (all formats except 352 pixels/line)
Display Mode pan & scan, letterbox
User_data closed caption
Coded frame sizes 525/60: 720x480, 704x480, 352x480, 352x240
625/50: 720x576, 704x576, 352x576, 352x288
(MPEG-1 is allowed only in 352x240 or 352x288 res).
GOP size max 36 fields or 18 frames (NTSC)
max 30 fields or 15 frames (PAL)
Buffer size 1.8535008 Mbits (MPEG-2)
max 327689 bits (MPEG-1)
Transfer method VBR, CBR (MPEG-2)
only CBR for MPEG-1
Maximum bitrate 9.8 Mbit/sec
Low_delay NOT permitted !!!!


Notes:

  • the frame rate is the intended display frame rate. The number of coded frames in a sequence may vary due to 3:2 pulldown (the DVD MPEG decoder performs this function). The permitted values in DVD are more restrictive than MPEG-2 MP@ML which also includes 23.976, 24, and 30 frames/sec rates.
  • aspect ratio is the display aspect ratio. Only 16:9 and 4:3 are permitted. Note: MP@ML's 2.21:1 is not included.
  • MP@ML has no GOP size restriction. In fact, the GOP() is considered to be an insignificant layer in MPEG-2. Instead the sequence() layer serves as the most important boundary in the generic MPEG sense.
  • The MPEG-1 and MPEG-2 vbv_buffer_size limits are the same as MP@ML and Constrained Parameters Bitstreams, respectively.
  • The maximum bitrate of 9.8 Mbit/sec is more restrictive than MP@ML's 15 Mbit/sec limit. However, the point of diminishing returns (no visual difference between original video and compressed video) is widely known to be around 9 Mbit/sec.
  • user_data() fields in MPEG video picture headers contain closed captioning (similar to Grand Alliance and DVB methods). See this ATSC (Advanced Television Systems Committee) site for more information: http://www.atsc.org/
    [is this the same in DVB, DVD, and ATSC ? ]
  • For picture sizes, only a very limited set of coded dimensions are legal.
  • Variable bit rate is permitted only in MPEG-2 streams since the VBV model in MPEG-2 has provisions for it. MPEG-1 was an earlier standard (by two years) and not developed the VBV model to handle 3:2 pulldown cases.
  • contrary to popular belief: all DVD players are required to decode video streams up to 9.8 Mbit/sec for indefinite periods of time. The popular average rate of 3.5 Mbit/sec or 4.7 Mbit/sec are merely canonical figures created by the notion that only single sided, single layer discs will hold feature length films. Should Single Sided, Double Layer discs prevail, the average rate would be almost twice as great. ALL DVD PLAYERS MUST SUSTAIN A 9.8 MBIT/SEC VIDEO DECODE RATE!!!!!!! Hardwired (Application Specific Integrated Circuits---ASICs) implementations of MPEG-2 MP@ML decoders are generally capable of handling 15 mbit/sec sustained rates.

  • MPEG Display Formats

    MPEG-2 video decoder chips have implemented pan & scan for a few years already since it has been a requirement for cable TV and direct broadcast satellite applications. The letterbox requirement (vertical filter) is a relatively new addition to the MPEG decoder universe. The second generation DVD MPEG-2 video decoders will most likely perform "on-chip" sub-picture reconstruction.


    Display 4:3 Display 16:9
    Source 4:3 No conversion horizontal filtering accomplished by TV monitor.
    Source 16:9 letterbox (vertical filter) or Pan & Scan No conversion


    Note: Letterbox Conversion is a mandatory feature in the DVD Player !!!



    Subpictures

  • run-length compressed bitmaps that are overlayed ontop of the MPEG reconstructed video.
  • Applications include: Menus, sub-titles, karaoke, and simple animation.
  • Pixels are divided into four types: 1. background 2. Foregound 3. Empahsis-1 4. Emphasis-2
  • 4 colors out of 16 color palette (4 colors are determined once per PGC).
  • 4 out of 16 contrast values
  • up to a maximum of 32 sub-picture bitstreams. Each subpicture stream could, for example, could contain text from a particular language.
  • subpicture buffer size is restricted to 62 Kbytes. This means a maximum of 62 KB per GOP/cell. 32 Kbytes of this is control data.
  • Maximum number of bits per run-length coded line is 1440 bits.
  • Display area maximum: 720x480 (525/60) and 720x576 (625/50)
  • area, content, color, and contrast can be changed every video field
  • Sub-Picture Display Control Sequences (SP_DCSQ) control the presentation of Sub-pictures.
  • Presentation effects include: scroll up/down, fade in/out, etc.
  •   Structure of Sub-picture Decoding Unit (SPU):

    [ SPUH ][ PXD ][ DCSQT ]
    SPUH: Sub-picture Unit Header (size of SPU, start address of DCSQT)
    PXD: Pixel Data (variable length run-length coded)
    DCSQT: Display Control Sequence Table (one or more display control
    command sequences).
    DCSQT: [DCSQ 0][DCSQ 1][DCSQ 2] ... [DCSQ n]
    DCSQ: [Start time] [ Pointer to next DCSQ] [Command Sequence]
    Command Sequence: [DCC 0][DCC 1]... [DCC m]

    Display Control Commands (DCC):

  • Set start address in PXD
  • Set colors
  • Set contrast
  • Set SP screen position
  • Start/stop display
  • Set CHG_COLCON areas.



  • VBI Decoding

    The Vertical Blanking Interval (VBI) packet (multiplexed at the Cell level along with Navigation, Video, and Audio packets) contains information which is directly inserted into the reconstructed video signal, sans level adjustments (16 levels into a, e.g. 256 nominal level video signal).

  • only 1 VBI channel per program (sub-pictures have up to 32)
  • Line range is from 10 to 23 NTSC and 6 to 23.5 for PAL.
  • Separate palette (16 Y values, Cr=Cb=128) from subpictures.
  • No highlight
  • Restricted DCSQ command set
  • closed captioning for NTSC-rate (525/60) is coded exclusively in the user_data() field of the group_of_pictures_header()
  • VBI information is losslessly represented as a waveform, and coded into packets. The 525/60 player uses a far more efficient alternative: the source character stream is coded in the MPEG video user_data() field. The NTSC/PAL modulator chip then creates the VBI closed caption (Line 21) signal from this character stream.

    This brings our tally of closed caption representations to THREE ways!!

  • as packets of 16-level sampled VBI waveforms (PAL)
  • as user_data() character streams (NTSC)
  • as rendered subpictures (NTSC and PAL)


  • Letterbox

    Picture Size Conversion

    All DVD players are required to have built-in vertical filters which scale a 16:9 coded video image to fit within a traditional 4:3 display. This player feature is needed since it is anticipated that a majority of movies will be coded for the 16:9 aspect ratio, while at the same time most TV displays (in the early years) will be 4:3. In the same vien as multilingual audio, a single coded aspect ratio in market distribution reduces confusion and bolsters economy of scale.

    525/60 (NTSC-rate display):

    (Note: 480*(4/3)/(16/9) = 480*0.75 = 360)

       _____________________
    | 60 |
    |---------------------|
    | |
    | 360 | 480 lines total
    | |
    |---------------------|
    | 60 |
    -----------------------

    625/50 (PAL-rate display):

    (Note: 576*(4/3)/(16/9) = 576*0.75 = 432)

       _____________________
    | 72 |
    |---------------------|
    | |
    | 432 | 576 lines total
    | |
    |---------------------|
    | 72 |
    -----------------------

    A simple bi-linear vertical filter can be applied, yielding good visual results. Here, two source samples (s[n],s[n+1]) are weighted by simple complementary factors and added together to form the destination sample value (d[m]). These weights are easily implemented with shifters. For interlaced displays, vertical filtering occurs only within the same field parity.

         d[0] = (3/4)*s[0] + (1/4)*s[1]
    d[1] = (1/2)*s[1] + (1/2)*s[2]
    d[2] = (1/4)*s[2] + (3/4)*s[3]

    A decoder can determine whether inter or intra-parity vertical filtering is applied by testing the progressive_frame flag of the MPEG-2 video stream. (MPEG-1 frames are always progressive by definition). This flag indicates that a picture contains interlaced or progressive vertically correlated information. Almost all MPEG-2 coded movies consist exclusively of progressive frames. In a sense, MPEG-2's interlaced prediction modes are underutilized by DVD.



    Audio

    Audio data specifications


    Linear PCM Dolby AC-3 MPEG-2 audio
    Sampling frequency 48 or 96 kHz 48 kHz 48 kHz
    Number of bits per sample 16/20/24 compressed (16 bits) compressed (16 bits)
    Max transfer rate 6.144 Mbit/sec 448 kbits/sec 640 kbits/sec
    Max Number of channels 8 5.1 5.1 or 7.1



    NTSCPAL
    Mandatory Dolby AC-3 and/or Linear PCM MPEG-2 audio and/or Linear PCM
    Optional MPEG-2 Audio Dolby AC-3


    Philips' provided three practical scenarios for audio.

    UseChannelskbits/sec

    Case 1: One mono language channel to be mixed with the Center multichannel set.
    Multichannel music & effects 5.1 or 7.1 384
    Mono English dialogue 1 64
    Mono French dialogue 1 64
    Mono German dialogue 1 64

    Case 2: One of the stereo lingual signals mixed with the L & R channel of the playback multichannel set.
    Multichannel music & effects 5.1 or 7.1 384
    Mono English dialogue 2 128
    Mono French dialogue 2 128
    Mono German dialogue 2 128

    Case 3: One to be selected for playback.
    Multichannel with English dialogue 5.1 or 7.1 384
    Multichannel with French dialogue 5.1 or 7.1 384
    Multichannel with German dialogue 5.1 or 7.1 384

    Audio Signal Decoding System

    General

  • up to a maximum of 8 audio streams can be multiplexed into the same cell with a single video stream. Each stream for example is designated for a particular language or special effects & music tracks.
  • Dolby AC-3 used mandatory for 525/60 (NTSC) players and MPEG-2 is mandatory for 625/50 (PAL) players, but optional on discs themselves.
  • LPCM (Linear Pulse Code Modulated) is mandatory for all players, but optional on discs themselves.
  • 48 kHz and 96 kHz uncompressed PCM audio
  • High Definition Audio Experience
  • A 525/60 disc must contain either Dolby AC-3 or LPCM.
  • A 625/50 disc must contain either MPEG-2 audio or LPCM. Due to bandwidth efficiency, most titles will use the more compact Dolby AC-3 or MPEG-2 audio.
  • Extendibility is reserved for new algorithms such as DTS, Sony SDDS, et al.
  • IEC-958 Digital Audio Interface for external decoder/receiver. Output types: compressed AC-3 or MPEG stream, two channel LPCM. DVD players are required only to output a full reconstruction of the Left and Right channels. An external AC-3 decoder would optionally decode all 5.1 channels. A more expensive DVD player would output all 5.1 reconstructed channels.

    Dolby AC-3 parameters

    Sampling frequency 48 kHz
    bitrate 64 kbits/sec to 448 kbits/sec per stream
    Audio coding mode 1/0, 2/0, 3/0, 2/1, 2/2, 3/1, and 3/2 (acmod)

    Characteristics
  • dialog normalization
  • dynamic range compression
  • downmixing (5.1 -> 2 channel) capability
  • Dolby Pro-Logic Encoding (5.1 -> 2 channel)
  • Karaoke mode (voice overlay)

  • MPEG Audio parameters

    Sampling frequency 48 kHz
    MPEG-1: Layer II only
    Mono (32 to 192 kb/s) and Stereo (64 to 384 kb/s)
    MPEG-2
  • main stream (same as MPEG-1)
  • extension stream (up to 528 kbit/sec)
  • sum of main and extension stream up to 912 kb/s
  • unmatrix mode excluded (always MPEG-1 compatible)

  • LPCM Coding

  • Lossless/uncompressed PCM audio
  • Sampling frequency: either 48 kHz or 96 kHz
  • bits/sample: 16, 20, or 24 bits
  • up to 8 PCM channels.

    Due to the user rate bandwidth limitation of 6.144 mbit/sec for any LPCM audio stream, not all combinations of channel count, sample precision and sample rates are permitted. However, up to 8 separate streams are permitted, as long as the combined stream rate is less than or equal to 9.8 mbit/sec. DVD nomenclature states that a single LPCM stream consists of one to 8 channels.

    Sample RateSample Prec.Channel Count
    (Hz)(bits)Mono 2 CH 5 CH 8 CH
    48,000 16 Yes Yes Yes Yes

    20 Yes Yes Yes No

    24 Yes Yes Yes No
    96,000 16 Yes Yes No No

    20 Yes Yes No No

    24 Yes Yes No No


  • Interactive functions

    Source: Pioneer

    Basic User interface:

  • Control: ten keys and cursor keys
  • Display: menu graphics and high-light

    GUI Display:

  • Menu picture with subpicture and MPEG graphics
  • highlighted area
    Menu:
      Basic

    1. Title A
    2. Title B
    3. Title C
    4. Previous 5. next

    Multi-page Menu

    1. Title A 4. Title D 7. Title G
    2. Title B 5. Title E 8. Title H
    3. Title C 6. Title F 9. Title I
    Exit Next Prev Exit Next Prev Exit


    Interactivity


    Level of functionality
    1. simply play
    2. interactivity similar to Video-CD
    3. Interactivity simular to PC Applications

    Functions

    Information Control

  • parental control
  • copy management

    Menu

  • Title: sub-picture
  • Root: Angle
  • Audio: part of title

    Search functions:

  • program search
  • time search
  • angle search
  • part of title search

    Seamless play function

    Still picture function

    Search Functions by User

    There are 6 search functions defined for DVD. Two are present in most of today's VCRs: the linear style Time Search and Scan (Fast forward, rewind). The other 4 are made possible thanks to the non-linear, random-access playback capability of DVD.

    User operation (ability to scan through or play) can be prohibited by content. This is signalled by such attributes as the parental control level. For example, certain Part_of_Title's can be skipped over which contain R-rated (US) scenes.

    Title Search User can select the exact title to shuttle to.
    Part_of_Title Search User can go to specific version (PG-13, R, directors cut, children's version) or camera angle by either title name or number.
    Program Search User can go to a specific scene (car chase, opening credits, gun fight, etc.) within a program chain.
    Time Search User can go to a specific SMPTE style time code (HH:MM:SS:FF) location within a program chain.
    Scan: Scan (linearly) forward or backwards in time.
    GoUp: Within the current program chain, jump to the next program chain. This command traverses the DVD control information hierarchy.


  • For Time Searches, all DVD players are required to arrive to the nearest I picture. It is optional that DVD players be capable of arriving at the exact picture (regardless of its picturing coding type).

    Navigation Commands and Parameters

    The author (content provider) is given the freedom of creating an arbitrary branching structure for a given title. Of course some restraint should be exercised since, thanks to interframe MPEG coding dependencies and physical servo mechanism limitations, a program chain cannot be constructed of 30 pictures/sec of totally randomly located information on the disc.

    However, the constant DVD transfer rate of 11 Mbit/sec provides some flexibility when the average program rate is kept lower. For example, if the average bit rate is only 5 Mbit/sec, then the player can waste 6 Mbit/sec of potential transfer rate in random access overhead.

    Player Settings:

    There are 24 system parameters for player setting:

    SPRMMeaning
    0 Menu Description Language Code
    1 Audio stream number
    2 Sub-picture Stream number
    3 Angle Number
    4 Title Number
    5 VTS title Number
    6 Title PGC Number
    7 Part of title number for one sequential_PGC_Title
    8 Highlighed Buttom number
    9 Navigation Timer
    10 Title PGC number for Navigation Timer
    11 Audio Mixing Mode for Karaoke
    12 Country Code for Parental Management
    13 Parental Level
    14 Player Configuration for Video
    15 Player Configuration for Audio
    16 Initial Language Code for Audio
    17 Initial Language Code for Sub-picture
    18 Initial Language Code Extension for Sub-picture
    19 Initial Language Code for Sub-picture
    20 Reserved
    21 Reserved
    22 Reserved
    23 Reserved


    General Parameters:

    Used for interactive operation of titles, such as quizzes, or games.

  • 16 general parameters for navigation. These are RAM variables in the DVD players for use as, e.g., arithmetic scratch pads, counters, etc.
  • Arithmetical operations are available (add, compare, etc.)
  • Navigation Commands

    Each command consists of a single instruction or a combination of two or three instructions.

    Instruction Groups:

    Goto branch between command
    Link transfer between same Domain
    Jump transfer between each Domain
    Compare recognition of parameter value
    SetSystem player system setting
    Set calculate GPRM values


    Forum

    File Structure Hierarchy

    The DVD is broken into two separate types of information: Navigation Data (control) and Presentation (object) data. Control data acts as pointers (like an operating system's File Allocation Table) to the actual video and audio object data on the disc.

    In the DVD reference player model, Presentation and Navigation data packets are separated at the track buffer.

    Control data can be expressed as a series of nested layers:

    Title distinguishes multiple movies or TV episodes on one disc. Each title is one of two types: a single program chain (One_Sequential_PGC_Title) or a collection of different program chains (Multi_PGC_Title).
    Program Chain A collection of programs with, e.g., a particular theme in common.
    Part_of_Title Links to one or more Program (PG) units on the disc. Like PGC, this mechanism can be used to create different versions (camera angle, ratings, outcomes, etc.) of the same program chain. POTs can also be used to mark scenes.
    Program Usually a scene. Consists of multiple cells.
    Cell Preceded by a navigation packet, and alternating video and audio packets. A cell is typically all the video and audio data associated with an integer number of a group of pictures.
    VOBU Video Object Unit: nominally a group of pictures (GOP)
    GOP 1. smallest granularity of random access on disc (Group of pictures being with a coded Intra frame)
    2. largest interframe dependent coding unit. (Interframe compression is bounded within a GOP)
    Usually 15 display frames of data (0.5 seconds duration) for NTSC-rate (525/60) content.
    Packet DVD packets are 2048 bytes (sector payload size) large. As per MPEG-2 PES/Program streams, they contain data from only one data elementary stream (video, audio, etc.)
    NAV packet contains the optional Buttom-Command defining the playback behaviour of the current cell.


    1. Logical structure of Video Manager and Video Title Set [notes from Hitachi] =========================================================

    A disc volume may contain up to 99 different titles, each with an initial Navigation Menu allowing the user to select among different versions of the title. The root menu which branches to all titles on the disc is coded within the Video Manager. Each title is organized as a Video Tile Set (VTS).

      DVD:   [VM][VTS #1][VTS #2] ..... [VTS #n]       where n<=99

    The VM's VMGI includes: Attributes for the Menu, Tile Search Pointers, and the PCGI for the Menu.

      VM:    [VMGI][VOBS for Menu][Back up for VMGI]

    The Control Data (VTSI) for the title (VTS) includes: attributes for Menu, Attributes for Title, Part of Title Search Pointer, Time Map Table, PGCI for Menu, and PGCI for Title. The Video Objects (VOBS) contain the actual program chains, Part_of_Tiles, programs, and so forth.

      VTS:   [VTSI][VOBS for Menu][VOBS for Title][Back up for VTSI]
     Legend:
    VM Video Manager: sets up menus for a series of titles (1 through n)
    VTS Video Tile Set: a collection of video objects.
    VMGI Video Manager Information:
    VOBS Video Object Set
    PGCI Program Chain Information

    Structure of Title

    A title begins with the entry program chain (Entry PGC). It can branch to a single program chain (One_Sequential_PGC_Title) or multiple program chains (Multi_PGC_Title). The location of the branch is determined by the link condition.

    Structure of a Program Chain (PGC)

    The program chain is broken into two separate entities:
  • program control information (PGCI)
  • video object (VOB)

    The PGCI defines the playback order of Programs by acting as a table of addresses which point to the sector locations of the program cells on the DVD. A program cell is essentially a group of pictures (GOP), spanning multiple sectors, and contains the actual interleaved packets of compressed bits for video and audio data.

    Part_of_Title (PTT)

    The Part_of_Title divides a title in a maximum of 99 different pieces. The intent of the PTT is aid in the construction of multiple versions of the same title.

    One_Sequential_PCG_Title: The Part_of_Title and Program numbers are synchronized.

       [ PTT #1  |  PTT #2  | .... | PTT #n  ]    Part_of_Title
    [ [PG #1] | [PG #2] | .... | [PG #n] ] Program Chain (PGC)


    Multi_PGC_Title:

    branch PTT #2
    --> [PG #1] (PGC1)
    PTT #3 PTT #m
    PTT #1 --> [PG #1] [PG #j] ... [PG #k] (PGC2)
    [PG #1]
    --> [PG #1] (PGC3)



  • Presentation

    Combinations of presentation types

    TypeCountRepresentation
    Video 1 stream only MPEG-1 or MPEG-2 Video
    Audio maximum of 8 streams Linear PCM and/or:
    Dolby AC-3 (NTSC)
    MPEG audio (PAL)
    Sub-picture max 32 streams Run-length encoded with bitmap of 2 bits/pixel


    Presentation stream rates

    max total of combined audio and video: 9.8 Mbit/sec max sum of Elementary streams + systems overhead: 10.08 Mbit/sec.

    Presentation of PGC

    The program chain (PGC) can be presented either serially (linear) or in random/shuffle (non-linear) fashion.

    For example, a quiz title should break each question into separate programs. The next program chain branched to would be determined by the answered provided by the user.

    Still image presentation

    Still pictures are coded as MPEG intra frames. They may be displayed for indefinite duration. They can be accompanied by background music, or audio can be muted.

  • still function is created by the action of the navigation system
  • The same video frame and sub-picture is frozen (displayed over and over again on the TV) while audio is or playing in background.
  • There are three types of the Still Function:

    TypeTimingStill time in seconds
    PGC Still Stills at end of the PGC 0-254, limitless
    Cell Still Stills at end of the Cell 0-254, limitless
    VOBU Still Stills in every VOBU in the Cell limitless


    VOBU: Video Object Unit.

    Location of each command

    Within a program chain (PGC), commands can be located at the front of the chain, in between cells of the chain, and at the end of the chain.

                              Program chain

    [Pre-Commands] [Cell] [Cell] [Cell-Command] [Cell] [Post-Commands]

    Each cell can have one command. There is a restriction that
    no more than 128 commands can be contained within a program chain:

    Pre-commands + Cell Commands + Post Commands <= 128

    Further, there are a maximum of 36 buttons, each of which can
    have one associated command.

    Example of a PGC transition

    [taken from the Hitachi overheads]

    3 quiz problems are presented to the user. Each quiz problem/question is coded as a separate program chain. One of the questions prompts the user for a "Yes" or "No" answer.

    The Link command is used to branch from the original top-level menu to one of the three program chains. The Set Command is used to tally a score. Finally, the CompareLink command (which consist of two commands, Compare & Link) branches to a particular Program depending on the user's answer.