The present invention belongs to a technical field relating to technology for playing back 2D/3D videos.
Recent years have witnessed an increase in the number of movie theaters that offer stereoscopic viewing of 3D videos. Due to this trend, there has been a demand for optical discs having recorded thereon high-quality 3D videos.
An optical disc having recorded thereon 3D video must possess playback compatibility with a playback device that is capable of playing back only optical discs having recorded thereon 2D videos (hereafter, “a 2D playback device”). If a 2D playback device cannot playback 3D video recorded on an optical disc as 2D video, it will be necessary to manufacture two types of discs, namely a 3D disc and a 2D disc, of the same content. This could be a costly process. Accordingly, it is desired that an optical disc having recorded thereon 3D video be played back as 2D video on a 2D playback device, and as 2D or 3D video on a playback device that is capable of playing back both 2D and 3D videos (hereafter, “a 2D/3D playback device”).
Prior art for securing playback compatibility between a playback device and an optical disc having recorded thereon 3D video includes the technology disclosed in PLT 1 indicated below.
With a single optical disc that possesses such playback compatibility, a 2D/3D playback device can play back and present both 2D and 3D videos to a viewer (hereafter, the term “viewer” is used interchangeably with the term “user”).
Japanese Patent Publication No. 3935507
A wide band is required to transmit 3D video from one device to another. In view of this, a 2D/3D playback device and a display device are often connected to each other in compliance with the High Definition Multimedia Interface (HDMI) standard that allows data transmission using a wide band. Such devices that are connected to one another in compliance with the HDMI standard exchange data when they have established synchronization with one another. In order to change a frame rate at which a video signal is output, these devices need to re-establish synchronization with one another; that is to say, during their attempt to re-establish such synchronization, a video output is ceased.
During playback of 3D video, a 24 Hz left-view video and a 24 Hz right-view video are output to the display device. An entirety of the 3D video is thereby output at 48 Hz frame rate. On the other hand, during playback of 2D video, only a 24 Hz left-view video is output to a display device. Therefore, when a 2D/3D playback device switches to playback of 2D video during playback of 3D video, the 2D/3D playback device needs to re-establish synchronization with the display device if they are connected to each other using the HDMI connection. This gives rise to the problem that the start of playback of the 2D video is delayed.
In view of the above problem, the present invention aims to provide a playback device and a recording medium that enable seamless playback of 2D/3D videos.
In order to achieve the above aim, the present invention provides a playback device for playing back 3D video streams including a base-view video stream and a dependent-view video stream, wherein (i) when performing stereoscopic playback using the 3D video streams, the playback device outputs first picture data pieces and second picture data pieces to a display device, the first picture data pieces and the second picture data pieces being obtained by decoding the base-view video stream and the dependent-view video stream, respectively, and (ii) when performing 2D playback using the 3D video streams, the playback device outputs each of the first picture data pieces to the display device at least twice in succession.
When performing 2D playback using the 3D video streams, the playback device of the present invention structured in the above manner outputs, to the display device, each of picture data pieces obtained by decoding the base-view video stream at least twice in succession. This way, the output frame rate at which the 2D playback is performed matches the output frame rate at which stereoscopic playback is performed.
Accordingly, even if playback of the 3D video streams (stereoscopic playback) is switched to the 2D playback, there is no need for the playback device and the display device to re-establish synchronization with each other as required by the HDMI standard. This enables the playback device to perform seamless playback.
The following describes embodiments of a playback device comprising means for solving the aforementioned problem, with reference to the accompanying drawings. First, the principle of stereoscopic viewing is briefly discussed below.
In general, the right eye has a slightly different view of an object than the left eye, due to the difference in the locations of the right and left eyes. This binocular disparity enables a human to recognize an object seen by his/her eyes as a 3D object. Stereoscopic display can be realized by taking advantage of the binocular disparity of a human, i.e., by causing the viewer to visually recognize 2D images as if they are stereoscopic images.
More specifically, by alternately displaying, in a short time span, a 2D right-eye image and a 2D left-eye image which offer different visual perceptions to the right and left eyes of the viewer in a similar manner as the binocular disparity does, the viewer sees these images as if they are displayed in 3D.
This short time span should be a time period during which such alternate display of the 2D right-eye image and the 2D left-eye image can give a human the illusion of stereoscopic display. There are two methods to realize stereoscopic viewing. The first method utilizes a holography technique. The second method utilizes images that create binocular disparity effects (hereafter, “parallax images”), and is referred to as the “parallax image method”.
The first method, which utilizes the holography technique, is characterized in that it can create stereoscopic images of an object in such a manner that the viewer visually recognizes the three-dimensionality of the created stereoscopic images in the same way as he/she would visually recognize the three-dimensionality of the actual object. However, although a technical theory has already been established in the field of holography, it is extremely difficult to create and playback holograms of video using the current technology, because doing so requires use of (i) a computer that can perform an enormous amount of operations to create holograms of the video in real time, and (ii) a display device whose resolution is high enough to be able to draw thousands of linear materials in a distance of 1 mm. For this reason, there are almost no practical examples of holography that are commercially used.
The second method, namely the parallax image method, is beneficial in that stereoscopic viewing can be realized only by preparing right-eye video and left-eye video that give different perspectives to the right and left eyes. Technically speaking, the issue of the second method is how each of the right-eye and left-eye images can be presented only to the corresponding eye. In view of this, the second technique has already been implemented in various technical formats, one of which is an alternate-frame sequencing scheme.
With the alternate-frame sequencing scheme, left-eye video and right-eye video are displayed alternately in the time axis direction. Due to the afterimage effect, each left scene is overlapped with the corresponding right scene in the viewer's brain. As a result, the viewer visually recognizes an entirety of the left and right scenes as stereoscopic video.
The BD-ROM 100 provides the above home theater system with, for example, movies.
The playback device 200, which is connected to the television 300, is a 2D/3D playback device that plays back the BD-ROM 100.
The playback device 200 is connected to the television 300 in compliance with the HDMI standard.
The television 300 provides the user with an interactive operating environment by displaying a movie being played back, a menu, and the like. The display device 300 of the present embodiment realizes stereoscopic viewing by the user wearing the 3D glasses 400. However, if the display device 300 utilizes a lenticular lens, then the display device 300 can realize stereoscopic viewing without the user wearing the 3D glasses 400. The display device 300 utilizing the lenticular lens simultaneously arranges a left-eye picture and a right-eye picture next to each other on the screen. A lenticular lens having a semicircular shape is attached to the surface of the screen of the display device 300 is. Via this lenticular lens, the left eye converges only to pixels constituting the left-eye picture, and the right eye converges only to pixels constituting the right-eye picture. Stereoscopic viewing can be realized by the left and right eyes thus seeing two parallax pictures.
The 3D glasses 400 are composed of liquid crystal shutter glasses and allow the user to view parallax images using an alternate-frame sequencing scheme or a polarizing glass scheme. A pair of parallax images includes (i) an image to be presented to the right eye and (ii) an image to be presented to the left eye. Stereoscopic viewing is realized when the right and left eyes of the user only see the right-eye and left-eye pictures, respectively.
The remote control 500 is a device that receives operations relating to a multilayer GUI from the user. To receive such user operations, the remote control 500 is composed of: (i) a menu button for calling a menu constituting the GUI; (ii) arrow buttons for moving a focus for selecting one of GUI components constituting the menu; (iii) a select button that confirms selection of one of the GUI components constituting the menu; (iv) a return button for returning to the upper layer of the multilayer menu; and (v) number buttons.
This concludes the description of usage of the recording medium and the playback device.
In the present embodiment, a method of recording parallax images used for stereoscopic viewing on an information recording medium is described.
With the parallax image method, video to be presented to the right eye and video to be presented to the left eye are separately prepared. Here, stereoscopic viewing can be realized by making the right-eye and left-eye pictures visible only to the right and left eyes, respectively. The left side of
Of parallax images, images to be presented to the left eye are referred to as left-eye images (L images), and images to be presented to the right eye are referred to as right-eye images (R images). Video comprising left-eye pictures (L images) is referred to as a left-view video, and video comprising right-eye pictures (R images) is referred to as a right-view video. Video streams obtained by digitalizing and compression encoding the left- and right-view videos are referred to as a left-view video stream and a right-view video stream, respectively.
The second row of
The fourth row of
The fifth row of
The above left- and right-view video streams are compressed by using inter-picture predictive encoding which utilizes correlated characteristics of different visual perspectives, in addition to inter-picture predictive encoding which utilizes correlated characteristics of pictures in the time direction. Each picture of the right-view video stream is compressed by referring to a corresponding one of pictures of the left-view video stream which is assigned the same display time.
For example, the first P-picture of the right-view video stream refers to an I-picture of the left-view video stream. A B-picture of the right-view video stream refers to a Br-picture of the left-view video stream. The second P-picture of the right-view video stream refers to a P-picture of the left-view video stream.
Methods of compressing video by utilizing such correlated characteristics of different visual perspectives include Multiview Video Coding (MVC), which is an amendment to the H.264/MPEG-4 AVC standard. In July 2008, the Joint Video Team (JVT), which is a cooperative project between the ISO/IEC MPEG and the ITU-T VCEG, completed formulation of the amendment to the H.264/MPEG-4 AVC standard called Multiview Video Coding (MVC). The MVC is the standard intended to collectively encode images that show different visual perspectives. As the MVC enables predictive encoding by utilizing not only similarities between images in the time direction but also similarities between different visual perspectives, the MVC can improve compression efficiency as compared to when images that show different visual perspectives are each compressed individually.
Of the left- and right-view video streams that have been compression encoded using the MVC, a video stream that can be independently decoded is referred to as a “base-view video stream”. On the other hand, of the left- and right-view video streams, a video stream that (i) has been compression encoded based on its inter-frame correlated characteristics with respect to picture data pieces constituting the base-view stream, and (ii) can be decoded after the base-view stream has been decoded, is referred to as a “dependent-view stream”.
Described below is creation of a recording medium, i.e., manufacturing of the recording medium.
The first row of
The volume area is divided into a plurality of access units to which a series of consecutive numbers are assigned beginning from the first access unit. An optical disc can be accessed via each access unit. These consecutive numbers are called logical addresses. Data can be read out from the optical disc by designating the logical addresses. Basically, in the case of a read-only disc such as the BD-ROM 100, sectors having consecutive logical addresses are physically arranged on the optical disc consecutively. That is, data of such sectors having consecutive logical addresses can be read out without the seek processing. However, at a boundary between recording layers, data of such sectors cannot be read out consecutively even if their logical addresses are consecutive.
File system management information is recorded in the volume area immediately after the lead-in area. The file system management information is followed by a partition area to be managed by the file system management information. The file system is a system that expresses data on the disc in units called directories and files. In the case of the BD-ROM 100, the file system is recorded in a Universal Disc Format (UDF). A file system called FAT or NTFS is used in an ordinary personal computer (PC) to express data recorded in the hard disk using a directory/file structure, thus improving usability. The file system used on the BD-ROM 100 makes it possible to read logical data recorded on the BD-ROM 100 in the same manner as an ordinary PC, using a directory/file structure.
Of accessible files in the file system, a file storing AV streams obtained by multiplexing video streams and audio streams are called “AV stream files”, and a file storing general data other than the AV streams is called “a non-AV file”.
Elementary streams, representative examples of which include video and audio streams, are first converted into Packetized Elementary Streams (PESs) to which PES headers are assigned, and then converted into TS packets. Thereafter, the elementary streams are multiplexed. A file multiplexed in units of these TS packets is called a “transport stream file”.
Meanwhile, a file generated by (i) converting PES streams (results of converting elementary streams) into pack sequences and (ii) multiplexing the pack sequences is called a “program stream file”. This program stream file is different from the transport stream file.
An AV stream file recorded on a BD-ROM, a BD-RE and a BD-R is the former file, namely the transport stream file. An AV stream file recorded on a DVD-Video, a DVD-RW, a DVD-R and a DVD-RAM is the latter file, namely the program stream file, and is also called a Video Object.
The fourth row of
Extents are formed on a plurality of sectors that are physically continuous in the partition area. The partition area is an area accessed by the file system and includes an “area in which file set descriptor is recorded”, an “area in which end descriptor is recorded”, a “ROOT directory area”, a “BDMV directory area”, a “JAR directory area”, a “BDJO directory area”, a “PLAYLIST directory area”, a “CLIPINF directory area”, and a “STREAM directory area”. The following explains these areas.
The “file set descriptor” includes a logical block number (LBN) that indicates a sector in which the file entry of the ROOT directory is recorded, among directory areas. The “end descriptor” indicates an end of the file set descriptor.
Next is a detailed description of the directory areas. The above-described directory areas have an internal structure in common. That is to say, each of the “directory areas” is composed of a “file entry”, “directory file”, and “file recording area of lower file”.
The “file entry” includes a “descriptor tag”, an “ICB tag”, and an “allocation descriptor”.
The “descriptor tag” is a tag that indicates the entity having the descriptor tag is a file entry.
The “ICB tag” indicates attribute information concerning the file entry itself.
The “allocation descriptor” includes a logical block number (LBN) that indicates a recording position of the directory file. This concludes the description of the file entry. Next is a detailed description of the directory file.
The “directory file” includes a “file identification descriptor of lower directory” and “file identification descriptor of lower file”.
The “file identification descriptor of lower directory” is information that is referenced to access a lower directory that belongs to the directory file itself, and is composed of identification information of the lower directory, the length of the directory name of the lower directory, a file entry address that indicates the logical block number of the block in which the file entry of the lower directory is recorded, and the directory name of the lower directory.
The “file identification descriptor of lower file” is information that is referenced to access a file that belongs to the directory file itself, and is composed of identification information of the lower file, the length of the lower file name, a file entry address that indicates the logical block number of the block in which the file entry of the lower file is recorded, and the file name of the lower file.
The file identification descriptors of the directory files of the directories indicate the logical blocks in which the file entries of the lower directory and the lower file are recorded. By tracing the file identification descriptors, it is therefore possible to reach from the file entry of the ROOT directory to the file entry of the BDMV directory, and reach from the file entry of the BDMV directory to the file entry of the PLAYLIST directory. Similarly, it is possible to reach the file entries of the JAR directory, BDJO directory, CLIPINF directory, and STREAM directory.
The “file recording area of lower file” is an area in which the substance of the lower file that belongs to a directory is recorded. A “file entry” of the lower file and one or more “extents” are recorded in the “file recording area of lower file”.
The “file entry” includes a “descriptor tag”, an “ICB tag”, and an “allocation descriptor”.
The “descriptor tag” is a tag that indicates the entity having the descriptor tag is a file entry. The tag is classified into a file entry descriptor, a space bit map descriptor, and the like. In the case of the file entry, “261” indicating the file entry is described in the descriptor tag.
The “ICB tag” indicates attribute information concerning the file entry itself.
The “allocation descriptor” includes a logical block number (LBN) that indicates a recording position of an extent that constitutes a lower file belonging to a directory. The allocation descriptor includes data indicating an extent length, and a logical block number that indicates a recording position of an extent. Here, when the higher two bits of the data indicating the extent length are set to “0”, it is indicated that the extent is an assigned and recorded extent; and when the higher two bits are set to “1”, it is indicated that the extent is an assigned and unrecorded extent. When they are set to “0”, it is indicated that the extent is an extent that continues from the allocation descriptor. When a lower file belonging to a directory is sectioned into a plurality of extents, the file entry has a plurality of allocation descriptors for each extent.
By referring to the allocation descriptors of the above-described file entries, it is possible to recognize addresses of extents constituting an AV stream file and a non-AV file.
For example, the AV stream file is a file recording area that exists in the directory area of the directory to which the file belongs. It is possible to access the AV stream file by tracing the file identification descriptors of the directory files, and the allocation descriptors of the file entries.
A BDMV directory has recorded therein data such as AV contents and management information to be recorded on the BD-ROM. Below the BDMV directory exist the following five sub-directories: a “PLAYLIST directory”; a “CLIPINF directory”; a “STREAM directory”; a “BDJO directory”; and a “JAR directory. The BDMV directory includes two types of files, “index.bdmv” and “MovieObject.bdmv”.
The “index.bdmv” (fixed file name) stores an index table that shows (i) title numbers of a plurality of titles that can be played back from the BD-ROM, and (ii) program files (BD-J objects or movie objects) each defining a corresponding one of the titles. The index table is management information relating to an entirety of the BD-ROM. Once the disc has been loaded in the playback device, the playback device first reads the index.bdmv to uniquely acknowledge the disc. The index table is the highest level table defining the title structures including all titles, a top menu, and FirstPlay that are to be recorded on the BD-ROM. The index table designates the program file to be executed first from among general titles, a top menu title, and a FirstPlay title. Each time a title or a menu is called, the playback device in which BD-ROM has been loaded refers to the index table, to execute a predetermined program file. Here, the FirstPlay title is set by a content provider, and indicates a program file to be executed automatically when the disc is loaded in the playback device. The top menu title designates a movie object or a BD-J object to be called when a command indicating “Return to Menu” or the like is executed according to a user operation received via the remote control. The index.bdmv contains Initial_output_mode information as information relating to stereoscopic viewing. This Initial_output_mode information defines the initial state in which an output mode of the playback device should be in when the index.bdmv is loaded. The Initial_output_mode information can be configured to define an output mode desired by the manufacturer of the BD-ROM.
The “MovieObject.bcdmv” (fixed file name) stores one or more movie objects. A movie object is a program file defining control procedures that the playback device should follow during an operation mode (HDMV mode) controlled by a command interpreter. The movie object includes a mask flag indicating, when the user has executed one or more commands and menu/title calls with respect to the GUI, whether these calls should be masked.
The “BDJO” directory includes a program file with the extension bdjo (“xxxxx.bdjo” where “xxxxx” is variable and the extension “bdjo” is fixed). This program file stores a BD-J object defining control procedures that the playback device should follow during an operation mode (BD-J mode) controlled by a Java® virtual machine, which is a byte code interpreter. The BD-J object contains an “application management table” for causing the playback device to perform application signaling whose life cycle falls into each title. The application management table contains (i) “application identifiers” that each identify an application to be executed when the title of the corresponding BD-J object becomes a current title, and (ii) “control codes”. Especially, an application whose life cycle is defined by the application management table is called a “BD-J application”. When a control code is set to AutoRun, it means that the corresponding application should be executed automatically after having been loaded onto a heap memory. When a control code is set to Present, it means that the corresponding application should be executed once it has been called by another application after having been loaded in the heap memory. Meanwhile, some of the BD-J applications do not cease their operations even if the corresponding titles have been completed. Such BD-J applications are called “title unboundary applications”.
An actual Java® application is the Java® archive file (YYYYY.jar) stored in the JAR directory below the BDMV directory. An application is, for example, a Java® application containing one or more xlet programs that have been loaded onto a heap area (also called a work memory) of the virtual machine. An application is composed of such one or more xlet programs and data loaded onto the work memory.
The “PLALIST directory” contains a playlist information file with the extension mpls (“xxxxx.mpls” where “xxxxx” is variable and the extension “mpls” is fixed).
A “playlist” defines playback sections along the time axis of an AV stream, and represents a playback path defined by logically specifying the playback order of these playback sections. The “playlist” defines (i) which AV stream(s) should be played back, (ii) which part of the AV stream (s) should be played back, and (iii) in what order the scenes of the AV stream(s) should be played back. The playlist information file stores playlist information defining such a playlist. AV playback can be started by the Java® application, which is used for playback control, instructing the Java® virtual machine to generate a Java® Media Framework (JMF) player instance that plays back the playlist information. The JMF player instance is the actual data to be generated into the heap memory of the virtual machine based on a JMF player class.
The “CLIPINF” directory contains a clip information file with the extension clpi (“xxxxx.clpi” where “xxxxx” is variable and the extension “clpi” is fixed).
The “STREAM” directory stores an AV stream file that is in compliance with the format xxxxx.m2ts (“xxxxx” is variable and the extension “m2ts” is fixed).
An AV stream file in the STREAM directory is a digital stream in the MPEG-2 transport stream (TS) format, and is generated by multiplexing a plurality of elementary streams, such as a video stream, an audio stream, and a graphics stream.
The AV stream file contains a “left-view AV stream”, which is a group of packets storing various types of PES streams for left-view playback, such as packets storing a left-view video stream, packets storing a graphics stream for the left view, and packets storing an audio stream to be played back together with these streams. When the left-view AV stream includes a base-view video stream and enables 2D playback, this left-view AV stream is referred to as a “2D/left-view video stream”. In the following description, the left-view video stream is the base-view video stream, and the left-view AV stream including the left-view video stream is the 2D/left-view AV stream, unless stated otherwise.
The AV stream file also contains a “right-view AV stream”, which is a group of packets storing various types of PES streams for right-view playback, such as source packets storing a right-view video stream, source packets storing a graphics stream for the right view, and source packets storing an audio stream to be played back together with these streams.
Clip information files in the CLIPINF directory are pieces of information that show, in one to one correspondence with AV stream files, details of the AV stream files indicating, for example, types of packets constituting the AV stream files. Each clip information file is read out by memory prior to playback of the corresponding AV stream file, and is referenced within the playback device while the corresponding AV stream file is being played back.
This concludes the description of the internal structure of the recording medium. The following describes a method of creating the recording medium shown in
The recording method of the present embodiment encompasses not only real-time recording (i.e., creating the aforementioned AV stream file and non-AV file in real time, and directly writing the created files into the volume area), but also pre-format recording (i.e., muss-producing optical discs by preparing the entire bitstreams to be recorded into the volume area, creating the master based on the prepared bitstreams, and performing press processing on the master). The recording medium of the present embodiment can also be identified by the recording method utilizing the real-time recording and the recording method utilizing the pre-format recording.
Step S301 is a process of determining the title structure of the BD-ROM, and thus generating title structure information. Using a tree structure, the title structure information defines a relationship between units of playback on the BD-ROM, e.g., a relationship between a title, a movie object, a BD-J object and a playlist. More specifically, the title structure information is generated as follows. First, the following nodes are defined: (i) a node corresponding to the “disc name” of the BD-ROM to be created; (ii) a node corresponding to the “title” that can be played back from Index.bdmv of the BD-ROM; (iii) a node corresponding to the “movie object” or “BD-J object” constituting the title; and (iv) a node corresponding to the “playlist” that is played back from the movie objector BD-J object. Then, by connecting these nodes by branches, the relationship between the title, the movie object, the BD-J object and the playlist is defined.
Step S302 is a process of importing a video, audio, still images and subtitle information to be used for the title.
Step S303 is a process of creating BD-ROM scenario data by performing, on the title structure information, editing processing according to the user operation received via GUI. The BD-ROM scenario data is information for causing the playback device to play back an AV stream on a per-title basis. In the case of the BD-ROM, a scenario is information defined as the index table, the movie object, or the playlist. The BD-ROM scenario data includes material information constituting the stream, information showing playback sections and a playback path, menu screen arrangement, and information showing transition from the menu.
Step S304 is encode processing. A PES stream is acquired by performing the encode processing based on the BD-ROM scenario data.
Step S305 is multiplex processing that is performed in accordance with the BD-ROM scenario data. An AV stream is acquired by multiplexing the PES stream in Step S305.
In Step S306 is a process of acquiring a database that is used for recording data on the BD-ROM. Here, the database is a general term referring to the above-described index table, movie object, playlist, BD-J object, etc. that are defined on the BD-ROM.
In Step S307, a Java® program, the AV stream acquired by the multiplex processing, and the BD-ROM database are input. Then, an AV stream file and a non-AV file are created in a file system format compliant with the BD-ROM.
Steps S308 is a process of writing, from among data to be recorded on the BD-ROM, a non-AV file onto the BD-ROM. S309 is a process of writing, from among data to be recorded on the BD-ROM, an AV stream file onto the BD-ROM.
The multiplex processing of Step S305 includes (i) a first conversion process of converting a video stream, an audio stream and a graphics stream into a PES stream, then converting the PES stream into a transport stream, and (ii) a second conversion process of converting each TS packet constituting the transport stream into a source packet. The multiplex processing of Step S305 thus multiplexes a source packet sequence constituting a video, audio and graphics.
In Step S309, namely the process of writing the AV stream file, the source packet sequence is written into consecutive areas of the recording medium as AV stream file extents.
The following streams are written onto the recording medium.
(I) Video Stream
A video stream includes primary and secondary videos of a movie. Here, the primary video represents ordinary video to be displayed on the full screen as parent images during the Picture in Picture display. The secondary video represents video to be displayed in a small inset window during the Picture in Picture display. There are two types of primary video: a left-view video and a right-view video. Similarly, there are two types of secondary video: a left-view video and a right-view video.
The video stream is encoded and recorded by using, for example, MVC (described above), MPEG-2, MPEG-4 AVC and SMPTE VC-1.
(II) Audio Stream
An audio stream is the primary audio of a movie. The audio stream is compression encoded and recorded by using, for example, Dolby AC-3, Dolby Digital Plus, MLP, DTS, DTS-HD, and a linear PCM. There are two types of audio streams: a primary audio stream and a secondary audio stream. The primary audio stream is output as the primary audio when playback is performed together with audio mixing. The secondary audio stream is output as the secondary audio when playback is performed together with audio mixing.
(III) Presentation Graphics Stream
A Presentation Graphics (PG) stream presents graphics (e.g., movie subtitles and animated characters) to be displayed in close synchronization with pictures. Individual PG streams are provided in one to one correspondence with a plurality of different languages, such as English, Japanese, and French.
A PG stream is composed of a sequence of functional segments, namely a Presentation Control Segment (PCS), a Pallet Definition Segment (PDS), a Window Definition Segment (WDS) and an Object Definition Segment (ODS). The ODS is a functional segment that defines subtitles from among graphics objects.
The WDS is a functional segment that defines the bit size of graphics objects on the screen. The PDS is a functional segment that defines colors to be presented when drawing graphics objects. The PCS is a functional segment that defines page control during display of subtitles. Examples of such page control include Cut-In/Out, Fade-In/Out, Color Change, Scroll, and Wipe-In/Out. Page control defined by the PCS enables various display effects, one example of which is to display a new subtitle while gradually deleting a previous subtitle.
To playback the graphics stream, a graphics decoder executes the following processing in a pipeline: (i) decoding an ODS that belongs to a certain unit of display, and writing its graphics objects into an object buffer, and (ii) writing, in plane memory, graphics objects acquired by decoding an ODS that belongs to a preceding unit of display. The above-mentioned close synchronization can be established by making hardware operate to the fullest extent to execute these processing.
Other than the PG stream, a text subtitle (textST) stream is also one of the streams that present subtitles. The textST stream is not multiplexed on the AV stream file. The textST stream expresses contents of the subtitles in character codes. According to the BD-ROM standard, a pair of the PG stream and the textST stream is referred to as a “PGTextST stream”.
(IV) Interactive Graphics Stream
An Interactive Graphics (IG) stream is a graphics stream that realizes interactive control via a remote control. The interactive control defined by the IG stream is compatible with interactive control performed on the DVD playback device. The IG stream is composed of a plurality of functional segments, namely an Interactive Composition Segment (ICS), a Palette Definition Segment (PDS), and an Object Definition Segment (ODS). The ODS is a functional segment that defines graphics objects. Buttons on the interactive screen are drawn by aggregation of such graphics objects. The PDS is a functional segment that defines colors to be presented when drawing graphics objects. The ICS is a functional segment for causing a state change, or more specifically, for changing a state of each button in accordance with a user operation. The ICS includes button commands, each of which is to be executed when selection of the corresponding button is confirmed. The interactive graphics stream represents the interactive screen that is formed by arranging GUI components on the screen.
A video stream is composed of a plurality of Groups of Pictures (GOPs). Editing of and random access to video are made possible by performing encode processing on a per-GOP basis.
The first video access unit of a GOP is composed of a sequence header, a picture header, supplementary data, and compressed picture data, and stores data of an I-picture. A sequence header stores information that is commonly shared within the GOP, such as resolution, a frame rate, an aspect ratio, and a bit rate. A frame rate, resolution, and an aspect ratio stored in a sequence header of each GOP in the right-view video stream are respectively the same as a frame rate, resolution and an aspect ratio stored in a sequence header of the corresponding GOP in the left-view video stream. The picture header stores information indicating a method of encoding an entirety of the picture and the like. The supplementary data represents additional information that is not essential in decoding the compressed picture data. Examples of such additional information include character information on closed captions to be displayed on the TV screen in synchronization with video, and time code information. The compressed picture data is picture data that has been compression encoded. Video access units other than the first video access unit of a GOP are each composed of a picture header, supplementary data, and compressed picture data.
Contents of the sequence header, picture header, supplementary data, and compressed picture data are configured in different manners depending on the method with which video is encoded. For example, when the video is encoded using MPEG-4 AVC, the sequence header, picture header and supplementary data correspond to a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS) and Supplemental Enhancement Information (SEI), respectively.
The upper row of
The decode switch information is composed of a subsequent access unit type, a subsequent access unit size, and a decode counter.
The subsequent access unit type is information showing whether the video access unit to be decoded next is of the left-view video stream or the right-view video stream. When the subsequent access unit type shows a value “1”, it means the video access unit to be decoded next is of the left-view video stream. When the subsequent access unit type shows a value “2”, the video access unit to be decoded next is of the right-view video stream. When the subsequent access unit type indicates a value “0”, it means that the current video access unit is the last video access unit of the stream.
The subsequent access unit size is information showing a size of the video access unit to be decoded next. If the size of the video access unit to be decoded next is unknown, then it is required to identify the size of this video access unit by analyzing its structure when extracting this video access unit of an undecoded state from a corresponding buffer. However, with the aid of the subsequent access unit size, the video decoder can identify the size of the subsequent video access unit without analyzing its structure. This simplifies the processing of extracting a picture of an undecoded state from a corresponding buffer.
In a case where the first I-picture of a GOP in the left-view video stream is assigned a decode counter “0”, the video access units of the left- and right-view video streams following this I-picture are assigned decode counters that successively increment in the order in which they are decoded, as shown in
Use of such information (the decode counters) makes it possible to perform proper processing to resolve an error that arises when a video access unit cannot be read for some reason. For example, assume a case where the third video access unit of the left-view video stream (Br-picture) cannot be read due to a reading error as shown in
Alternatively, as shown in
As shown in the first row of
As shown in the second row of
<Multiplexing of AV Stream>
A group of source packets whose ATSs are consecutive on the Arrival Time Clock (ATC) time axis is called an ATC sequence. A group of source packets whose DTSs and PTSs are consecutive on the System Time Clock (STC) time axis is called an STC sequence.
The second row of
In each of the extents shown in the first row, groups of source packets constituting the right-view AV stream and groups of source packets constituting the left-view AV stream are interleaved. This interleaved arrangement shown in
Here, each of the variables “i”, “i+1”, etc. included in the brackets indicates the numerical order in which the corresponding extent is played back. According to the numerical orders indicated by the variables shown in
The sizes of the extents EXT_L[i] and EXT_R[i] are expressed as SEXT_L[i] and SEXT_R[i], respectively.
The following explains how these sizes SEXT_L and SEXT_R are determined. The playback device has two buffers, namely a right-view read buffer and a left-view read buffer. The extents shown in
Capacity of Right-View Read Buffer=Rmax1דTime Period for which Left-View Read Buffer Becomes Full, Including Jump Time Period(s)”
Here, a jump has the same meaning as a disc seek. This is because a BD-ROM has a limited number of consecutive areas that can be secured as recording areas, and the left-view and right-view video streams are not necessarily recorded on the BD-ROM right next to each other; that is, there are cases where the left-view video stream is recorded in an area that is distant from the area in which the right-view video stream is recorded on the BD-ROM.
The following discusses the “Time Period for which Left-View Read Buffer Becomes Full, Including Jump Time Period (s)”. A TS packet is accumulated in the left-view read buffer at a transfer rate of Rud−Rmax2, which denotes a difference between (i) the output rate Rmax2, at which the left-view read buffer performs an output, and (ii) the input rate Rud, at which the left-view read buffer receives an input. Accordingly, the time period for which the left-view read buffer becomes full is RB2/(Rud−Rmax2). RB2 denotes the capacity of the left-view read buffer.
In order for the left-view read buffer to read data, it is necessary to take into consideration (i) a jump time period (T jump) required to jump from the right-view AV stream to the left-view AV stream, and (ii) a jump time period (T jump) required to jump from the left-view AV stream to the right-view AV stream. For this reason, a time period (2×T jump+RB2/(Rud−Rmax2)) is required to accumulate data in the left-view read buffer.
Given that the transfer rate to the right-view read buffer is Rmax1, all the source packets in the right-view read buffer need to be output at the transfer rate of Rmax1 during the above-described time period for which data is accumulated in the left-view read buffer. Therefore, the capacity RB1 of the right-view read buffer is:
RB1≧Rmax1×{2×Tjump+RB2/(Rud−Rmax2)}
In a similar manner, the capacity RB2 of the left-view read buffer can be calculated using the following expression:
RB2≧Rmax2×{2×Tjump+RB1/(Rud−Rmax1)}
A specific memory size of each of the right- and left-view read buffers is equal to or below 1.5 Mbytes. In the present embodiment, the extent sizes SEXT_R and SEXT_L are set to be exactly or substantially equal to the memory sizes of the right- and left-view read buffers, respectively. As the file extents are physically arranged in the above-described manner, the AV stream can be played back seamlessly without the video and audio cut halfway through. This concludes the description of the method of recording the left- and right-view AV streams. Described below are the internal structures of left- and right-view AV streams. More specifically, the following describes the internal structures of the extents EXT_R[i] and EXT_L[i] with reference to the first row of
The extent EXT_L[i] is composed of the following source packets.
Source packets with a packet ID “0x0100” constitute a Program map. TS packets with a packet ID “0x1001” constitute a PCR.
Source packets with a packet ID “0x1011” constitute the left-view video stream.
Source packets with packet IDs “0x1220” to “0x123F” constitute the left-view PG stream.
Source packets with packet IDs “0x1420” to “0x143F” constitute the left-view IG stream.
Source packets with PIDs “0x1100” to “0x111F” constitute the audio stream.
The extent EXT_R[i] is composed of the following source packets. TS packets with a packet ID “0x1012” constitute the right-view video stream. Source packets with packet IDs “0x1240” to “0x125F” constitute the right-view PG stream. Source packets with packet IDs “0x1440” to “0x145F” constitute the right-view IG stream.
In addition to the source packets of each stream (e.g., video, audio and graphics streams), the AV stream also includes source packets of a Program Association Table (PAT), a Program Map Table (PMT), a Program Clock Reference (PCR) and the like. The PAT shows the PID of the PMT used in the AV stream. The PID of the PAT itself is registered as “0x0000”. The PMT stores the PIDs of each stream (e.g., video, audio and graphics streams) included in the AV stream file, and attribute information of streams corresponding to the PIDs. The PMT also has various descriptors relating to the AV stream. The descriptors have information such as copy control information showing whether copying of the AV stream file is permitted or not permitted. The PCR stores STC time info Ration corresponding to the ATS showing when the PCR packet is transferred to the decoder, in order to achieve synchronization between an Arrival Time Clock (ATC) that is a time axis of the ATSs, and a System Time Clock (STC) that is a time axis of the PTSs and DTSs.
More specifically, a PMT header is disposed at the top of the PMT. Information written in the PMT header includes the length of data included in the PMT to which the PMT header is attached. A plurality of descriptors relating to the AV stream are disposed after the PMT header. Information such as the described copy control information is listed in the descriptors. After the descriptors is a plurality of stream information pieces relating to the streams included in the AV stream file. Each stream information piece is composed of stream descriptors, each listing information such as a stream type for identifying the compression codec of the stream, a stream PID, or stream attribute information (such as a frame rate or an aspect ratio). The stream descriptors are equal in number to the number of streams in the AV stream file.
The following explains how the extents shown in
In
Dotted arrows h1, h2, h3, h4 and h5 indicate attribute relationships based on allocation identifiers. In other words, these dotted arrows indicate to which files the extents EXT_R[i], EXT_L[i], EXT_R[i+1] and EXT_L[i+1] belong to, respectively. According to the attribute relationships indicated by the dotted arrows h1, h2, h3, h4 and h5, the extents EXT_R[i], EXT_L[i], EXT_R[i+1] and EXT_L[i+1] are all registered as extents of the XXXXX.m2ts.
This concludes the description of the AV stream file storing the AV stream. A description is now given of a clip information file.
<Clip Information File>
As indicated by leading lines ch2, the clip information is composed of a “system rate”, “playback start time”, and “playback end time”. The system rate denotes a maximum transfer rate at which each TS packet constituting the AV stream file is transferred to a PID filter of a system target decoder (described later). Intervals between the ATSs included in the AV stream file are each set to be equal to or smaller than the system rate. The playback start time is set to the PTS assigned to the first video frame of the AV stream file. The playback end time is set to a time obtained by adding a per-frame playback interval to the PTS assigned to the last video frame of the AV stream file.
In
As indicated by the leading lines ah1, the stream attribute information set shows attribution of a PES stream constituted from various types of source packets. More specifically, the stream attribute information set shows: (i) stream attribute information of the left-view video stream constituted from TS packets with a PID “0x1011”; (ii) stream attribute information of the right-view video stream constituted from TS packets with a PID “0x1012”; (iii) stream attribute information of the audio stream constituted from TS packets with a PID “0x1100” or “0x1101”; and (iv) stream attribute information of the PG stream constituted from TS packets with a PID “0x1220” or “0x1221”. As indicated by the leading lines ah1, attribute information is registered for each PID of each stream in the AV stream file. The attribute information of each stream has different information depending on the type of the stream. Video stream attribute information carries information including what kind of compression codec the video stream was compressed with, and the resolution, aspect ratio and frame rate of each picture data that compose the video stream. Audio stream attribute information carries information including what kind of compression codec the audio stream was compressed with, how many channels are included in the audio stream, how many languages the audio stream supports, and the sampling frequency. The above information in the video stream attribute information and the audio stream attribute information is used for purposes such as initialization of the decoder before the player performs playback.
A description is now given of video stream attribute information. The codec, frame rate, aspect ratio and resolution included in the left-view video stream attribute information, which corresponds to the PID “0x1011”, must match those included in the corresponding right-view video stream attribute information, which corresponds to the PID “0x1012”. If the codec included in the left-view video stream attribute information does not match the codec included in the corresponding right-view video stream attribute information, the two video streams cannot refer to each other. Furthermore, in order to playback the two video streams in synchronization with each other as 3D video on the display, the frame rate, aspect ratio and resolution included in the left-view video stream attribution information must match those included in the corresponding right-view video stream attribute information. Otherwise, playback of the two video streams would bring discomfort to the viewer.
The right-view video stream attribute information may further include a flag indicating that it is necessary to refer to the left-view video stream to decode the right-view video stream. The right-view video stream attribute information may also include information indicating the video stream to be referred to when decoding the right-view video stream. By configuring the left-view video stream attribute information and the right-view video stream attribute information in the above manner, a relationship between the two video streams can be judged by a tool for verifying whether data has been created in compliance with a specified format.
The “entry map header information” includes information such as PIDs of video streams indicated by the entry maps and the number of entry points indicated by the entry maps.
The “extent start type” shows whether the first one of a plurality of extents arranged is of the left-view video stream or the right-view video stream. With reference to the “extent start type”, the 2D/3D playback device can easily judge which one of an extent of the left-view AV stream and an extent of the right-view AV stream it should request a BD-ROM drive to play back first.
The “entry map for the PDI ‘0x1011’”, “entry map for the PDI ‘0x1012’”, “entry map for the PDI ‘0x1220’”, and “entry map for the PDI ‘0x1221’” are respectively entry maps of PES streams composed of different types of source packets. A pair of a PTS and an SPN included in each entry map is called an “entry point”. Each entry point has an entry point ID (hereafter, “EP_ID”). Starting with the top entry point, which has an EP_ID “0”, the entry points have successively incrementing EP_IDs. A pair of the PTS and SPN of the first I-picture of each GOP included in the left-view video stream is registered as each entry point of the let-view video stream. Similarly, a pair of the PTS and SPN of the first picture of each GOP included in the right-view video stream is registered as each entry point of the right-view video stream. Using these entry maps, the player can specify the location of a source packet corresponding to an arbitrary point on the time axis of the video stream. For instance, when performing special playback such as fast forward or rewind, the player can perform processing efficiently without analyzing the AV stream file, by specifying, selecting and playing back the I-picture registered in each entry map. An entry map is created for each video stream multiplexed on the AV stream file. The entry maps are managed according to the PIDs.
In
The entry point with the EPID “2” shows a correspondence between an is_angle_change flag (set to be “OFF”), the SPN “3200”, and the PTS “360000”. The entry point with the EP_ID “3” shows a correspondence between an is_angle_change flag (set to be “OFF”), the SPN “4800”, and the PTS “450000”. Each is_angle_change flag indicates whether the picture of the corresponding entry point can be decoded independently of this entry point. Each is_angle_change flag is set to be “ON” when the video stream has been encoded using MVC or MPEG-4 AVC and the picture of the corresponding entry point is an IDR picture. On the other hand, each is_angle_change flag is set to be “OFF” when the video stream has been encoded using MVC or MPEG-4 AVC and the picture of the corresponding entry point is a non-IDR picture.
The entry point with the EP_ID “2” shows a source packet with the SPN “3200” in correspondence with the PTS “360000”. The entry map with the EP_ID “3” shows a source packet with the SPN “4800” in correspondence with the PTS “450000”.
Using the entry map, the player can specify the location of the AV stream file corresponding to an arbitrary point on the time axis of the video stream. For instance, when performing special playback such as fast forward or rewind, the player can perform processing efficiently without analyzing the AV stream file, by specifying, selecting and playing back the I-picture registered in each entry map.
Assume a case where, from among the first I-picture of a GOP in the left-view video stream and the first I-picture of the corresponding GOP in the right-view video stream, one of them is registered in the corresponding entry map, while the other one is not registered in the corresponding entry map. In this case, when a random access (e.g., jump playback) is performed, it would be difficult to play back the left- and right-view video streams as stereoscopic video. For example,
The above problem can be solved by the structure shown in
This concludes the description of the entry map table. The following is a detailed description of the 3D metadata set.
The 3D metadata set is a group of metadata that defines various information required for stereoscopic playback, and includes a plurality of offset entries. Each PID corresponds to a plurality of offset entries. The offset entries are in one to one correspondence with a plurality of display times. When playing back a PES stream of a certain PID, it is possible to define, for each PID, what offset should be used to perform stereoscopic playback at each display time in the PES stream.
3D metadata is information for adding depth information to 2D images of a presentation graphics stream, an interactive graphics stream, and a secondary video stream. As shown in the upper row of
This concludes the description of the clip information file. The following is a detailed description of a playlist information file.
<Playlist Information File>
A playlist shows a playback path of an AV stream file. A playlist is composed of one or more playitems. Each playitem shows a corresponding playback section of an AV stream, and is identified by a corresponding playitem ID. The playitems are listed in the order in which they should be played back in the playlist. A playlist includes entry marks each showing a corresponding playback start point. Each entry mark can be assigned to a playback section defined by the corresponding playitem. Specifically, each entry mark is assigned to a position that could be the playback start point of the corresponding playitem. The entry marks are used for cue playback. For example, chapter playback can be performed by assigning entry marks to the positions that represent start points of chapters in a movie title.
The “main path” is composed of one or more playitems. In the example of
Each “subpath” shows a playback path to be played back together with the main path. Subpaths are assigned IDs (subpath IDs) in the order in which they are registered in the playlist. Subpath IDs are used to identify the subpaths. There are a subpath of the synchronized type and a subpath of the unsynchronized type. The subpath of the synchronized type is played back in synchronization with playback of the main path. The subpath of the unsynchronized type can be played back without being in synchronization with playback of the main path. The types of subpaths are stored as subpath types. Each sub-playitem is composed of one or more sub-playitem information pieces.
Each playitem includes a stream selection table, which is information showing the stream number of an elementary stream whose playback is permitted in the playitem or the corresponding sub-playitem. The playlist information, playitem information, sub-playitem information and stream selection table are described in detail in the later embodiments.
“AV clips #1, #2 and #3” constitute an AV stream that is (i) played back as 2D video, or (ii) played back as a left-view AV stream during 3D video playback.
“AV clips #4, #5 and #6” constitute an AV stream that is played back as a right-view AV stream during 3D video playback.
As shown by the reference numbers rf1, rf2 and rf3, the main path of the 2D playlist refers to the AV clips #1, #2 and #3 that store the left-view AV stream.
The 3D playlist is composed of (i) a main path including the playitems that refer to the left-view AV stream as shown by the reference numbers rf4, rf5 and rf6, and (ii) a subpath including the sub-playitems that refer to the right-view AV stream. More specifically, as shown by the reference numbers rf7, rf8 and rf9, the subpath of the 3D playlist refers to the AV clips #4, #5 and #6 that store the right-view AV stream. This subpath is configured to be synchronized with the main path on the time axis. The 2D and 3D playlists structured in the above manner can share AV clips storing the left-view AV stream. In the 3D playlist structured in the above manner, the left- and right-view AV streams are in correspondence with each other so that they are synchronized with each other on the time axis.
Referring to
As opposed to the 3D playlist of
In the example of
When playlist information is described so as to realize the 3D playlist of
A description is now given of a playlist in which 2D playitems and 3D playitems coexist. During playback of such a playlist, 2D and 3D playitems must be seamlessly connected with one another.
Content that stores 3D videos does not necessarily consist only of 3D videos. In some contents, 2D and 3D videos coexist. During playback back of such contents, the 2D and 3D videos included therein need to be seamlessly played back.
When playing back the playback sections #1, #2 and #3 in this order, playback is not performed at the same frame rate throughout the playback sections #1, #2 and #3. Each time a frame rate is changed, the HDMI connection between the playback device and the television needs to be reset; this causes delay and therefore does not guarantee seamless playback. One method of avoiding this problem is illustrated in
In view of the above, the following structure, which is illustrated in
In the example of
The following describes a specific data structure of a 3D playlist with reference to
Each playitem includes a field for a duplicate flag as shown in
It is permissible to prohibit coexistence of 2D and 3D playitems in one playlist. This makes it possible to easily avoid the problem of delay caused by changing the frame rate when switching between 2D and 3D videos.
First, a description is given of the MainPath information. Leading lines mp1 indicate a close-up of the internal structure of the MainPath information. As indicated by the leading lines mp1, the MainPath information is composed of a plurality of pieces of PlayItem information, namely PlayItem information #1 through #N. The PlayItem information defines one or more logical playback sections that constitute the MainPath. Leading lines mp2 of
The following describes the “STN_table”, the “left-view/right-view identification information”, and the “multi_clip_entry”.
The “STN_table (Stream Number_table)” is a table in which logical stream numbers are assigned to pairs of (i) a stream entry including a packet ID and (ii) a stream attribute. The order of the pairs of a stream entry and a stream attribute in the STN_table indicates a priority order of the corresponding streams. This STN_table is provided for 2D playback, and an STN_table for 3D playback is provided independent of this table.
The “left-view/right-view identification information” is base-view video stream specification information that specifies which one of the left-view video stream and the right-view video stream is the base-view video stream. When the left-view/right-view identification information shows “0”, it means that the left-view video stream is the base-view video stream. When the left-view/right-view identification information shows “1”, it means that the right-view video stream is the base-view video stream.
The “connection_condition” indicates a type of connection between the current playitem and the preceding playitem. When the connection_condition of a playitem is “1”, it indicates that a seamless connection between the AV stream specified by the playitem and the AV stream specified by the preceding playitem is not guaranteed. When the connection_condition of a playitem is “5” or “6”, it indicates that a seamless connection between the AV stream specified by the playitem and the AV stream specified by the preceding playitem is guaranteed.
When the connection_condition is “5”, the STCs between playitems may be discontinuous. That is to say, the video display start time of the start of the starting AV stream of the post-connection playitem may be discontinuous from the video display end time of the end of the ending AV stream of the pre-connection playitem. It should be noted here that the AV streams need to be generated so that the decoding by the system target decoder (described later) does not fail when playback is performed after the AV stream of the post-connection playitem is input into the PID filter of the system target decoder immediately after the AV stream of the pre-connection playitem is input into the PID filter of the system target decoder. Also, there are some limiting conditions. For example, the audio end frame of the AV stream of the pre-connection playitem should overlap, on the playback time axis, with the audio start frame of the post-connection playitem.
When the connection_condition is “6”, an AV stream of the pre-connection playitem connected with an AV stream of the post-connection playitem should be playable as one AV clip. That is to say, the STCs and ATCs should be continuous throughout the AV streams of the pre-connection playitem and post-connection playitem.
The “Multi_clip_entry” is information that identifies AV streams representing videos of different angles when a multi-angle section is formed by the playitem.
This concludes the description of the MainPath information. Next, a detailed description is given of the SubPath information table.
The “Clip_information_file_name” is information that, with the file name of Clip information written therein, uniquely specifies a SubClip that corresponds to the SubPlayItem.
The “Clip_codec_identifier” indicates an encoding method of the AV stream file.
The “ref_to_STC_id[0]” uniquely indicates an STC_Sequence that is the target of the SubPlayItem.
The “SubPlayItem_In_time” is information that indicates the start point of the SubPlayItem on the playback time axis of the SubClip.
The “SubPlayItem_Out_time” is information that indicates the end point of the SubPlayItem on the playback time axis of the SubClip.
The “sync_PlayItem_id” is information that uniquely specifies, among PlayItems constituting the MainPath, a PlayItem with which the SubPlayItem is to be synchronized. The “SubPlayItem_In_time” is present on the playback time axis of the PlayItem specified by the “sync_PlayItem_id”.
The “sync_start_PTS_of_PlayItem” indicates, with the time accuracy of 45 KHz, the position of the start point of the SubPlayItem specified by the SubPlayItem_In_time, on the playback time axis of the PlayItem specified by the “sync_PlayItem_id”.
This concludes the description of the subpath information. Next is a detailed description of the entry mark information.
The entry mark information can be attached to a position within a playback section defined by the playitem. Namely, the entry mark information is attached to a position that can be a playback start point in the playitem, and is used for cue playback. For example, during playback of a movie title, chapter playback is realized when an entry mark is attached to a chapter start position.
This concludes the description of the entry mark information. Next is a detailed description of the extension data.
The extension data is an extension unique to the 3D playlist, and is not compatible with the 2D playlist. The extension data stores STN_table_SSs #1 through #N. Each STN_table_SS corresponds to a different piece of playitem information, and is a table in which logical stream numbers are assigned to pairs of a stream entry and a stream attribute for 3D playback. The order of the pairs of a stream entry and a stream attribute in the STN_table_SS indicates a priority order of the corresponding streams. The stream selection table is constituted from the STN_table in the playitem information and the STN_table_SS in the extension data.
The following describes the stream selection table which is included in the above-described internal structure of the PlayItem information.
As the stream entries of the STN_table, the audio/PG/IG for 2D that are playable during 2D playback can be registered. For this reason, the STN_table includes a 2D video stream entry group, a 2D audio stream entry group, a 2D PG stream entry group, and a 2D IG stream entry group, and the packet identifiers of the video, audio, PG, and IG streams can be described in these stream entry groups.
As the stream entries of the STN_table_SS, the audio/PG/IG for 3D that are playable during stereoscopic playback can be registered. For this reason, the STN_table_SS includes a 3D video stream entry group, a 3D audio stream entry group, a 3D PG stream entry group, a 3D IG stream entry group, and stream combination information, and the packet identifiers of the video, audio, PG, and IG streams can be described in these stream entry groups.
The “stream selection number” is a number assigned to each stream entry in the stream selection table, and is incremented by one in the order starting with the “stream entry 1”. The “stream selection number” is used by the playback device to identify each stream.
The “stream path information” is information that indicates an AV stream on which the stream indicated by the stream identification information is multiplexed. For example, when the “stream path information” is “main path”, it indicates an AV stream of the playitem; and when the “stream path information” is “subpath ID ‘1’”, it indicates an AV stream of a sub-playitem that corresponds to the playback section of the playitem, in the subpath indicated by the subpath ID.
The “stream identification information” is information such as the PID, and indicates a stream multiplexed on the referenced AV stream file. Attribute information of each stream is also recorded in each stream entry. Here, the attribute information is information that indicates characteristics of each stream. For example, in the case of audio, presentation graphics, or interactive graphics, the attribute information includes a language attribute or the like.
In the STN_table_SS, the stream entries for the left- and right-view video streams have the same values with respect to, for example, the frame rate, resolution, and video format. For this reason, the stream entry may include a flag that indicates whether the corresponding stream is the left-view video stream or the right-view video stream.
This concludes the description of the stream selection table. Next, a detailed description is given of the left-view/right-view identification information.
It has been presumed in the description that the left view serves as the main view, and the left view is displayed during 2D display. However, alternatively, the right view may serve as the main view. A playlist includes information that indicates which one of the left view and the right view serves as the main view and is displayed during 2D playback. Judgment as to which one of the left view and the right view serves as the main view is determined according to this information. This information is the left-view/right-view identification information.
It is generally considered that a left-view video is generated as 2D video in a studio. However, some may think that it is preferable to create a right-view video as 2D video. Due to this possibility, the left-view/right-view identification information, which indicates which one of the left view and the right view serves as the base view, can be set for each piece of playitem information.
Each stream and the left-view/right-view identification information can be output to the display device, and the display device can use the left-view/right-view identification information to distinguish between the left-view and right-view streams. When the shutter glasses are used, it is necessary to recognize which one of the left-view video and the right-view video is the main video that is to be referenced by the playitem, in order to synchronize the operation of the shutter glasses with the display of the display device. Therefore, switch signals are sent to the shutter glasses so that the glass over the left eye becomes transparent during display of the left-view video while the glass over the right eye becomes transparent during display of the right-view video.
The distinction between the left view and the right view is also necessary even in the naked-eye stereoscopic view method in which the display device has a screen embedded with a prism, such as a lenticular lens. Therefore, the left-view/right-view identification information is also utilized when this method is used.
This concludes the description of the left-view/right-view identification information. The left-view/right-view identification information is provided on the assumption that either the left-view video or the right-view video among the parallax images can be played back as 2D video. However, some parallax images may not be suited for use as 2D images depending on their types.
The following describes left- and right-view images that are not suited for use as 2D images.
In
The “00003.mpls” specifies the video stream representing center images, using the main path. The movie object in the upper-left corner of
This concludes the description of implementation of the recording medium and recording method. The following describes the playback device in detail.
As with a 2D playback device, the BD-ROM drive 1 reads out data from a BD-ROM disc based on a request from the playback control unit 7. An AV stream file read out from the BD-ROM disc is transferred to the read buffer 2a or 2b.
When playing back 3D video, the playback control unit 7 issues a read request that instructs the BD-ROM drive 1 to read out the 2D/left-view AV stream and the right-view AV stream alternately on a per-extent basis. The BD-ROM drive 1 reads out extents constituting the 2D/left-view AV stream into the read buffer 2a, and reads out extents constituting the right-view AV stream into the read buffer 2b. When playing back 3D video, the BD-ROM drive 1 should have a higher reading speed than when playing back 2D video, since it is necessary to read out both the 2D/left-view AV stream and the right-view AV stream simultaneously.
The read buffer 2a is a buffer that may be realized by, for example, a dual-port memory, and stores data of the 2D/left-view AV stream read out by the BD-ROM drive 1.
The read buffer 2b is a buffer that may be realized by, for example, a dual-port memory, and stores data of the right-view AV stream read out by the BD-ROM drive 1.
The switch 3 is used to switch the source of data to be input into the read buffers, between the BD-ROM drive 1 and the local storage 18.
The system target decoder 4 decodes the streams by performing demultiplexing processing on the source packets read out into the read buffers 2a and 2b.
The plane memory set 5a is composed of a plurality of plane memories. The plane memories include a left-view video plane, a right-view video plane, a secondary video plane, an interactive graphics plane (IG plane), and a presentation graphics plane (PG plane).
The plane composition unit 5b instantaneously superimposes images in the left-view video plane, right-view video plane, secondary video plane, IG plane, PG plane, and GFX plane, and displays the superimposed images onto a screen such as a TV screen. In displaying such superimposed images, the plane composition unit 5b crops the images in a set of the secondary video plane, PG plane, and IG plane for the left view and the right view alternately, and composites the cropped images with the left- or right-view video plane. The composited images are transferred to the GFX plane for superimposition processing.
The plane composition unit 5b crops graphics in the IG plane for the left view and the right view alternately, by using the offset information specified by the API, and outputs, to the television, an image generated by superimposing the images in the left- or right-view video plane, the secondary video plane, the PG plane, and the IG plane.
The superimposed image is output to the television or the like in compliance with the 3D method. When it is necessary to play back the left-view images and the right-view images alternately by using the shutter glasses, the images are output as they are. When the superimposed images are output to, for example, a television with a lenticular lens, left- and right-view images are transferred and stored into a temporary buffer in this order; once the two images have been stored, they are output simultaneously.
The HDMI transmission/reception unit 6 includes an interface conforming to, for example, the HDMI standard. The HDMI transmission/reception unit 6 performs data transmission/reception so that the playback device and a device (in the present embodiment, the television 300) connected to the playback device using the HDMI connection conform to the HDMI standard. The picture data stored in the video and the uncompressed audio data decoded by the audio decoder are transferred to the television 300 via the HDMI transmission/reception unit 6. The television 300 holds, for example, (i) information indicating whether or not it supports the stereoscopic display, (ii) information regarding resolution with which the 2D display can be performed, and (iii) information regarding resolution with which the stereoscopic display can be performed. Upon receiving a request from the playback device via the HDMI transmission/reception unit 6, the television 300 returns the requested necessary information (e.g., the above pieces of information (i), (ii) and (iii)) to the playback device. In this way, the playback device can obtain information indicating whether or not the television 300 supports the stereoscopic display from the television 300 via the HDMI transmission/reception unit 6.
When the program execution unit 11 or the like instructs the playback control unit 7 to play back a 3D playlist, the playback control unit 7 identifies a 2D/left-view AV stream of a playitem that is the playback target among the 3D playlist, and identifies a right-view AV stream of a sub-playitem in the 3D subpath that should be played back in synchronization with the playitem. Thereafter, the playback control unit 7 interprets the entry map of the corresponding clip information file, and requests the BD-ROM drive 1 to alternately read out extents of the 2D/left-view AV stream and the right-view AV stream, starting with the playback start point, based on the extent start type that indicates which one of the extents of the 2D/left-view AV stream and the right-view AV stream is disposed first. When the playback is started, the first extent is read out into the read buffer 2a or 2b. Once this readout has been completed, the first extent is transferred from the read buffer 2a or 2b to the system target decoder 4. When playing back the 3D playlist, the playback control unit 7 notifies the plane composition unit 5b of the 3D metadata included in the clip information file that corresponds to the 3D/left-view AV stream.
In performing the aforementioned control, the playback control unit 7 can read out a file into the memory by performing a system call for a file open.
The file open denotes a process in which the file system (i) searches for a directory using a file name that is given upon performing the system call, (ii) secures a File Control Block (FCB) if the file is found, and (iii) returns the number of the file handle. The FCB is generated by copying, into the memory, the contents of the directory entry of the target file. Afterward, the playback control unit 7 can transfer the target file from the BD-ROM to the memory by presenting this file handle to the BD-ROM drive 1.
The playback engine 7a executes AV playback functions. The AV playback functions denote a group of traditional functions succeeded from CD and DVD players. Here, the AV playback functions are processing such as starting playback, stopping playback, pausing, canceling pausing, canceling still image function, fast forward performed by specifying a value indicating the playback speed, rewind performed by specifying a value indicating the playback speed, switching audio, switching picture data for secondary video, switching angle, etc.
The playback control engine 7b executes playlist playback functions in response to a function call from a command interpreter (operator of the HDMV mode) and a Java® platform (operator of the BD-J mode). The playlist playback functions are processing of performing, from among the aforementioned AV playback functions, the playback start and the playback stop in accordance with the current playlist information constituting the current playlist and the current clip information.
The management information memory 9 is a memory for storing the current playlist information and the current clip information. The current playlist information is a piece of playlist information that is currently being the processing target, from among a plurality of pieces of playlist information that can be accessed from the BD-ROM, built-in medium drive, or removable medium drive. The current clip information is a piece of clip information that is currently being the processing target, from among a plurality of pieces of clip information that can be accessed from the BD-ROM, built-in medium drive, or removable medium drive.
The register set 10 (a player status/setting register set) is a set of registers including: a player status register for storing a playlist playback status; a player setting register for storing configuration information indicating the configuration of the playback device; and a general-purpose register for storing arbitrary information that is to be used by contents. Here, the playlist playback status indicates, for example, the AV data that is being used from among various pieces of AV data information described in the playlist, and a position (time) at which a portion of the playlist which is currently being played back exists.
When the playlist playback status has changed, the playback control engine 7b stores the changed playlist playback status into the register set 10. Also, in accordance with an instruction issued from an application run by the command interpreter (operator of the HDMV mode) and the Java® platform (operator of the BD-J mode), a value specified by the application may be stored, and the stored value may be transferred to the application.
The program execution unit 11 is a processor for executing a program stored in a BD program file. The program executing unit 11 performs the following controls by operating in accordance with the stored program: (1) instructing the playback control unit 7 to play back a playlist; and (2) transferring, to the system target decoder, PNGs and JPEGs for a menu or graphics for a game, so that they can be displayed on the screen. These controls can be performed freely in accordance with construction of the program, and how the controls are performed is determined by the process of programming the BD-J application in the authoring process.
The program memory 12 stores a current dynamic scenario, and is used for processing performed by the HDMV module (operator of the HDMV mode) and the Java® platform (operator of the BD-J mode). The current dynamic scenario is one of the Index.bdmv, BD-J object, and movie object recorded on the BD-ROM which is currently being targeted for execution. The program memory 12 includes a heap memory.
The heap memory is a stack region for storing byte codes of the system application, byte codes of the BD-J application, system parameters used by the system application, and application parameters used by the BD-J application.
The HDMV module 13 is a DVD virtual player that is an operator of the HDMV mode. The HDMV module 13 is also an executor of the HDMV mode. The HDMV module 13 has a command interpreter, and performs the control in the HDMV mode by interpreting and executing the navigation command constituting the movie object. The navigation command is described in a syntax that resembles a syntax used in the DVD-Video. Accordingly, it is possible to realize a DVD-Video-like playback control by executing the navigation command.
The BD-J platform 14 is a Java® platform that is an operator of the BD-J mode, and is fully implemented with Java 2 Platform, Micro Edition (J2ME) Personal Basis Profile (PBP 1.0) and Globally Executable MHP specification (GEM 1.0.2) for package media targets. The BD-J platform 14 is composed of a class loader, a byte code interpreter, and an application manager.
The class loader is one of system applications, and loads a BD-J application by reading out byte codes from the class file existing in the JAR archive file, and storing the byte codes into the heap memory.
The byte code interpreter is what is called a Java® virtual machine. The byte code interpreter converts (i) the byte codes constituting the BD-J application stored in the heap memory and (ii) the byte codes constituting the system application, into native codes, and causes the MPU to execute the native codes.
The application manager is one of system applications, and performs application signaling for the BD-J application (e.g., starts or ends the BD-J application) based on the application management table in the BD-J object. This concludes the internal structure of the BD-J platform.
The middleware 15 is an operating system for the embedded software, and is composed of a kernel and a device driver. The kernel provides the BD-J application with a function unique to the playback device, in response to a call for the Application Programming Interface (API) from the BD-J application. The middleware 15 also realizes control of the hardware, such as starting the interruption handler by sending an interruption signal.
The mode management module 16 holds Index.bdmv that was read out from the BD-ROM, built-in medium drive, or removable medium drive, and performs mode management and branch control. The mode management by the mode management module is module assignment to cause either the BD-J platform or the HDMV module to execute the dynamic scenario.
The user event processing unit 17 receive a user operation via a remote control, and causes the program execution unit 11 or the playback control unit 7 to perform processing as instructed by the received user operation. For example, when the user presses a button on the remote control, the user event processing unit 17 instructs the program execution unit 11 to execute a command included in the button. For example, when the user presses a fast forward or rewind button on the remote control, the user event processing unit 17 instructs the playback control unit 7 to execute the fast forward or rewind processing on the AV stream of the playlist currently being played back.
The local storage 18 includes the built-in medium drive for accessing a hard disk and the removable medium drive for accessing a semiconductor memory card, and stores downloaded additional contents, data to be used by applications, and the like. An area for storing the additional contents is divided into small areas that are in one to one correspondence with BD-ROMs. Also, an area for storing data used by applications is divided into small areas that are in one to one correspondence with applications.
The nonvolatile memory 19 is a recording medium that is, for example, a readable/writable memory, and is a medium such as flash memory or FeRAM that can preserve the recorded data even if power is not supplied thereto. The nonvolatile memory 19 is used to store a backup of data stored in the register set 10.
Next, the internal structures of the system target decoder 4 and the plane memory set 5a will be described.
The ATC counter 21 generates an Arrival Time Clock (ATC) for adjusting the operation timing within the playback device.
After a source packet is stored in the read buffer 2a, the source depacketizer 22 transfers a TS packet of the source packet to the PID filter. More specifically, the source depacketizer 22 transfers the TS packet to the PID filer according to the recording rate of the AV stream file the moment the value of the ATC generated by the ATC counter and the value of the ATS of the source packet become identical. In transferring the TS packet, the source depacketizer 22 adjusts the time of input into the decoder in accordance with the ATS of the source packet.
The PID filter 23 transfers, from among the TS packets output from the source depacketizer 22, TS packets having a PID that matches a PID required for playback, to the primary video decoder 31, the secondary video decoder 34, the IG decoder 38, the PG decoder 36, the primary audio decoder 40, or the secondary audio decoder 41.
The STC counter 24 generates a System Time Clock (STC) for adjusting the operation timing of each decoder.
The ATC counter 25 generates an Arrival Time Clock (ATC) for adjusting the operation timing within the playback device.
After a source packet is stored in the read buffer 2b, the source depacketizer 26 transfers a TS packet of the source packet to the PID filter. More specifically, the source depacketizer 26 transfers the TS packet to the PID filer according to the system rate of the AV stream the moment the value of the ATC generated by the ATC counter and the value of the ATS of the source packet become identical. In transferring the TS packet, the source depacketizer 26 adjusts the time of input into the decoder in accordance with the ATS of the source packet.
The PID filter 27 transfers, from among the TS packets output from the source depacketizer 26, TS packets having a PID that matches a PID written in the stream selection table of the current playitem, to the primary video decoder, in accordance with the PID.
The primary video decoder 31 decodes the left-view video stream, and writes the decoding results, namely uncompressed video frames, into the left-view video plane 32.
The left-view video plane 32 is a plane memory that can store picture data with a resolution of, for example, 1920×2160 (1280×1440).
The right-view video plane 33 is a plane memory that can store picture data with a resolution of, for example, 1920×2160 (1280×1440).
The secondary video decoder 34 has the same structure as the primary video decoder, decodes an secondary video stream input thereto, and writes resultant pictures to the secondary video plane in accordance with respective display times (PTSs).
The secondary video plane 35 stores picture data for the secondary video that is output from the system target decoder 4 as a result of decoding the secondary video stream.
The PG decoder 36 extracts a presentation graphics stream from the TS packets input from the source depacketizer, decodes the extracted presentation graphics stream, and writes the resultant uncompressed graphics data to the PG plane in accordance with respective display times (PTSs).
The PG plane 37 stores an uncompressed graphics object obtained by decoding the presentation graphics stream.
The IG decoder 38 extracts an interactive graphics stream from the TS packets input from the source depacketizer, decodes the extracted interactive graphics stream, and writes the resultant uncompressed graphics object to the IG plane in accordance with respective display times (PTSs).
The IG plane 39 stores graphics data obtained by decoding the interactive graphics stream.
The primary audio decoder 40 decodes the primary audio stream.
The secondary audio decoder 41 decodes the secondary audio stream.
The mixer 42 mixes the decoding result of the primary audio decoder 40 with the decoding result of the secondary audio decoder 41.
The rendering engine 43 decodes graphics data (e.g., JPEG and PNG) used by the BD-J application when rendering a menu.
The GFX plane 44 is a plane memory into which graphics data (e.g., JPEG and PNG) is written after it is decoded.
Next, the internal structure of the primary video decoder 31 will be explained. The primary video decoder 31 is composed of a TB 51, an MB 52, an EB 53, a TB 54, an MB 55, an EB 56, a video decoder 57, a buffer switch 58, a DPB 59, and a picture switch 60.
The Transport Buffer (TB) 51 is a buffer for temporarily storing TS packets containing the left-view video stream as they are, after they are output from the PID filter 23.
The Multiplexed Buffer (MB) 52 is a buffer for temporarily storing PES packets when the video stream is output from the TB to the EB. When the data is transferred from the TB to the MB, the TS headers are removed from the TS packets.
The Elementary Buffer (EB) 53 is a buffer for storing video access units in the encoded state. When the data is transferred from the MB to the EB, the PES headers are removed.
The Transport Buffer (TB) 54 is a buffer for temporarily storing TS packets containing the right-view video stream as they are, after they are output from the PID filter.
The Multiplexed Buffer (MB) 55 is a buffer for temporarily storing PES packets when the video stream is output from the TB to the EB. When the data is transferred from the TB to the MB, the TS headers are removed from the TS packets.
The Elementary Buffer (EB) 56 is a buffer for storing video access units in the encoded state. When the data is transferred from the MB to the EB, the PES headers are removed.
The video decoder 57 generates a frame/field image by decoding each access unit constituting the video elementary stream at predetermined decoding times (DTSs). Since there are various compression encoding methods (e.g., MPEG-2, MPEG-4 AVC, and VC-1) that can be used to compression encode the video stream to be multiplexed on the AV stream file, the decoding method used by the video decoder 57 is selected in accordance with the stream attribute of each stream. When decoding picture data constituting the base-view video stream, the video decoder 57 performs motion compensation using pieces of picture data, which exist in the future and past directions, as reference pictures. When decoding picture data constituting the dependent-view video stream, the video decoder 57 performs motion compensation using pieces of picture data that constitute the base-view video stream as reference pictures. After each picture data is decoded in this way, the video decoder 57 transfers the decoded frame/field image to the DPB 59, and transfers the corresponding frame/field image to the picture switch at the display time (PTS) assigned thereto.
The buffer switch 58 determines from which one of the EB 53 and the EB 56 the next access unit should be extracted, by using the decode switch information obtained when the video decoder 57 decoded the video access units, and transfers a picture from the EB 53 or the EB 56 to the video decoder 57 at the decoding time (DTS) assigned to the video access unit. Since the DTSs of the left- and right-view video streams are set to alternate on the time axis on a per-picture basis, it is preferable that the video access units are transferred to the video decoder 57 on a per-picture basis when, for example, decoding is performed ahead of schedule regardless of the DTSs.
The Decoded Picture Buffer (DPB) 59 is a buffer for temporarily storing the decoded frame/field image. The DPB 59 is used by the video decoder 57 to refer to the decoded pictures when the video decoder 57 decodes a video access unit such as the P-picture or the B-picture having been encoded by the inter-picture predictive encoding.
When the decoded frame/field image transferred from the video decoder 57 is to be written into a video plane, the picture switch 60 switches the writing destination between the left-view video plane and the right-view video plane. When the left-view stream is being processed, uncompressed picture data is instantaneously written into the left-view video plane. When the right-view stream is being processed, uncompressed picture data is instantaneously written into the right-view video plane.
The plane memory set 5a includes a left-view video plane, a right-view video plane, a secondary video plane, a PG plane, an IG plane, and a GFX plane, which are arranged in the stated order. The system target decoder 4 writes image data into the left- or right-view video plane at the timings the corresponding PTS.
Based on the value set to the PSR 22 and the duplicate flag assigned to the playitem currently being played back, the switch 62 connects to the left- or right-view video plane, and establishes a connection path with the connected plane so as to receive data via the connection path. Once the switch 62 has selected one of the planes that is connected thereto along the connection path, the switch 62 transfers data received from the selected plane. The transferred data is superimposed with data in the secondary video plane, the PG plane and the IG plane.
In this method, different contents are stored into the left- and right-view video planes to realize the stereoscopic view. However, even if the same content is stored into the left- and right-view video planes, it is possible to realize pseudo stereoscopic view by assigning different coordinates to the pixels in the left- and right-view video planes. Among the above-described plane memories, the PG plane realizes stereoscopic view by changing the coordinates of pixels in the plane memory. The following describes how the stereoscopic view is realized by the PG plane.
A description is now given of a method of compositing planes, by taking an example of the PG plane shown in
When the image plane to be superimposed is the right-view video plane, the plane composition unit 5b shifts the coordinates of the image data stored in the PG plane towards the negative direction along the X-axis by the offset value. The plane composition unit 5b then crops the PG plane in such a manner that the cropped PG plane would fit within the right-view video plane, and superimposes the cropped PG plane (see the lower row of
This concludes the description of plane composition. The following describes an internal structure of the register set 10 and the detail of the playback control engine 7b.
The left-hand side of
The values stored in the PSRs (see
First, representative PSRs will be described.
PSR 1 is a stream number register for the audio stream, and stores a current audio stream number.
PSR 2 is a stream number register for the PG stream, and stores a current PG stream number.
PSR 4 is set to a value ranging from “1” to “100” to indicate a current title number.
PSR 5 is set to a value ranging from “1” to “999” to indicate a current chapter number; and is set to a value “0xFFFF” to indicate that the chapter number is invalid in the playback device.
PSR 6 is set to a value ranging from “0” to “999” to indicate a current playlist number.
PSR 7 is set to a value ranging from “0” to “255” to indicate a current playitem number.
PSR 8 is set to a value ranging from “0” to “0xFFFFFFFF” to indicate a current playback time point (current PTM) with the time accuracy of 45 KHz.
PSR 10 is a stream number register for the IG stream, and stores a current IG stream number.
PSR 21 indicates whether or not the user intends to perform stereoscopic playback. A value is set to the PSR 21 via the navigation of the BD program file, API, or the OSD of the player. The remote control 500 has a “2D/3D switch button”. When the user event processing unit 17 notifies that the 2D/3D switch button is held down at an arbitrary timing (e.g., during playback of a 3D playlist), the value of the PSR 21 is reset—i.e., changed from a value indicating stereoscopic playback to a value indicating 2D playback, or vice versa. This way, the user's preference can be taken into consideration.
PSR 22 indicates a display type value.
PSR 23 is used to set “Display Capability for 3D”. This indicates whether or not the display device connected to the playback device is capable of performing stereoscopic playback.
PSR 24 is used to set “Player Capability for 3D”. This indicates whether or not the playback device is capable of performing stereoscopic playback. The “Player Capability for 3D” stored in the PSR 24 means 3D playback capability of the playback device as a whole, and thus may be simply referred to as “3D-Capability”.
On the other hand, the playback control engine 7b includes a procedure execution unit 8 for uniquely determining the display type of the current playlist by referring to the PSR 4, PSR 6, PSR 21, PSR 23, and PSR 24 in the register set 10, and the stream selection table of the current playlist information in the management information memory 9.
When the value of the PSR 21 is changed during playback of a 3D playlist, the procedure execution unit 8 resets the display type value of the PSR 22 by following the processing procedure shown in
This concludes the description of the register set 10.
Note, the playback device 200 of the present embodiment is a 3D/2D playback device capable of playing back 3D video. However, a 2D playback device plays back 2D video by referring only to a 2D playlist describing a playback path along which 2D video is played back.
A description is now given of a mechanism of a 2D playback device to play back a 2D playlist.
In S1, the value of the PST 24 is checked. When the value of the PST 24 is “0”, it means that the playback device is a 2D playback device, and therefore 2D video is played back. When the value of the PST 24 is “1”, processing moves to S2.
In S2, the program displays a menu screen and makes an inquiry to the user as to whether he/she desires playback of 2D video or 3D video. The user selects one of the 2D and 3D videos via, for example, a remote control. When the user desires playback of the 2D video, the 2D playlist is played back. When the user desires playback of the 3D video, processing moves to S3.
In S3, the program checks whether the display supports playback of the 3D video. For example, the program connects the playback device to the display using the HDMI connection, so that the playback device can make an inquiry to the display as to whether the display supports playback of the 3D video. When the display does not support playback of the 3D video, the 2D playlist is played back. Here, alternatively, the program may present, on the menu screen or the like, a notification for notifying the user that the television does not support playback of the 3D video. When the display supports playback of the 3D video, the 3D playlist is played back.
Note, the prefix numbers (e.g., XXX in the case of XXX.mpls) given to the file names of the 2D and 3D playlists may be consecutive. This makes it easy to identify the 3D playlist corresponding to the 2D playlist. This concludes the description of selection of 2D and 3D playlists.
<How to Seamlessly Switch between Stereoscopic Playback and 2D Playback>
When the user selects 2D video playback during 3D video playback (stereoscopic playback), it is necessary to switch to 2D playback at an arbitrary timing. Similarly, when the user selects 3D (stereoscopic) video playback during 2D video playback, it is necessary to switch from 2D playback to stereoscopic playback smoothly. One example of the latter case is when the user desires to switch from stereoscopic playback to 2D playback due to eye strain.
One method of switching from 3D video playback to 2D video playback is to switch from playback of a 3D playlist, which stores the playback path along which the 3D video is played back, to playback of a 2D playlist, which stores the playback along which the 2D video is played back. According to this method, for example, while playing back the 3D video from the 3D playlist, the user issues an instruction to switch from the 3D video playback to the 2D video playback via a menu or the like displayed by the BD program file. In response to this instruction, the BD program file halts the playback of the 3D playlist, and specifies/selects the 2D playlist corresponding to this 3D playlist. Then, the BD program file specifies the playback start point of the specified/selected 2D playlist, which corresponds to the time at which the playback of the 3D playlist is halted, and performs jump playback from this specified playback start point. In the above manner, the playback device transitions from the 3D video playback to the 2D video playback. However, use of this method requires processing of (i) halting the playback of the 3D playlist and (ii) executing the BD program file. In other words, use of this method gives rise to the problem that the playback device cannot switch from the 3D video playback to the 2D video playback seamlessly.
Furthermore, when switching to 2D video playback during 3D video playback, the playback device needs to transition from the processing of alternately outputting pictures of the left- and right-view video streams to the processing of only outputting pictures of the left-view video stream. More specifically, the frame rate at which the pictures are output has to change from a 48 Hz frame rate to a 24 Hz frame rate. As a result, the playback device and the television must re-establish synchronization with each other (i.e., reset the HDMI connection therebetween). That is, delay is caused when switching from the 3D video playback to the 2D video playback. In view of this problem, there needs to be a mechanism that does not make the playback device change the frame rate.
Described below is a method of seamlessly switching between 3D (stereoscopic) playback and 2D playback.
As shown in
The value of the PSR 22 is changed when the PSR 21 indicating the display type set by the user is changed via, for example, the navigation command of the BD program file, API, OSD of the player, or button operations of the remote control 500. This enables a status change during playback. In the example of
With the above structure, there is no need to switch from one playlist to another upon switching from 3D video playback to 2D video playback. During the 3D video playback, the user can dynamically switch to the 2D video playback. Furthermore, as pictures of the 2D video are played back in duplicate, it is required to neither play back pictures of the 3D video, nor switch change the frame rate. This makes it possible to seamlessly switch between 3D video playback and 2D video playback without causing any delay.
Operations of the plane composition unit 5b make it possible to seamlessly switch between 3D (stereoscopic) playback and 2D playback as shown in
Meanwhile, the switch 62 of the plane composition unit 5b follows the procedure shown in
After the switch 62 has performed the switch control, data stored in each plane of the plane memory set 5a is read and transferred and subjected to the superimposition processing. Subsequently, the current view is changed (Step S104), and processing moves to Step S101. The plane composition unit 5b repeats the sequence of processing of Steps S101 through S105 at 48 Hz. Note, the current view is a left view at the start of playback of a video stream. Each time processing of Step S105 is executed, the current view is changed alternately, i.e., the left view is changed to the right view, and vice versa.
The above switch control performed by the switch 62 allows (i) outputting 3D video at a 48 Hz frame rate when the PSR 22 indicates the “L-R display type”, and (ii) outputting 2D video at a 48 Hz frame rate when the PSR 22 indicates the “L-L display type”.
Note that it is possible to seamlessly switch between the 3D (stereoscopic) playback and the 2D playback as shown in
The playback control unit 7 notifies the primary video decoder 31 of a notification indicating the display type set to the PSR 22. When the display type is the L-R display type, the picture switch 60 outputs each of the decoded pictures of the left- and right-view video streams, which have been transferred from the decoder 57, to the corresponding video plane at the timing shown by the corresponding PTS. When the display type is the L-L display type, the picture switch 60 outputs (i) the decoded picture of the left-view video stream, which has been transferred from the decoder 57, to the left-view video plane at the timing shown by the corresponding PTS, and (ii) this decoded picture of the left-view video stream to the right-view video plane at the timing obtained by adding the 3D display delay to said PTS.
The above structure enables the seamless switching between the 3D (stereoscopic) playback and the 2D playback.
As set forth above, when the playback device is of the L-R display type, the present embodiment allows alternately outputting pieces of picture data obtained from the left- and right-video video streams. On the other hand, when the playback device is of the L-L display type, the present embodiment allows outputting each picture data obtained from the left-view video stream twice in succession. This way, the 3D video, which is output when the display type is the L-R display type, and the 2D video, which is output when the display type is the L-L display type, are output at the same frame rate. Consequently, there is no need to re-establish synchronization, or reset the HDMI connection, between the playback device and the display device upon switching between the 3D video and the 2D video, thus enabling seamless playback of the 3D video and the 2D video.
It has been described in the present embodiment that the 2D playback device is configured to play back the left-eye video as 2D video. However, it goes without saying that the present embodiment can still achieve the same effects when the 2D playback device is configured to play back the right-eye video as 2D video.
It has been described above that the present invention is a method of recording 3D video. The present invention may also be utilized when recording high frame rate video. The high frame rate video is composed of (i) odd-numbered frame video, which stores odd-numbered frames extracted from the high frame rate video, and (ii) an even-numbered frame video, which stores even-numbered frames extracted from the high frame rate video. By recording the odd-numbered frame video and the even-numbered frame video respectively as the 2D/left-eye video and the right-eye video by using the data structure described in the present embodiment, it is possible to obtain the same effects as when the 3D video is recorded. More specifically, with use of a BD-ROM on which the above high frame rate video has been recorded in accordance with the present embodiment, the 2D playback device can play back the odd-numbered frame video, whereas the 2D/3D playback device can playback either the odd-numbered frame video or the high frame rate video. Therefore, such a BD-ROM is compatible with and playable on both the 2D playback device and the 2D/3D playback device.
As shown in
According to the structure shown in
As shown in
Note, when the playback device transitions into a playback section in which 2D video storing only the left-view video stream is played back while the display type is set to the “L-R display type”, the PSR 22 is changed to the “L-L display type”. This way, the 2D video can be played back at a frame rate at which the 3D video was played back. Consequently, the 3D video playback can be succeeded by the 2D video playback without changing a frame rate.
Also, when the playback device transitions into a playback section in which 2D video storing only the left-view video stream is played back while the display type is set to the “L-R display type”, the duplicate flag assigned to this playback section may be prioritized. For example, the display type may be changed while giving priority to information (e.g., the aforementioned duplicate flag) assigned to the playback section. When the duplicate flag shows “duplicate playback”, the PSR 22 is changed to indicate the “L-L display type”. When the duplicate flag shows “normal playback”, the PSR 22 is changed to indicate the “2D normal playback type”.
An entry map may be configured such that, as shown in
In a case where an extent starts with a TS packet including the head of the I-picture that is located at the start of a GOP constituting a left-view video stream of a left-view AV stream, an entry point must also be created. Similarly, in a case where an extent starts with a TS packet including the head of the picture that is located at the start of a right-eye GOP constituting a right-view video stream of a right-view AV stream, an entry point must also be created.
It has been described above that an extent start flag is added to each entry point of an entry map. However, an entry map is provided with a one-bit flag called an angle switch flag, which indicates a timing to switch to a different angle during multi-angle playback. In order to reduce the bit size, the extent start flags may be combined with the angle switch flag. In this case, the entry map header information may be provided with a flag indicating whether this one-bit field is the “extent start flag” or the “angle switch flag”. Here, the 2D/3D playback device interprets the meaning of this one-bit field in the entry map by checking said flag provided to the entry map header information, and switch to proper processing according to the result of this check.
It has been described above that an extent start flag is added to each entry point of an entry map. The present invention, however, is not limited to this. The extent start flag may be replaced by any information that can identify the extent sizes of the corresponding AV stream. For example, the extent sizes of the corresponding AV stream may be listed and stored in a clip information file as metadata. Alternatively, a sequence of bits may be reserved in one to one correspondence with entry points of an entry map, so as to indicate that (i) each entry point is at the start of an extent when the corresponding bit shows “1”, and (ii) each entry point is not at the start of an extent when the corresponding bit shows “0”.
This Second Embodiment section discusses a playback method and a playback device to be utilized in a case where a picture cannot be decoded due to damage during playback of 3D video recorded on a BD-ROM.
The upper row of
In light of the above problem, a playback device pertaining to Second Embodiment displays pictures during 3D video playback as shown in the lower row of
The following describes a playback device that has been modified to execute the above method. The playback device pertaining to the present embodiment comprises a modified version of the primary video decoder included in the playback device 200 described in First Embodiment.
As one modification of the present embodiment, the playback device may display pictures during 3D video playback as shown in the lower row of
As another modification, the 2D/3D playback device may display pictures during 3D video playback as shown in the lower row of
As yet another modification, the 2D/3D playback device may display pictures during 3D video playback as shown in the lower row of
As yet another modification, the 2D/3D playback device may display pictures during 3D video playback as shown in the lower row of
As yet another modification, the 2D/3D playback device may display pictures during 3D video playback as shown in the lower row of
The above has discussed the playback method and the playback device to be utilized in a case where pictures cannot be decoded due to damage. Although it has been described that the damaged pictures 6501 are included in the right-view video stream, it goes without saying that the above-described structures are also applicable in a case where the damaged pictures 6501 are included in the left-view video stream (in this case, the processing performed with respect to the left-view video stream and the processing performed with respect to the right-view video stream are interchanged). It should be noted that as has been described in the present embodiment, in a case where the pictures of the right-view video stream are configured to refer to the pictures of the left-view video stream, a damaged picture counterpart of the right-view video stream that corresponds to a damaged picture of the left-view video stream would also become a damaged picture. For this reason, when the left-view video stream includes one or more damaged pictures, it is effective to use methods of replacing both of the damaged picture (s) and the damaged picture counterpart (s) with other pictures—i.e., it is effective to use the methods explained with reference to
This Third Embodiment section discusses pause processing of pausing 3D video recorded on a BD-ROM.
As shown in
The following describes a 2D/3D playback device whose functions have been extended to execute the above methods of performing pause processing during 3D video playback.
Firstly, below is a description of a 2D/3D playback device whose functions have been extended to execute the first method of displaying a picture of one of the 2D/left-eye video and the right-eye video at the time of issuance of the pause instruction. Upon receiving the pause instruction from the user event processing unit 17 or the program execution unit 11, the playback control unit 7 of the 2D/3D playback device 200 issues a decode cease instruction to the BD-ROM drive 1 and the system target decoder 4. Upon receiving the decode cease instruction, the BD-ROM drive 1 ceases read-out of data from the BD-ROM disc. Upon receiving the decode cease instruction, the system target decoder 4 ceases decoding as well as outputting of audio to the speaker. Here, if a picture is being written into the left- or right-view video plane at the time of receiving the decode cease instruction, the system target decoder 4 waits until writing of this picture is completed, and notifies the plane composition unit 5b of the pause status. As shown in
Secondly, below is a description of a 2D/3D playback device whose functions have been extended to execute the second method of displaying a pair of pictures of the 2D/left-eye video and the right-eye video displayed at the time of issuance of the pause instruction. Upon receiving the pause instruction from the user event processing unit 17 or the program execution unit 11, the playback control unit 7 of the 2D/3D playback device 200 issues a decode cease instruction to the BD-ROM drive 1 and the system target decoder 4. Upon receiving the decode cease instruction, the BD-ROM drive 1 ceases read-out of data from the BD-ROM disc. Upon receiving the decode cease instruction, the system target decoder 4 ceases decoding as well as outputting of audio to the speaker. Here, if a picture is being written to the left- or right-view video plane at the time of receiving the decode cease instruction, the system target decoder 4 waits until writing of this picture is completed. Also, if the last picture has been output to the left-view video plane (i.e., the last picture that has been output is of the left-view video stream), the system target decoder 4 further waits until a picture of the right-view video stream, which is to be displayed paired with the last picture output to the left-view video plane, is decoded and output to the right-view video plane. The system target decoder 4 then notifies the plane composition unit 5b of the pause status. Upon receiving the pause status from the system target decoder, the plane composition unit 5b alternately performs (i) processing of superimposing the left-view video plane with other planes and (ii) processing of superimposing the right-view video plane with other planes, at intervals equivalent to the frame rate at which 3D video is played back.
This concludes the description of the pause processing of pausing the 3D video.
This Fourth Embodiment section discusses the data structure of still images constituting 3D video and a playback method and a playback device for playing back the still images.
First, the following describes the data structure of still images constituting 3D video and a playback method for playing back the still images.
Below is a description of a 2D/3D playback device whose functions have been extended to playback still images constituting 3D video. Assume a case where the 2D/3D playback device attempts to playback the left-view video stream including the still images shown in
On the other hand, in a case where the 2D/3D playback device attempts to play back the left- and right-view video streams including the still images shown in
It has been described above that the sequence end code 6402, which is identical to the sequence end code 6401 of the left-view video stream, is stored at the end of the first picture of the right-eye GOP of the right-view video stream. Alternatively, the sequence end code 6402 may have a unique format designed only for the right-view video stream. For example, the sequence end code 6402 may be newly defined, or the sequence end code 6402 may be defined exclusively for the right-view video stream by the supplementary data of the right-view video stream. Alternatively, the sequence end code 6402 may be replaced by the decode switch information illustrated in
This Fifth Embodiment section discusses a playback method and a playback device for performing special playback of 3D video recorded on a BD-ROM.
Blocks shown in
In order to perform fast forward playback the 3D video, it is necessary to play back pairs of (i) I-pictures indicated by the entry points of the left-view video stream and (ii)) pictures indicated by the entry points of the right-view video stream. At this time, as shown by a playback path of the 3D video in
In view of the above problem, as shown in
Taking an example of the first I-picture shown in
In order to realize the above structure, the left-view AV stream may be assigned PIDs for special playback and store I-pictures of the right-view video stream in correspondence with these PIDs, as shown in
Also, in order to realize the above structure, the left-view AV stream may store pictures indicated by the entry points of the right-view video stream for special playback, as shown in
This Sixth Embodiment section discusses a recording device for performing the recording method described in First Embodiment.
When the recording method is to be realized by the real-time recording technology, the recording device for performing the recording method creates an AV stream file in real time and records the AV stream file on the BD-RE, BD-R, hard disk, or semiconductor memory card.
In this case, the AV stream file may be a transport stream obtained by the recording device encoding an analog input signal in real time, or a transport stream obtained by the recording device partializing a digitally-input transport stream.
The recording device for performing the real-time recording includes: a video encoder for obtaining a video stream by encoding a video signal; an audio encoder for obtaining an audio stream by encoding an audio signal; a multiplexer for obtaining a digital stream in the MPEG-2 TS format by multiplexing the video stream, audio stream, and the like; and a source packetizer for converting TS packets constituting the digital stream in the MPEG-2 TS format into source packets. The recording device stores an MPEG-2 digital stream having been converted into the source packet format into an AV stream file, and writes the AV stream file onto the BD-RE, BD-R, or the like. When the digital stream is written, the control unit of the recording device performs processing of generating the clip information and the playlist information in the memory. More specifically, when the user requests the recording processing, the control unit creates an AV stream file and a clip information file onto the BD-RE or the BD-R.
After this, when the starting position of a GOP in the video stream is detected from the transport stream input from outside the device, or when the GOP of the video stream is created by the encoder, the control unit of the recording device obtains (i) the PTS of the intra picture positioned at the start of the GOP and (ii) the packet number of the source packet that stores the starting portion of the GOP, and additionally writes the pair of the PTS and the packet number into the entry map of the clip information file as a pair of EP_PTS entry and EP_SPN entry. Thereafter, each time a GOP is generated, a pair of EP_PTS entry and EP_SPN entry is additionally written into the entry map of the clip information file. Here, when the starting portion of a GOP is an IDR picture, an “is_angle_change” flag having been set to “ON” is added to a pair of EP_PTS entry and EP_SPN entry. Also, when the starting portion of a GOP is not an IDR picture, an “is_angle_change” flag having been set to “OFF” is added to a pair of EP_PTS entry and EP_SPN entry.
Further, the attribute information of a stream in the clip information file is set in accordance with the attribute of the stream to be recorded. After the AV stream file and the clip information have been generated and written onto the BD-RE or the BD-R in the above manner, the playlist information defining the playback path via the entry map in the clip information is generated and written onto the BD-RE or the BD-R. When this process is executed with the real-time recording technology, a hierarchical structure composed of the AV stream clip information and the playlist information is obtained on the BD-RE or the BD-R.
This concludes the description of the recording device for performing the recording method by the real-time recording. Next is a description of the recording device for performing the recording method by the pre-format recording.
The recording device described here is used by the authoring staff in a production studio for distributing movie contents. The recording device of the present invention is used as follows. First, according to the operations of the authoring staff, a digital stream that has been compression encoded in compliance with the MPEG standard, and a scenario describing how a movie title should be played back, are generated. Then, a volume bit stream for a BD-ROM including these data is generated.
The video encoder 501 generates left- and right-view video streams by encoding left- and right-view uncompressed bit map images in accordance with a compression method such as MPEG-4 AVC or MPEG-2. At this time, the right-view video stream is generated by encoding frames of the left-view video stream by the inter-picture predictive encoding. In the process of the inter-picture predictive encoding, the depth information for 3D video is extracted from motion vectors of the left- and right-view images, and the depth information is written into a frame depth information storage unit 501a. The video encoder 501 extracts motion vectors in units of 8×8 or 16×16 macroblocks, so as to perform image compression with use of correlated characteristics of pictures.
Assume a case where motion vectors are extracted in units of macroblocks from video that shows a house in the background and a circle in the foreground, as shown in
The detected motion vectors are extracted, and the depth information is generated on a per-frame basis when the 3D video is displayed. The depth information is, for example, an image having the same resolution as a frame having the depth of eight bits.
The material creation unit 502 generates streams such as an audio stream, a presentation graphics stream, and an interactive graphics stream, and writes these generated streams into an audio stream storage unit 502a, a presentation graphics stream storage unit 502b, and an interactive graphics stream storage unit 502c, respectively.
The material creation unit 502 creates the audio stream by encoding uncompressed Linear PCM audio and the like by a compression method such as AC3. Other than this, the material creation unit 502 creates a presentation graphics stream in a PG stream format conforming to the BD-ROM standard, based on the subtitle information file including a subtitle image, a display timing, and subtitle effects such as fade-in and fade-out. The material creation unit 502 also creates an interactive graphics stream in a format for the menu screen conforming to the BD-ROM standard, based on the menu file describing bit-map images to be used for the menu, transition of the buttons arranged on the menu, and the display effects.
The scenario generation unit 503 generates a scenario in the BD-ROM format, in accordance with information on each stream generated by the material creation unit 502 and the operations of the authoring staff via the GUI. Here, the scenario means files such as an index file, a movie object file and a playlist file. The scenario generation unit 503 also generates a parameter file describing which stream(s) constitutes each AV stream for realizing the multiplex processing. The data structures of the generated files, namely the index file, the movie object file and the playlist file, are the same as the data structure described in First Embodiment.
The BD program creation unit 504 creates a source code for a BD program file and a BD program in accordance with a request from the user received via a user interface such as the GUI. At this time, the program of the BD program file can use the depth information output from the video encoder 501 to set the depth of the GFX plane.
The multiplex processing unit 505 generates an AV stream file in the MPEG-2 TS format by multiplexing a plurality of streams described in the BD-ROM scenario data, such as the left-view video stream, right-view video stream, video, audio, subtitles, and buttons. When generating the AV stream file, the multiplex processing unit 505 also generates a clip information file that makes a pair with the AV stream file.
The multiplex processing unit 505 generates the clip information file by associating, as a pair, (i) the entry map generated by the multiplex processing unit 505 itself and (ii) attribute information that indicates audio attribute, image attribute and the like for each stream included in the AV stream file. The structure of the clip information file is the same as the structure that has been described in the above embodiments.
The format processing unit 506 generates a disc image in the UDF format (a file system conforming to the BD-ROM standard) by arranging, in the format conforming to the BD-ROM standard, files and directories including the BD-ROM scenario data generated by the scenario generation unit 503, the BD program file created by the BD program creation unit 504, and the AV stream file and the clip information file generated by the multiplex processing unit 505.
At this time, the format processing unit 506 generates 3D metadata for the PG stream, ID stream, and secondary video stream by using the depth information output from the video encoder 501. The format processing unit 506 also (i) sets arrangement of images on the screen automatically, so that they do not overlap with objects of the 3D video, and (ii) adjusts offset values so that depths do not overlap each other. The file layout of the disc image generated in this way is set to have the data structure of the file layout described in Embodiments 1 and 2. The file layout of the generated disc image is set according to the data structure of the file layout described in First and Second embodiments. The BD-ROM can be manufactured by converting the generated disc image into data suited for the BD-ROM press processing, and performing the BD-ROM pressing processing on this data.
(Embodiment as Recording Device for Realizing Managed Copy)
The recording device may have a function to write a digital stream by managed copy.
Managed copy is technology for communicating with a server and enabling execution of copy only if the copy is authenticated and permitted. The managed copy is utilized when the digital stream, playlist information, clip information and application program recorded on a read-only recording medium (e.g., a BD-ROM) are to be copied to another optical disc (e.g., BD-R, BD-RE, DVD-R, DVD-RW and DVD-RAM), hard disk, and removable medium (e.g., an SD memory card, Memory Stick, CompactFlash, SmartMedia, MultiMediaCard). This technology makes it possible to perform various controls, such as limiting the number of backups and permitting the backup only when there is a charge on the backup.
When performing a copy from the BD-ROM to the BD-R or BD-RE, if the copy source and the copy destination have the same recording capacity, the managed copy only requires a sequential copy of the bit stream on the BD-ROM from the innermost circumference to the outermost circumference of the BD-ROM.
When the managed copy technology is used to copy from/to media of different types, transcoding is required. Here, “transcoding” denotes processing of adapting the digital stream recorded on the BD-ROM to the application format of the copy-destination medium by converting the format of the digital stream from the MPEG-2 transport stream format to the MPEG-2 program stream format and the like, or by performing re-encoding after lowering the bit rates assigned to video and audio streams. In order to perform transcoding, it is necessary to obtain the AV stream file, clip information and playlist information by performing the above-described real-time recording processing.
(Additional Notes)
The present invention has been described above through the best embodiments that the Applicant acknowledges as of now. However, further improvements or changes can be added regarding the following technical topics. It is to be noted that whether or not to implement the present invention exactly as indicated by the above embodiments, or whether or not to add further improvements of changes to the above embodiments, is optional and may be determined by the subjectivity of a person who implements the present invention.
(Stereoscopic Viewing Methods)
The parallax image method used in First Embodiment displays left- and right-eye images alternately in the time axis direction. Thus, unlike an ordinary 2D movie that is displayed at 24 frames-per-second, this method needs to display a total of 48 left- and right-eye images per second. Therefore, this method is suitable for use in a display device that can rewrite the screen at relatively high speed. This stereoscopic viewing technique utilizing the parallax image method has been commonly used for attractions of amusement parks and the like—i.e., has already been technically established. Hence, this technique may be the closest form of technology that could be practically implemented for home use. It should be mentioned that many other methods/techniques have been suggested to realize such stereoscopic viewing utilizing the parallax images, such as a two-color separation method. Although the alternate-frame sequencing and the polarization glass technique are explained in the present embodiment as examples of methods/techniques to realize the stereoscopic viewing, the stereoscopic viewing may be realized using other methods/techniques other than the aforementioned two techniques, as long as it is realized using parallax images.
The lenticular lens in the display device 300 may be replaced with another device (e.g., liquid crystal elements) that has the same function as the lenticular lens. Alternatively, a vertical polarizing filter and a horizontal polarizing filter may be provided for left-eye pixels and right-eye pixels, respectively. Here, stereoscopic viewing can be realized by the viewer viewing the screen of the display device through polarizing glasses including a vertical polarizing filter for the left eye and a horizontal polarizing filter for the right eye.
(Data Structure of Index.Bdmv for Storing 3D Video)
It is also possible to provide different types of index files to a 2D playback device and a 3D playback device, instead of providing different types of playlists thereto. In this case, the 2D playback device refers to “Index.bdmv” whereas the 3D playback device selects “Index.3dmv” upon starting the playback.
(Data Structure Used when Dealing with Plurality of Streams)
When there is a plurality of streams, the subpath information may be used as described above, or multi_clip_entries for multi-angle may be used. When the “multi_clip_entries” is used, it is preferable that the UO for changing the angle be prohibited after a proper stream is chosen according to the screen size of the display device, so as not to mistakenly switch to another stream that is dedicated for a different screen size.
(Targets of Application of Left and Right Views)
Not only video streams relating to the main content of the disc but also thumbnail images may be provided separately for the left and right images. Here, as is the case with the video stream, a 2D playback device displays conventional 2D thumbnails, but a 3D playback device outputs left-eye thumbnails and right-eye thumbnails, which have been prepared for 3D playback, according to the corresponding 3D display method.
The same rule applies to the following items: menu images; thumbnail images showing different scenes for chapter search; and reduced images showing different scenes.
(Creating Program of Each Embodiment)
The application program described in each embodiment of the present invention can be created as follows. First, the software developer writes, using a programming language, a source program that achieves each flowchart and functional component. Here, the software developer writes the source program that achieves each functional component by using the class structure, variables, array variables and calls to external functions in accordance with the sentence structure of the programming language.
The written source program is sent to the compiler as files. The compiler translates the source program and generates an object program.
The translation performed by the compiler includes processes such as syntax analysis, optimization, resource allocation, and code generation. In the syntax analysis, the characters, phrases, sentence structure and meaning of the source program are analyzed. The source program is then converted into an intermediate program. In the optimization, the intermediate program is subjected to processing such as the basic block setting, control flow analysis, and data flow analysis. In the resource allocation, to adapt to the instruction sets of the target processor, the variables in the intermediate program are allocated to the register or memory of the target processor. In the code generation, each intermediate instruction in the intermediate program is converted into a program code, and an object program is obtained.
The generated object program is composed of one or more program codes that cause the computer to execute each step of the flowcharts and each procedure of the functional components explained in the above embodiments. There are various types of program codes, such as the native code of the processor and the Java® byte code. There are also various forms in which the steps of the program codes are realized. For example, when the steps can be realized by using external functions, the call statements for calling the external functions are used as the program codes. Program codes that realize one step may belong to different object programs. In the RISC processor in which the types of instructions are limited, each step of the flowcharts may be realized by combining arithmetic operation instructions, logical operation instructions, branch instructions, and the like.
After the object program is generated, the programmer activates a linker. The linker allocates the memory spaces to the object programs and the related library programs, and links them together to generate a load module. The load module is generated under the assumption that it is read by the computer and causes the computer to execute the processing procedures of the flowcharts and the processing procedures of the functional components. The program described here may be recorded on a computer-readable recording medium to be provided to the user.
(How to Describe Data Structure)
Among the above-described data structures, a repetitive structure that has a plurality of pieces of predetermined type of information can be defined by setting (i) an initial value for the control variable and (ii) a repeat condition, into the “for” statement. The “Do While” statement may be used as well.
Also, an arbitrary data structure, in which predetermined information is defined when a predetermined condition is satisfied, can be defined by describing, into the “if” statement, (i) the condition to be satisfied and (ii) a variable to be set when the condition is satisfied. The “switch” statement or the “case” statement may be used as well.
As described above, the data structure of each Embodiment can be described in compliance with the grammar of a high-level programming language. Therefore, the data structure of each Embodiment is subjected to the translation processes performed by the compiler, including the syntax analysis, optimization, resource allocation, and code generation. In an object-oriented language, the data structure described in a high-level programming language is treated as a portion other than the method of the class structure, i.e., as an array-type member variable in the class structure, and constitutes a part of the program. That is to say, the data structure of each Embodiment is converted into computer code, then recorded into a computer-readable recording medium, and becomes a member variable of the program. Since it can be treated in this way, the data structure described up to now is substantially a program.
(Playback of Optical Disc)
The BD-ROM drive is equipped with an optical head that includes a semiconductor laser, a collimated lens, abeam splitter, an objective lens, a collecting lens, and a light detector. The light beams emitted from the semiconductor laser pass through the collimated lens, beam splitter, and objective lens, and are collected on the information surface of the optical disc.
The collected light beams are reflected/diffracted on the optical disc, pass through the objective lens, beam splitter, and collimated lens, and are collected in the light detector. A playback signal is generated depending on the amount of light collected in the light detector.
(Variations of Recording Medium)
The recording medium described in each Embodiment indicates a general package medium as a whole, including the optical disc and the semiconductor memory card. In each Embodiment, it is presumed, as one example, that the recording medium is an optical disc on which necessary data is preliminarily recorded (for example, an existing read-only optical disc such as the BD-ROM or DVD-ROM). However, the present invention is not limited to this. For example, the present invention may be implemented as follows: (i) obtain 3D content that includes the data necessary for implementing the present invention and is distributed by a broadcast or via a network; (ii) record the 3D content onto a writable optical disc (for example, an existing writable optical disc such as the BD-Re and DVD-RAM) by using a terminal device having the function of writing onto an optical disc (the function may be embedded in a playback device, or the device may not necessarily be a playback device); and (iii) apply the optical disc having recorded thereon the 3D content to the playback device of the present invention.
(Embodiments of Semiconductor Memory Card Recording Device and Playback Device)
The following describes embodiments of the recording device for recording the data structure of each Embodiment into a semiconductor memory, and the playback device for playing back the semiconductor memory.
First, the mechanism for protecting the copyright of data recorded on the BD-ROM will be explained, as background technology.
Some of the data recorded on the BD-ROM may have been encrypted as necessary in view of the confidentiality of the data.
For example, the BD-ROM may contain, as encrypted data, the data corresponding to a video stream, an audio stream, or a stream including these.
The following describes decryption of the encrypted data among the data recorded on the BD-ROM.
The semiconductor memory card playback device preliminarily stores data (for example, a device key) that corresponds to a key that is necessary for decrypting the encrypted data recorded on the BD-ROM.
On the other hand, the BD-ROM has preliminarily recorded thereon (i) data (for example, a medium key block (MKB) corresponding to the above-mentioned device key) that corresponds to a key that is necessary for decrypting the encrypted data, and (ii) encrypted data (for example, an encrypted title key corresponding to the above-mentioned device key and MKB) that is generated by encrypting the key itself that is necessary for decrypting the encrypted data. Note here that the device key, MKB, and encrypted title key are treated as a set, and are further associated with an identifier (for example, a volume ID) written in an area (called BCA) of the BD-ROM that cannot be copied in general. Here, encrypted data cannot be decrypted if these elements are combined incorrectly. Only if the combination is correct, a key (for example, a title key that is obtained by decrypting the encrypted title key by using the above-mentioned device key, MKB, and volume ID) that is necessary for decrypting the encrypted data can be derived. The encrypted data can be decrypted by using the derived key.
When a playback device attempts to play back a BD-ROM loaded therein, it cannot play back the encrypted data unless the device itself has a device key that makes a pair with (or corresponds to) the encrypted title key and the MKB recorded on the BD-ROM. This is because the key (title key) that is necessary for decrypting the encrypted data has been encrypted, and is recorded on the BD-ROM as the encrypted title key and the key that is necessary for decrypting the encrypted data cannot be derived if the combination of the MKB and the device key is not correct.
Conversely, when the combination of the encrypted title key, MKB, device key, and volume ID is correct, the video and audio streams are decoded by the decoder with use of the above-mentioned key (for example, a title key that is obtained by decrypting the encrypted title key by using the device key, MKB, and volume ID) that is necessary for decrypting the encrypted data. The playback device is structured in this way.
This concludes the description of the mechanism for protecting the copyright of data recorded on the BD-ROM. It should be noted here that this mechanism is not limited to being applied to the BD-ROM, but may be applicable to, for example, a readable/writable semiconductor memory (e.g., a portable semiconductor memory such as the SD card) for the implementation.
Described blow is the playback procedure to be followed by the semiconductor memory card playback device. In a case where the playback device plays back an optical disc, the playback device is structured to read out data via an optical disc drive, for example. On the other hand, in a case where the playback device plays back a semiconductor memory card, the playback device is structured to read out data via an interface for reading out the data from the semiconductor memory card.
More specifically, the playback device may be structured such that, when a semiconductor memory card is inserted into a slot (not illustrated) provided therein, the playback device and the semiconductor memory card are electrically connected with each other via the semiconductor memory card interface, and the playback device reads out data from the semiconductor memory card via the semiconductor memory card interface.
(Embodiments of Receiving Device)
The playback device explained in each Embodiment may be realized as a terminal device that receives data (distribution data) that corresponds to the data explained in each Embodiment from a distribution server for an electronic distribution service, and records the received data into a semiconductor memory card.
Such a terminal device may be realized by structuring the playback device explained in each Embodiment so as to perform such operations, or may be realized as a dedicated terminal device that is different from the playback device explained in each Embodiment and stores the distribution data into a semiconductor memory card. The following describes a case where the playback device is used. Also, in the following description, an SD card is used as the recording-destination semiconductor memory.
When the playback device is to record distribution data into an SD memory card inserted in a slot provided therein, the playback device first requests a distribution server (not illustrated) that stores distribution data to transmit the distribution data. At this time, the playback device reads out identification information for uniquely identifying the inserted SD memory card (for example, identification information uniquely assigned to each SD memory card, or more specifically, the serial number or the like of the SD memory card), from the SD memory card, and transmits the read-out identification information to the distribution server together with the distribution request.
The identification information for uniquely identifying the SD memory card corresponds to, for example, the volume ID described earlier.
On the other hand, the distribution server stores necessary data (for example, the video stream, the audio stream and the like) in an encrypted state such that the necessary data can be decrypted by using a predetermined key (for example, a title key).
The distribution server holds, for example, a private key so that it can dynamically generate different pieces of public key information respectively in correspondence with identification numbers uniquely assigned to each semiconductor memory card.
Also, the distribution server is structured to be able to encrypt the key (title key) itself that is necessary for decrypting the encrypted data (that is to say, the distribution server is structured to be able to generate an encrypted title key).
The generated public key information includes, for example, information corresponding to the above-described MKB, volume ID, and encrypted title key. With this structure, when, for example, a combination of the identification number of the semiconductor memory card, the public key contained in the public key information which will be explained later, and the device key that is preliminarily recorded in the playback device, is correct, a key (for example, a title key obtained by decrypting the encrypted title key by using the device key, the MKB, and the identification number of the semiconductor memory) necessary for decrypting the encrypted data is obtained, and the encrypted data is decrypted by using the obtained necessary key (title key).
Subsequently, the playback device records the received piece of public key information and distribution data into a recording area of the semiconductor memory card being inserted in the slot thereof.
A description is now given of an example of the method for decrypting and playing back the encrypted data among the data contained in the public key information and distribution data recorded in the recording area of the semiconductor memory card.
The received public key information stores, for example, a public key (for example, the above-described MKB and encrypted title key), signature information, identification number of the semi conductor memory card, and device list being information regarding devices to be invalidated.
The signature information includes, for example, a hash value of the public key information.
The device list is, for example, information for identifying the devices that might be played back in an unauthorized manner. The information, for example, is used to uniquely identify the devices, parts of the devices, and functions (programs) that might be played back in an unauthorized manner, and is composed of, for example, the device key and the identification number of the playback device that are preliminarily recorded in the playback device, and the identification number of the decoder provided in the playback device.
The following describes playback of the encrypted data from among the distribution data recorded in the recording area of the semiconductor memory card.
First, whether or not the decryption key itself can be used before decrypting the encrypted data using the decryption key is checked.
More specifically, the following checks are conducted.
(1) A check on whether the identification information of the semiconductor memory card contained in the public key information matches the identification number of the semiconductor memory card preliminarily stored in the semiconductor memory card.
(2) A check on whether the hash value of the public key information calculated in the playback device matches the hash value included in the signature information.
(3) A check, based on the information included in the device list, on whether the playback device to perform the playback is authentic (for example, the device key shown in the device list included in the public key information matches the device key preliminarily stored in the playback device).
These checks may be performed in any order.
After the above described checks (1) through (3) are conducted, the playback device performs a control not to decrypt the encrypted data when any of the following conditions is satisfied: (i) the identification information of the semiconductor memory card contained in the public key information does not match the identification number of the semiconductor memory card preliminarily stored in the semiconductor memory card; (ii) the hash value of the public key information calculated in the playback device does not match the hash value included in the signature information; and (iii) the playback device to perform the playback is not authentic.
On the other hand, when all of the conditions: (i) the identification information of the semiconductor memory card contained in the public key information matches the identification number of the semiconductor memory card preliminarily stored in the semiconductor memory card; (ii) the hash value of the public key information calculated in the playback device matches the hash value included in the signature information; and (iii) the playback device to perform the playback is authentic, are satisfied, it is judged that the combination of the identification number of the semiconductor memory, the public key contained in the public key information, and the device key that is preliminarily recorded in the playback device, is correct, and the encrypted data is decrypted by using the key necessary for the decryption (the title key that is obtained by decrypting the encrypted title key by using the device key, the MKB, and the identification number of the semiconductor memory).
When the encrypted data is, for example, a video stream and an audio stream, the video decoder decrypts (decodes) the video stream by using the above-described key necessary for the decryption (the title key that is obtained by decrypting the encrypted title key), and the audio decoder decrypts (decodes)) the audio stream by using the above-described key necessary for the decryption.
With such a structure, when devices, parts of the devices, and functions (programs) that might be used in an unauthorized manner are known at the time of the electronic distribution, a device list showing such devices and the like may be distributed. This enables the playback device having received the list to inhibit the decryption with use of the public key information (public key itself) when the playback device includes anything shown in the list. Therefore, even if the combination of the identification number of the semiconductor memory, the public key itself contained in the public key information, and the device key that is preliminarily recorded in the playback device, is correct, a control is performed not to decrypt the encrypted data. This makes it possible to prevent use of the distribution data by an unauthorized device.
It is preferable that the identifier of the semiconductor memory card that is preliminarily recorded in the semiconductor memory card be stored in a highly secure recording area. This is because, when the identification number (for example, the serial number of the SD memory card) that is preliminarily recorded in the semiconductor memory card is tampered with, unauthorized copying can be easily done. More specifically, unique, although different identification numbers are respectively assigned to semiconductor memory cards, if the identification numbers are tampered with to be the same, the above-described judgment in (1) does not make sense, and as many semiconductor memory cards as tamperings may be copied in an unauthorized manner.
For this reason, it is preferable that information such as the identification number of the semiconductor memory card be stored in a highly secure recording area.
To realize this, the semiconductor memory card, for example, may have a structure in which a recording area for recording highly confidential data such as the identifier of the semiconductor memory card (hereinafter, the recording area is referred to as a second recording area) is provided separately from a recording area for recording regular data (hereinafter, the recording area is referred to as a first recording area), a control circuit for controlling accesses to the second recording area is provided, and the second recording area is accessible only through the control circuit.
For example, data may be encrypted so that the encrypted data is recorded in the second recording area, and the control circuit may be embedded with a circuit for decrypting the encrypted data. In this structure, when an access is made to the second recording area, the control circuit decrypts the encrypted data and returns the decrypted data. As another example, the control circuit may hold information indicating the location where the data is stored in the second recording area, and when an access is made to the second recording area, the control circuit identifies the corresponding storage location of the data, and returns data that is read out from the identified storage location.
An application, which is running on the playback device and is to record data onto the semiconductor memory card with use of the electronic distribution, issues, to the control circuit via a memory card interface, an access request requesting to access the data (for example, the identification number of the semiconductor memory card) recorded in the second recording area. Upon receiving the request, the control circuit reads out the data from the second recording area and returns the data to the application running on the playback device. It sends the identification number of the semiconductor memory card and requests the distribution server to distribute the data such as the public key information, and corresponding distribution data. The public key information and corresponding distribution data that are sent from the distribution server are recorded into the first recording area.
Also, it is preferable that the application, which is running on the playback device and is to record data onto the semiconductor memory card with use of the electronic distribution, preliminarily checks whether or not the application is tampered with before it issues, to the control circuit via a memory card interface, an access request requesting to access the data (for example, the identification number of the semiconductor memory card) recorded in the second recording area. For checking this, an existing digital certificate conforming to the X.509 standard, for example, may be used.
Also, the distribution data recorded in the first recording area of the semiconductor memory card may not necessarily be accessed via the control circuit provided in the semiconductor memory card.
(System LSI)
It is desirable that part of the components of the playback device that is mainly composed of logic devices, such as the system target decoder, playback control unit 7, and program executing unit, be realized as a system LSI.
The system LSI is obtained by implementing a bare chip on a high-density substrate and packaging them. The system LSI is also obtained by implementing a plurality of bare chips on a high-density substrate and packaging them, so that the plurality of bare chips have an outer appearance of one LSI (such a system LSI is called a multi-chip module).
The system LSI has a QFP (Quad Flat Package) type and a PGA (Pin Grid Array) type. In the QFP-type system LSI, pins are attached to the four sides of the package. In the PGA-type system LSI, a large number of pins are attached to the entire bottom.
These pins function as an interface with other circuits. The system LSI, which is connected with other circuits through such pins as an interface, plays a role as the core of the playback device 200.
Such a system LSI can be embedded into various types of devices that can play back images, such as a television, a game console, a personal computer, a one-segment mobile phone, as well as into the playback device 200. The system LSI thus greatly broadens the use of the present invention.
It is desirable that the system LSI conforms to the UniPhier architecture.
A system LSI conforming to the UniPhier architecture includes the following circuit blocks.
The DPP is an SIMD-type processor where a plurality of elemental processors perform a same operation. The DPP achieves parallel decoding of a plurality of pixels constituting a picture by causing operating units, respectively embedded in the elemental processors, to operate simultaneously by one instruction.
The IPP includes: a local memory controller that is composed of instruction RAM, instruction cache, data RAM, and data cache; processing unit that is composed of instruction fetch unit, decoder, execution unit, and register file; and virtual multi processing unit that causes the processing unit to execute parallel execution of a plurality of applications.
The MPU block is composed of: peripheral circuits such as ARM core, external bus interface (Bus Control Unit: BCU), DMA controller, timer, vector interrupt controller; and peripheral interfaces such as UART, GPIO (General Purpose Input Output), and sync serial interface.
The stream I/O block performs data input/output with the drive device, hard disk drive device, and SD memory card drive device which are connected onto the external busses via the USB interface and the ATA packet interface.
The AV I/O block, which is composed of audio input/output, video input/output, and OSD controller, performs data input/output with the television and the AV amplifier.
The memory control block performs reading and writing from/to the SD-RAM connected therewith via the external buses. The memory control block is composed of internal bus connection unit for controlling internal connection between blocks, access control unit for transferring data with the SD-RAM connected to outside of the system LSI, and access schedule unit for adjusting requests from the blocks to access the SD-RAM.
The following describes a detailed production procedure. First, a circuit diagram of a part to be the system LSI is drawn, based on the drawings that show structures of the embodiments. And then the constituent elements of the target structure are realized using circuit elements, ICs, or LSIs.
While realizing the constituent elements in the above manner, buses connecting between the circuit elements, ICs, or LSIs, peripheral circuits, interfaces with external entities and the like are defined. Further, the connection lines, power lines, ground lines, clock signals and the like are defined. For these definitions, the operation timings of the constituent elements are adjusted by taking into consideration the LSI specifications, and band widths necessary for the constituent elements are secured. With other necessary adjustments, the circuit diagram is completed.
After the circuit diagram is completed, the implementation design is performed. The implementation design is a work for creating a board layout by determining how to arrange the parts (circuit elements, ICs, LSIs) of the circuit and the connection lines onto the board.
After the implementation design is performed and the board layout is created, the results of the implementation design are converted into CAM data, and the CAM data is output to equipment such as a Numerical Control (NC) machine tool. The NC machine tool performs the System on Chip (SoC) implementation or the System in Package (SiP) implementation. The SoC implementation is technology for printing a plurality of circuits onto a chip. The SiP implementation is technology for packaging a plurality of circuits by resin or the like. Through these processes, a system LSI of the present invention can be produced based on the internal structure of the playback device 200 described in each embodiment above.
It should be noted here that the integrated circuit generated as described above may be called IC, LSI, ultra LSI, super LSI or the like, depending on the level of integration.
It is also possible to achieve the system LSI by using the Field Programmable Gate Array (FPGA). In this case, a large number of logic elements are to be arranged lattice-like, and vertical and horizontal wires are connected based on the input/output combinations described in a Look-Up Table (LUT), so that the hardware structure described in each embodiment can be realized. The LUT is stored in the SRAM. Since the contents of the SRAM are erased when the power is off, when the FPGA is used, it is necessary to define the Config information so as to write, onto the SRAM, the LUT for realizing the hardware structure described in each embodiment.
The present embodiment is realized by middleware and hardware part corresponding to the system LSI, hardware part other than the part corresponding to the system LSI, interface part for the middleware, interface part for the middleware and system LSI, interface with the hardware part other than the part corresponding to the system LSI, and the user interface part, and when these are embedded in a playback device, these operate in cooperation with each other to provide unique functions.
By appropriately defining the interface part for the middleware, and the interface part for the middleware and system LSI, it is possible to develop, independently in parallel, the user interface part, middleware part, and system LSI part of the playback device. This makes it possible to develop the product more efficiently. Note that the interface can be segmented in various ways.
A playback device of the present invention does not require a change in an output frame rate when switching between 3D video playback and 2D video playback. The playback device of the present invention is therefore beneficial when connected to a monitor using the HDMI connection that necessitates synchronization between the output frame rate of the playback device and the output frame rate of the monitor.
Number | Date | Country | |
---|---|---|---|
61101324 | Sep 2008 | US |