This application claims priority under 35 U.S.C. §119(a) to a Korean Patent Application filed in the Korean Intellectual Property Office on May 19, 2008 and assigned Serial No. 10-2008-0046315, the contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates generally to an apparatus and method for creating and replaying media files, and more particularly, to an apparatus and method for creating and displaying stereoscopic media files in a media file playback apparatus.
2. Description of the Related Art
Moving Picture Experts Group (MPEG), which is an international multimedia standards organization, has released MPEG-2, MPEG-4, MPEG-7 and MPEG-21 standards since its first standardization of MPEG-1 in 1988. Multimedia Application Format (MAF), which is under standardization, intends to increase values of known standards by combining existing MPEG standards with non-MPEG standards to meet the industrial needs. The main purpose of the MAF standardization is to provide a standardized file format for a specific application, thereby enabling a wide use of the application and a supply of MPEG standards.
Recently, extensive research on methods for implementing Three-Dimensional (3D) videos has been conducted to represent more realistic video information. One of the methods scans both left and right-view images on the existing display device at their associated locations to separately image the left view and the right view on the left eye and the right eye of the user, using visual characteristics of human beings, thereby allowing the user to experience 3D effects. For example, a portable terminal equipped with a barrier Liquid Crystal Display (LCD) may provide more lifelike videos to the user by replaying stereoscopic contents.
Stereoscopic contents composed of two or more tracks may include, for each track, the same stereoscopic video-related information about, for example, a frame structure of stereoscopic streams, which of the left view sequence and the right view sequence was encoded first, and whether each frame is a stereoscopic or monoscopic video frame. The duplicate stereoscopic video-related information may be stored in only one track. Conventionally, however, a syntax based on which it is possible to determine the duplication of the stereoscopic video-related information, is not defined in a file format. Therefore, there is a need for a method and apparatus capable of providing information based on which duplication of stereoscopic video-related information can be determined, when each track of stereoscopic contents consisting of two or more tracks has the same stereoscopic video-related information.
An aspect of the present invention is to address at least the above-mentioned problems and/or disadvantages and to provide at least the advantages described below. Accordingly, an aspect of the present invention provides an apparatus and method for providing information based on which duplication of stereoscopic video-related information can be determined, when each track of stereoscopic contents composed of two or more tracks includes the same stereoscopic video-related information.
According to the present invention, there is provided a computer-readable recording medium for storing stereoscopic content having two or more tracks and stereoscopic video-related information of each track in the stereoscopic content, including a duplication indication field indicating duplication of stereoscopic video-related information of each track in the stereoscopic content, and a track reference field including information used to connect a current track to at least one other track referred to by the current track.
According to the present invention, there is provided a computer-implemented method in a terminal, including receiving a media file, parsing a duplication indication field indicating duplication of stereoscopic video-related information of each track in stereoscopic content having two or more tracks, parsing a track reference field including information used to connect a current track to at least one other track that the current rack refers to, and displaying the received media file based on the parsing.
According to the present invention, there is provided a terminal apparatus for receiving a media file, including a unit for receiving and storing a media file, a processor for parsing a duplication indication field indicating duplication of stereoscopic video-related information of each track in stereoscopic content having two or more tracks, and parsing a track reference field including information used to connect a current track to at least one other track that the current rack refers to, and a display unit for displaying the received media file based on the parsing.
The above and other aspects, features and advantages of certain exemplary embodiments of the present invention will be more apparent from the following description taken in conjunction with the accompanying drawings, in which:
Throughout the drawings, the same drawing reference numerals will be understood to refer to the same elements, features and structures.
Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings. In the following description and drawings, a detailed description of known functions and configurations incorporated herein will be omitted for the sake of clarity and conciseness.
The terms and words used in the following description and claims are not limited to their dictionary meanings, and are merely used by the inventor to enable a clear and consistent understanding of the invention. Accordingly, it should be apparent to those skilled in the art that the following description of preferred embodiments of the present invention are provided for illustration purpose only and not for the purpose of limiting the invention as defined by the appended claims and their equivalents.
It is to be understood that the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a component surface” includes reference to one or more of such surfaces.
The present invention provides a method and apparatus for creating a file format capable of indicating that each track uses the same stereoscopic video-related information, when stereoscopic video-related information of each track is the same regarding stereoscopic contents composed of two or more tracks.
Before a description of a stereoscopic content storage format is given, a block structure of a media file format based on the conventional ISO/IEC 14496-12 standard will be described with reference to
Although not illustrated in
The data in a movie box (‘moov’) 110 is created in an object-based structure, and includes content information such as a frame rate, a bit rate and an image size, sync information for supporting a playback function such as Fast Forward (FF)/REWind (REW), and all other information for replaying the file. In particular, as the ‘moov’ 110 includes information such as the total number of frames of video and audio data and a size of each frame, the video and audio data can be decoded and replayed by parsing the ‘moov’ 110 during playback.
Meanwhile, media data (‘mdat’) 120 includes actual stream data according to each track, and video data and audio data are stored in units of their frames.
By checking reference_type in the ‘tref’ box, it is possible to know that the track to be referred to contains additional view media information, i.e. a stereoscopic video stream paired with the track. Referring to
Table 1 shows an example of a conventional ‘svmi’ box with stereoscopic video-related information about, for example, a frame structure of stereoscopic streams, which of the left view sequence and the right view sequence was encoded first, and whether each frame is a stereoscopic video frame or a monoscopic video frame.
The present invention implements a storage format including stereoscopic content-related information by modifying the storage format of
Referring to
Table 2 is defined by adding information based on which duplication of stereoscopic video-related information is determined as described in
Referring to
When the track does not include a ‘tref’ box in step 409, the terminal directly proceeds to step 413 where it checks an ‘mdia’ box of the track, and then checks a sample table box (‘stbl’ box) in step 415. Thereafter, the terminal determines in step 417 whether a ‘svmi’ box with stereoscopic video-related information exists. If the ‘svmi’ box exists, the terminal checks the ‘svmi’ box in step 419, and checks ‘duplication_flag’ in the ‘svmi’ box in step 421. Based on a value of the ‘duplication_flag’, the terminal determines whether two paired tracks have the same stereoscopic video-related information. That is, ‘duplication_flag’ is set to ‘1’ (true) when details of the ‘svmi’ box are the same.
In operation of all preferred embodiments of the present invention, the expression “checking a box” is equivalent to a process of parsing a file to decode data on each track, and indicates parsing information (fields and parameters) contained in the box.
Although a process of checking the respective boxes is needed in all operations of the terminal, the sequence in which the respective boxes are checked does not have to follow the sequence in the drawing. The parsing process for the file format and the terminal's operation, which are not described in detail herein, will conform to the ISO/IEC 14496-12 standard.
A new stereoscopic video additional information (‘svai’) box that provides information based on which duplication of stereoscopic video-related information of stereoscopic contents composed of two or more tracks will be described with reference to
The ‘svai’ box includes information, i.e. ‘duplication_flag’, based on which duplication of stereoscopic video-related information is determined, and includes ‘track_IDs’ indicating tracks with a ‘svmi’ box omitted, when details of the ‘svmi’ box with stereoscopic video-related information are the same among stereoscopic video streams, i.e. when ‘duplication_flag==1’. For example, if two paired tracks are equal in stereoscopic video-related information of track_ID=1 and track_ID=2, i.e. information contained in their ‘svmi’ boxes, and a ‘svmi’ box of the track with track_ID=2 is omitted, then ‘duplication_flag’ is set to ‘1’ (true), and track_ID=2 of the track with a ‘svmi’ box omitted is stored.
The ‘svai’ box is defined as a box at a ‘moov’ level included in a ‘moov’ box as shown in
Table 3A, as described in
Referring to
In
Table 3B gives a definition of the new ‘svai’ box in
In the exemplary case of
In
Table 3C gives a definition of the new ‘svai’ box in
Tables 4A and 4B show new boxes that are added in the present invention based on an ISO based media file format according to the second, third and fourth embodiments of the present invention.
Referring to
In step 609, the terminal determines whether stereoscopic video-related information of two paired tracks is the same, based on a ‘duplication_flag’ value in the ‘svai’ box, and detects a track with a ‘svmi’ box omitted, by identifying the track_ID. That is, in step 609, the terminal checks ‘duplication_flag’ and track_ID with a ‘svmi’ box omitted. In step 611, the terminal checks the ‘trak’ box. The sub-process of checking the ‘trak’ box is repeatedly implemented on the first track to the last track of the media file. In step 613, the terminal checks a track header box (‘tkhd’ box) to identify a track_ID. In step 615, the terminal determines whether the track includes a ‘tref’ box. If the track includes a ‘tref’ box, the terminal checks the ‘tref’ box in step 617. Thereafter, the terminal checks an ‘mdia’ box in step 619.
However, when the track does not include the ‘tref’ box in step 615, the terminal bypasses step 617 and directly proceeds to step 619 where it checks an ‘mdia’ box of the track, and then checks a ‘stbl’ box in step 621. Thereafter, in step 623, the terminal checks the ‘svmi’, if it is determined, based on the information checked in the ‘svai’ box, that the track is not a track with a ‘svmi’ box omitted.
Although a process of checking the respective boxes is needed in all operations of the terminal, the sequence in which the respective boxes are checked does not have to follow the sequence in the drawing. The parsing process for the file format and the terminal's operation, which are not described in detail herein, will conform to the ISO/IEC 14496-12 standard.
Referring to
In
In
Table 5 gives a definition of a new ‘svai’ box that includes ‘duplication_flag’ information based on which duplication of stereoscopic video-related information is determined, as described in
Referring to
If the track does not include a ‘tref’ box in step 809, the terminal bypasses step 811 and directly proceeds to step 813 where it checks an ‘mdia’ box of the track, and then checks an ‘stbl’ box in step 815. In step 817, the terminal determines whether an ‘svmi’ box with stereoscopic video-related information exists in the ‘stbl’ box. If the ‘svmi’ box exists, the terminal checks the ‘svmi’ box in step 819. However, if the ‘svmi’ box does not exist, the terminal checks a ‘meta’ box in step 821, checks a ‘svai’ box, which is a ‘meta’ box at a ‘trak’ level including information about each track, in step 823, and checks ‘duplication_flag’ information in the ‘svai’ box in step 825.
Although a process of checking the respective boxes is needed in all operations of the terminal, the sequence in which the respective boxes are checked does not have to follow the sequence in the drawing. The parsing process for the file format and the terminal's operation, which are not described in detail herein, will conform to the ISO/IEC 14496-12 standard.
Referring to
The information indicating duplication of ‘svmi’ information is represented in the method described in the preferred embodiments of the present invention. Images photographed by the left view camera 900 and the right view camera 910 are input to the video signal processor 930 via the input unit 920. The left/right view image data undergoes preprocessing in the video signal processor 930. The preprocessing process includes converting analog external image values into digital values, and composing a left view sequence and a right view sequence. The preprocessed image is encoded by the encoder 940.
The file creator 950 creates a media file using the image data encoded by the encoder 940. Here, the image data is stored in a media data (‘mdat’) area, and a type of a media file and information for replaying the media file are stored in a file type (‘ftyp’) area and a movie (‘moov’) area, respectively. The created stereoscopic media file is input or transferred to a stereoscopic media file playback apparatus, which replays and displays the image. The file creator 950 includes an ‘svmi’ duplication-related information inserter 951. The ‘svmi’ duplication-related information inserter 951 inserts ‘svmi’ duplication-related information in the media file according to the present invention.
Referring to
The file parser 1000 parses the media file created by the file creator 950 in the media file creation apparatus and the information stored in the ‘ftyp’ area and the ‘moov’ area, and the decoder 1010 decodes the image data stored in the ‘mdat’ area using the parsed information. The playback unit 1020 replays the decoded data, and the display unit 1030 displays the data replayed by the playback unit 1020 on a display device of the terminal.
Operations of the system for creating and replaying the stereoscopic media file are subject to change according to apparatuses.
Although not specifically described in the present invention, the basic operation performed on the file format will follow details of the ISO/IEC 14496-12 standard. In addition, as the file format suggested by the present invention may be compatible with file formats extended from the ISO Base Media File format or the ISO Base Media File format, the stored media data can be transferred or applied to various multimedia applications.
As is apparent from the foregoing description, when stereoscopic video-related information of each track is the same regarding the stereoscopic contents consisting of two or more tracks, the present invention efficiently indicates that the tracks use the same stereoscopic video-related information.
In addition, the present invention avoids storing the same information in a duplicated manner.
Furthermore, the present invention prevents the same information from being duplicated, thereby reducing the file size and terminal's overhead.
Preferred embodiments of the present invention are also embodied as computer-readable codes on a computer-readable recording medium. The computer-readable recording medium is any data storage device that can store data, which can thereafter be read by a computer system. Examples of the computer-readable recording medium include, but are not limited to, Read-Only Memory (ROM), Random-Access Memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the Internet via wired or wireless transmission paths). The computer-readable recording medium is also distributable over network-coupled computer systems so that the computer-readable code is stored and executed in a distributed fashion. Also, function programs, codes, and code segments for accomplishing the present invention can be easily construed as within the scope of the invention by programmers skilled in the art to which the present invention pertains.
While the invention has been shown and described with reference to a certain exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
10-2008-0046315 | May 2008 | KR | national |
Number | Name | Date | Kind |
---|---|---|---|
6055012 | Haskell et al. | Apr 2000 | A |
6072831 | Chen | Jun 2000 | A |
7184036 | Dimsdale et al. | Feb 2007 | B2 |
7230986 | Wise et al. | Jun 2007 | B2 |
7486311 | Baker et al. | Feb 2009 | B2 |
7528830 | Redert | May 2009 | B2 |
7529400 | Katata et al. | May 2009 | B2 |
7561779 | Yahata et al. | Jul 2009 | B2 |
7580463 | Routhier et al. | Aug 2009 | B2 |
7580952 | Logan et al. | Aug 2009 | B2 |
7643025 | Lange | Jan 2010 | B2 |
7702016 | Winder et al. | Apr 2010 | B2 |
7710463 | Foote | May 2010 | B2 |
7746931 | Kato et al. | Jun 2010 | B2 |
7747765 | Jones et al. | Jun 2010 | B2 |
7825991 | Enomoto | Nov 2010 | B2 |
7848425 | Cho et al. | Dec 2010 | B2 |
7855737 | Petrescu et al. | Dec 2010 | B2 |
7857700 | Wilder et al. | Dec 2010 | B2 |
7898578 | Nakamura | Mar 2011 | B2 |
7908273 | DiMaria et al. | Mar 2011 | B2 |
7970221 | Yang | Jun 2011 | B2 |
8042094 | Napoli et al. | Oct 2011 | B2 |
8044994 | Vetro et al. | Oct 2011 | B2 |
8098728 | Winder et al. | Jan 2012 | B2 |
8111758 | Yun et al. | Feb 2012 | B2 |
20050041736 | Butler-Smith et al. | Feb 2005 | A1 |
20050244135 | Yahata et al. | Nov 2005 | A1 |
20050259147 | Nam et al. | Nov 2005 | A1 |
20060028479 | Chun et al. | Feb 2006 | A1 |
20060262856 | Wu et al. | Nov 2006 | A1 |
20070147502 | Nakamura | Jun 2007 | A1 |
20070206926 | Ando et al. | Sep 2007 | A1 |
20080033983 | Ko | Feb 2008 | A1 |
20080098052 | Kim et al. | Apr 2008 | A1 |
20080098083 | Shergill et al. | Apr 2008 | A1 |
20080117231 | Kimpe | May 2008 | A1 |
20080151112 | Basile et al. | Jun 2008 | A1 |
20080246836 | Lowe et al. | Oct 2008 | A1 |
20080247670 | Tam et al. | Oct 2008 | A1 |
20080259073 | Lowe et al. | Oct 2008 | A1 |
20090034629 | Suh et al. | Feb 2009 | A1 |
20090251531 | Marshall et al. | Oct 2009 | A1 |
20090279608 | Jeon et al. | Nov 2009 | A1 |
20090304068 | Pandit et al. | Dec 2009 | A1 |
20100020871 | Hannuksela et al. | Jan 2010 | A1 |
20100039499 | Nomura et al. | Feb 2010 | A1 |
20100060717 | Klein Gunnewiek et al. | Mar 2010 | A1 |
20100097525 | Mino | Apr 2010 | A1 |
20100146018 | Kim | Jun 2010 | A1 |
20100165077 | Yin et al. | Jul 2010 | A1 |
20100182403 | Chun et al. | Jul 2010 | A1 |
20100231689 | Bruls et al. | Sep 2010 | A1 |
20100271463 | Gutierrez Novelo | Oct 2010 | A1 |
20100289876 | Shin et al. | Nov 2010 | A1 |
20110050990 | Farkash | Mar 2011 | A1 |
20110187821 | Routhier et al. | Aug 2011 | A1 |
Number | Date | Country |
---|---|---|
1 501 317 | Jan 2005 | EP |
1 617 684 | Jan 2006 | EP |
2247115 | Nov 2010 | EP |
1020050121246 | Dec 2005 | KR |
1020080004772 | Jan 2008 | KR |
WO 2009093881 | Jul 2009 | WO |
Entry |
---|
“Coding of Audio-Visual Objects,” International Standard WD 3.0 ISO/IEC 14496-1, 2002, pp. 1-487. |
“Coding of Moving Pictures and Audio Information, XMT and MPEG-J Extensions,” ISO/IEC 14496-11, 2003, pp. 1-38. |
“ISO Base Media File Format,” ISO/IEC 14496-12, 2003, pp. 1-24. |
Goodwin, J. and Apel, H. “A Uniform Resource Name (URN) Namespace for the International Organization for Standardization (ISO),” RFC 5141, Mar. 2008, pp. 1-28. |
Puri, A. et al. “Basics of Stereoscopic Video, New Compression Results with MPEG-2 and a Proposal for MPEG-4,” Signal Processing: Image Communication, vol. 10, Issues 1-3, MPEG-4, Part 2, Jul. 1997, pp. 201-234. |
Aksay, Anil et al. “End-to-End Stereoscopic Video Streaming with Content-Adaptive Rate and Format Control,” Signal Processing: Image Communication, vol. 22, Issue 2, Feb. 2007, pp. 157-168. |
Pehlivan, S. et al. “End-to-End Stereoscopic Video Streaming System,” IEEE International Conference on Multimedia and Expo, Jul. 12, 2006, pp. 2169-2172. |
Yang, W. “An MPEG-4-Compatible Stereoscopic/Multiview Video Coding Scheme,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, Issue 2, Feb. 2006, pp. 286-290. |
Chang, Gan-Cheih. “Multi-View Image Compression and Intermediate View Synthesis for Stereoscopic Applications,” The 2000 IEEE International Symposium on Circuits and Systems, vol. 2, 2000, p. 277-280. |
Siegel, Mel et al. “Compression and Interpolation of 3D-Stereoscopic and Multi-View Video,” Stereoscopic Displays, 1997, pp. 1-13. |
Hom Yo-Sung. “Overview of Multi-View Coding,” 14th International Workshop on Systems, Signals and Image Processing, 2007 and 6th EURASIP Conference Focused on Speech and Image Processing, Multimedia Communications and Services, Jun. 27-30, 2007, pp. 5-12. |
Ohm, Jens-Rainer. “Stereo/Multiview Video Encoding using the MPEG Family of Standards,” Society of Photo-Optical Instrumentation Engineers, 1999. |
Lim, Jeong Eun et al. “A Multiview Sequence CODEC with View Scalability,” Signal Processing: Image Communication, vol. 19, Issue 3, Mar. 2004, pp. 239-256. |
Next Generation Broadcasting Forum, Korea: “Proposal for Technical Specification of Stereoscopic MAF”, ISO/IEC JTC 1/SC 29/WG 11, Jun. 2007. |
International Standard, Information Technology—Coding of Audio-Visual Objects—Part 12: ISO Base Media File Format, ISO/IEC 14496-12, Corrected Version, Oct. 1, 2005. |
Number | Date | Country | |
---|---|---|---|
20090284583 A1 | Nov 2009 | US |