The present invention relates method and apparatus for playing video data of high bit rate format by a player capable of playing video data of low bit rate format. More particularly, the present invention relates to the prediction of navigation information used for seamless AV media playback in multiple formats. In particular, the invention provides apparatus and method to achieve real time streaming for networked computing devices and to enable offline playback of transcoded AV media requested by Digital Media Player (DMP) among heterogeneous computing devices. Each computing device may support different AV media formats. Prediction of navigation information on Digital Media Server (DMS) is invoked prior to transcoding of original AV media format to the AV media format requested by DMP. Both the continuous and discontinuous, e.g. after user editing, AV media are supported by the prediction of navigation information.
The recent advancement in networking technologies has enabled communication capabilities among networked computing devices, such as home consumer electric (CE) devices and handheld devices. Network CE devices such as Digital Versatile Disc (DVD) recorder are widespread in home network and each of them can function as either DMS or DMP or both. These devices can either playback AV media stored locally, or playback AV media stored remotely in other devices via network streaming, e.g. DMS streams AV media stored in its storage units to the DMP upon DMP request.
On the other hand, the number of AV media formats is increasing as a result of the high popularization of digital broadcasting and the emergence of new AV media formats such as Blu-ray Disc (BD) format. As a result, format incompatibility happens when DMP attempts to playback AV media format of DMS that is not supported by the DMP. Specifically, a DVD-VR device can playback DVD-VR media that is stored either locally or remotely. However, the device does not support new AV media format such as BD format. Therefore, this DVD-VR device is not able to playback BD media even though BD media can be acquired successfully to the DVD-VR device, either through network streaming or copied from portable media storage unit, such as Secure Digital (SD) Card. The portable media storage unit can be ported to other computing device for offline playback of the stored AV media. Backward compatibility and interoperability among networked computing devices are necessary so that media device is not limited to playback only its supported types of AV media format.
For movie streaming, some computing devices, e.g. a DVD-VR device needs two types of information to achieve DVD-VR media playback. First information is the AV media that contains AV stream. However, with only AV media, DMP can merely achieve normal playback mode in which AV media is played continuously from the beginning to the end. Second information is the navigation information that contains navigation information required to playback AV media in either normal playback mode or trick play mode. Navigation information is defined by each AV media format and it contains the mapping information of the playback time and the position of AV media. With the navigation information, trick play mode such as seek, fast forward and rewind, can be achieved. In particular, each AV media format has its specifically defined structure of AV media and navigation information. One common example is DVD format that defines video object block (VOB) and information format (IFO) as its AV media and navigation information file respectively.
[Patent Document 1] U.S. Pat. No. 6,463,445 B1
[Patent Document 2] US Patent Application Publication 2004/0054689 A1
[Patent Document 3] EP 1524855 A1
[Patent Document 4] EP 0920203 A2
[Patent Document 5] US Patent Application Publication 2003/0021587 A1
Even though existing solutions are capable of providing backward compatibility and interoperability among networked computing devices, there are several drawbacks. First, existing solutions do transcoding of requested AV media prior to generating its corresponding navigation information based on the transcoded AV media. Therefore, it has to allocate storage spaces for different AV media formats so that the navigation information can be generated based on the requested AV format. In this case, the memory and storage consumption are high especially when the AV media is lengthy and the number of AV media stored in DMS is large.
Second drawback is that the time taken to transcode the entire AV media causes delay in rendering the requested navigation information to DMP, because the desired navigation information can only be generated upon completion of the transcoding of the whole AV media. As a result, existing solution avoids the real time playback of streamed AV media at DMP, which has to wait for the availability of navigation information before requesting for AV media. Apart from this, other solutions provide the interoperability of AV media access without utilizing the navigation information. However, only normal playback is enabled at DMP because trick play mode cannot be realized.
This present invention relates to the prediction of navigation information used for seamless playback of multi-format AV media. In particular, the invention provides apparatus and method on DMS to achieve on-demand real time streaming for networked computing devices and to enable offline playback of transcoded AV media requested by Digital Media Player (DMP) among heterogeneous computing devices, e.g. home CE devices, which support different AV media formats.
Present invention has an intelligent mechanism to counter the above drawbacks. To avoid the problem of high storage consumption, present invention transcodes only the range of requested AV media, stores in temporary media storage, and sends the transcoded AV media to the DMP. It is important for resource-limited computing device especially in home CE network, where high volumes of AV media can be exchanged among the network CE devices. Present invention predicts navigation information based on original navigation information rather than the transcoded AV media. Without parsing the transcoded AV media, predictive navigation information can be generated with negligible delay and thus, aids in achieving real-time AV playback at DMP.
The predictive navigation information can be generated before the transcoding of the video data of high bit rate format to the video data of low bit rate format.
The transcoding of the video data can be carried out in real time. Thus, the viewer can watch the video without any waiting time.
The transcoding of the video data is carried out in sections. Thus, the memory size can be made small size, which is sufficient to do the transcoding of one section.
The embodiments of the invention are supplemented with drawings to illustrate the invention with reference to specific figure elements. Repetitive instances are covered by the specific instances which have been covered with a specific figure element.
Preferred embodiments of the present invention are described below with reference to the accompanying figures. Note that like reference numerals in the figures denote identical components performing identical actions and operations.
Embodiments of the present invention are related to DMS (digital media server) and DMP (digital media player) involving AV media transcoder, AV media streaming and playback via network or offline AV media playback, prediction of navigation information for both continuous and discontinuous AV media and media retrieval method based on range mapping between two streams in different formats. The embodiment of the present invention also relates to the padding subsystem to ensure consistency of predictive and transcoded stream size. More specifically, the present invention is related to apparatus, system and method of navigation information prediction provided by DMS that enable offline or real time playback of transcoded AV media in requesting DMP.
First, the present invention is described according to one specific example.
Referring to
Referring to
It is to be noted that the original navigation information 902 as explained above is applicable to DMS 200, i.e., the blu-ray disc recorder which has a bit rate of 20 Mbps (this number is just an example), but is not applicable to DMP 300, i.e., the DVD-VR player which has a bit rate of 10 Mbps (this number is just an example).
According to the present invention, the original navigation information 902 is converted, using a conversion condition table 1004, to a predictive navigation information 912 which is applicable to DMP 300, i.e., the DVD-VR player.
As shown in
According to the conversion condition table 1004, it is indicated that: the bit rate for the original AV media is 20 Mbps, and that for the target AV media is 10 Mbps; the size for the original AV media is HD with the screen ratio of 16:9 for the screen width to height, and that for the target AV media is D1 with the screen ratio of 3:2 for the screen width to height; the frame rate for both media are 29.97; the key frame interval for both media are 15; the system stream format for the original AV media is the transport stream, and that for the target AV media is the program stream; the audio elementary stream format for the original AV media is the AC3, and that for the target AV media is the LPCM; and the video elementary stream format for both media are the MPEG2. The conversion condition table 1004 is prepared and stored in DMS 200. When a new device is connected to the home network system, information of the new device is sent to DMS 200 which adds the information of the new device in the conversion condition table 1004.
In the conversion condition table 1004, the bit rate is the attribute that is necessary for the present invention. Other attributes can be omitted.
According to the present invention, the original navigation information is converted to the predictive navigation information using the bit rate information. Generally, the following relationship can be given.
(Position in original AV media):(Position in target AV media)=(Original AV media bit rate):(Target AV media bit rate)
Thus,
(Position in target AV media)=(Position in original AV media)×(Target AV media bit rate)/(Original AV media bit rate)
is obtained.
For example, after 0.5 seconds from the beginning, the position (in bytes) in target AV media can be calculated by the following formula.
1200×(10/20)=600
In the same manner, the positions in target AV media at different times are calculated to obtain the predictive navigation information, as shown in
Referring again to
Thereafter, when the user wishes to watch the movie M1 in the bed room, the user enters a play signal. In response to the play signal, DMP 300 reads the predictive navigation information of movie M1, and sends a transcoded AV media request 482 to DMS 200. For example, when a portion of movie M1 covering 0.00-0.50 second is requested, DMP 300 produces a transcoded AV media stream request of 0-599 bytes. This request in data size of transcoded AV media stream is changed to the request in time domain, that is 0.00-0.50 second. The request in time domain is transmitted from DMP 300 to DMS 200 as a transcoded AV media request 482. Then, in DMS 200, using the time domain request and the original navigation information 902, a portion of movie M1 measured in bytes in the original AV media is detected. In this case, such a portion is 0-1199 bytes. Thus, in DMS 200, a portion (0-1199 bytes) in the original AV media is taken up and processed for conversion from the BD-RE format to DVD-VR format. Transcoded portion of movie M1 in the DVD-VR format has a data size of 599 bytes or less, which is about a half of 1200 bytes, that is about a half of the data size of the same portion of movie M1 in the BD-RE format.
Then, the transcoded portion of movie M1 in the DVD-VR format, which is about 599 byte data, is transmitted from DMS 200 to DMP 300 as transcoded AV media delivery 480. Then, in DMP 300, the transcoded portion of movie M1 in the DVD-VR format is used for playing on the screen for the requested a period of 0.00-0.50 second.
Thereafter, DMP 300 may produce another transcoded AV media stream request of 600-1499 bytes for the following portion of the movie M1.
As apparent from the above, to play a video data of high bit rate format stored in DMS 200 by a player, DMP 300, which is capable of playing a video data of low bit rate format, the following steps are carried out.
(A1) Reading an original navigation information from the video data of high bit rate format in DMS 200.
(A2) Converting the original navigation information into a predictive navigation information in DMS 200, the predictive navigation information being applicable to the video data of low bit rate format.
(A3) Sending the predictive navigation information from DMS 200 to DMP 300.
(A4) Requesting by DMP 300, based on the predictive navigation information, a section of the video data of low bit rate format in a low bit rate byte information, wherein the low bit rate byte information includes a starting byte position and an end byte position of the section of the video data of low bit rate format for playing by DMP 300.
(A5) Converting, based on the predictive navigation information, the low bit rate byte information into a time domain information including a starting time and an end time for playing.
(A6) Converting, based on the predictive navigation information and the original navigation information, the time domain information into a high bit rate byte information, wherein the high bit rate byte information includes a starting byte position and an end byte position of the section of the video data of high bit rate format.
(A7) Retrieving, based on the high bit rate byte information, the section of the video data of high bit rate format from the video data stored in DMS 200.
(A8) Transcoding the section of the video data of high bit rate format to a section of the video data of low bit rate format in DMS 200.
(A9) Sending the section of the video data of low bit rate format from DMS 200 to DMP 300.
It is noted that, according to one embodiment, the high bit rate corresponds to the original AV media bit rate, and the low bit rate corresponds to the target AV media bit rate, but the invention is not limited to this arrangement. According to the present invention, it is possible to convert from the low bit rate to the high bit rate, or vice versa.
In the embodiment described above, steps (A6), (A7) and (A8) are explained as being carried out in DMS 200, but step (A6) can be carried out in DMP 300. In this case, the transcoded AV media request 482 can be, instead of the request in time domain, the high bit rate information obtained at step (A6). Also step (A8) can be carried out in DMP 300.
Furthermore, step (A5) which is explained as being carried out in DMP 300, can be carried out in DMS 200. In this case, the transcoded AV media request 482 can be, instead of the request in time domain, the low bit rate information obtained at step (A4).
In the embodiment described above, the video data stored in DMS 200 is transferred in sections and played by DMP 300. The present invention can be applied to a player which is capable of playing a video data of low bit rate format, and provided with a storage device for storing a video data of high bit rate format. To play by such a player, the following steps are carried out.
(B1) reading an original navigation information from the video data of high bit rate format.
(B2) converting the original navigation information into a predictive navigation information which is applicable to the video data of low bit rate format.
(B3) requesting, based on the predictive navigation information, a section of the video data of low bit rate format in a low bit rate byte information, said low bit rate byte information including a starting byte position and an end byte position of the section of the video data of low bit rate format for playing by the player.
(B4) converting, based on the predictive navigation information and the original navigation information, the low bit rate byte information into a high bit rate byte information, said high bit rate byte information including a starting byte position and an end byte position of the section of the video data of high bit rate format.
(B5) retrieving, based on the high bit rate byte information, the section of the video data of high bit rate format from the storage device.
(B6) transcoding the section of the video data of high bit rate format to a section of the video data of low bit rate format.
Step (B4) can be accomplished by multiplying each of the starting byte position and the end byte position of the section of the video data of low bit rate format by (low bit rate)/(high bit rate). Such a player can be constructed by combining DMS 200 (without decoder 314) and DMP 300.
Such a player has an advantage that the player can play a video data which has a higher bit rate than the bit rate that the player can process.
The present invention will be explained in more detail below.
The internal media storage unit 106 contains but not limited to the AV media 108, the conversion condition table and also navigation information 110 for the AV media along with the range map generated during navigation information prediction. The format of navigation information 110 and AV media 108 are supported by the computing devices where they reside. AV media and its navigation information generated for requesting DMP are removed from the internal media storage unit 106 to avoid storage wastage. Other than internal media storage unit 106, each computing device may have the portable media storage 114 that allows copying of predictive navigation information and original AV media for offline playback on other computing device.
Navigation information is required to playback the AV media. The contents of navigation information include the size of AV structure, which is format-dependent, the presentation time, the address of AV structure, the mapping of the presentation time to the address of AV structure and other AV media attributes. Two examples of navigation information are IFO for DVD-VR format, and clip information and real playlist for BD-RE format. Note that the address of AV structure means the offset or byte position of AV structure from the beginning of the AV media for a particular presentation time. The address of either AV structure or AV media mentioned in the remaining of the document refers to the same meaning of the address used in this paragraph.
Each computing device in the network is able to access and playback the AV media stored in other networked computing devices, which work as DMS and have the navigation information prediction subsystem. Additionally, the computing device can also copy the predictive navigation information from DMS for offline playback. DMS provides backward compatibility feature so that requesting DMP that does not support new AV format is able to playback the AV media stored in DMS.
A media renderer 202 is responsible to meet the functionality and capability of DMP, such as supported AV format, based on the DMP request by producing the navigation information and AV media compatible to DMP. In the illustrative embodiment, the media renderer 202 consists of four subsystems, namely transcoder subsystem 204, navigation information subsystem 206 and range mapping subsystem 220 and padding subsystem 222.
The transcoder subsystem 204 transcodes AV media to and from different formats including MPEG video, MPEG ES, MPEG TS or MPEG PS. The transcoder subsystem 204 is able to transcode either the entire AV media or only a range, or a section, of AV media, based on the range specified in the requested signal from the DMP. In the latter, the range of stream to be encoded is based on the range requested by DMP. Since different formats of AV media have different AV data structures, different format has different range for one original media. Therefore, a range mapping subsystem 220 provides the corresponding range between range of original AV media format and range of requested AV media format. Next, padding subsystem 222 is responsible for ensuring the consistency between the predictive and transcoded stream size, which is important to guarantee proper AV playback, which is based on predictive navigation information.
On the other hand, navigation information prediction subsystem 206 is responsible for predicting and generating the navigation information for the AV media format compatible to requesting DMP. Prediction of navigation information is carried out prior to the transcoding of original AV media to the format compatible to requesting DMP. The outcome of the navigation information prediction subsystem 206 is the navigation information compatible to the AV media format requested by DMP.
A media directory 212 running a media directory service involves in determining the AV media type based on requesting DMP. Media directory includes both AV media 224 and navigation information 226 stored in the file system 218 for different media types. The navigation information includes but not limited to the AV data size, AV bit rate, presentation time, discontinuity information and AV stream information. Media directory handler 232 categorizes the AV media and navigation information according to the AV format, e.g. media type 1 is the DVD-VR device-compliant PS and IFO files while media type is the BD-RE device-compliant TS and its navigation information files. An example of the media directory 212 management is that only the default media directory on DMS is accessible by certain specific DVD-VR device, e.g. the directory /Directory level 1/Directory level 2/Directory level 3. The media directory can be either a pre-configured fixed directory or dynamically created directory that exists prior to the sending of AV media or navigation information file to the requesting device.
For content transfer purpose, the content transfer subsystem 214 is responsible to determine whether the request originates from remote computing device or portable media storage unit. The server application 216, along with the remote content transfer subsystem 208 and network interface control subsystem 230, handles the content transfer to requesting DMP in the network. On the other hand, the local content transfer subsystem 228 is responsible for the content transfer between internal media storage unit 106 and portable media storage unit 114. Lastly, an AV decoder 210 such as MPEG decoder may exist if DMS is capable of AV media playback locally.
To work with DMS of present invention, a DMP must be a computing device with either the network capability or portable media storage unit. In the former, the DMP requests to access AV media in DMS via communication network. In the latter, the DMP performs offline playback based on predictive navigation information and original AV media copied from DMS. Offline playback is performed on DMP that has the capability to transcode the copied original AV media. The DMP transcodes the copied original AV media to the target AV media format playable in DMP based on the copied predictive navigation information, along with the conversion condition table,
An example of a DMP is depicted in
Numerous functional subsystems includes content transfer subsystem 316, media directory 320 containing both AV media 306 and navigation information 308, which are categorized by media directory handler 322, within the file system 304, and a display device 318. For real time AV media streaming and playback, the DMP communicates with DMS via network to request for predictive navigation information and AV media from DMS. For offline playback, the DMP obtains the predictive navigation information and AV media copied from DMS via portable media storage. The content transfer subsystem 316 handles the media transfer to or from DMS via network if the request signal is from the DMS or portable media storage unit if the request signal is from the portable media storage unit. The AV decoder 314 is responsible to render the AV media obtained from the content transfer subsystem 316 for presentation on the display device 318.
The acquirement of the predictive navigation information and AV media by DMP from DMS can be done in two different ways. First, the predictive navigation information and original AV media can be copied from DMS to enable offline playback on DMP via portable media storage unit.
To illustrate the process,
Details on the prediction of the navigation information are elaborated in later section based on
Alternatively, the communication flow illustrated in
Based on the received navigation information 520, DMP invokes stream playback 532 by sending an AV media request 510, specifying the desired range of AV media, to the DMS. Instead of transcoding the entire original AV media, the transcoder subsystem 204 retrieves and transcodes only the specified range of original AV media 530 to the requested AV format 528. The transcoded AV media is then delivered 512 to the requesting DMP, which begins the playback of the transcoded AV images 534 via its display device 318. In case trick play 536 (e.g. fast forward, skip) is desired, the DMP has to initiates a request that specifies the range of AV media corresponds to the selected trick play mode to the DMS 514. The transcoding process is then invoked at DMS to transcodes only the requested range of AV media, which is then delivered to the DMP 516. DMP playbacks the desired trick play image 538 upon receiving the requested range of transcoded AV media. Normal and trick play modes can be selected by DMP and the desired range of AV media for playback is specified in the AV media request directed to DMS.
Based on the communication flow in
Based on these three sources, the navigation information generator 694 predicts the contents of the desired navigation information and sends the newly generated navigation information 616 to the DMP 690. At the same time, range map information generated while preparing the predictive navigation information is stored 628 in the range map information storage unit 682 of DMS. The range map information 682 is used by both transcode manager 692 and padding handler 684.
Once DMP 690 receives the requested predictive navigation information 616, the DMP 690 starts requesting the range of AV media 618 it intends to play based on the contents of the predictive navigation information 616. Upon receiving the request 618, which specifies the range of AV media in desired format, the transcode manager 692 is invoked to determine the corresponding range of original AV media it needs to retrieve for transcoding purpose. First, the transcode manager 692 requests the range map information 630 from the range map information storage unit 682. The range map information is retrieved and delivered to the transcode manager 692. Based on the range map information 634, the transcoder manager 692 retrieves only the corresponding requested range of the original AV media 620 from the media stream storage 696 that contains both continuous and discontinuous streams. Once the range of AV media 622 is obtained, the transcode manager 692 transcodes this range of the original AV media 622 to the format supported by the DMP 690. Along with the range map information 632 obtained from range map information storage 682, the transcoded AV media 624 is then delivered to the padding handler 684 to determine whether padding is required to maintain consistency between predictive stream size and transcoded stream size. Lastly, the transcoded AV media, padded if necessary, is delivered 626 to the requesting DMP 690. Present invention incurs lower memory and storage consumption compared to the need of larger storage for the transcoding of entire AV media.
The most important data in navigation information is the mapping between presentation time and the address or position of its corresponding AV media.
The navigation information subsystem first parses and gathers necessary information from the original navigation information and conversion condition table at step 796. The prediction subsystem obtains the original stream information and target stream information from both the conversion condition table at step 794. The original stream information includes the discontinuity flag, recording time, character set and others, while the conversion condition information includes the AV media bit rate, the frame rate, the key frame interval and stream size of both original and target stream. From the parsed information at step 796 and 794, the presentation time stamp and its corresponding address in original stream is extracted at step 792. Based on the conversion condition information, checking is done at step 790 to determine whether there is a change between original and target stream bit rate. If the target stream has different bit rate than the original stream bit rate, the size of each AV data structure of the format requested is predicted and accumulated based on presentation time interval and target bit rate at step 788. Otherwise, the mapping between the original stream address and the target stream address for each AV data structure is done at step 786 based on the presentation time interval, which is unchanged even after transcoding process. The mapping process between original and target stream address is elaborated in the description with reference to
As elaborated in the description for
The transcode manager is responsible to transcode any range of AV media requested by the DMP at step 812 based on the navigation information generated by navigation information prediction subsystem 206. Depending on the playback modes, which can be either normal playback mode or trick play mode, the requested range of AV media is specified in the request initiated by DMP. Every step illustrated in
While predicting the navigation information for requested AV media format, there are a few steps that adopted by the navigation information prediction subsystem 206 in DMS. Among them, the generation of range map information is important to determine the range of original and target AV media so that it matches AV media request by DMP. Range map information is generated based on two types of information: the presentation time stamp and its corresponding address or position in the AV media. The mapping of presentation time stamp to its corresponding position in AV media serves as the reference for the decoder to present the corresponding portion of AV media at specific presentation time.
Another important source to predict navigation information is the conversion condition table 1004. The conversion condition table is used by DMS to predict the contents of the requested navigation information by DMP. DMS refers to conversion condition table 1004 to get the target bit rate and other transcoding parameters, which are necessary to predict the requested navigation information 912.
The mapping between the presentation time stamp and its corresponding address in AV media is based on presentation time values that do not change for each particular AV media without editing. The AV position is calculated based on the target bit rate. Bit rate represents the total size of AV media in bits to be presented within a time interval. Therefore, for example, when the bit rate is reduced, the AV position at each presentation time is reduced to a lower offset as well. The relationship between bit rate and AV position is proportional.
Consistency between predictive navigation information and the transcoded AV media has to be maintained to ensure proper playback at DMP. In present invention, after the DMS transcodes and produces the requested AV media in the requested format, DMS has to ensure that the transcoded AV media size is consistent the requested AV media size as predicted and indicated in the predictive navigation information referred to by DMP.
As described earlier, the navigation information prediction subsystem 206 supports both continuous and discontinuous AV media.
The present invention can be used in method and apparatus for playing video data of high bit rate format by a player capable of playing video data of low bit rate format.
Number | Date | Country | Kind |
---|---|---|---|
2006-240281 | Sep 2006 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2007/066449 | 8/24/2007 | WO | 00 | 2/19/2009 |