The present invention generally relates to subtitles and, more particularly, to a method, apparatus and system for determining disparity estimation for stereoscopic subtitles.
On two-dimensional content, subtitles are usually placed in the same location, for example, at the bottom of a frame or sequence of frames. In contrast, for three-dimensional content, it makes sense to place the subtitles in a particular area of a frame or sequence of frames depending on the elements in the frame(s).
Another factor to consider for three-dimensional content is the disparity involved with displaying three-dimensional content. More specifically, while in two-dimensional content both eyes receive the same frame, for three-dimensional content each eye receives a different frame. As such, the subtitles for three-dimensional content can be rendered in different positions on the horizontal axis. The difference of horizontal positions is called disparity. Disparity of three-dimensional images can cause problems in placing subtitles within three-dimensional content. More specifically, not applying enough disparity or providing too much disparity to a subtitle in a stereoscopic image can negatively affect the image.
For example,
In addition,
As such, because there are many more variables that have to be controlled and taken into account, providing subtitles for three-dimensional content is much more complicated than for two-dimensional content.
Embodiments of the present invention address the deficiencies of the prior art by providing a method, apparatus and system for disparity estimation for determining a position of a subtitle for stereoscopic content. In various embodiments of the present invention, an algorithm is provided to estimate the disparity of subtitles for stereo sequences.
In one embodiment of the present invention, the difference of disparity between subtitles along time is constrained by a function of time and disparity. This guarantees that two consecutive subtitles will have similar disparity if they are close in time.
More specifically, in one embodiment of the present invention, a method for the positioning of subtitles in stereoscopic content includes estimating a position for a subtitle in at least one frame of the stereoscopic content and constraining a difference in disparity between subtitles in at least two frames by a function of time and disparity. In such an embodiment, the estimating can include computing a disparity value for the subtitle using a disparity of an object in a region in the at least one frame in which the subtitle is to be inserted. The subtitle can then be adjusted to be in front of or behind the object.
In an alternate embodiment of the present invention, a subtitling device for determining a position of subtitles in stereoscopic content includes a memory for storing at least program routines, content and data files and a processor for executing the program routines. In such an embodiment, the processor, when executing the program routines, is configured to estimate a position for a subtitle in at least one frame of the stereoscopic content and constrain a difference in disparity between subtitles in at least two frames by a function of time and disparity.
In an alternate embodiment of the present invention, a system for determining a position of subtitles for stereoscopic content includes a source of at least one left-eye view frame of stereoscopic content in which a subtitle is to be inserted, a source of at least one right-eye view frame of stereoscopic content in which a subtitle is to be inserted and a subtitling device for estimating a position for a subtitle in at least one frame of the stereoscopic content, constraining a difference in disparity between subtitles in at least two frames by a function of time and disparity and inserting the subtitle in the frames using the estimated and constrained position.
The teachings of the present invention can be readily understood by considering the following detailed description in conjunction with the accompanying drawings, in which:
It should be understood that the drawings are for purposes of illustrating the concepts of the invention and are not necessarily the only possible configuration for illustrating the invention. To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures.
The present invention advantageously provides a method, apparatus and system for providing subtitles and disparity estimations for stereoscopic content. Although the present invention will be described primarily within the context of providing subtitles for three-dimensional content, the specific embodiments of the present invention should not be treated as limiting the scope of the invention. It will be appreciated by those skilled in the art and informed by the teachings of the present invention that the concepts of the present invention can be applied to substantially any stereoscopic image content.
The functions of the various elements shown in the figures can be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions can be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which can be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and can implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future (i.e., any elements developed that perform the same function, regardless of structure).
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative system components and/or circuitry embodying the principles of the invention. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
As previously mentioned, adding subtitles to stereoscopic content, such as three-dimensional (3D) content is much more complicated than adding subtitles to two-dimensional content. For example, for 3D content, it makes sense to place the subtitles in a particular area of a frame or sequence of frames depending on the elements in the frame(s). In addition, for 3D content, the disparity involved with displaying the 3D content has to be taken into account. As such, the subtitles for three-dimensional content can be rendered in different positions on the horizontal axis.
A previously proposed solution is to put the subtitles as close as possible to the objects of the scene, but this can yield to problems too. There are no guarantees that consecutive subtitles close to each other in time will have a similar disparity. A considerable difference of disparity between subtitles close in time can create visual fatigue to the user and ruin the visual experience. More specifically, the disparity of an object present in left and right frames of a stereo sequence can be zero, positive or negative. When the disparity is zero, the 3D projection of the object will be in the plane of the screen. When the disparity is positive, the object will pop into the screen, and when it is negative, the object will pop out of the screen. Typically, the disparity is measured in pixels.
There are several methods to estimate the disparity of the objects of the scene. A possible classification of the methods is by the number of disparity points that they provide. Therefore, two categories are:
Dense disparity maps, where each pixel (or almost each pixel) has a disparity value.
Sparse disparity maps, where only a few pixels have a disparity value.
The implementation and description of the methods of the various embodiments of the present invention described herein implement a sparse disparity map, but a dense disparity map can also be used in accordance with the concepts of the present invention without affecting the procedure or the results.
In describing the concepts of the present invention, the inventors define subtitles as being divided in units that are defined as cells. Each cell is typically composed of an incremental unique identifier, a timestamp and the text itself.
In one embodiment of the present invention, the fields in a subtitle cell are:
Timestamp, which dictates when the subtitle has to be rendered.
Text, which is the subtitle text to be rendered.
In accordance with an embodiment of the present invention, the location of subtitles for a stereoscopic image begins with an estimation. That is, the region in which the subtitles are going to be rendered can be estimated before rendering. Even if the exact dimensions or placement of the region is not completely known (the size and font of the subtitles can vary, so can the region) a rough estimate is enough to begin. For example,
In one embodiment of the present invention, the size and placement of the subtitle region is defined on percentage of the frame size, being the X-range from 10% to 90% of the frame width and the Y-range from 70% to 100% of the frame height.
In accordance with various embodiments of the present invention, the disparity of a subtitle cell is estimated according to the following relations:
C={c1, c2, . . . , cM} depicts the set of subtitle cells and ti the timestamp of the subtitle cell ci (note that the timestamp ti indicates in which frames the text of the subtitle cell ci has to be rendered). Ft
depicts the set of disparities D (sorted in increasing order) inside the region R of the jt
The relations described above assign a disparity value {circumflex over (d)}i to the subtitle cell ci. For this purpose the set of disparity values DRt
It should be noted that some of the disparities in DRt
For example,
In one embodiment of the present invention, the disparity values are computed using the horizontal component of the displacement vector between two feature points. In addition, the variables of the algorithm explained in
In accordance with the present invention, a disparity value {circumflex over (d)}i is assigned to each subtitle cell ci as described above. The values of the embodiment of
In accordance with an embodiment of the present invention, in order to fix this problem, the subtitle cells have to be balanced. This consists in introducing a constraint, function of time and disparity, to the set of disparities of C. In one embodiment of the present invention, the subtitles close in time (i.e., number of frames) are forced to have a similar disparity. In one embodiment of the present invention, this is accomplished by adding a negative value to the subtitle cell with higher disparity (i.e., 3D projection closer to the screen) in order to avoid the problem depicted in
For example,
In one embodiment of the present invention, an algorithm for adding a negative value to the subtle cell with higher disparity follows:
where gap(ti,ti+1) is the number of frames between the end of the timestamp ti and the beginning of the timestamp ti+1, T is a threshold and ε is a negative value. In one embodiment T=3 and ε=1.
In various embodiments of the present invention, subtitle cells of C can be sliced in one-frame-long cells, generating a new set of cells. The result of applying the disparity estimation method of the present invention to this new set of subtitle cells leads to subtitles that smoothly move on the Z axis according to the disparity of the elements on DR. This technique leads to a better user experience. Although in the described embodiment, one-frame-long cells have been generated, in alternate embodiments of the present invention, it is also possible to generate cells of a larger number of frames. In addition, the disparity values can be filtered again to constrain even more temporal consistency.
For example,
In accordance with the concepts of the present invention, subtitles can be treated as other objects of the scene. That is, subtitles can be occluded partially or totally by objects present in the content. For example,
In addition, in accordance with the concepts of the present invention besides disparity, other features of the subtitles (like size, color, texture, font . . . ) can also change depending on the characteristics of the scene. For example, the size of a subtitle can increase when it pops out of the screen. In addition, the algorithm of the present invention can be improved to balance the subtitles in a faster way. For example, in one embodiment of the present invention, a maximum disparity value can be set such that when a difference of disparity between two subtitle cells is higher than the maximum allowed, the disparity of the cell that has to change can be set to the disparity of the other cell plus the maximum difference of disparity allowed between them.
Even further, in alternate embodiments of the present invention, regions of interest are determined and the subtitles are placed at the same disparity of the objects there. If there are objects with more negative disparity in the subtitles region, the disparity will be set to the one there. Subtitles can be balanced too.
Furthermore, in accordance with various embodiments of the present invention, a default disparity value can be set. As such, subtitle cells with the default disparity value can be disregarded as anchor points to pull other subtitle cells to its position. In addition, the disparity values can be computed using the horizontal component of the displacement vector between two feature points, but both horizontal and vertical components can be used to compute the disparity values. In such an embodiment, the region DR can change with time.
In the system 100 of
Again, although the subtitle device 115 of
The GUI of
As depicted in
At step 1304, a difference in disparity between subtitles in at least two frames is constrained by a function of time and disparity. As described above, in one embodiment of the present invention, a difference in disparity between subtitles in the at least two frames is constrained by applying a negative disparity value to a subtitle having a higher disparity value. That is, in various embodiment of the present invention, a maximum difference of disparity in subtitles between frames is set such that when a difference of disparity between two subtitles is higher than the maximum, the disparity value of the subtitle that has to change is set to the disparity value of the other subtitle plus the maximum difference of disparity. The method 1300 is then exited.
Having described various embodiments for a method, apparatus and system for disparity estimation for providing subtitles for stereoscopic content (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments of the invention disclosed which are within the scope and spirit of the invention. While the forgoing is directed to various embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof.
This application claims the benefit of U.S. Provisional Application Ser. No. 61/308,174 filed Feb. 25, 2010, and is hereby incorporated by reference in its entirety for all purposes.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US10/03217 | 12/20/2010 | WO | 00 | 8/23/2012 |
Number | Date | Country | |
---|---|---|---|
61308174 | Feb 2010 | US |