This invention relates to a method and apparatus for generating a video field or frame, and in particular to a method and apparatus for generating a video field Rk within a video field sequence R of NR video fields.
Television broadcast schedules are required to be optimised to generate the highest possible revenue, which is achieved through selling high value advertising spots. In many countries, legislation exists which mandates a minimum of actual programming hours per elapsed hour, which allows a certain amount of advertising per hour.
In order to maximise usage of the allowed advertising slots, broadcasters want programming material which does not exceed the minimum legal requirement, and as many advertisements as possible to fill the allowed advertising slots. Accordingly, if a programme delivered to a broadcaster is longer than required for optimal scheduling, it is desirable to reduce the running time of the content. Similarly, if the programme is too short for the legally required duration, it is desirable to increase the running time of the programme. Such modifications can be made to advertisements also.
In the early days of television broadcasting, such programme duration increases or decreases were made by manual editing, e.g. removing segments of a programme or repeating segments. More recently, automated techniques have been developed that allow the running time of video material to be increased or reduced. Such known automated techniques involve the dropping or repeating of frames or fields, and/or interpolation (linear or motion compensated) of frames or fields.
The problem with the dropping or repeating of frames or fields is that programme material is essentially discarded in the case of frame dropping, and that visually disturbing freezes can be created in the case of frame repeating. Furthermore, there can be audible audio disturbances if relevant audio information is dropped or repeated when the video frame is processed. Significantly, where a large programme length change is needed, there would not be, in general, enough scene cuts for such methods to achieve the required duration modification.
Interpolation methods involve a continuous interpolation process, effectively creating an output video sequence at a nominally higher or lower frame rate than the input. When this is replayed at the original frame rate, the sequence will be of longer playback duration (in the case of a higher nominal output frame rate) or of shorter playback duration (in the case of a lower nominal output frame rate). The main disadvantage of such methods is that, unless the frame interpolation method is very sophisticated (e.g. using motion compensated interpolation), the output video may suffer visible quality defects such as blurring in areas of motion.
More complex methods apply a hybrid of the frame drop/repeat method combined with some form of interpolation when there are insufficient scene cuts or static areas. Such methods risk the introduction of artefacts from both blurring/frame blending due to the interpolation, and loss of relevant picture material due to frame dropping.
Accordingly, there is a need for an improved technique for enabling the running time of video material to be adjusted (reduced or increased) that overcomes at least some of the above identified problems with conventional techniques.
According to a first aspect of the present invention there is provided a method of generating a video field Rk within the output video field sequence R, the output video sequence R consisting of NR fields, where 0≦k<NR. The method comprises determining a temporal alignment parameter CRk indicative of a temporal alignment of a start time TCk of a conversion time interval Ck within a sequence C of NR conversion time intervals with respect to a source video frame Si within the source video frame sequence S, wherein the sequence C of conversion time intervals Ci comprises a duration equal to a duration PS of the source video frame sequence S. A source video frame from the source video frame sequence S from which to generate the video field Rk is then determined based on the temporal alignment parameter CRk, and the video field Rk is then generated from the determined source video frame.
In this manner, video fields Rk for the output video field sequence R are able to be dynamically generated such that the resulting output video sequence R comprises a predetermined and adaptable duration PR.
According to some further embodiments, the method may comprise comparing the temporal alignment parameter CRk to a threshold value Z, and determining the source video frame from which to generate the video field Rk based on the comparison of the temporal alignment parameter CRk to the threshold value Z.
According to some further embodiments, the method may comprise selecting one of the source video frame Si and the source video frame Si+1 as the source video frame from which to generate the video field Rk based on the comparison of the temporal alignment parameter CRk to the threshold value Z.
According to some further embodiments, the temporal alignment parameter CRk may comprise a fractional component of the start time TCk in source frame units.
According to some further embodiments, the method may further comprise determining whether the field Rk comprises a field 1 or a field 2, and generating the video field Rk further based on the determination of whether the field Rk comprises a field 1 or field 2 where field 1 and field 2 are defined below.
According to some further embodiments, the method may comprise:
According to some further embodiments, the method may further comprise outputting the generated video field Rk.
According to some further embodiments, the method may comprise outputting the generated video field Rk as part of a video stream comprising the video field sequence R of NR video fields.
According to some further embodiments, the method may comprise outputting the generated video field Rk to at least one of:
According to a second aspect of the present invention there is provided a video processing apparatus comprising at least one video field generator component for performing the method of the first aspect of the invention.
Further details, aspects and embodiments of the invention will be described, by way of example only, with reference to the drawings. In the drawings, like reference numbers are used to identify like or functionally similar elements. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale.
The present invention will now be described with reference to the accompanying drawings. However, it will be appreciated that the present invention is not limited to the specific examples herein described and as illustrated in the accompanying drawings. For example, examples of the present invention are herein described primarily with reference to the reduction of the program length of a source video sequence by means of adjusting the generation of output video fields. However, it will be appreciated that the present invention provides a method for generating video fields that enables both the reduction and increase in the resulting output program relative to the source content. In addition, and as will become apparent below, the use of the term ‘field’ as used herein is intended to encompass, without limitation, an individual field within video sequences consisting of segmented frames such as interlaced/progressive segmented video sequences, and an individual frame within non-segmented frame video sequences.
Furthermore, because the illustrated embodiments of the present invention may for the most part, be implemented using electronic components and circuits known to those skilled in the art, details will not be explained in any greater extent than that considered necessary as illustrated below, for the understanding and appreciation of the underlying concepts of the present invention and in order not to obfuscate or distract from the teachings of the present invention.
It is understood that in the context of interlaced video sequences, two fields, field 1 and field 2, make up one interlaced video frame, and field 1 always precedes field 2 within the interlaced video sequence. Field 1 typically may comprise the odd numbered lines of the frame or the even numbered lines of the frame, depending on the originating video standard. Similarly, field 2 typically may comprise the even numbered lines of the frame or the odd numbered lines of the frame, depending on the originating video standard. As an example, in Phase Alternating Line (PAL) television systems, field 1 comprises the odd numbered active frame lines in each interlaced video frame and field 2 comprises even numbered active frame lines, whereas in National Television System Committee (NTSC) systems, field 1 comprises even numbered active frame lines from each interlaced video frame and field 2 comprises odd numbered active frame lines.
Furthermore, references are made herein to video rates of 24 Hz and 60 Hz. Common content acquisition rates are 24/1.001 Hz and 60/1.001 Hz (sometimes referred to in the broadcast industry as 23.98 and 59.94 Hz). It is to be understood that the use of the term “24 Hz” as used herein is intended to encompass both 24 Hz and 24/1.001 Hz video rates unless expressly stated otherwise, and the use of the term 60 Hz as used herein is intended to encompass both 60 Hz and 60/1.001 Hz video rates unless expressly stated otherwise.
Referring first to
The process by which 24 Hz material is commonly converted to 60 Hz is informally termed “telecine” or 2:3 pull-down, referring to the original transfer of 24 Hz film content to 60 Hz interlaced video for television transmission. As is well known in the field, creating an output 60 Hz interlaced sequence of the same running length as the 24 Hz source material requires deriving ten output fields 125 for every four source frames 115.
A simple diagram representing a conventional 2:3 pull-down process for the 24 Hz to 60 Hz conversion is shown in
For example, source frame S0 is used to generate two intermediate interlaced fields (f01 and f02), whereby source frame S0 is optically scanned or digitally sampled with appropriate filtering using a field 1 sample grid to generate the intermediate field f01. The same source frame S0 is then sampled again using a field 2 sample grid to generate the intermediate field f02.
The second source frame S1 is used to generate three intermediate interlaced fields (f11, f12 and f11), whereby image content of the second source frame S1 is sampled using the field 1 sample grid, generating intermediate field f11. The second source frame S1 is then sampled again using the field 2 sample grid, generating intermediate field f12. In the case of the second source frame S1, image content of the second source frame S1 is again sampled using the field 1 sample grid, generating a further intermediate field f11.
The output frame sequence 220 is then created using the intermediate fields 230.
Irrespective of how many fields are generated from a given source frame Si, the sampled intermediate fields fi must always alternate between afield 1 and afield 2 as per normal interlaced video. Interlaced video is recorded and broadcast as a whole number of interleaved frames comprising a field 1 and a subsequent field 2. For example, as illustrated in
As outlined above, there is a need for enabling the running time of video material to be adjusted (reduced or increased). In accordance with some example embodiments of the present invention, and referring back to
In this manner, and as described in greater detail below, video fields Rk 125 for the output video field sequence R 120 are able to be dynamically generated such that the resulting output video sequence R 120 comprises a predetermined and adaptable duration PR 122.
The method of adapting the running length of the output video field sequence R 120 is achieved by adapting the ratio of the number of output fields NR to the number of source frames NS. For example, taking the source video sequence S 110 of duration PS 112, consisting of NS frames, which are timed at 24 frames per second, i.e. having a frame period TS 114 of 1/24 seconds, it is desired to produce output video field sequence R 120 having a duration PR 122.
Given that the output video field sequence R 120 must comprise a whole number NR of fields Rk 125, the duration PR 122 is necessarily an integer number NR of fields, each with a predefined duration TR 124. Accordingly:
P
R
=T
R
*N
R Equation 1
Furthermore, in the case of the output video field sequence R 120 comprising an interlace format, the output video field sequence R 120 must necessarily be a whole number of frames, with each frame consisting of consecutive fields 1 and 2. Thus, the desired choice of output duration PR 122 may only be such that NR is even. In any event, the desired choice of output duration PR 122 defines the number NR of fields Rk 125 in the output video field sequence R 120.
As shown in
T
x
=P
s
/N
R Equation 2
Accordingly, and as described in greater detail below, the sequence C 130 of NR conversion time intervals Ci 135 may be used to determine from which source frame 115 each output video field Rk 125 is to be generated.
The method starts at 305, and moves on to 310 where the duration PS 112 of the source video sequence S 110 and the number NR of fields Rk 125 in the output video field sequence R 120 are determined. Such determinations may be by way of user input, or through being derived from the source video sequence S 110 itself (in the case of the duration PS 112) and a required duration PR 122 for the output video field sequence R 120 (in the case of the NR of fields Rk 125). The period TX 134 for the conversion time intervals Ci 135 is then computed, at 315, and counter values k and i are initialised at 320.
A first field (Rk=R0) 125 is then generated from the first source frame (Si=S0) 115 using a field 1 sample grid, and outputted at 325. Since the field Rk 125 generated at 325 is a first field within the interlaced output video field sequence R 120, at least one further field must be generated in order to complete the field 1/field 2 interlaced field pairing. Accordingly, the method moves on to 330, where the counter k is incremented.
A temporal alignment parameter CRk indicative of a temporal alignment of a start time TCk of the conversion time interval Ck 135 within the sequence C 130 of conversion time intervals with respect to the source video frame Si is then computed, at 335. For example, and referring back to
T
Ck
=k*T
x Equation 3
The start time TCk of a conversion time interval Ck 135 may further be expressed in terms of source frame units SCk:
S
Ck
=T
Ck
/T
S Equation 4
A fractional component FCk of this source frame unit value SCk may then be obtained, which represents the temporal alignment of the start time TCk 140 of the conversion time interval Ck 135 with respect to its immediately preceding source frame Si:
FCk=SCk modulo 1 Equation 5
Substituting Equations 3 and 4 into Equation 5 gives:
F
Ck=(k*Tx/TS) modulo 1 Equation 6
The index i of the immediately preceding source frame Si:
i=floor(k*Tx/TS) Equation 7
That is to say, i is equal to the largest integer not greater than the source frame unit value SCk, and thus is equal to the largest integer not greater than (k*Tx/TS). Accordingly, the fractional component FCk may be obtained by the alternative expression:
F
Ck=(k*Tx/TS)−i Equation 8
The fractional component FCk may then be used as (or to otherwise used to derive) the temporal alignment parameter CRk indicative of the temporal alignment of the start time TCk of the conversion time interval Ck 135.
Accordingly, referring back to
C
Rk=(k*Tx/TS)−i Equation 9
Hence, in this first instance for step 335, when k=1 and i=0, a temporal alignment parameter CR1 for the start time TC1 (
Having computed the temporal alignment parameter CRk, the method moves on to 340 where, in the illustrated example, it is compared to a threshold value Z to determine the source video frame 115 from which to generate the video field Rk 125. In particular for the example illustrated in
The choice of threshold Z determines the timing of a transition from one source frame 115 to the next, from which the output video fields 125 are generated. In effect, adjusting the threshold Z creates a variable sub-field latency, on average, in the delivery of image samples. Hence, the threshold Z has implications for the alignment of the image component of a programme following conversion with respect to other programme content such as audio.
Ultimately, the relative timing of the image component with respect to the audio component is a subjective choice but where image samples are uneven or deviate from the presentation of audio samples, it is widely recognised that images presented early with respect to audio are significantly more preferable to images presented late with respect to audio for the simple reason that the former never occurs in normal real-world experience whereas, due to the finite and relative speeds of light and sound, the latter occurs frequently in real-world experience.
For the case of normal 2:3 pull-down as illustrated in
In general, to ensure that the audio always lags behind the video, we must ensure that the increment, i, for the source frame 115 from which an output field Rk 125 is generated meets the criterion that the start time of the chosen source frame Si used to generate the output field Rk 125 may only be up to a maximum of one output field period TR 124 early.
For a normal 2:3 pattern, one output field period is 40% of the duration of a source frame period. Therefore Z=1−(40/100)=0.6 would provide the optimum output for a normal 2:3 pattern. When the programme length PR 122 (
It is preferable that the difference in time between the presentation of any source frame 115 versus the presentation of the same image content in an output field 125 should never be greater than one output field period TR 124 (
The optimum choice of threshold Z can be calculated from the desired programme length duration reduction.
Let us define a duration change scaling, d, such that:
P
R
=d*P
S Equation 10
where 0<d<1. As noted above, to ensure that the difference in time between the presentation of any source frame 115 versus the presentation of the same image content in an output field 125 should never be greater than one output field period TR 124, we must ensure that the start time of the chosen source frame Si used to generate the output field Rk 125 may only be up to a maximum of one output field period TR 124 early. In the case where one output field period TR 124 is 40% of the duration of a source frame period TS 114 (for example where TS= 1/24 s and TR=60 s), and for a decrease or increase in programme length of d, the optimum threshold value Z may be determined as:
Z=1−(0.4/d) Equation 11
For example, for a 4% programme duration decrease, d=0.96, giving a threshold value of: Z=1−0.4/0.96=0.583. Conversely, for a 4% programme duration increase, d=1.04, giving a threshold value of: Z=1−0.4/1.04=0.615.
Referring back to
Since the field Rk 125 generated at 355 is a field 2 within the interlaced output video field sequence R 120, the method moves on to 360 where the counter k is incremented. It is then determined whether the output field Rk 125 generated at step 355 was the last field in the sequence R 120; i.e. whether the counter value k=(NR−1), at 365. If it is determined that the output field Rk 125 generated at step 355 was the last field in the sequence R 120, the method ends at 370. However, if it is determined that the output field Rk 125 generated at step 355 was not the last field in the sequence R 120, the method loops back to step 335.
Referring back to step 350, if it is determined that the counter value k is an even number, the method loops back to 325 where the next field (Rk) 125 is generated from the first source frame (S0) 115 using the field 1 sample grid.
Referring now to
The method starts at 505, and moves on to 510 where the duration PS 112 of the source video sequence S 110 and the number NR of fields Rk 125 in the output video field sequence R 120 are determined. Such determinations may be by way of user input, or through being derived from the source video sequence S 110 itself (in the case of the duration PS 112) and a required duration PR 122 for the output video field sequence R 120 (in the case of the NR of fields Rk 125). The period TX 134 for the conversion time intervals Ci 135 is then computed, at 515, and counter values k and i are initialised at 520.
A first field (Rk=R0) 125 is then generated from the first source frame (Si=S0) 115, and outputted at 525. The method then moves on to 530 where it is then determined whether the output field Rk 125 generated at step 525 was the last field in the sequence R 120; i.e. whether the counter value k=(NR−1). If it is determined that the output field Rk 125 generated at step 525 was the last field in the sequence R 120, the method ends at 560. However, if it is determined that the output field Rk 125 generated at step 525 was not the last field in the sequence R 120, the method moves on to 535 where the counter k is incremented.
A temporal alignment parameter CRk indicative of a temporal alignment of a start time TCk of the conversion time interval Ck 135 within the sequence C 130 of conversion time intervals with respect to the source video frame Si is then computed at 540, for example as described above in relation to the method of
Having computed the temporal alignment parameter CRk, the method moves on to 545 where, in the illustrated example, it is compared to a threshold value Z to determine the source video frame 115 from which to generate the video field Rk 125. In particular for the example illustrated in
Although the above description refers specifically to programme duration reduction or expansion in the case of a 24 Hz progressive to 60 Hz interlaced conversion, it is equally applicable to any other frame rate conversion. In particular, content which is sourced at 60 Hz interlaced with a 2:3 cadence initially applied may in a first step be reverted to its original 24 Hz frame pattern using a method known to broadcast engineers as “reverse telecine”.
It should be noted that the amount of programme length reduction or expansion enabled by the method disclosed above will be affected by the amount of consecutive 3 field or 2 field frames which the end viewer will find subjectively acceptable. Thus it may be noted that the method allows a maximum of 20% reduction in programme length (where d=0.8) and a maximum of 20% increase in programme length where (d=1.2).
Referring now to
In the example illustrated in
The memory element 620 may comprise, for example and without limitation, one or more of the following: magnetic storage media including disk and tape storage media; optical storage media such as compact disk media (e.g., CD-ROM, CD-R, etc.) and digital video disk storage media; non-volatile memory storage media including semiconductor-based memory units such as FLASH memory, EEPROM, EPROM, ROM; ferromagnetic digital memories; Magnetoresistive random-access memory (MRAM); volatile storage media including registers, buffers or caches, main memory, RAM, etc.
In the example illustrated in
In accordance with some example embodiments of the present invention, the video field generator component 610 may be arranged to perform one of the methods of generating a video field Rk as illustrated in
The video field generator 610 has been illustrated and described as being implemented by way of computer program code executed on one or more processor devices. However, it is contemplated that the video field generator 610 is not limited to being implemented by way of computer program code, and it is contemplated that any suitable alternative implementation may equally be employed. For example, it is contemplated that one or more steps of the methods illustrated in
As previously identified, the invention may be implemented in a computer program for running on a computer system, at least including code portions for performing steps of a method according to the invention when run on a programmable apparatus, such as a computer system or enabling a programmable apparatus to perform functions of a device or system according to the invention.
A computer program is a list of instructions such as a particular application program and/or an operating system. The computer program may for instance include one or more of: a subroutine, a function, a procedure, an object method, an object implementation, an executable application, an applet, a servlet, a source code, an object code, a shared library/dynamic load library and/or other sequence of instructions designed for execution on a computer system.
The computer program may be stored internally on a tangible and non-transitory computer readable storage medium or transmitted to the computer system via a computer readable transmission medium. All or some of the computer program may be provided on computer readable media permanently, removably or remotely coupled to an information processing system. The tangible and non-transitory computer readable media may include, for example and without limitation, any number of the following: magnetic storage media including disk and tape storage media; optical storage media such as compact disk media (e.g., CD-ROM, CD-R, etc.) and digital video disk storage media; non-volatile memory storage media including semiconductor-based memory units such as FLASH memory, EEPROM, EPROM, ROM; ferromagnetic digital memories; MRAM; volatile storage media including registers, buffers or caches, main memory, RAM, etc.
A computer process typically includes an executing (running) program or portion of a program, current program values and state information, and the resources used by the operating system to manage the execution of the process. An operating system (OS) is the software that manages the sharing of the resources of a computer and provides programmers with an interface used to access those resources. An operating system processes system data and user input, and responds by allocating and managing tasks and internal system resources as a service to users and programs of the system.
The computer system may for instance include at least one processing unit, associated memory and a number of input/output (I/O) devices. When executing the computer program, the computer system processes information according to the computer program and produces resultant output information via I/O devices.
In the foregoing specification, the invention has been described with reference to specific examples of embodiments of the invention. It will, however, be evident that various modifications and changes may be made therein without departing from the scope of the invention as set forth in the appended claims and that the claims are not limited to the specific examples described above.
Those skilled in the art will recognize that boundaries between the above described operations are merely illustrative. For example, multiple operations may be combined into a single operation, a single operation may be distributed in additional operations and operations may be executed at least partially overlapping in time. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments.
Also for example, the examples, or portions thereof, may be implemented as soft or code representations of physical circuitry or of logical representations convertible into physical circuitry, such as in a hardware description language of any appropriate type.
However, other modifications, variations and alternatives are also possible. The specifications and drawings are, accordingly, to be regarded in an illustrative rather than in a restrictive sense.
In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word ‘comprising’ does not exclude the presence of other elements or steps than those listed in a claim. Furthermore, the terms ‘a’ or ‘an,’ as used herein, are defined as one or more than one. Also, the use of introductory phrases such as ‘at least one’ and ‘one or more’ in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles ‘a’ or ‘an’ limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases ‘one or more’ or ‘at least one’ and indefinite articles such as ‘a’ or ‘an.’ The same holds true for the use of definite articles. Unless stated otherwise, terms such as ‘first’ and ‘second’ are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements. The mere fact that certain measures are recited in mutually different claims does not indicate that a combination of these measures cannot be used to advantage.
Number | Date | Country | Kind |
---|---|---|---|
1602790.6 | Feb 2016 | GB | national |