The present invention relates generally to the field of video browsing. The invention concerns a method for browsing video frames in a timeline user interface.
Timeline user interfaces are classically used for browsing collections of video frames. Such a timeline user interface is generally embedded in video players, video editing software or video summarizing software. In these user interfaces, the video frames are displayed in their chronological order and they introduce the notions of current frame, past frames (left side of the current frame) and future frames (right side of the current frame).
The layouts (designs) of these timeline user interfaces may be basically classified into two categories: 1-D layouts and 2-D layouts.
The 1-D layout comprises generally a horizontal timeline whose left end is associated to the start of the video sequence and the right end is associated to the end of the video sequence. Classic video players, such as the one illustrated by
As other 1-D horizontal layouts, key-frame filmstrips as illustrated by
For both key-frames filmstrips and tapestries, the length of strip is generally much greater than the user's screen width. So the user cannot have a global overview of the whole video sequence on a single screen. The user can view the whole strip, piece by piece, by dragging the strip to make visible another time range, or by tuning the step of the time subsampling of the displayed key-frames.
Tapestries and key-frame filmstrips only exploit the width of the user's screen but not the height. Such representations do not allow a global overview of the video sequence. So 2-D layouts have been developed in order to exploit the two dimensions (width and height) of the user's screen.
This system is configured to allow the user to scroll through the video frames. More specifically, it may be configured to move (fast forward or rewind) the video frames along the path by dragging a video frame to the left or to the right.
This layout is adapted to display consecutive video frames of a video sequence. When the video sequence, for example a movie, comprises a large amount of video frames, these latter can not be all displayed in the path. It is not adapted to zoom-in or zoom-out in some parts of the video sequence.
There is thus a need to propose enhanced browsing solutions that would enable to have an overview of the whole video sequence by displaying key frames representative of the whole video sequence and would enable the user to zoom-in or zoom-out easily on some part of the video sequence.
The present invention concerns a method for browsing a collection of P video frames through a user interface, wherein each video frame has a timestamp and wherein the user interface comprises N cells disposed along a time line with N<P, said method comprising the steps of:
Hence, according to the invention, a subset of N key frames representative of the collection is first displayed in the cells of the user interface. Then, the user can stretch (zoom-in) or compress (zoom-out) a portion of the displayed collection by moving a selected key frame displayed in a given cell to another cell thus adjusting the uniform subsampling step by piece. According to the invention, the zoom-in or zoom-out is fully customizable by selecting appropriately the cells Ci and Cj.
According to a particular embodiment, in response to the input command, if the cells Ci and Cj are both past cells, only the key frames displayed in the past cells and the current cell are updated and if the cells Ci and Cj are both future cells, only the key frames displayed in the future cells and the current cell are updated.
According to another embodiment, in response to the input command, the key frames displayed in all cells are updated, except possibly the first cell and the last cell.
Thus, if the user moves the key frame of the cell Ci to the cell Cj with j>i, it will result in stretching (zooming-in) the portion of the collection displayed in the cells disposed along the time line before the cell Cj while compressing (zooming-out) the portion of the collection displayed in the cells disposed along the time line after the cell Cj. Conversely, if the user moves the key frame of the cell Cj to the cell Ci, it will result in compressing (zooming-out) the portion of the collection displayed in the cells disposed along the time line before the cell Cj while stretching (zooming-in) the portion of the collection displayed in the cells disposed along the time line after the cell Cj.
According to a particular embodiment, the collection of frames is a video sequence.
According to a particular embodiment, the N key frames displayed in the N cells are determined by subsampling the video sequence.
According to a particular embodiment, the temporal distance between two key frames displayed in two consecutive cells is substantially constant. In this case, the subsampling step is fixed.
According to a particular embodiment, the N key frames displayed in the N cells are determined by:
According to a particular embodiment, the temporal distance between two frames displayed in two consecutive cells of the past cells or the future cells is substantially constant. The subsampling step for determining the key frames of the past cells may be different from the subsampling step for determining the key frames of the future cells.
According to a particular embodiment, in response to the input command, the updating key frames are determined by subsampling a portion of the video sequence preceding the key frame displayed in the cell Ci and subsampling a portion of the video sequence subsequent to the key frame displayed in the cell Ci.
According to another embodiment, the N key frames displayed in the N cells are determined by selecting N key frames in the video sequence according to a predetermined selection criterion. This selection may be based on saliency criterion or on the presence of faces, cuts or shots in the video sequence.
In this embodiment, the N key frames are advantageously selected among Q key frames determined according said predetermined selection criterion, with N<Q<P, and, when a transfer of a key frame from a cell Ci to a cell Cj is requested, the updating key frames are selected among the Q frames.
According to a particular embodiment, the updating key frames are determined by firstly selecting key frames among the key frames previously displayed in the N cells and, when appropriate, adding intermediate key frames issued from the collection of video frames.
According to a particular embodiment, the user interface comprises N=2M+1 cells, one cell for the current cell, M cells for the past cells and M cells for the future cells.
According to a particular embodiment, the cells are disposed along the time line such the first M cells defines a first spiral of cells and the last M cells defines a second spiral of cells, said first and second spirals of cells being linked to both sides of the current cell.
According to a particular embodiment, the collection of video frames is a collection of reference frames, each reference frame being representative of a own video sequence having a creation date, said creation date being used as timestamp for the reference frame.
The invention also concerns a processing device for browsing a collection of P video frames through a user interface, wherein each video frame has a timestamp and wherein the user interface comprises N cells disposed along a time line with N<P, said processing device comprising a processor for processing said P video frames, a display element for displaying said user interface, an input circuit for receiving input commands, the processor being configured to:
The invention can be better understood with reference to the following description and drawings, given by way of example and not limiting the scope of protection, and in which:
While example embodiments are capable of various modifications and alternative forms, embodiments thereof are shown by way of example in the drawings and will herein be described in details. It should be understood, however, that there is no intent to limit example embodiments to the particular forms disclosed, but on the contrary, example embodiments are to cover all modifications, equivalents, and alternatives falling within the scope of the claims. Like numbers refer to like elements throughout the description of the figures.
Methods discussed below, some of which are illustrated by the drawings, may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine or computer readable medium such as a storage medium. A processor(s) may perform the necessary tasks. Specific structural and functional details disclosed herein are merely representative for purposes of describing example embodiments of the present invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to only the embodiments set forth herein.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises”, “comprising,” “includes” and/or “including”, when used herein, specify the presence of stated features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof. It will be further understood that terms, e.g., those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The invention will be described hereinafter by using a user interface layout having N=2M+1 cells Ci disposed along a time line as illustrated by
In this figure, the first M cells along the time line (past cells) defines a first spiral SP1 of cells and the last M cells (future cells) defines a second spiral SP2 of cells, these first and second spirals of cells being linked to both sides of the current cell. The current cell is referenced C0, the past cells are referenced Ci with iε[−M . . . −1] and the future cells are referenced Ci with iε[1 . . . +M].
Of course, other geometries of 1-D or 2-D layout with cells disposed along a time line may be used for the user interface without departing from the scope of the invention. Likewise, cells do not have necessarily the same size and this size can vary according to the position of the cell. In the example of
According to the invention and as illustrated by
In a particular embodiment, the step S1 is implemented by subsampling the video sequence. Advantageously, the temporal distance between two successive video frames of the N video frames is substantially constant (the subsampling interval is thus fixed).
This embodiment is illustrated by
(also called first subsampling interval) where t−M and tM designate the timestamp of the first and last video frames displayed in the cells C−M and CM. The key frames F(ti) are displayed in the cells Ci with
In a variant illustrated by
In this case illustrated by
and the future cells Ci (iε[1 . . . M]) are filled with the key frames F(ti) with
In this embodiment, the portion of the video sequence preceding the key frame F(t0) is subsampled by a subsampling step
(also called fourth subsampling interval) for determining the key frames to be displayed in the past cells C−M . . . C−1 and the portion of the video sequence subsequent to the key frame F(t0) is subsampled by a subsampling step
(also called fifth subsampling interval) for determining the key frames to be displayed in the future cells C1 . . . CM.
In reference to
In response to this input command, the method of the invention comprises a step S4 (
Examples of input commands will be described hereinafter in reference to
In that case, the key frames F(ti) displayed in the cells Ci are updated as illustrated by
are displayed in the past cells Ci (iε[−M . . . −1]) and the key frames F(ti) with
are displayed in the future cells Ci (iε[1 . . . M]). The hatched cells indicate the cells in which the video key is updated.
It creates a zooming-in (stretching) effect in the future spiral (future cells) since the same number of key frames (M key frames) is used to describe less time (tM-tu instead of tM-t0). Hence, more temporal details on what happens between instants tu and t−M are revealed in the future spiral. Conversely, it creates a zooming-out (compressing) effect in the past spiral (past cells) since the same number of key frames (M video frames) is used to describe more time (tu-t−M instead of t0-t−M). Accordingly, in this embodiment, the portion of the video sequence preceding the key frame F(tu) is subsampled by a subsampling step
(also called second subsampling interval) for determining the key frames to be displayed in the past cells C−M . . . C−1 and the portion of the video sequence subsequent to the key frame F(tu) is subsampled by a subsampling step
(also called third subsampling interval) for determining the key frames to be displayed in the future cells C1 . . . CM.
In that case, the key frames F(ti) displayed in the cells Ci are updated as illustrated by
are displayed in the future cells Ci (iε[1 . . . M]). The subsampling step of the future spiral is reduced
which provides the temporal zoom effect. To zoom out, the user may touch the key frame of the future spiral and drag it out of the spiral, then the key frame F(tM) will be re-set at the center of the spiral.
This action allows to reveal more temporal details on what happens between the key frame F(t0) and the key frame F(tu).
In that case, the key frames F(ti) displayed in the cells Ci are updated as illustrated by
are displayed in the future cells Ci (iε[1 . . . j]) and the key frames F(ti) with
are displayed in the future cells Ci (iε[j+1 . . . M]).
Two different subsampling steps δ1 and δ2 are then used inside the future spiral.
(also called second subsampling interval in this embodiment) is applied to the j first key frames and
(also called third subsampling interval in this embodiment) is applied to the M-j last key-frames.
Given k (1≦k<M and k≠j) the index of the cell containing the key frame F(tu) before user interaction, j>k corresponds to using more key frames to represent the time interval t0,tu while using less key frames to represent the time interval tu,tM, i.e. stretching the representation of the time interval t0,tu while compressing the representation of the time interval tu,tM.
On the contrary, j<k corresponds to using less key frames to represent t0,tu while using more key-frames to represent tu,tM, i.e. compressing the representation of the time interval t0,tu while stretching the representation of the time interval tu,tM.
In that case, the key frames F(ti) displayed in the cells Ci are updated as illustrated by
are displayed in the cells Ci (iε[−M . . . j]) and the key frames F(ti) with
are displayed in the future cells Ci (iε[j+1 . . . M]). Accordingly, two different subsampling steps δ1 and δ2 are then used inside the whole spiral.
(also called second subsampling interval in this embodiment) is applied to the j first key frames and
(also called third subsampling interval in this embodiment) is applied to the M-j last key-frames.
Of course, other interactions by dragging the key frame of a cell onto any other cell of the spirals are possible. The update operation can be applied to the cells of only one spiral (e.g.
In some cases, it may be a bit disturbing for the user to see that the key frames of all the cells change as soon as the key frame of the current cell changes and not to see the previously displayed key frames anymore. To address this problem, it is proposed to, when zooming-in in a zone of cells, select the updating key frames among the key frames previously displayed in the cells of this zone, redistribute them linearly in this zone and to fill the empty cells by new key-frames selected from the video sequence by using a uniform subsampling. When zooming-out in a zone of cells, it is proposed to select the updating some key frames among the key frames previously displayed in this zone and to redistribute them linearly in the cells of the zone. This update operation is illustrated by
All the examples given hereinabove are described where key frames and updating key frames are selected automatically using subsampling in time. Alternative solutions may be used for selecting the key frames such as a selection based on saliency, on detection of faces, cuts or shots in the video sequence. For example, Q frames can be selected among P frames of a video sequence, with Q<P, based on a saliency criterion. N key frames are selected among the Q frames in order to be displayed in the N cells of the user interface. When a transfer of a key frame from a cell Ci to a cell Cj is requested, the updating key frames are selected among the Q frames. For example, if there are K key frames to be updated, the K updating key frames can be selected by subsampling a portion of the Q frames, said portion being related to the part of the video sequence comprising key frames to be updated. In that case, this is not a time subsampling.
Likewise, the invention has been described hereinabove for browsing a video sequence. It can also be used for browsing other collections of frames, for example a collection of reference frames, each reference frame being representative of a own video sequence having a creation date, said creation date being used as timestamp for the reference frame. In that case, the key frames are selected by subsampling the collection of reference frames. The key frames are then displayed in the cells of the user interface in a chronological order by using the associated creation date.
The internal memory 11 stores a computer program comprising instructions which, when executed by the processing device 1, in particular by the processor 10, make the processing device 1 carry out the processing method described in
According to exemplary and non-limitative embodiments, the processing device 1 is a device, which belongs to a set comprising:
According to the invention, the processor is configured to implement the following steps:
Number | Date | Country | Kind |
---|---|---|---|
1530532.7 | Mar 2015 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2016/054592 | 3/3/2016 | WO | 00 |