Method and system for generating a video frame

Abstract
The present disclosure provides a method for generating a video frame and a system thereof, including: receiving at least two frames of a video captured by an image capture unit through a network; calculating a first set of optical flow vectors of the at least two frames by a first algorithm; generating a set of modified vectors according to at least one parameter; combining the set of modified vectors and the first set of optical flow vectors to obtain a second set of optical flow vectors; and shifting one of the at least two frames according to the second set of optical flow vectors to generate a virtual image. Therefore, the present disclosure can reduce deviation caused by the latency of the network and improve user experience.
Description
CROSS-REFERENCE TO RELATED APPLICATION

The present disclosure is based on, and claims priority from, Taiwan Application Number 105136858, filed Nov. 11, 2016, the disclosure of which is hereby incorporated by reference herein in its entirety.


BACKGROUND
1. Technical Field

The present disclosure relates to methods and systems for generating a video frame, and, more particularly, to a method and a system for generating a video frame that address the deviation caused by the latency of the network.


2. Description of the Prior Art

Unmanned vehicles (e.g., cars or aircrafts) are usually remotely controlled by viewing images from a first person perspective. The images captured by unmanned vehicles are usually transmitted to a remote display via a mobile broadband network using, for example, 2.4 GHz automatic frequency hopping technology. However, the video stream data of a typical image is bandwidth-demanding, and the mobile broadband network often has latency (e.g., in the order of tens of milliseconds or even 1 to 2 seconds), resulting in non-real time transmission of images to the remote display. If a user is controlling an unmanned vehicle by looking at the delayed video images, the deviation or collision of the unmanned vehicle may occur.


Accordingly, there is an imperative need for a method and a system for generating a video frame that address the aforementioned issues in the prior art.


SUMMARY

The present disclosure provides a method for generating a video frame, which may include: obtaining at least two frames in a video captured by an image capture unit through a network; calculating a first set of optical flow vectors of the at least two frames by a first algorithm; generating a set of modified vectors according to at least one parameter; combining the set of modified vectors and the first set of optical flow vectors to obtain a second set of optical flow vectors; and shifting one of the at least two frames according to the second set of optical flow vectors to generate a virtual image.


The present disclosure provides a system for generating a video frame, which may include: an image capture unit for capturing a video; and a calculation unit connected with the image capture unit through a network and configured for receiving at least two frames in the video, wherein the calculation unit includes: a modified vector generating module configured for generating a set of modified vectors based on at least one parameter; an optical flow vector generating module configured for calculating a first set of optical flow vectors of the at least two frames by a first algorithm, and combining the set of modified vectors and the first set of optical flow vectors to generate a second set of optical flow vectors; and a virtual image generating module configured for shifting one of the at least two frames based on the second set of optical flow vectors to generate a virtual image.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a flowchart illustrating a method for generating a video frame in accordance with a first embodiment of the present disclosure;



FIG. 2 is a flowchart illustrating a method for generating a video frame in accordance with a second embodiment of the present disclosure;



FIG. 3 is a schematic diagram depicting optical flow vectors of objects in a frame in accordance with the present disclosure;



FIG. 4 is a schematic diagram depicting a first set of optical flow vectors of the objects in the frame in accordance with the present disclosure;



FIG. 5 is a schematic diagram depicting a set of modified vectors in accordance with the present disclosure;



FIG. 6 is a schematic diagram depicting a second set of optical flow vectors in accordance with the present disclosure;



FIG. 7 is a schematic diagram depicting a virtual image generated in accordance with the first embodiment of the present disclosure;



FIG. 8 is a schematic diagram depicting a virtual image generated in accordance with the second embodiment of the present disclosure; and



FIG. 9 is a schematic diagram depicting a system for generating a video frame in accordance with the present disclosure.





DETAILED DESCRIPTION

In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the disclosed embodiments. It will be apparent, however, that one or more embodiments may be practiced without these specific details. In other instances, well-known structures and devices are schematically shown in order to simplify the drawing.


Referring to FIG. 1, a method for generating a video frame of the present disclosure includes steps S11 to S14. In step S11, a first set of optical flow vectors of frames is calculated by a first algorithm. In an embodiment, at least two frames of a video is captured by an image capture unit through a network, and a first set of optical flow vectors of the least two frames is calculated by a first algorithm. In an embodiment, the at least two frames are contiguous or non-contiguous frames in time, and an object that appears in both of the at least two frames is used as a reference. In an embodiment, the image capture unit can be installed on an unmanned vehicle (e.g., car or aircraft), and the video captured by the image capture unit is obtained while the unmanned vehicle is moving.


Upon obtaining the at least two frames, an optical flow estimation technique is used for estimating a motion vector of each object in the frames. In an embodiment, Lucas-Kanade optical flow method can be used for calculation in order to obtain a first set of optical flow vectors. The method proceeds to step S12.


In step S12, a set of modified vectors is generated. In an embodiment, a set of modified vectors is generated according to at least one parameter. In an embodiment, the parameter is the latency information of the network, a direction of the image capture unit, a speed of the image capture unit, or a combination thereof. The method proceeds to step S13.


In step S13, the set of modified vectors and the first set of optical flow vectors are combined to obtain a second set of optical flow vectors. The method proceeds to step S14.


In step S14, one of the at least two frames is shifted according to the second set of optical flow vectors to generate a virtual image. In an embodiment, a frame that occurs later in time can be chosen as the frame being shifted, but the present disclosure is not limited as such.


Referring to FIGS. 2 to 6, a method for generating a video frame in accordance with a second embodiment of the present disclosure is shown. The technical contents of the second embodiment below can also be applied to the previous first embodiment.


In step S21, at least two frames of a video is captured by an image capture unit through a network, and a set of optical flow vectors of the least two frames is calculated by a first algorithm. As shown in FIG. 3, optical flow vectors 211, 221, 231 and 241 of objects 21, 22, 23 and 24 in a frame are calculated by the first algorithm, to generate the set of optical flow vectors.


In an embodiment, the first algorithm is optical flow estimation. Assuming that an optical flow is constant within a neighborhood of a pixel, if one wishes to estimate a pixel I(x, y) in a frame I in a pixel H(x, y) in a frame H, it can be regarded as finding all pixels in the neighborhood, and its fundamental optical flow equation can be given as follows:








E


(

u
,
v

)


=





x
,
y





(


I


(


x
+
u

,

y
+
v


)


-

H


(

x
,
y

)



)

2


=




x
,
y





(


I


(

x
,
y

)


-

H


(

x
,
y

)


+

uI
x

+

vI
y


)

2




,




wherein u and v are the displacement vectors of the pixel. As such, the optical flow vectors 211, 221, 231 and 241 of the objects 21, 22, 23 and 24 in the frame can be calculated to obtain a set of optical flow vectors. In another embodiment, Lucas-Kanade optical flow method may also be used, and in addition to pixels, the calculation can be made in regards to blocks, but the present disclosure is not limited as such. The method proceeds to step S22.


In step S22, it is determined whether an object is stationary or moving. Since the image capture unit is installed on a moving remotely controlled unmanned vehicle (e.g., an unmanned aerial vehicle or an unmanned car), the video images captured are not stationary, but the objects in the frames of the video will have relative displacements with respect to the movement of the remotely controlled unmanned vehicle. Thus, it is determined whether the objects 21, 22, 23 and 24 shown in FIG. 3 are stationary or moving based on characteristic point directionality in conjunction with an associated learning mechanism (such as an artificial intelligent neural network etc.). For example, as shown in FIG. 3, the objects 21 and 22 are trees, and should be determined as stationary objects; the object 24 is a zebra crossing, and should also be determined as a stationary object; and the object 23 is a car, and should be determined as a moving object. This is called ego-motion estimation. The method proceeds to step S23.


In step S23, the optical flow vectors 211, 221, 231 and 241 of the objects 21, 22, 23 and 24 in the at least two frames are adjusted based on the optical flow vectors 211, 221 and 241 of the stationary objects 21, 22 and 24. As shown in FIG. 4, the optical flow vector 231 of the object 23 becomes the first optical flow vector 200, while the optical flow vectors 211, 221 and 241 of the objects 21, 22 and 24 become the first optical flow vectors 210, thereby obtaining a set of the first optical flow vectors 200 and 210. Next, proceed to step S24.


In step S24, a set of modified vectors is generated. In an embodiment, a set of modified vectors is generated according to at least one parameter. In an embodiment, wherein the parameter is the latency information of the network, the direction of the image capture unit, the speed of the image capture unit, or a combination thereof. For example, assuming that the remotely controlled unmanned vehicle exhibits a linear motion in a very short period of time, with the latency information of the network, vector values indicating the forward or backward motions of the first optical vectors 200 and 210 as shown in FIG. 4 can be readily estimated, thereby forming a set of modified vectors 250 as shown in FIG. 5. Alternatively, using the direction of the image capture unit or the speed of the image capture unit, vectors values indicating which direction in which the first optical flow vectors 200 and 210 shown in FIG. 4 should move can be estimated, thereby forming a set of modified vectors 250 as shown in FIG. 5.


As shown in step S25, as shown in FIGS. 4, 5 and 6, the set of modified vectors 250 and the first set of optical flow vectors 200 and 210 are combined to obtain a second set of optical flow vectors 200′ and 210′. This is purely vector operations. The method proceeds to step S26.


In step S26, as shown in FIG. 7, the frame is shifted according to the second set of optical flow vectors 200′ and 210′. For example, the object 23 shown in FIG. 3 is shifted to the location of object 23′ shown in FIG. 7, thus generating a virtual image 30. In an embodiment, a frame that occurs later in time can be chosen as the frame being shifted, but the present disclosure is not limited as such.


In an embodiment, assuming the set of frames in a video is X={X1, X2, . . . , Xm}, a set of virtual images to be generated is Y={Y1, Y2, . . . , Yn}. Assuming that the first set of optical flow vectors is Ed, the network latency information is represented as λ={direction, speed, timestamp}, wherein direction, speed are the moving direction and traveling speed of the image capture unit (that is, the moving direction and traveling speed of the remotely controlled unmanned vehicle), and timestamp is the timestamp of a frame of a video obtained by the remote end, then at least one difference between the estimated set of virtual images Y and the set of frames actually received X can be represented by an equation

Lp(X,Y)=∥(Ed(X)+λ)−Y∥p(p=1 or p=2).


In another example, after a virtual image is obtained, the following steps can be performed. In step S27, at least one patched area in the virtual image is obtained. In an embodiment, differences between the virtual image and one of the at least two frames being shifted based on the set of second optical flow vectors, for example, the difference between the location of the object 23′ in the frame of FIG. 7 and the location of the object 23 in the frame of FIG. 3, are calculated to obtain a patched area 31 in the virtual image 30 shown in FIG. 7. The method proceeds to step 28.


In step 28, image restoration is performed on the patched area 31 by a second algorithm, for example, based on an exemplar-based image inpainting (EBI) method. FIG. 8 shows a virtual image 30′ after restoration, wherein the zero crossing is restored in the patched area. In an embodiment, since the patched area is obtained by comparing the difference between the location of the object 23′ in the frame of FIG. 7 and the location of the object 23 in the frame of FIG. 3, more time and effort are saved than the prior art where a target block is manually selected for restoration.


Referring to FIG. 9, a system for generating a video frame 10 is further provided by the present disclosure, which includes an image capture unit 11 and a calculation unit 12. The calculation unit 12 includes a modified vector generating module 121, an optical flow vector generating module 122 and a virtual image generating module 123. Some of the technical contents of the system for generating a video frame 10 of the present disclosure are the same as those described with respect to the method for generating a video frame above, and therefore will not be repeated.


In an embodiment, the calculation unit 12 is an electronic device having a processor, e.g., a computer, a mobile phone, a tablet or a remote controller with a screen (for a remotely controlled unmanned vehicle). In another embodiment, the module is software performed by a processor.


The image capture unit 11 is used for capturing a video, and can be provided on a remotely controlled unmanned vehicle. The calculation unit 12 is connected to the image capture unit 11 via a network 13 (e.g., 4G, Wi-Fi, WIFI, WiMAX or the like), and is used for receiving the video captured by the image capture unit 11. In an embodiment, the video can first be subjected to real-time video data compression and then transmitted to the calculation unit 12 through the network 13 in a streaming manner. In another embodiment, the video, after being transmitted to the calculation unit 12, can be stored in a storage medium (e.g., a hard disk or a memory) of the calculation unit 12 for further processing.


The modified vector generating module 121 of the calculation unit 12 is configured for generating a set of modified vectors based on at least one parameter. In an embodiment, the parameter is the latency information of network 13, the direction of the image capture unit 11, the speed of the image capture unit 11, or a combination of the above.


The optical flow vector generating module 122 of the calculation unit 12 is configured for calculating a first set of optical flow vectors of at least two frames in the video by a first algorithm. In an embodiment, the first algorithm is an optical flow estimation method. In another embodiment, the at least two frames are contiguous or non-contiguous frames in time, and an object that appears in both of the at least two frames is used as a reference. After calculating a set of optical flow vectors of objects in the at least two frames in the video by the first algorithm, the optical flow vector generating module 122 is configured for determining whether the objects are stationary or moving, and modifying the set of optical flow vectors of the objects in the at least two frames based on a set of optical flow vectors corresponding to the stationary objects to obtain the first set of optical flow vectors.


After generating the set of modified vectors by the modified vector generating module 121, the optical flow vector generating module 122 is further configured for combining the set of modified vectors and the first set of optical flow vectors to generate a second set of optical flow vectors.


The virtual image generating module 123 of the calculation unit 12 is configured for shifting one of the at least two frames according to the second set of optical flow vectors to generate a virtual image. The virtual image can be displayed on a screen of the calculation unit 12 for user viewing. The virtual image can also be transmitted to an external screen through the network of the calculation unit 12. In an embodiment, the calculation unit 12 is a cloud server, and, after the virtual image is calculated, the virtual image can be transmitted to a screen of a controller for controlling the unmanned vehicle.


The system for generating a video frame 10 according to the present disclosure further includes an image compensation module 124 configured for calculating differences between the virtual image and the one of the at least two frames being shifted based on the second set of optical flow vectors to obtain a patched area in the virtual image, and performing image compensation on the patched area by a second algorithm. In an embodiment, the second algorithm is an exemplar-based image inpainting (EBI) method. This part of the technical contents is as described before, and will not be repeated.


With the method and the system for generating a video frame according to the present disclosure, a first set of optical flow vectors of at least two frames in a video is obtained, a set of modified vectors is generated based on a parameter, and the first set of optical flow vectors and the set of modified vectors are combined to obtain a second set of optical flow vectors. The second set of optical flow vectors is the estimated optical flows, which can be used for generating an estimated virtual image. The virtual image can reduce the deviation caused by the latency of the network. The present disclosure is suitable for long-range (5 km or more) wireless network environment, and can effectively eliminate the deviation caused by time delays of the network transmissions. The virtual image with low latency also improves user experience associated with operating of the unmanned vehicle (e.g., a car or an aircraft).


The above embodiments are only used to illustrate the principles of the present disclosure, and should not be construed as to limit the present disclosure in any way. The above embodiments can be modified by those with ordinary skill in the art without departing from the scope of the present disclosure as defined in the following appended claims.

Claims
  • 1. A method for generating a video frame, comprising: obtaining at least two frames in a video captured by an image capture unit through a network;calculating a first set of optical flow vectors of the at least two frames by a first algorithm;generating a set of modified vectors according to at least one parameter;combining the set of modified vectors and the first set of optical flow vectors to obtain a second set of optical flow vectors;shifting one of the at least two frames according to the second set of optical flow vectors to generate a virtual image;calculating a difference between the virtual image and the one of the at least two frames being shifted based on the second set of optical flow vectors to obtain a patched area in the virtual image, andperforming image compensation on the patched area by a second algorithm, wherein the second algorithm is an exemplar-based image inpainting (EBI) method.
  • 2. The method of claim 1, wherein calculating the first set of optical flow vectors of the at least two frames by the first algorithm includes: calculating optical flow vectors of objects in the at least two frames by the first algorithm;determining whether the objects are stationary or moving; andmodifying the optical flow vectors of the objects in the at least two frames based on optical flow vectors corresponding to the stationary objects to obtain the first set of optical flow vectors.
  • 3. The method of claim 2, wherein the at least two frames are contiguous frames in time.
  • 4. The method of claim 2, wherein the at least two frames are non-contiguous frames in time.
  • 5. The method of claim 2, wherein the at least two frames have the same object.
  • 6. The method of claim 1, wherein the first algorithm is an optical flow method.
  • 7. The method of claim 1, wherein the parameter is latency information of the network, a direction of the image capture unit, a speed of the image capture unit, or a combination thereof.
  • 8. A system for generating a video frame, comprising: an image capture unit configured for capturing a video; anda calculation unit connected with the image capture unit through a network and configured for receiving at least two frames in the video, the calculation unit including: a modified vector generating module configured for generating a set of modified vectors based on at least one parameter;an optical flow vector generating module configured for calculating a first set of optical flow vectors of the at least two frames by a first algorithm, and combining the set of modified vectors and the first set of optical flow vectors to generate a second set of optical flow vectors;a virtual image generating module configured for shifting one of the at least two frames based on the second set of optical flow vectors to generate a virtual image; andan image compensation module configured for calculating a difference between the virtual image and the one of the at least two frames being shifted based on the second set of optical flow vectors to obtain a patched area in the virtual image, and performing image compensation on the patched area by a second algorithm, wherein the second algorithm is an exemplar-based image inpainting (EBI) method.
  • 9. The system of claim 8, wherein the optical flow vector generating module is configured for calculating optical flow vectors of objects in the at least two frames by the first algorithm, determining whether the objects are stationary or moving, and modifying the optical flow vectors of the objects in the at least two frames based on optical flow vectors corresponding to the stationary objects to obtain the first set of optical flow vectors.
  • 10. The system of claim 8, wherein the at least two frames are contiguous frames in time.
  • 11. The system of claim 8, wherein the at least two frames are non-contiguous frames in time.
  • 12. The system of claim 8, wherein the at least two frames have the same object.
  • 13. The system of claim 8, wherein the first algorithm is an optical flow method.
  • 14. The system of claim 8, wherein the parameter is latency information of the network, a direction of the image capture unit, a speed of the image capture unit, or a combination thereof.
  • 15. The system of claim 8, wherein the calculation unit is a computer, a mobile phone, a tablet or a remote controller with a screen.
Priority Claims (1)
Number Date Country Kind
105136858 A Nov 2016 TW national
US Referenced Citations (22)
Number Name Date Kind
7616782 Badawy Nov 2009 B2
8610608 Ratnakar Aravind et al. Dec 2013 B2
8774950 Kelly et al. Jul 2014 B2
8928813 Liu Jan 2015 B2
9350969 Cohen May 2016 B2
9380286 Cohen Jun 2016 B2
9602814 Bhagavathy Mar 2017 B2
20040057446 Varsa et al. Mar 2004 A1
20050275727 Lai Dec 2005 A1
20060125835 Sha et al. Jun 2006 A1
20060257042 Ofek Nov 2006 A1
20090300692 Mavlankar Dec 2009 A1
20110013028 Zhou Jan 2011 A1
20110273582 Gayko et al. Nov 2011 A1
20120105728 Liu May 2012 A1
20130156297 Shotton Jun 2013 A1
20140072228 Rubinstein Mar 2014 A1
20140072229 Wadhwa Mar 2014 A1
20140348249 Sullivan Nov 2014 A1
20150281597 Salvador Marcos Oct 2015 A1
20160068114 Liao Mar 2016 A1
20160134673 MacInnis May 2016 A1
Foreign Referenced Citations (5)
Number Date Country
102123234 Jul 2011 CN
105405319 Mar 2016 CN
200824433 Jun 2008 TW
I444593 Jul 2014 TW
201523516 Jun 2015 TW
Non-Patent Literature Citations (10)
Entry
Criminisi et al., “Region Filling and Object Removal by Exemplar-Based Image Inpainting, IEEE Transaction on Image Processing, vol. 13, No. 9, Sep. 2004” (Year: 2004).
Roberts, Richard, et al., “Learning General Optical Flow Subspaces for Egomotion Estimation and Detection of Motion Anomalies,” Computer Vision and Pattern Recognition, 2009, pp. 57-64.
Pintea, S. L., “Dejavu: Motion Prediction in Static Images,” European Conference on Computer Vision, 2014, pp. 172-187.
Rosello, Pol, “Predicting Future Optical Flow from Static Video Frames,” Technical Report, Stanford University, 2016. pp. 1-16.
Zhang, Qing and Jiagun Lin, J., “Exemplar-based Image Inpainting Using Color Distribution Aalysis,” Inf. Sci. Eng., 2012, pp. 641-654.
Criminisi, Antonio, et al., “Region filling and object removal by exemplar-based inpainting,” IEEE Transactions on Image Processing, 2004, pp. 1200-1212.
Wang, Lu, et al., “The adaptive compensation algorithm for small UAV image stabilization,” IEEE International Geoscience and Remote Sensing Symposium, 2012, 4391-4394.
Walker, Jacob, et al., “Dense Optical Flow Prediction from a Static Image,” IEEE International Conference on Computer Vision, 2015, pp. 2443-2451.
Mathieu, Matthew, et al., “Deep multi-scale video production beyond mean square error,” Conference Paper, ICLR 2016, 2015, pp. 1-14.
Shaou-Gang Miaou, “The technical level of a person having ordinary skill in the art is determined by reference to ‘Digital Image Processing—Using Matlab nimbly, second education’,” Dec. 2015.
Related Publications (1)
Number Date Country
20180139362 A1 May 2018 US