The present disclosure is based on, and claims priority from, Taiwan Application Number 105136858, filed Nov. 11, 2016, the disclosure of which is hereby incorporated by reference herein in its entirety.
The present disclosure relates to methods and systems for generating a video frame, and, more particularly, to a method and a system for generating a video frame that address the deviation caused by the latency of the network.
Unmanned vehicles (e.g., cars or aircrafts) are usually remotely controlled by viewing images from a first person perspective. The images captured by unmanned vehicles are usually transmitted to a remote display via a mobile broadband network using, for example, 2.4 GHz automatic frequency hopping technology. However, the video stream data of a typical image is bandwidth-demanding, and the mobile broadband network often has latency (e.g., in the order of tens of milliseconds or even 1 to 2 seconds), resulting in non-real time transmission of images to the remote display. If a user is controlling an unmanned vehicle by looking at the delayed video images, the deviation or collision of the unmanned vehicle may occur.
Accordingly, there is an imperative need for a method and a system for generating a video frame that address the aforementioned issues in the prior art.
The present disclosure provides a method for generating a video frame, which may include: obtaining at least two frames in a video captured by an image capture unit through a network; calculating a first set of optical flow vectors of the at least two frames by a first algorithm; generating a set of modified vectors according to at least one parameter; combining the set of modified vectors and the first set of optical flow vectors to obtain a second set of optical flow vectors; and shifting one of the at least two frames according to the second set of optical flow vectors to generate a virtual image.
The present disclosure provides a system for generating a video frame, which may include: an image capture unit for capturing a video; and a calculation unit connected with the image capture unit through a network and configured for receiving at least two frames in the video, wherein the calculation unit includes: a modified vector generating module configured for generating a set of modified vectors based on at least one parameter; an optical flow vector generating module configured for calculating a first set of optical flow vectors of the at least two frames by a first algorithm, and combining the set of modified vectors and the first set of optical flow vectors to generate a second set of optical flow vectors; and a virtual image generating module configured for shifting one of the at least two frames based on the second set of optical flow vectors to generate a virtual image.
In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the disclosed embodiments. It will be apparent, however, that one or more embodiments may be practiced without these specific details. In other instances, well-known structures and devices are schematically shown in order to simplify the drawing.
Referring to
Upon obtaining the at least two frames, an optical flow estimation technique is used for estimating a motion vector of each object in the frames. In an embodiment, Lucas-Kanade optical flow method can be used for calculation in order to obtain a first set of optical flow vectors. The method proceeds to step S12.
In step S12, a set of modified vectors is generated. In an embodiment, a set of modified vectors is generated according to at least one parameter. In an embodiment, the parameter is the latency information of the network, a direction of the image capture unit, a speed of the image capture unit, or a combination thereof. The method proceeds to step S13.
In step S13, the set of modified vectors and the first set of optical flow vectors are combined to obtain a second set of optical flow vectors. The method proceeds to step S14.
In step S14, one of the at least two frames is shifted according to the second set of optical flow vectors to generate a virtual image. In an embodiment, a frame that occurs later in time can be chosen as the frame being shifted, but the present disclosure is not limited as such.
Referring to
In step S21, at least two frames of a video is captured by an image capture unit through a network, and a set of optical flow vectors of the least two frames is calculated by a first algorithm. As shown in
In an embodiment, the first algorithm is optical flow estimation. Assuming that an optical flow is constant within a neighborhood of a pixel, if one wishes to estimate a pixel I(x, y) in a frame I in a pixel H(x, y) in a frame H, it can be regarded as finding all pixels in the neighborhood, and its fundamental optical flow equation can be given as follows:
wherein u and v are the displacement vectors of the pixel. As such, the optical flow vectors 211, 221, 231 and 241 of the objects 21, 22, 23 and 24 in the frame can be calculated to obtain a set of optical flow vectors. In another embodiment, Lucas-Kanade optical flow method may also be used, and in addition to pixels, the calculation can be made in regards to blocks, but the present disclosure is not limited as such. The method proceeds to step S22.
In step S22, it is determined whether an object is stationary or moving. Since the image capture unit is installed on a moving remotely controlled unmanned vehicle (e.g., an unmanned aerial vehicle or an unmanned car), the video images captured are not stationary, but the objects in the frames of the video will have relative displacements with respect to the movement of the remotely controlled unmanned vehicle. Thus, it is determined whether the objects 21, 22, 23 and 24 shown in
In step S23, the optical flow vectors 211, 221, 231 and 241 of the objects 21, 22, 23 and 24 in the at least two frames are adjusted based on the optical flow vectors 211, 221 and 241 of the stationary objects 21, 22 and 24. As shown in
In step S24, a set of modified vectors is generated. In an embodiment, a set of modified vectors is generated according to at least one parameter. In an embodiment, wherein the parameter is the latency information of the network, the direction of the image capture unit, the speed of the image capture unit, or a combination thereof. For example, assuming that the remotely controlled unmanned vehicle exhibits a linear motion in a very short period of time, with the latency information of the network, vector values indicating the forward or backward motions of the first optical vectors 200 and 210 as shown in
As shown in step S25, as shown in
In step S26, as shown in
In an embodiment, assuming the set of frames in a video is X={X1, X2, . . . , Xm}, a set of virtual images to be generated is Y={Y1, Y2, . . . , Yn}. Assuming that the first set of optical flow vectors is Ed, the network latency information is represented as λ={direction, speed, timestamp}, wherein direction, speed are the moving direction and traveling speed of the image capture unit (that is, the moving direction and traveling speed of the remotely controlled unmanned vehicle), and timestamp is the timestamp of a frame of a video obtained by the remote end, then at least one difference between the estimated set of virtual images Y and the set of frames actually received X can be represented by an equation
Lp(X,Y)=∥(Ed(X)+λ)−Y∥p(p=1 or p=2).
In another example, after a virtual image is obtained, the following steps can be performed. In step S27, at least one patched area in the virtual image is obtained. In an embodiment, differences between the virtual image and one of the at least two frames being shifted based on the set of second optical flow vectors, for example, the difference between the location of the object 23′ in the frame of
In step 28, image restoration is performed on the patched area 31 by a second algorithm, for example, based on an exemplar-based image inpainting (EBI) method.
Referring to
In an embodiment, the calculation unit 12 is an electronic device having a processor, e.g., a computer, a mobile phone, a tablet or a remote controller with a screen (for a remotely controlled unmanned vehicle). In another embodiment, the module is software performed by a processor.
The image capture unit 11 is used for capturing a video, and can be provided on a remotely controlled unmanned vehicle. The calculation unit 12 is connected to the image capture unit 11 via a network 13 (e.g., 4G, Wi-Fi, WIFI, WiMAX or the like), and is used for receiving the video captured by the image capture unit 11. In an embodiment, the video can first be subjected to real-time video data compression and then transmitted to the calculation unit 12 through the network 13 in a streaming manner. In another embodiment, the video, after being transmitted to the calculation unit 12, can be stored in a storage medium (e.g., a hard disk or a memory) of the calculation unit 12 for further processing.
The modified vector generating module 121 of the calculation unit 12 is configured for generating a set of modified vectors based on at least one parameter. In an embodiment, the parameter is the latency information of network 13, the direction of the image capture unit 11, the speed of the image capture unit 11, or a combination of the above.
The optical flow vector generating module 122 of the calculation unit 12 is configured for calculating a first set of optical flow vectors of at least two frames in the video by a first algorithm. In an embodiment, the first algorithm is an optical flow estimation method. In another embodiment, the at least two frames are contiguous or non-contiguous frames in time, and an object that appears in both of the at least two frames is used as a reference. After calculating a set of optical flow vectors of objects in the at least two frames in the video by the first algorithm, the optical flow vector generating module 122 is configured for determining whether the objects are stationary or moving, and modifying the set of optical flow vectors of the objects in the at least two frames based on a set of optical flow vectors corresponding to the stationary objects to obtain the first set of optical flow vectors.
After generating the set of modified vectors by the modified vector generating module 121, the optical flow vector generating module 122 is further configured for combining the set of modified vectors and the first set of optical flow vectors to generate a second set of optical flow vectors.
The virtual image generating module 123 of the calculation unit 12 is configured for shifting one of the at least two frames according to the second set of optical flow vectors to generate a virtual image. The virtual image can be displayed on a screen of the calculation unit 12 for user viewing. The virtual image can also be transmitted to an external screen through the network of the calculation unit 12. In an embodiment, the calculation unit 12 is a cloud server, and, after the virtual image is calculated, the virtual image can be transmitted to a screen of a controller for controlling the unmanned vehicle.
The system for generating a video frame 10 according to the present disclosure further includes an image compensation module 124 configured for calculating differences between the virtual image and the one of the at least two frames being shifted based on the second set of optical flow vectors to obtain a patched area in the virtual image, and performing image compensation on the patched area by a second algorithm. In an embodiment, the second algorithm is an exemplar-based image inpainting (EBI) method. This part of the technical contents is as described before, and will not be repeated.
With the method and the system for generating a video frame according to the present disclosure, a first set of optical flow vectors of at least two frames in a video is obtained, a set of modified vectors is generated based on a parameter, and the first set of optical flow vectors and the set of modified vectors are combined to obtain a second set of optical flow vectors. The second set of optical flow vectors is the estimated optical flows, which can be used for generating an estimated virtual image. The virtual image can reduce the deviation caused by the latency of the network. The present disclosure is suitable for long-range (5 km or more) wireless network environment, and can effectively eliminate the deviation caused by time delays of the network transmissions. The virtual image with low latency also improves user experience associated with operating of the unmanned vehicle (e.g., a car or an aircraft).
The above embodiments are only used to illustrate the principles of the present disclosure, and should not be construed as to limit the present disclosure in any way. The above embodiments can be modified by those with ordinary skill in the art without departing from the scope of the present disclosure as defined in the following appended claims.
Number | Date | Country | Kind |
---|---|---|---|
105136858 A | Nov 2016 | TW | national |
Number | Name | Date | Kind |
---|---|---|---|
7616782 | Badawy | Nov 2009 | B2 |
8610608 | Ratnakar Aravind et al. | Dec 2013 | B2 |
8774950 | Kelly et al. | Jul 2014 | B2 |
8928813 | Liu | Jan 2015 | B2 |
9350969 | Cohen | May 2016 | B2 |
9380286 | Cohen | Jun 2016 | B2 |
9602814 | Bhagavathy | Mar 2017 | B2 |
20040057446 | Varsa et al. | Mar 2004 | A1 |
20050275727 | Lai | Dec 2005 | A1 |
20060125835 | Sha et al. | Jun 2006 | A1 |
20060257042 | Ofek | Nov 2006 | A1 |
20090300692 | Mavlankar | Dec 2009 | A1 |
20110013028 | Zhou | Jan 2011 | A1 |
20110273582 | Gayko et al. | Nov 2011 | A1 |
20120105728 | Liu | May 2012 | A1 |
20130156297 | Shotton | Jun 2013 | A1 |
20140072228 | Rubinstein | Mar 2014 | A1 |
20140072229 | Wadhwa | Mar 2014 | A1 |
20140348249 | Sullivan | Nov 2014 | A1 |
20150281597 | Salvador Marcos | Oct 2015 | A1 |
20160068114 | Liao | Mar 2016 | A1 |
20160134673 | MacInnis | May 2016 | A1 |
Number | Date | Country |
---|---|---|
102123234 | Jul 2011 | CN |
105405319 | Mar 2016 | CN |
200824433 | Jun 2008 | TW |
I444593 | Jul 2014 | TW |
201523516 | Jun 2015 | TW |
Entry |
---|
Criminisi et al., “Region Filling and Object Removal by Exemplar-Based Image Inpainting, IEEE Transaction on Image Processing, vol. 13, No. 9, Sep. 2004” (Year: 2004). |
Roberts, Richard, et al., “Learning General Optical Flow Subspaces for Egomotion Estimation and Detection of Motion Anomalies,” Computer Vision and Pattern Recognition, 2009, pp. 57-64. |
Pintea, S. L., “Dejavu: Motion Prediction in Static Images,” European Conference on Computer Vision, 2014, pp. 172-187. |
Rosello, Pol, “Predicting Future Optical Flow from Static Video Frames,” Technical Report, Stanford University, 2016. pp. 1-16. |
Zhang, Qing and Jiagun Lin, J., “Exemplar-based Image Inpainting Using Color Distribution Aalysis,” Inf. Sci. Eng., 2012, pp. 641-654. |
Criminisi, Antonio, et al., “Region filling and object removal by exemplar-based inpainting,” IEEE Transactions on Image Processing, 2004, pp. 1200-1212. |
Wang, Lu, et al., “The adaptive compensation algorithm for small UAV image stabilization,” IEEE International Geoscience and Remote Sensing Symposium, 2012, 4391-4394. |
Walker, Jacob, et al., “Dense Optical Flow Prediction from a Static Image,” IEEE International Conference on Computer Vision, 2015, pp. 2443-2451. |
Mathieu, Matthew, et al., “Deep multi-scale video production beyond mean square error,” Conference Paper, ICLR 2016, 2015, pp. 1-14. |
Shaou-Gang Miaou, “The technical level of a person having ordinary skill in the art is determined by reference to ‘Digital Image Processing—Using Matlab nimbly, second education’,” Dec. 2015. |
Number | Date | Country | |
---|---|---|---|
20180139362 A1 | May 2018 | US |