This disclosure is related to the subject matter of commonly-assigned U.S. patent application Ser. No. 14/292,547, entitled, “Systems And Methods For Exposure Metering for Timelapse Video,” which was filed on May 30, 2014 (“the '547 application”).
This disclosure relates generally to the field of video capture, and more particularly, to acquiring timelapse video.
The advent of portable integrated computing devices has caused a wide proliferation of cameras and video devices. These integrated computing devices commonly take the form of smartphones or tablets and typically include general purpose computers, cameras, sophisticated user interfaces including touch sensitive screens, and wireless communications abilities through WiFi, LTE, HSDPA and other cell-based or wireless technologies. The wide proliferation of these integrated devices provides opportunities to use the devices' capabilities to perform tasks that would otherwise require dedicated hardware and software. For example, as noted above, integrated devices such as smartphones and tablets typically have one or two embedded cameras. These cameras generally amount to lens/camera hardware modules that may be controlled through the general purpose computer using firmware and/or software (e.g., “Apps”) and a user interface including the touchscreen fixed buttons and touchless control such as voice control.
The integration of cameras into communication devices such as smartphones and tablets has enabled people to share images and videos in ways never before possible. It is now very popular acquire and immediately share photos with other people by either sending the photos via text message, SMS, or email, or by uploading the photos to an internet-based website, such as a social networking site or a photosharing site.
Immediately sharing video is likewise possible, as described above for sharing of photos. However, bandwidth limitations and upload times significantly constrain the length of video that can easily be shared. In many instances, a short video clip that captures the essence of the entire action recorded may be desirable. The duration of the video clip may depend on the subject matter of the video clip. For example, a several hour car ride or an evening at a party might be reduced to a timelapse video clip lasting only a minute or two. Other action, such as a sunset or the movement of clouds, might be better expressed in a clip of twenty to forty seconds. While a timelapse video of shortened duration may be desired, a user often may wish to acquire video (termed herein “source video”) over a greater length of time, for example, over minutes, hours, or even days. A user may desire to reduce the length of the source video to provide a shortened, timelapse video clip. The user may wish to share the video, as mentioned above, or, may simply desire shortened, timelapse playback.
Disclosed herein is an adapative method whereby source video acquired over any given length of time is automatically processed to provide a short, timelapse video clip. To implement the method, a user need not know ahead of time the duration for which the source video will be acquired. Regardless of the acquisition time, the resulting video is automatically edited to provide a timelapse clip of a predefined length or of a length within a predefined range. The method involves saving images during recording but periodically deleting some of the images as filming continues. Moreover, the method involves decreasing the rate at which images are captured as the filming time increases. The method is adaptive, in that it adapts the effective acquisition frame rate as filming continues. Once acquisition has stopped, the saved images are encoded into a timelapse video clip. A further embodiment is an apparatus programmed to implement the adaptive methods described herein.
Systems, methods and program storage devices are disclosed, which provide instructions to cause one or more processing units to record timelapse video. The techniques disclosed herein are applicable to any number of electronic devices with displays: such as digital cameras, digital video cameras, mobile phones, personal data assistants (PDAs), portable music players, monitors, and, of course, desktop, laptop, and tablet computer displays.
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the inventive concept. As part of this description, some of this disclosure's drawings represent structures and devices in block diagram form in order to avoid obscuring the invention. In the interest of clarity, not all features of an actual implementation are described in this specification. Moreover, the language used in this disclosure has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter. Reference in this disclosure to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention, and multiple references to “one embodiment” or “an embodiment” should not be understood as necessarily all referring to the same embodiment.
It will be appreciated that, in the development of any actual implementation (as in any development project), numerous decisions must be made to achieve the developers' specific goals (e.g., compliance with system- and business-related constraints), and that these goals may vary from one implementation to another. It will also be appreciated that such development efforts might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the design of an implementation of image processing systems having the benefit of this disclosure.
Timelapse reduces the playback time of a video compared to the length of time it took to acquire the video. The examples discussed herein focus on providing a timelapse clip of 20 to 40. But it will be appreciated that any duration may be chosen. A method of reducing a 40 second clip of source video to 20 seconds of timelapse video would be to: (1) acquire source video for 40 seconds at a frame rate of 30 fps, yielding 1200 images total; (2) discard half of the images (for example, discard every other image), yielding 600 images total; and (3) play the remaining 600 images back at 30 fps, yielding 20 seconds of timelapsed video. Because half of the images are discarded, the acquisition frame rate is “effectively” 15 fps, even though the video was actually acquired at 30 fps. Thus, the term “effective acquisition frame rate” is used herein to refer to the number of images remaining divided by the true acquisition time. When played back at 30 fps, the action in the video will appear to move at twice the speed as the “true-to-life” action. To create 20 seconds of timelapse video from a longer segment of source video, more images would have to be discarded. For example, 80 seconds of source video recorded at 30 fps would yield 2400 images. Discarding 1800 of those images (i.e., keeping every fourth image) would leave 600 images, again providing 20 seconds of timelapse video for playback at 30 fps.
As the length of time source video is acquired increases, the number of images that are discarded to yield the timelapse video increases also, and quickly far exceeds the number of images that are actually used. The acquisition and storage of those unused images consumes processing and storage resources that could otherwise be used for other operations.
An alternative to acquiring and then discarding the used images would be to not acquire them in the first place. For example, if the user acquired 40 seconds of source video at an acquisition frame rate of 15 fps (the same as the “effective frame rate” in the above example) instead of 30 fps, then they would collect a total of 600 images. Playing back those 600 images at 30 fps would yield 20 seconds of timelapse video. Likewise, the user could collect 80 seconds of source video at a rate of 7.5 fps to yield 600 images that could be played back at 30 fps to provide 20 seconds of timelapse video.
The problem with the alternative method is that the user must know, before they begin recording, how long they will be acquiring the source video in order to know what frame rate to use for the recording. For example, if the user acquires source video at a frame rate of 7.5 fps (e.g., they expect to acquire for 80 seconds) but only acquires source video for 20 seconds, then they will end up with only 300 images of video, providing only 10 seconds of timelapse video.
In many cases, when the user begins acquiring video, they may not know how long they will be filming. For example, if they are filming a sunset, the user may not know if they will wish to film for fifteen minutes or thirty minutes. The user, therefore, does not know ahead of time the factor by which to reduce the acquisition frame rate.
Herein is described an adaptive algorithm for acquiring and processing timelapse video. An embodiment of the adaptive algorithm is illustrated as a flow chart in
Recording proceeds at the first frame rate until a critical number of images of video are acquired. The critical number of images is determined by the desired playback time and playback rate. According to illustrated embodiment, the playback time is actually a range of times, t1 to t2. The reason a range, rather than a specific time, is specified will become apparent from the following explanation. For the purposes of this discussion, the playback rate will be assumed to be 30 fps and the playback time will be from t1=20 to t2=40 seconds. In other words, regardless of the length of time source video is recorded, playback will be from 20 to 40 seconds at 30 fps. According to the operation 100, the user does not select the playback time and frame rate. Instead, the user simply selects to record video in timelapse mode and the playback time and frame rate are pre-programmed into the device. According to other embodiments, the user may be able to select the playback time and frame rate. In either case, the playback time and frame rate determine the critical number of images, as follows: the critical number of images is the number of images that would provide the longest desired playback time at the playback frame rate. For example, if the longest desired playback time (as pre-programmed into the recording device or as chosen by the user) is 40 seconds and the playback frame rate is 30 fps, then the critical number of images would be 1200 images.
Once the critical number of images has been reached, half of the stored images are discarded 103 and the frame rate R is decreased to R/2. For example, if the initial acquisition frame rate is 30 fps and the critical number of images is 1200 images. Discarding half of the images leaves 600 images.
Generally, discarding half of the images is accomplished by discarding every other image. It will be apparent that discarding every other image doubles the capture time interval ΔT between each of the images, providing a series of images that effectively correspond to a image capture rate of 15 fps (R/2). Thus, the remaining 600 images have an effective acquisition frame rate of 15 fps.
Moreover, recording continues a R/2 (i.e., 15 fps in this example). If the process were stopped at that point, the 600 images remaining would provide 20 seconds of timelapsed video for playback at 30 fps. Recall playback will always be at 30 fps, regardless of the frame rate R used for recording the source video, in this example. A playback frame rate of 30 fps has been found to provide a pleasing playback experience. However, it will be appreciated that another frame rate may be chosen.
If recording continues and the critical number of images (i.e., 1200) is reached again, then, again, half of the images will be discarded 103 and the recording frame rate will again be reduced by half 104. Operation 100 can be executed for as long as the user desires. The user can stop recording at any time, which causes operation 100 to cease. At any time the user quits recording, there will be between 600 and 1200 images stored, providing between 20 and 40 seconds of timelapse playback at 30 fps. The acquisition frame rate adaptively decrease as the recording time increases. Moreover, the effective acquisition frame rate of the remaining images decreases as recording time increases.
Once operation 100 has stopped, the stored images are encoded into a movie clip according to a video encoding protocol, such as MPEG. It will be appreciated that other encoding protocols can be used, as are known in the art. The video is encoded at a predetermined frame rate (i.e., 30 fps in the illustration). The total playback time is determined by the number of stored images at the time recording was stopped. In the example, there are between 600 and 1200 stored images at any given time, giving between 20 and 40 seconds of timelapse playback. According to an alternative embodiment, the total playback time can be predetermined and the playback speed can be adjusted based on the number of images in memory.
It will be apparent that many modifications of operation 100 are possible. For example, it may be desirable to begin with a lower frame rate, for example, 2 fps. If an initial frame rate of 2 fps is used instead of 30 fps, and if the critical number of images is 1200, then the critical number will be reached after recording for 10 minutes.
Processor 205 may execute instructions necessary to carry out or control the operation of many functions performed by device 200 (e.g., such as the generation and/or processing of timelapse video in accordance with operation 100). Processor 205 may, for instance, drive display 210 and receive user input from user interface 215. User interface 215 can take a variety of forms, such as a button, keypad, dial, a click wheel, keyboard, display screen and/or a touch screen. Processor 205 may be a system-on-chip such as those found in mobile devices and include a dedicated graphics processing unit (GPU). Processor 205 may be based on reduced instruction-set computer (RISC) or complex instruction-set computer (CISC) architectures or any other suitable architecture and may include one or more processing cores. Graphics hardware 220 may be special purpose computational hardware for processing graphics and/or assisting processor 205 process graphics information. In one embodiment, graphics hardware 220 may include a programmable graphics processing unit (GPU).
Sensor and camera circuitry 250 may capture still and video images that may be processed to generate images in accordance with this disclosure. Output from camera circuitry 250 may be processed, at least in part, by video codec(s) 255 and/or processor 205 and/or graphics hardware 220, and/or a dedicated image processing unit incorporated within circuitry 250. Images so captured may be stored in memory 260 and/or storage 265. Memory 260 may include one or more different types of media used by processor 205, graphics hardware 220, and image capture circuitry 250 to perform device functions. For example, memory 260 may include memory cache, read-only memory (ROM), and/or random access memory (RAM). Storage 265 may store media (e.g., audio, image and video files), computer program instructions or software, preference information, device profile information, and any other suitable data. Storage 265 may include one more non-transitory storage mediums including, for example, magnetic disks (fixed, floppy, and removable) and tape, optical media such as CD-ROMs and digital video disks (DVDs), and semiconductor memory devices such as Electrically Programmable Read-Only Memory (EPROM), and Electrically Erasable Programmable Read-Only Memory (EEPROM). Memory 260 and storage 265 may be used to retain computer program instructions or code organized into one or more modules and written in any desired computer programming language. When executed by, for example, processor 205 such computer program code may implement one or more of the methods described herein.
Referring again to
It is generally not optimal to compress the images iteratively based on their neighbors, because, as pointed out above, the final set of images is not determined until the operation is completed. More aggressive compression techniques can therefore be difficult to implement. However, if particular image images are predicted ahead of time to be deleted, then more aggressive compression techniques can be implemented to reduce the size of the image data. For example, if operation 100 is implemented such that all odd numbered images are generally delected, then all or some of the odd numbered images can be more aggressively compressed. Likewise, if particular images are slated to be deleted based on an image parameter, as described in more detail below, then those images can be aggressively compressed.
Once operation 100 has stopped, the stored images are encoded into a movie clip according to a video encoding format, such as one of the MPEG formats. It will be appreciated that other encoding formats, such as HEVC, Dirac, RealVideo, etc., can be used, as known in the art.
Operation 300 allows images to be discarded in an intelligent fashion. For example, if an anomaly is detected in an image, that image can be tagged for deletion. Perhaps the camera is disturbed or obscured during image acquisition. If that event causes an anomaly in the measured parameter, then the corresponding frame may be slated for deletion. Many techniques of implementing the tagging operation 300 will be apparent to the skilled artisan. For example, the default operation may call for all odd numbered images to be deleted. But the operation may check for any anomalous images within one or two neighboring images and delete the anomalous images preferentially. Generally, it will not be desirable to delete several consecutive images. According to an alternative embodiment, tagged images may not be deleted. Rather, the tag(s) is maintained with the images and embedded into the resulting output movie file such that the tag(s) can be used to inform further video editing.
It is to be understood that the above description is intended to be illustrative, and not restrictive. The material has been presented to enable any person skilled in the art to make and use the invention as claimed and is provided in the context of particular embodiments, variations of which will be readily apparent to those skilled in the art (e.g., some of the disclosed embodiments may be used in combination with each other). In addition, it will be understood that some of the operations identified herein may be performed in different orders. The scope of the invention therefore should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. In the appended claims, the terms “including” and “in which” are used as the plain-English equivalents of the respective terms “comprising” and “wherein.”
Number | Name | Date | Kind |
---|---|---|---|
5867214 | Anderson | Feb 1999 | A |
6636220 | Szeliski | Oct 2003 | B1 |
6665342 | Brown | Dec 2003 | B1 |
7110025 | Loui | Sep 2006 | B1 |
7295230 | Takahashi | Nov 2007 | B2 |
7450162 | Shioji | Nov 2008 | B2 |
7499588 | Jacobs | Mar 2009 | B2 |
7880936 | Shiiyama | Feb 2011 | B2 |
7990430 | Okamoto | Aug 2011 | B2 |
8118216 | Hoch | Feb 2012 | B2 |
8194993 | Chen | Jun 2012 | B1 |
8340453 | Chen | Dec 2012 | B1 |
8515270 | Posehn | Aug 2013 | B1 |
8657988 | Fan | Feb 2014 | B2 |
8681237 | Battles | Mar 2014 | B2 |
8711495 | Topliss | Apr 2014 | B2 |
8830347 | Jin | Sep 2014 | B2 |
8866928 | Geiss | Oct 2014 | B2 |
9077910 | Ninan | Jul 2015 | B2 |
20010050875 | Kahn | Dec 2001 | A1 |
20030146981 | Bean | Aug 2003 | A1 |
20030202777 | Kogusuri | Oct 2003 | A1 |
20050018049 | Falk | Jan 2005 | A1 |
20060285831 | Tanaka | Dec 2006 | A1 |
20070127573 | Soroushian | Jun 2007 | A1 |
20070189728 | Yu | Aug 2007 | A1 |
20080253758 | Yap | Oct 2008 | A1 |
20090222163 | Plante | Sep 2009 | A1 |
20090237502 | Maiya | Sep 2009 | A1 |
20090309989 | Tanaka | Dec 2009 | A1 |
20100053345 | Kim | Mar 2010 | A1 |
20100215348 | Saito | Aug 2010 | A1 |
20110075736 | Endo | Mar 2011 | A1 |
20110243453 | Kashima | Oct 2011 | A1 |
20120114304 | Mikawa | May 2012 | A1 |
20120257071 | Prentice | Oct 2012 | A1 |
20130063584 | Nakasho | Mar 2013 | A1 |
20130071031 | Huang | Mar 2013 | A1 |
20130202151 | Dauwels | Aug 2013 | A1 |
20130202185 | Irwin | Aug 2013 | A1 |
20130215289 | Vitsnudel | Aug 2013 | A1 |
20130308036 | Peng | Nov 2013 | A1 |
20130336590 | Sentinelli | Dec 2013 | A1 |
20140049657 | Fukunishi | Feb 2014 | A1 |
20140052636 | Mattes | Feb 2014 | A1 |
20140078343 | Dai | Mar 2014 | A1 |
20140085495 | Almalki | Mar 2014 | A1 |
20140105564 | Johar | Apr 2014 | A1 |
20140285718 | Murakami | Sep 2014 | A1 |
20140362247 | Fujita | Dec 2014 | A1 |
20150043893 | Nishizaka | Feb 2015 | A1 |
20150086176 | Komiya | Mar 2015 | A1 |
20150215537 | Nishizaka | Jul 2015 | A1 |
20150294686 | Autioniemi | Oct 2015 | A1 |
20150312463 | Gupta | Oct 2015 | A1 |
Number | Date | Country |
---|---|---|
103685933 | Mar 2014 | CN |
1557837 | Jul 2005 | EP |
2002135724 | May 2002 | JP |
2011015079 | Jan 2011 | JP |
2014166862 | Oct 2014 | WO |
Number | Date | Country | |
---|---|---|---|
20150350591 A1 | Dec 2015 | US |