Augmented reality is becoming a greater part of the computer user experience. Through augmented reality, a computer user wears a head mounted display (“HMD”) that projects computer generated images onto a real-world scene, thus augmenting the scene with computer generated information. This information can be in the form of graphics or text. Cameras mounted on the head mounted display pick up the images of what the user is looking at in the real world. In order to match the high resolution of human viewing, these cameras need to have extraordinarily high resolution on the order of 50 megapixels. A 50 megapixel camera requires tremendous processing power to process all of those pixels, but HMDs are limited in their computing power and battery power. It is not economical or practical from an engineering standpoint to use 50 megapixel cameras. However, there remains a desire for high resolution images for a user to view in order to make the augmented reality experience truly immersive.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description section. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended as an aid in determining the scope of the claimed subject matter.
Aspects are directed to a method for using a foveated camera for video augmented reality and a head mounted display. The method includes placing a camera in a binning mode; capturing a full frame binned image using the camera and turning off the binning mode of the camera; selecting a region of interest within the full frame; capturing the region of interest at a higher resolution than the resolution of the camera when in binning mode; compositing the region of interest with the full frame binned image; and displaying the composited image.
Additional aspects include a head mounted display that includes a camera capable of capturing a region of interest and of binning; a display; a processor coupled to the camera and the display, the processor operable to place the camera in a binning mode; capture a full frame binned image; designate a region of interest within the camera frame; capture the region of interest using the camera; and composite the region of interest with the full frame binned image.
Additional aspects include computer readable media containing computer executable instructions which, when executed by a computer, perform a method comprising the acts of receiving a lower resolution binned full frame image from a camera; receiving a higher resolution region of interest image from within a frame of the camera; compositing the received lower resolution image with the higher resolution image; and displaying the composited image.
Additional aspects include a method of reducing motion blur by enhancing the perceived frame rate. The method includes computing a scene for a left frame at time t=0; displaying the computed scene for left frame at time t=0; computing a scene for a right frame at time t=n/2, where n is the refresh period for the left frame and the right frame; and displaying the computed scene for the right frame at time t=n/2.
Examples are implemented as a method, computer process, a computing system, or as an article of manufacture such as a device, computer program product, or computer readable media. According to an aspect, the computer program product is a computer storage media readable by a computer system and encoding a computer program of instructions for executing a computer process.
The details of one or more aspects are set forth in the accompanying drawings and description below. Other features and advantages will be apparent from a reading of the following detailed description and a review of the associated drawings. It is to be understood that the following detailed description is explanatory only and is not restrictive of the claims.
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate various aspects. In the drawings:
The following detailed description refers to the accompanying drawings. Wherever possible, the same reference numbers are used in the drawings and the following description refers to the same or similar elements. While examples may be described, modifications, adaptations, and other implementations are possible. For example, substitutions, additions, or modifications may be made to the elements illustrated in the drawings, and the methods described herein may be modified by substituting, reordering, or adding stages to the disclosed methods. Accordingly, the following detailed description is not limiting, but instead, the proper scope is defined by the appended claims. Examples may take the form of a hardware implementation, or an entirely software implementation, or an implementation combining software and hardware aspects. The following detailed description is, therefore, not to be taken in a limiting sense.
Aspects of the present disclosure are directed to a method and system for using a foveated camera for video augmented reality and a head mounted display. Foveated imaging is a digital image processing technique in which the image resolution, or amount of detail, varies across the image according to one or more fixation points. A fixation point indicates the highest resolution region of the image and corresponds to the center of the eye's retina; the fovea. Because matching human vision requires high resolution cameras, on the order of 50 megapixels, processing this amount of information is computationally expensive, relatively slow, and requires a lot of power. This is not practical for most heat mounted display units; many of which are not tethered to workstations that can provide that kind of power.
However, modern cameras may have two features that can be taken advantage of to provide the high immersion sought, while not suffering from the high computational and power requirements of processing a full 50 megapixel image. The first feature is called “region of interest” that allows a system to inform a camera to capture just a part of an image. So, for example, instead of capturing a full 16 megapixels or 50 megapixels of an image, the system can instruct the camera to only capture a portion, for example, a 200×200 pixel square from anywhere in the field of view.
The second feature is known as “binning.” Binning permits the camera to receive an instruction from a system to only capture the average of an n×n number of pixels. For example, the camera can be instructed to only capture the full image by averaging 2×2 or 3×3 areas of pixels. Thus, the effective resolution of the image is reduced. For example, for a 16 megapixel camera instructed to perform 4×4 binning, the resultant image is only 1/16th the size of a full 16 megapixel image.
The present disclosure takes advantage of both “region of interest” and binning to produce a composite picture that features a high resolution region of interest embedded within a full frame that has been binned. The “region of interest” can be determined as either the center of the screen, or, using eye tracking, the region of interest will be the area at which the eye is looking. Furthermore, in order to enhance the look of the compositing, late stage re-projection is used to make up for the delay between the time at which the region of interest is captured and the full binned image is captured. In addition, the disclosure may use feathering to smooth out the periphery around the region of interest where it meets the binned image.
A Storage Device 120 stores the region of interest image and the full frame binned image that will be used in compositing the picture.
Interface 122 couples the Processor 118 to a computer 124. The computer 124 typically provides the augmented images and text to be superimposed upon the images received from the cameras. While the compositing has been described as being performed by the Processor 118, processing of compositing the images may be performed by the computer 124 in alternative embodiments of the disclosure.
The remainder of the frame 210, such as palm fronds 230, will have a binned picture taken. This binned picture will be a lower resolution picture with a bin of size, for example, 4×4 binning value. Other binning sizes are also possible depending upon the amount of computer power available and the degree of resolution desired for the remainder of frame 210. Once the binned picture is composited with the region of interest image, the frame will comprise a high resolution image around the region of interest and a low resolution image around the remainder of the frame.
Also, prior to compositing, late stage re-projection may need to be performed to align the region of interest and the low resolution full frame, binned image. This is because there is some small time period, say two milliseconds for a camera with a fast shutter speed, in which there may have been some movement between when the region of interest was captured and when the full frame was captured. Thus, the binned, full frame image may have to be slightly transformed and/or rotated prior to compositing occurring.
In an alternative embodiment, there may be more than two levels of resolution: a high resolution region of interest image and a lower resolution full frame image. For example, one can foresee a three level resolution composite image. This would comprise the high resolution region of interest; a medium resolution binned area surrounding the region of interest, and a lower resolution binned area for the remainder of the frame. For example, the region of interest may be captured in full resolution 200×200 pixel square; a 400×400 pixel square around the region of interest may be captured at a binning value of 2×2 averaging; and the remainder of the frame may be captured at a 4×4 binning value. Then, all three of these images, following late stage re-projection, would be combined into the final image to be displayed.
At OPERATION 312, late stage re-projection is performed to account for small translational and rotational changes that may have taken place between when the full frame binned image was captured and when the region of interest was captured. In other words, the full frame image may be slightly translated and rotated to account for changes that occur between shots, typically 2 ms or less. At OPERATION 314, the re-projected binned image and the region of interest are composited together and sent to the displays. During compositing, the region around the region of interest may also be feathered to make a smoother transition between the region of interest and the remainder of the frame. The method 300 then returns to OPERATION 304.
In an additional improvement, it is possible to improve the apparent refresh rate of the displays 104 and 106. Refresh rates of HMDs are currently around 90 Hz, so the signal is being refreshed approximately every 11 ms. Both displays are typically refreshed in sync, so that at time, t=0, both begin scanning from the top of the display downward. But a problem is that this refresh rate still leads to a strobing appearance if the user of the HMD 102 moves one's head quickly.
It is desirable to increase the refresh rate of the displays 104 and 106. One method is to not change the refresh rate of the displays 104, 106, but rather to improve the apparent refresh rate. The apparent refresh rate may be improved by staggering the refresh times of the left and right displays 104 and 106, instead of having both displays 104, 106 refreshed at the same time. In a staggered refresh rate, the left display 104 may begin to be refreshed at time t=0 ms as before; however, given a refresh rate of 90 hz, for example, the right display 106 will be refreshed from top to bottom beginning at time t=5.5 ms. Thus the refresh rates of both displays are still 90 Hz, but because the refresh rate of one of the displays is staggered by half the refresh rate, the user's brain sees a refresh rate of 180 hz.
Because of the time delay between refreshing the left display 104 and the right display 106 (recall that the left display 104 is refreshed at time t=0 ms, while the right display 106 is refreshed at time t=5.5 ms) the images shown in the display must be calculated and generated for these different times. Also, any cameras associated with the system need to be synced with the displays 104, 106, so that the camera 126 associated with the left display 104 will have a sync signal sent that corresponds to the refresh time of the left display 104 and the camera 128 associated with the right display 106 will have a sync signal sent that corresponds to the refresh time of the right display 106.
Thus, everything is staggered by half a frame: both the images generated by the computer 124 and the images generated by the cameras 126 and 128.
At time t=n/2 or 5.5 ms into the display period, the scan of the left display 408 is halfway through its scan. The scan of the right display 410 is just beginning at this time. At time t=n or 11 ms into the display period, the scan of the left display 412 has completed and scanning has returned to the top row of the display. The right display 414 is now halfway through its scan.
Finally, at time t=3n/2 or 16.5 ms, the left display 416 is in the middle of its second scan and the right display 418 has completed its first scan and is just beginning its second scan.
Thus, the refresh rates of both displays are 90 Hz; however, the refresh time is staggered between the left and right displays 104, 106 giving the brain the impression of a 180 Hz scan rate.
At OPERATION 506, the method 500 begins to display the left display at time t=i. At OPERATION 508, the right display will begin displaying at time t=n/2+i. If the method 500 is operating as an augmented display device, versus a virtual reality device, then at OPERATION 510, the cameras are synchronized with the respective vertical retrace signals for the respective displays. In other words, the right display is synced with the right camera and the left display is synced with the left camera. Then, at OPERATION 512, i is incremented by n milliseconds and the method 500 returns to OPERATION 504.
The aspects and functionalities described herein may operate via a multitude of computing systems including, without limitation, head mounted displays with and without computer assistance, or head mounted displays in conjunction with desktop computer systems, wired and wireless computing systems, mobile computing systems (e.g., mobile telephones, netbooks, tablet or slate type computers, notebook computers, and laptop computers), hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, and mainframe computers.
In addition, according to an aspect, the aspects and functionalities described herein may operate over distributed systems (e.g., cloud-based computing systems), where application functionality, memory, data storage and retrieval and various processing functions are operated remotely from each other over a distributed computing network, such as the Internet or an intranet. According to an aspect, user interfaces and information of various types are displayed via on-board computing device displays or via remote display units associated with one or more computing devices. For example, user interfaces and information of various types are displayed and interacted with on a wall surface onto which user interfaces and information of various types are projected. Interaction with the multitude of computing systems with which implementations are practiced include, keystroke entry, touch screen entry, voice or other audio entry, gesture entry where an associated computing device is equipped with detection (e.g., camera) functionality for capturing and interpreting user gestures for controlling the functionality of the computing device, and the like.
As stated above, according to an aspect, a number of program modules and data files are stored in the system memory 604. While executing on the processing unit 602, the program modules 606 (e.g., compositing system 655) perform processes including, but not limited to, one or more of the stages of the methods 300 and 500 illustrated in
According to an aspect, aspects are practiced in an electrical circuit comprising discrete electronic elements, packaged or integrated electronic chips containing logic gates, a circuit utilizing a microprocessor, or on a single chip containing electronic elements or microprocessors. For example, aspects are practiced via a system-on-a-chip (SOC) where each or many of the components illustrated in
According to an aspect, the computing device 600 has one or more input device(s) 612 such as a keyboard, a mouse, a pen, a sound input device, a touch input device, etc. The output device(s) 614 such as a head mounted display, display, speakers, a printer, etc. are also included according to an aspect. The aforementioned devices are examples and others may be used. According to an aspect, the computing device 600 includes one or more communication connections 616 allowing communications with other computing devices 618. Examples of suitable communication connections 616 include, but are not limited to, radio frequency (RF) transmitter, receiver, and/or transceiver circuitry; universal serial bus (USB), parallel, and/or serial ports.
The term computer readable media as used herein include computer storage media. Computer storage media include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, or program modules. The system memory 604, the removable storage device 609, and the non-removable storage device 610 are all computer storage media examples (i.e., memory storage.) According to an aspect, computer storage media includes RAM, ROM, electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other article of manufacture which can be used to store information and which can be accessed by the computing device 600. According to an aspect, any such computer storage media is part of the computing device 600. Computer storage media does not include a carrier wave or other propagated data signal.
According to an aspect, communication media is embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media. According to an aspect, the term “modulated data signal” describes a signal that has one or more characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), infrared, and other wireless media.
According to an aspect, one or more application programs 750 are loaded into the memory 762 and run on or in association with the operating system 764. Examples of the application programs include phone dialer programs, e-mail programs, personal information management (PIM) programs, word processing programs, spreadsheet programs, Internet browser programs, messaging programs, and so forth. According to an aspect, the compositing system 655 is loaded into memory 762. The system 702 also includes a non-volatile storage area 768 within the memory 762. The non-volatile storage area 768 is used to store persistent information that should not be lost if the system 702 is powered down. The application programs 750 may use and store information in the non-volatile storage area 768, such as e-mail or other messages used by an e-mail application, and the like. A synchronization application (not shown) also resides on the system 702 and is programmed to interact with a corresponding synchronization application resident on a host computer to keep the information stored in the non-volatile storage area 768 synchronized with corresponding information stored at the host computer. As should be appreciated, other applications may be loaded into the memory 762 and run on the mobile computing device 700.
According to an aspect, the system 702 has a power supply 770, which is implemented as one or more batteries. According to an aspect, the power supply 770 further includes an external power source, such as an AC adapter or a powered docking cradle that supplements or recharges the batteries.
According to an aspect, the system 702 includes a radio 772 that performs the function of transmitting and receiving radio frequency communications. The radio 772 facilitates wireless connectivity between the system 702 and the “outside world,” via a communications carrier or service provider. Transmissions to and from the radio 772 are conducted under control of the operating system 764. In other words, communications received by the radio 772 may be disseminated to the application programs 750 via the operating system 764, and vice versa.
According to an aspect, the visual indicator 720 is used to provide visual notifications and/or an audio interface 774 is used for producing audible notifications via the audio transducer 725. In the illustrated example, the visual indicator 720 is a light emitting diode (LED) and the audio transducer 725 is a speaker. These devices may be directly coupled to the power supply 770 so that when activated, they remain on for a duration dictated by the notification mechanism even though the processor 760 and other components might shut down for conserving battery power. The LED may be programmed to remain on indefinitely until the user takes action to indicate the powered-on status of the device. The audio interface 774 is used to provide audible signals to and receive audible signals from the user. For example, in addition to being coupled to the audio transducer 725, the audio interface 774 may also be coupled to a microphone to receive audible input, such as to facilitate a telephone conversation. According to an aspect, the system 702 further includes a video interface 776 that enables an operation of an on-board camera 730 to record still images, video stream, and the like.
According to an aspect, a mobile computing device 700 implementing the system 702 has additional features or functionality. For example, the mobile computing device 700 includes additional data storage devices (removable and/or non-removable) such as, magnetic disks, optical disks, or tape. Such additional storage is illustrated in
According to an aspect, data/information generated or captured by the mobile computing device 700 and stored via the system 702 is stored locally on the mobile computing device 700, as described above. According to another aspect, the data is stored on any number of storage media that is accessible by the device via the radio 772 or via a wired connection between the mobile computing device 700 and a separate computing device associated with the mobile computing device 700, for example, a server computer in a distributed computing network, such as the Internet. As should be appreciated such data/information is accessible via the mobile computing device 700 via the radio 772 or via a distributed computing network. Similarly, according to an aspect, such data/information is readily transferred between computing devices for storage and use according to well-known data/information transfer and storage means, including electronic mail and collaborative data/information sharing systems.
Implementations, for example, are described above with reference to block diagrams and/or operational illustrations of methods, systems, and computer program products according to aspects. The functions/acts noted in the blocks may occur out of the order as shown in any flowchart. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
The description and illustration of one or more examples provided in this application are not intended to limit or restrict the scope as claimed in any way. The aspects, examples, and details provided in this application are considered sufficient to convey possession and enable others to make and use the best mode. Implementations should not be construed as being limited to any aspect, example, or detail provided in this application. Regardless of whether shown and described in combination or separately, the various features (both structural and methodological) are intended to be selectively included or omitted to produce an example with a particular set of features. Having been provided with the description and illustration of the present application, one skilled in the art may envision variations, modifications, and alternate examples falling within the spirit of the broader aspects of the general inventive concept embodied in this application that do not depart from the broader scope.
Number | Name | Date | Kind |
---|---|---|---|
5604856 | Guenter | Feb 1997 | A |
6193660 | Jackson et al. | Feb 2001 | B1 |
6252989 | Geisler et al. | Jun 2001 | B1 |
6340994 | Margulis et al. | Jan 2002 | B1 |
6462768 | Oakley | Oct 2002 | B1 |
6549215 | Jouppi | Apr 2003 | B2 |
7148860 | Kooi et al. | Dec 2006 | B2 |
7499594 | Kortum et al. | Mar 2009 | B2 |
7542090 | Merchant | Jun 2009 | B1 |
7872635 | Mitchell | Jan 2011 | B2 |
8223024 | Petrou | Jul 2012 | B1 |
8379915 | Sharon et al. | Feb 2013 | B2 |
8401081 | Doser | Mar 2013 | B2 |
8872910 | Vaziri | Oct 2014 | B1 |
8885882 | Yin et al. | Nov 2014 | B1 |
8937771 | Robbins | Jan 2015 | B2 |
9030583 | Gove et al. | May 2015 | B2 |
9094677 | Mendis et al. | Jul 2015 | B1 |
9366862 | Haddick et al. | Jun 2016 | B2 |
9377852 | Shapiro et al. | Jun 2016 | B1 |
20020063726 | Jouppi | May 2002 | A1 |
20030198393 | Berstis | Oct 2003 | A1 |
20040102713 | Dunn | May 2004 | A1 |
20040130649 | Lee | Jul 2004 | A1 |
20040247167 | Bueno et al. | Dec 2004 | A1 |
20050007453 | Ahiska | Jan 2005 | A1 |
20050017923 | Kooi et al. | Jan 2005 | A1 |
20050018911 | Deever | Jan 2005 | A1 |
20050096543 | Jackson et al. | May 2005 | A1 |
20060221067 | Kim et al. | Oct 2006 | A1 |
20070177239 | Tanijiri et al. | Aug 2007 | A1 |
20080002262 | Chirieleison | Jan 2008 | A1 |
20080247620 | Lewis et al. | Oct 2008 | A1 |
20090040308 | Temovskiy | Feb 2009 | A1 |
20090116688 | Monacos et al. | May 2009 | A1 |
20090147126 | Miyoshi | Jun 2009 | A1 |
20090153730 | Knee et al. | Jun 2009 | A1 |
20090175535 | Mattox | Jul 2009 | A1 |
20100090929 | Tsujimoto | Apr 2010 | A1 |
20100142778 | Zhuo et al. | Jun 2010 | A1 |
20120105310 | Sverdrup et al. | May 2012 | A1 |
20120146891 | Kalinli | Jun 2012 | A1 |
20120176296 | Border et al. | Jul 2012 | A1 |
20120300112 | Natsume | Nov 2012 | A1 |
20130050485 | Tiana | Feb 2013 | A1 |
20130125155 | Bhagavathy et al. | May 2013 | A1 |
20130169683 | Perez et al. | Jul 2013 | A1 |
20130285885 | Nowatzyk et al. | Oct 2013 | A1 |
20140085505 | Asuka et al. | Mar 2014 | A1 |
20140125785 | Na et al. | May 2014 | A1 |
20140218468 | Gao et al. | Aug 2014 | A1 |
20140247277 | Guenter et al. | Sep 2014 | A1 |
20140247286 | Chi | Sep 2014 | A1 |
20140266990 | Makino et al. | Sep 2014 | A1 |
20140268360 | Ellsworth | Sep 2014 | A1 |
20140361977 | Stafford et al. | Dec 2014 | A1 |
20140375680 | Ackerman et al. | Dec 2014 | A1 |
20150042814 | Vaziri | Feb 2015 | A1 |
20150235453 | Schowengerdt et al. | Aug 2015 | A1 |
20160034032 | Jeong | Feb 2016 | A1 |
20160080653 | Kim | Mar 2016 | A1 |
20160109712 | Harrison et al. | Apr 2016 | A1 |
20160187969 | Larsen et al. | Jun 2016 | A1 |
20160246055 | Border et al. | Aug 2016 | A1 |
20160320559 | Richards | Nov 2016 | A1 |
20160328030 | Kimura et al. | Nov 2016 | A1 |
20170359510 | Lane | Dec 2017 | A1 |
20180035058 | Thumpudi et al. | Feb 2018 | A1 |
20180218661 | Guenter | Jun 2018 | A1 |
20180217383 | Guenter | Aug 2018 | A1 |
20180218217 | Guenter et al. | Aug 2018 | A1 |
Entry |
---|
Jin, et al., “Analysis and processing of pixel binning for color image sensor”, In EURASIP Journal on Advances in Signal Processing, vol. 1, Jun. 21, 2012, 22 pages. |
Guenter, et al., “Supplement to Foveated 3D Graphics: User Study Details”, In Journal of ACM Transactions on Graphics, vol. 31, No. 6, Nov. 20, 2012, 4 pages. |
Guenter, et al., “Foveated 3D Graphics”, In Journal of ACM Transactions on Graphics, vol. 31, Issue 6, Nov. 2012, 10 Pages. |
LaValle, Steven M., “The Physiology of Human Vision”, In Book of Virtual Reality, Chapter 5, Jan. 2015, 17 Pages. |
Poletti, et al., “Microscopic Eye Movements Compensate for Nonhomogeneous Vision within the Fovea”, In Journal of Current Biology, vol. 23, Issue 17, Sep. 9, 2013, 5 Pages. |
Rolland, Jannick, et al., “Head-Mounted Display Systems”, In Encyclopedia of Optical Engineering, Mar. 2005, 14 Pages. |
Stengel, et al., “An Affordable Solution for Binocular Eye Tracking and Calibration in Head-mounted Displays”, In Proceedings of the 23rd ACM international conference on Multimedia, Oct. 26, 2015, 10 Pages. |
“Non Final Office Action Issued in U.S. Appl. No. 15/421,150”, dated May 31, 2018, 34 Pages. |
“Non Final Office Action issued in U.S. Appl. No. 15/421,228”, dated Aug. 27, 2018, 23 Pages. |
“International Search Report and Written Report issued in PCT Application No. PCT/US2018/016187”, dated Aug. 21, 2018, 15 Pages. |
“Final Office Action Issued in U.S. Appl. No. 15/421,150”, dated Nov. 23, 2018, 28 Pages. |
“Non Final Office Action Issued in U.S. Appl. No. 15/421,252”, dated Nov. 1, 2018, 18 Pages. |
U.S. Appl. No. 15/421,228, Office Action dated Mar. 7, 2019, 21 pages. |
Number | Date | Country | |
---|---|---|---|
20180220068 A1 | Aug 2018 | US |