Stereoscopic images create the perception of depth by presenting slightly different versions of an image to the left and right eyes of a user. Typically, the differences between the images are the horizontal location of objects in the images. When processed by the brain, these location differences create the perception of depth. Stereoscopic images and videos are often used in augmented reality and virtual reality applications.
Images or videos containing depth cues can make them appear more realistic. For example, a rendered virtual reality (VR) environment in which objects located further away from a viewing location appear further away from a viewer can result in a more immersive VR experience. Generation of 360-degree content for VR applications is of interest given its wide field of view, but some existing mechanisms for 360-degree VR content capture provide limited depth cues. VR content generated by such mechanisms can thus result in a less immersive experience. Such approaches typically have the additional drawback that individual images need to be stitched together before the VR content can be viewed.
Devices exist that can generate stereoscopic content with a wide field of view. For example, devices compliant with the VR180 format can generate stereoscopic content having a horizontal field of view of substantially 180 degrees. This horizontal field of view is wider than that of typical existing head-mounted devices (HMDs). As such, a user can move their head to the left or to the right to an extent to see new content while viewing such stereoscopic content on typical HMDs, similar to how new content is observed in the real world.
One consequence of using a wide-angle camera to generate stereoscopic content is that a portion of one stereoscopic lens can be captured in images taken using the other lens. That is, the left image of a stereoscopic image can contain a portion of the right lens of the camera, and the right image can contain a portion of the left lens. The lens portions captured in the left and right images can be referred to herein as lens artifacts. The presence of lens artifacts in stereoscopic images and videos can result in a less immersive experience to a viewer of the stereoscopic content. They can remind a viewer that they are viewing recorded content and the fact that a lens artifact is not captured in both the left and right images can leave a viewer feeling disoriented. Narrowing the field of view of a stereoscopic camera may reduce or eliminate the presence of lens artifacts, but this gain comes at the cost of a narrower field of view. The technologies described herein remove lens artifacts from stereoscopic images and replace them with image data that blends in with the remainder of the image while retaining the camera's full field of view. The technologies described herein can remove lens artifacts from stereoscopic videos as well as stereoscopic images.
In the following description, specific details are set forth, but embodiments of the technologies described herein may be practiced without these specific details. Well-known circuits, structures, and techniques have not been shown in detail to avoid obscuring an understanding of this description. “An embodiment,” “various embodiments,” “some embodiments,” and the like may include features, structures, or characteristics, but not every embodiment necessarily includes the particular features, structures, or characteristics.
Some embodiments may have some, all, or none of the features described for other embodiments. “First,” “second,” “third,” and the like describe a common object and indicate different instances of like objects being referred to. Such adjectives do not imply objects so described must be in a given sequence, either temporally or spatially, in ranking, or any other manner. “Connected” may indicate elements are in direct physical or electrical contact with each other and “coupled” may indicate elements co-operate or interact with each other, but they may or may not be in direct physical or electrical contact. Terms modified by the word “substantially” include arrangements, orientations, spacings, or positions that vary slightly from the meaning of the unmodified term. For example, a stereoscopic camera with a field of view of substantially 180 degrees includes cameras that have a field of view within a few degrees of 180 degrees.
The description may use the phrases “in an embodiment,” “in embodiments,” “in some embodiments,” and “in various embodiments,” each of which may refer to one or more of the same or different embodiments. Furthermore, the terms “comprising,” “including,” “having,” and the like, as used with respect to embodiments of the present disclosure, are synonymous.
Reference is now made to the drawings wherein similar or same numbers may be used to designate the same or similar parts in different figures. The use of similar or same numbers in different figures does not mean all figures including similar or same numbers constitute a single or same embodiment. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding thereof. It may be evident, however, that the embodiments can be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form to facilitate a description thereof. The intention is to cover all modifications, equivalents, and alternatives within the scope of the claims.
The computing device 100 can generate stereoscopic images and videos through the use of the stereoscopic camera 110. As used herein, the term “content” can refer to an image, a portion of an image, multiple images, a video, a portion of a video, or multiple videos. The stereoscopic camera 110 comprises a left image sensor 140, a left lens 145, a right image sensor 150, and a right lens 155. The left and right image sensors 140 and 150 produce left and right image sensor data, respectively. Left and right image sensor data are used to generate images from which stereoscopic images or videos can be generated. A stereoscopic image can comprise a left image and a right image. A stereoscopic video comprises a series of stereoscopic images (or frames) and an image in a stereoscopic video comprises a left image and a right image. The series of left images in a stereoscopic video can be referred to as a left video and the series of right images in a stereoscopic video can be referred to as a right video.
In some embodiments, stereoscopic images and videos are generated by the stereoscopic camera 110. In other embodiments, stereoscopic content is generated by the one or more processors 120 based on left and right image sensor data or left and right images provided to the one or more processors 120 by the stereoscopic camera 110. Reference to stereoscopic content that is generated by a stereoscopic camera, generated using a stereoscopic camera, or captured by a stereoscopic camera refers to stereoscopic content that is provided by a stereoscopic camera or stereoscopic content generated by components (e.g., one or more processors 120 of
In some embodiments, stereoscopic content comprises digital images and videos and can conform to any image or video format, such as JPEG (Joint Photographic Experts Group), TIFF (Tagged Image File Format), GIF (Graphics Interchange Format), and PNG (Portable Network Graphics) image formats; and AVI (Audio Video Interleave), WMV (Windows Media Video), any of the various MPEG (Moving Picture Experts Group) formats (e.g., MPEG-1, MPEG-2, MPEG-4), QuickTime, and 3GPP (3rd Generation Partnership Project) video formats.
The stereoscopic camera 110 can be incorporated into the computing device 100 (such as in a head-mounted device (HMD), or smartphone) or communicatively coupled to the computing device 100 through a wired or wireless connection. In some embodiments the one or more processors 120 can comprise one or more artificial intelligence (AI) accelerators that implement inpainting models that can generate revised stereoscopic content.
In some embodiments, stereoscopic content can contain portions of the camera 110 that protrude from a camera 110 or device 100 surface. For example, in embodiments where the camera 110 is a wide-angle stereoscopic camera with left and right lenses 145 and 155 that protrude from a front surface of the camera, a portion of the left and right lenses 145 and 155 can be captured in stereoscopic images or videos captured by the camera 110. As will be discussed in greater detail below, the one or more processors 120 can take a stereoscopic image or video containing lens artifacts and generate a revised stereoscopic image or video in which the lens artifacts have been removed and replaced with content that blends in with the remainder of the stereoscopic image or video.
Stereoscopic content and revised stereoscopic content can be stored in the one computer-readable storage media 130, which can be a removable memory card (e.g., Secure Digital (SD) memory card), module, stick, or any other type of removable or non-removable computer-readable storage media described herein. The computing device 100 can further optionally comprise a display 160, a battery 170, a network interface 180, and an antenna 185. In some embodiments, the device 100 is an HMD comprising a screen upon which stereoscopic content can be displayed. In other embodiments, the device 100 is a smartphone and left and right images or left and right videos are shown on left and right portions of the smartphone display, respectively. In such embodiments, the smartphone can be positioned within a virtual reality viewer or other device that, when looked into by a viewer, limits the left eye to see only the left portion of the smartphone display and limits the right eye to see only the right portion. The network interface 180 allows the computing device 100 to communicate in wired or wireless fashion with other computing devices, such remote system 190 or a remote display device 199, using any communication interface, protocol, technology, or combinations thereof. The antenna 185 enables wireless communications between the device 100 and other computing devices.
The computing device 100 can send stereoscopic content to the remote system 190 for the generation of revised stereoscopic content. The remote system 190 can be a smartphone 192, laptop 194, personal computer 196, server 197, or any other computing device. The generation of revised stereoscopic images and videos by the remote system 190 can be performed during the post-processing of stereoscopic content captured by the device 100. That is, the remote system 190 can store the received stereoscopic content and generate revised stereoscopic content at any later time. Revised stereoscopic images and videos generated by the remote system 190 can be sent back to the capturing device (i.e., device 100) or the remote display device 199 for display or storage. Revised stereoscopic content can be sent to a remote storage 198 for later retrieval by any of the devices shown in
In an embodiment involving all of the devices illustrated in
The display of revised stereoscopic content can be done in real-time or at a later time after capture. As used herein, the term “real-time” when used in the context of displaying revised stereoscopic images refers to displaying revised stereoscopic content at a short enough time after the stereoscopic content has been captured such that a user is likely not to notice the delay between capture and display or that the user does not suffer from motion sickness or any other physical effects due to this delay. For example, in a “see-through” augmented reality (AR) embodiment, an HMD with an integrated stereoscopic camera can generate and display revised stereoscopic content in a short enough time after capture such that the user feels that they are seeing a live view of what the camera is viewing. Additional content can be added to the revised stereoscopic content before being shown on the display to enable augmented reality use cases.
The camera 200 further comprises a left image sensor associated with the left lens 260 and a right image sensor associated with the right lens 240 (image sensors not shown). The camera 200 further accommodates removable storage media, such as Secure Digital (SD) (e.g., SD, SD High Capacity (SDHC), SD Extended Capacity (SDXC)) or Compact Flash (CF) memory cards for storing stereoscopic content, revised stereoscopic content, or any other content captured or generated by the camera 200. The camera 200 further comprises one or more processors that can process stereoscopic images and videos generated by the camera to produce revised stereoscopic images and videos using the technologies described herein. The camera 200 further comprises a network interface that allows the camera 200 to communicate with other devices via one or more wired or wireless interfaces. The camera 200 further comprises an antenna to enable wireless communication and a rechargeable battery.
The technologies described herein remove lens artifacts captured in stereoscopic images and videos and replace them with content that blends in with the remainder of the image to create revised stereoscopic images and videos. As used herein, the term “blends in” with reference to content that replaces a lens artifact in an image means that the content replacing the lens artifact more closely matches or better fits the image than the lens artifact. In some embodiments, the content replacing a lens artifact can be content that a user would have expected to see had the lens responsible for the artifact had not been in the way. For example, with reference to
In some embodiments, a lens artifact is replaced with content that blends in with a stereoscopic image or video via inpainting. As used herein, the term “inpainting” refers to the process of filling in one or more missing portions or replacing one or more portions of an image or video with content generated based on the remainder of the image or video (or a portion thereof). In some embodiments, inpainting is performed using artificial intelligence approaches. For example, an image or image portion with one or more missing portions or portions marked for replacement can be provided as input to a model that can perform inpainting on images and the model can output a revised image with the missing portions filled in or the marked portions replaced with content that blends in with the remainder of the image. Such models can be referred to herein as inpainting models. In this way, revised stereoscopic content can be generated. In some embodiments, an inpainting model can be a trained machine learning model, such as a trained convolutional neural network. In other embodiments, the model can be based on a generative adversarial network (GAN).
In some embodiments, a device that captures stereoscopic content and generates revised stereoscopic content can utilize trained inpainting models to generate revised stereoscopic content. The inpainting models can be implemented in dedicated hardware, such as one or more GPUs (graphics processing units), FPGA (Field Programmable Gate Arrays), or AI accelerators. Such dedicated hardware can be located in any device described or referenced herein that generates revised stereoscopic content.
In some embodiments, performing inpainting on one image (left/right) of a stereoscopic image can be based on a portion of the other image in the stereoscopic image (right/left). This approach takes advantage of the fact that lens artifacts are not captured stereoscopically. For example, a right lens artifact in a left image can be replaced with content based on a portion of the right image that corresponds to where the right lens artifact resides in the left image. Referring back to
In some embodiments, inpainting comprises copying information from one image to another. For example, the left lens artifact 330 in
In some embodiments, inpainting comprises providing content taken from a source image to an inpainting model to generate the content to replace a lens artifact. For example, inpainting a left image of a stereoscopic image can comprise providing the left image (or a portion thereof) to an inpainting model and a portion of the right image corresponding to the region of the left image occupied by the right lens artifact.
In embodiments involving the use of inpainting models, only a portion of the image may be provided to the inpainting model. For example, with reference to
In some embodiments, the portion of an image to be inpainted can be defined by a mask.
In some embodiments, as the location, size, and shape of a lens artifact in stereoscopic content are fixed for a particular camera, the shape, size, and location of a lens mask for the particular camera can be fixed as well. Lens mask information can identify a lens mask in various fashions. For example, lens mask information can comprise a plurality of (x,y) coordinates that define a polygon. Although the mask 510 in
In other examples, a mask can be a lens mask image in which pixels having one or more specified characteristics specify the region to be filled in or replaced during inpainting. For example, mask 550 of
In some embodiments, lens mask information can be stored with stereoscopic content for use during post-processing or playback. Lens mask information can be stored as part of the stereoscopic content as metadata or otherwise. In some embodiments, lens mask information can be provided to a device that is to generate revised stereoscopic content, such as the remote system 190 in
In some embodiments, stereoscopic content is played back on a device with a display having a field of view less than that of the stereoscopic content. For example, existing HMDs or other VR viewers generally have a field of view (FOV) that is narrower than wide-angle stereoscopic cameras, such as cameras conforming to the VR180 format. If a user is viewing stereoscopic content having a FOV greater than the display FOV of the device showing the stereoscopic content, lens artifacts would be displayed when the viewer looks to the far left (and would see the left lens artifact in the right image with their right eye) or to the far right (and would see the right lens artifact in the left image with their left eye). Lens artifacts would not be displayed when a viewer is looking generally straight ahead or moves their head to the left or the right within a certain range from center. In such embodiments, a device can generate and display revised stereoscopic content if it determines that regions of stereoscopic content containing lens artifacts would otherwise be shown on the display. In some embodiments, a device can make such a determination based on the orientation of the viewing device. The orientation of a viewing device can be determined based on sensor data provided by one or more viewing device sensors, such as an accelerometer or gyroscope.
As previously discussed, the inpainting of stereoscopic content to remove and replace lens artifacts can be performed locally at a capturing device (e.g., device 100 of
In some embodiments, the technologies described herein can be used to remove additional artifacts from stereoscopic content. These additional artifacts can be caused by the presence of camera or device features other than lenses, such as buttons, switches, latches, housing portions, or any other camera or device feature located in the camera's field of view. The technologies disclosed herein can be used to revised stereoscopic content in which these additional artifacts are removed and replaced with content that blends with the stereoscopic content.
The technologies, techniques, and embodiments described herein can be performed by any of a variety of computing devices, including mobile devices (e.g., smartphones, handheld computers, tablet computers, laptop computers, media players, portable gaming consoles, cameras and video recorders, wearables (e.g., smartwatches)), non-mobile devices (e.g., desktop computers, servers, stationary gaming consoles, set-top boxes, smart televisions); embedded devices (e.g., devices incorporated into a vehicle, home or place of business); and any other device described or referenced to herein. As used herein, the term “computing device” includes computing systems and includes devices comprising multiple discrete physical components.
As shown in
Processors 1002 and 1004 further comprise at least one shared cache memory 1012 and 1014, respectively. The shared caches 1012 and 1014 can store data (e.g., instructions) utilized by one or more components of the processor, such as the processor cores 1008-1009 and 1010-1011. The shared caches 1012 and 1014 can be part of a memory hierarchy for the device 1000. For example, the shared cache 1012 can locally store data that is also stored in a memory 1016 to allow for faster access to the data by components of the processor 1002. In some embodiments, the shared caches 1012 and 1014 can comprise multiple cache layers, such as level 1 (L1), level 2 (L2), level 3 (L3), level 4 (L4), and/or other caches or cache layers, such as a last level cache (LLC).
Although the device 1000 is shown with two processors, the device 1000 can comprise any number of processors. Further, a processor can comprise any number of processor cores. A processor can take various forms such as a central processing unit, a controller, a graphics processor, an accelerator (such as a graphics accelerator, digital signal processor (DSP), or AI accelerator) or a field programmable gate array (FPGA). A processor in a device can be the same as or different from other processors in the device. In some embodiments, the device 1000 can comprise one or more processors that are heterogeneous or asymmetric to a first processor, accelerator, FPGA, or any other processor. There can be a variety of differences between the processing elements in a system in terms of a spectrum of metrics of merit including architectural, microarchitectural, thermal, power consumption characteristics and the like. These differences can effectively manifest themselves as asymmetry and heterogeneity amongst the processors in a system. In some embodiments, the processors 1002 and 1004 reside in the same die package.
Processors 1002 and 1004 further comprise memory controller logic (MC) 1020 and 1022. As shown in
Processors 1002 and 1004 are coupled to an Input/Output (I/O) subsystem 1030 via P-P interconnections 1032 and 1034. The point-to-point interconnection 1032 connects a point-to-point interface 1036 of the processor 1002 with a point-to-point interface 1038 of the I/O subsystem 1030, and the point-to-point interconnection 1034 connects a point-to-point interface 1040 of the processor 1004 with a point-to-point interface 1042 of the I/O subsystem 1030. Input/Output subsystem 1030 further includes an interface 1050 to couple I/O subsystem 1030 to a graphics engine 1052, which can be a high-performance graphics engine. The I/O subsystem 1030 and the graphics engine 1052 are coupled via a bus 1054. Alternately, the bus 1054 could be a point-to-point interconnection.
Input/Output subsystem 1030 is further coupled to a first bus 1060 via an interface 1062. The first bus 1060 can be a Peripheral Component Interconnect (PCI) bus, a PCI Express bus, another third generation I/O interconnection bus or any other type of bus.
Various I/O devices 1064 can be coupled to the first bus 1060. A bus bridge 1070 can couple the first bus 1060 to a second bus 1080. In some embodiments, the second bus 1080 can be a low pin count (LPC) bus. Various devices can be coupled to the second bus 1080 including, for example, a keyboard/mouse 1082, audio I/O devices 1088 and a storage device 1090, such as a hard disk drive, solid-state drive or other storage device for storing computer-executable instructions (code) 1092. The code 1092 can comprise computer-executable instructions for performing technologies described herein. Additional components that can be coupled to the second bus 1080 include communication device(s) 1084, which can provide for communication between the device 1000 and one or more wired or wireless networks 1086 (e.g. Wi-Fi, cellular or satellite networks) via one or more wired or wireless communication links (e.g., wire, cable, Ethernet connection, radio-frequency (RF) channel, infrared channel, Wi-Fi channel) using one or more communication standards (e.g., IEEE 1002.11 standard and its supplements).
The device 1000 can comprise removable memory such as flash memory cards (e.g., SD (Secure Digital) cards), memory sticks, Subscriber Identity Module (SIM) cards). The memory in device 1000 (including caches 1012 and 1014, memories 1016 and 1018 and storage device 1090) can store data and/or computer-executable instructions for executing an operating system 1094 and application programs 1096. Example data includes web pages, text messages, images, sound files, video data, stereoscopic images or videos, or other data to be sent to and/or received from one or more network servers or other devices by the device 1000 via one or more wired or wireless networks, or for use by the device 1000. The device 1000 can also have access to external memory (not shown) such as external hard drives or cloud-based storage.
The operating system 1094 can control the allocation and usage of the components illustrated in
The device 1000 can support various input devices, such as a touchscreen, microphone, monoscopic camera, stereoscopic camera, trackball, touchpad, trackpad, mouse, keyboard, proximity sensor, light sensor, electrocardiogram (ECG) sensor, PPG (photoplethysmogram) sensor, galvanic skin response sensor, and one or more output devices, such as one or more speakers or displays. Other possible input and output devices include piezoelectric and other haptic I/O devices. Any of the input or output devices can be internal to, external to or removably attachable with the device 1000. External input and output devices can communicate with the device 1000 via wired or wireless connections.
In addition, the computing device 1000 can provide one or more natural user interfaces (NUIs). For example, the operating system 1094 or applications 1096 can comprise speech recognition logic as part of a voice user interface that allows a user to operate the device 1000 via voice commands. Further, the device 1000 can comprise input devices and logic that allows a user to interact with the device 1000 via a body, hand or face gestures. For example, a user's hand gestures can be detected and interpreted to provide input to a gaming application or virtual reality application.
The device 1000 can further comprise one or more communication components 1084. The components 1084 can comprise wireless communication components coupled to one or more antennas to support communication between the system 1000 and external devices. The wireless communication components can support various wireless communication protocols and technologies such as Near Field Communication (NFC), IEEE 1002.11 (Wi-Fi) variants, WiMax, Bluetooth, Zigbee, 4G Long Term Evolution (LTE), Code Division Multiplexing Access (CDMA), Universal Mobile Telecommunication System (UMTS) and Global System for Mobile Telecommunication (GSM). In addition, the wireless modems can support communication with one or more cellular networks for data and voice communications within a single cellular network, between cellular networks, or between the mobile computing device and a public switched telephone network (PSTN).
The device 1000 can further include at least one input/output port (which can be, for example, a USB, IEEE 1394 (FireWire), Ethernet and/or RS-232 port) comprising physical connectors, a power supply (such as a rechargeable battery), a satellite navigation system receiver (such as a GPS receiver); a gyroscope; an accelerometer; a proximity sensor; and a compass. A GPS receiver can be coupled to a GPS antenna. The device 1000 can further include one or more additional antennas coupled to one or more additional receivers, transmitters and/or transceivers to enable additional functions.
In wearable embodiments, the device 1000 can comprise attachment mechanisms such as straps, clasps, or frames that allow the device to be attached to a body. In some embodiments, the device 1000 comprises a propulsion system, such as a motor to drive one or more propellers, fans, or wheels.
It is to be understood that
The processor core comprises front-end logic 1120 that receives instructions from the memory 1110. An instruction can be processed by one or more decoders 1130. The decoder 1130 can generate as its output a micro operation such as a fixed width micro operation in a predefined format, or generate other instructions, microinstructions, or control signals, which reflect the original code instruction. The front-end logic 1120 further comprises register renaming logic 1135 and scheduling logic 1140, which generally allocate resources and queues operations corresponding to converting an instruction for execution.
The processor core 1100 further comprises execution logic 1150, which comprises one or more execution units (EUs) 1165-1 through 1165-N. Some processor core embodiments can include a number of execution units dedicated to specific functions or sets of functions. Other embodiments can include only one execution unit or one execution unit that can perform a particular function. The execution logic 1150 performs the operations specified by code instructions. After completion of execution of the operations specified by the code instructions, back-end logic 1170 retires instructions using retirement logic 1175. In some embodiments, the processor core 1100 allows out of order execution but requires in-order retirement of instructions. Retirement logic 1170 can take a variety of forms as known to those of skill in the art (e.g., re-order buffers or the like).
The processor core 1100 is transformed during execution of instructions, at least in terms of the output generated by the decoder 1130, hardware registers and tables utilized by the register renaming logic 1135, and any registers (not shown) modified by the execution logic 1150. Although not illustrated in
As used in any embodiment herein, the term “module” refers to logic that may be implemented in a hardware component or device, software or firmware running on a processor, or a combination thereof, to perform one or more operations consistent with the present disclosure. Software may be embodied as a software package, code, instructions, instruction sets and/or data recorded on non-transitory computer readable storage mediums. Firmware may be embodied as code, instructions or instruction sets and/or data that are hard-coded (e.g., nonvolatile) in memory devices. As used in any embodiment herein, the term “circuitry” can comprise, for example, singly or in any combination, hardwired circuitry, programmable circuitry such as computer processors comprising one or more individual instruction processing cores, state machine circuitry, and/or firmware that stores instructions executed by programmable circuitry. Modules described herein may, collectively or individually, be embodied as circuitry that forms a part of one or more devices. Thus, any of the modules can be implemented as circuitry, such as stereoscopic content generation circuitry, revised stereoscopic generation circuitry, etc. A computer device referred to as being programmed to perform a method can be programmed to perform the method via software, hardware, firmware or combinations thereof.
Any of the disclosed methods can be implemented as computer-executable instructions or a computer program product. Such instructions can cause a computer or one or more processors that can execute computer-executable instructions to perform any of the disclosed methods. Generally, as used herein, the term “computer” refers to any computing device or system described or mentioned herein, or any other computing device. Thus, the term “computer-executable instruction” refers to instructions that can be executed by any computing device described or mentioned herein, or any other computing device.
The computer-executable instructions or computer program products as well as any data created and used during implementation of the disclosed technologies can be stored on one or more tangible or non-transitory computer-readable storage media, such as optical media discs (e.g., DVDs, CDs), volatile memory components (e.g., DRAM, SRAM), or non-volatile memory components (e.g., flash memory, solid state drives, chalcogenide-based phase-change non-volatile memories). Computer-readable storage media can be contained in computer-readable storage devices such as solid-state drives, USB flash drives, and memory modules. Alternatively, the computer-executable instructions may be performed by specific hardware components that contain hardwired logic for performing all or a portion of disclosed methods, or by any combination of computer-readable storage media and hardware components.
The computer-executable instructions can be part of, for example, a dedicated software application or a software application that is accessed via a web browser or other software application (such as a remote computing application). Such software can be read and executed by, for example, a single computing device or in a network environment using one or more networked computers. Further, it is to be understood that the disclosed technology is not limited to any specific computer language or program. For instance, the disclosed technologies can be implemented by software written in C++, Java, Perl, JavaScript, Adobe Flash, or any other suitable programming language. Likewise, the disclosed technologies are not limited to any particular computer or type of hardware.
Furthermore, any of the software-based embodiments (comprising, for example, computer-executable instructions for causing a computer to perform any of the disclosed methods) can be uploaded, downloaded or remotely accessed through a suitable communication technology. Such suitable communication technologies include, for example, the Internet, the World Wide Web, an intranet, cable (including fiber optic cable), magnetic communications, electromagnetic communications (including RF, microwave, and infrared communications), electronic communications, or other such communication technologies.
As used in this application and in the claims, a list of items joined by the term “and/or” can mean any combination of the listed items. For example, the phrase “A, B and/or C” can mean A; B; C; A and B; A and C; B and C; or A, B and C. As used in this application and in the claims, a list of items joined by the term “at least one of” can mean any combination of the listed terms. For example, the phrase “at least one of A, B or C” can mean A; B; C; A and B; A and C; B and C; or A, B and C.
The disclosed methods, apparatuses and systems are not to be construed as limiting in any way. Instead, the present disclosure is directed toward all novel and nonobvious features and aspects of the various disclosed embodiments, alone and in various combinations and subcombinations with one another. The disclosed methods, apparatuses, and systems are not limited to any specific aspect or feature or combination thereof, nor do the disclosed embodiments require that any one or more specific advantages be present or problems be solved.
Theories of operation, scientific principles or other theoretical descriptions presented herein in reference to the apparatuses or methods of this disclosure have been provided for the purposes of better understanding and are not intended to be limiting in scope. The apparatuses and methods in the appended claims are not limited to those apparatuses and methods that function in the manner described by such theories of operation.
Although the operations of some of the disclosed methods are described in a particular, sequential order for convenient presentation, it is to be understood that this manner of description encompasses rearrangement, unless a particular ordering is required by specific language set forth herein. For example, operations described sequentially may in some cases be rearranged or performed concurrently. Moreover, for the sake of simplicity, the attached figures may not show the various ways in which the disclosed methods can be used in conjunction with other methods.
The following examples pertain to additional embodiments of technologies disclosed herein.
Example 1 is an apparatus comprising: a left image sensor; a left lens; a right image sensor; and a right lens; one or more processors; one or more computer-readable media having stored thereon instructions that when executed cause the one or more processors to: generate, using the left image sensor and the right image sensor, a stereoscopic image comprising a left image and a right image, the left image including a right lens artifact, the right image including a left lens artifact; and generate a revised stereoscopic image by replacing the right lens artifact with left image content that blends in with the left image and replacing the left lens artifact with right image content that blends in with the right image.
Example 2 is the apparatus of Example 1, wherein the replacing the right lens artifact comprises generating the left image content via inpainting and the replacing the left lens artifact comprises generating the right image content via inpainting.
Example 3 is the apparatus of Example 2, wherein a right lens mask defines the portion of the left image to be replaced with the left image content a left lens mask defines the portion of the right image to be replaced with the right image content.
Example 4 is the apparatus of Example 1, wherein the left image content is based on a right image portion of the right image corresponding to a location of the right lens artifact in the left image and the right image content is based on a left image portion of the left image corresponding to a location of the left lens artifact in the right image.
Example 5 is the apparatus of Example 4, wherein: the left image content is based on a transformation of the right image portion from an image space of the right image to an image space of the left image; and the right image data is based on a transformation of the left image portion from the image space of the left image to the image space of the right image.
Example 6 is the apparatus of Example 4, wherein: replacing the right lens artifact with the left image content comprises providing the right image portion to an inpainting model, wherein the left image data is based on the resulting output of the inpainting model; and replacing the left lens artifact with the right image content comprises providing the left image portion to the inpainting model, wherein the right image data is based on the resulting output of the inpainting model.
Example 7 is the apparatus of Example 1, wherein the apparatus further comprises a display, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to cause the revised stereoscopic image to be shown on the display.
Example 8 is the apparatus of Example 7, wherein the revised stereoscope image shown on the display is an image in a revised stereoscopic video and the revised stereoscopic video is shown on the display in real time.
Example 9 is the apparatus of Example 1, wherein the left image comprises one or more additional feature artifacts, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to replace the one or more additional feature artifacts with additional left image content that blends in with the left image.
Example 10 is the apparatus of Example 1, wherein the right image comprises one or more additional feature artifacts, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to replace the one or more additional feature artifacts with additional right image content that blends in with the right image.
Example 11 is the apparatus of Example 1, wherein the apparatus is a head-mounted device or an optical head-mounted display device.
Example 12 is the apparatus of Example 1, wherein the apparatus further comprises an antenna, a battery, and a network interface.
Example 13 is the apparatus of Example 1, wherein the apparatus is a stereoscopic camera.
Example 14 is a computing device comprising: one or more processors; a display having a display field of view (FOV); and one or more computer-readable media having stored thereon instructions that when executed cause the one or more processors to: determine an orientation of the computing device; determine a portion of stereoscopic content to show on the display based on the orientation of the computing device, the stereoscopic content having a stereoscopic content FOV greater than the display FOV; if the portion of the stereoscopic content to show on the display includes at least a portion of a lens artifact, generate revised stereoscopic content by replacing the lens artifact with content that blends in with the stereoscopic content and show the revised stereoscopic content instead of the stereoscopic content on the display; and if the portion of the stereoscopic image to shown on the display does not include at least a portion of the lens artifact, show the stereoscopic image on the display.
Example 15 is a system comprising: one or more processors; one or more computer-readable media having stored thereon instructions that when executed cause the one or more processors to: generate a revised stereoscopic image from a stereoscopic image, the stereoscopic image comprising a left image and a right image, the left image including a right lens artifact, the right image including a left lens artifact, the generating comprising replacing the right lens artifact with left image content that blends in with the left image and replacing the left lens artifact with content that blends in with the right image; and store the revised stereoscopic image.
Example 16 is the system of Example 15, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to: receive the stereoscopic image from a computing device; and send the revised stereoscopic image to the computing device.
Example 17 is the system of Example 15, wherein the replacing the right lens artifact comprises generating the left image content via inpainting and the replacing the left lens artifact comprises generating the right image content via inpainting.
Example 18 is the system of Example 17, wherein a right lens mask defines the portion of the left image to be replaced with the left image content a left lens mask defines the portion of the right image to be replaced with the right image content.
Example 19 is the system of Example 18, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to receive right lens mask information describing the right lens mask and left lens mask information defining the left lens mask.
Example 20 is the system of Example 15, wherein the left image content is based on a right image portion of the right image corresponding to a location of the right lens artifact in the left image and the right image content is based on a left image portion of the left image corresponding to a location of the left lens artifact in the right image.
Example 21 is the system of Example 20, wherein: the left image content is based on a transformation of the right image portion from an image space of the right image to an image space of the left image; and the right image data is based on a transformation of the left image portion from the image space of the left image to the image space of the right image.
Example 22 is the system of Example 20, wherein: replacing the right lens artifact with the left image content comprises providing the right image portion to an inpainting model, wherein the left image data is based on the resulting output of the inpainting model; and replacing the left lens artifact with the right image content comprises providing the left image portion to the inpainting model, wherein the right image data is based on the resulting output of the inpainting model.
Example 23 is the system of Example 15, wherein the left image comprises one or more additional feature artifacts, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to replace the one or more additional feature artifacts with additional left image content that blends in with the left image.
Example 24 is the system of Example 15, wherein the right image comprises one or more additional feature artifacts, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to replace the one or more additional feature artifacts with additional right image content that blends in with the right image.
Example 25 is a stereoscopic content generation method comprising: generating, using a left image sensor and a right image sensor, a stereoscopic image comprising a left image and a right image, the left image including a right lens artifact, the right image including a left lens artifact; and generating a revised stereoscopic image by replacing the right lens artifact with left image content that blends in with the left image and replacing the left lens artifact with right image content that blends in with the right image.
Example 26 is the method of Example 25, wherein the replacing the right lens artifact comprises generating the left image content via inpainting and the replacing the left lens artifact comprises generating the right image content via inpainting.
Example 27 is the method of Example 25, wherein a right lens mask defines the portion of the left image to be replaced with the left image content a left lens mask defines the portion of the right image to be replaced with the right image content.
Example 28 is the method of Example 25, wherein the left image content is based on a right image portion of the right image corresponding to a location of the right lens artifact in the left image and the right image content is based on a left image portion of the left image corresponding to a location of the left lens artifact in the right image.
Example 29 is the method of Example 28, wherein: the left image content is based on a transformation of the right image portion from an image space of the right image to an image space of the left image; and the right image data is based on a transformation of the left image portion from the image space of the left image to the image space of the right image.
Example 30 is the method of Example 28, wherein: replacing the right lens artifact with the left image content comprises providing the right image portion to an inpainting model, wherein the left image data is based on the resulting output of the inpainting model; and replacing the left lens artifact with the right image content comprises providing the left image portion to the inpainting model, wherein the right image data is based on the resulting output of the inpainting model.
Example 31 is the method of Example 25, wherein the apparatus further comprises a display, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to cause the revised stereoscopic image to be shown on the display.
Example 32 is the method of Example 25, wherein the revised stereoscope image shown on the display is an image in a revised stereoscopic video and the revised stereoscopic video is shown on the display in real time.
Example 33 is the method of Example 25, wherein the left image comprises one or more additional feature artifacts, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to replace the one or more additional feature artifacts with additional left image content that blends in with the left image.
Example 34 is the apparatus of Example 25, wherein the right image comprises one or more additional feature artifacts, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to replace the one or more additional feature artifacts with additional right image content that blends in with the right image.
Example 35 is a stereoscopic content display method comprising: determining an orientation of a device comprising a display having a display field of view (FOV); determining a portion of stereoscopic content to show on the display based on the orientation of the device, the stereoscopic content having a stereoscopic content FOV greater than the display FOV; if the portion of the stereoscopic image to show on the display includes at least a portion of a lens artifact, generating revised stereoscopic content by replacing the lens artifact with content that blends in with the stereoscopic content and showing the revised stereoscopic content instead of the stereoscopic content on the display; and if the portion of the stereoscopic image to shown on the display does not include at least a portion of the lens artifact, showing the stereoscopic image on the display.
Example 36 is a stereoscopic content generation method comprising: generating a revised stereoscopic image from a stereoscopic image captured using a left image sensor and a right sensor, the stereoscopic image comprising a left image and a right image, the left image including a right lens artifact, the right image including a left lens artifact, the generating comprising replacing the right lens artifact with left image content that blends in with the left image and replacing the left lens artifact with content that blends in with the right image; and storing the revised stereoscopic image.
Example 37 is the method of claim 36, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to: receive the stereoscopic image from a computing device; and send the revised stereoscopic image to the computing device.
Example 38 is the method of claim 36, wherein the replacing the right lens artifact comprises generating the left image content via inpainting and the replacing the left lens artifact comprises generating the right image content via inpainting.
Example 39 is the method of claim 36, wherein a right lens mask defines the portion of the left image to be replaced with the left image content a left lens mask defines the portion of the right image to be replaced with the right image content.
Example 40 is the method of claim 39, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to receive right lens mask information describing the right lens mask and left lens mask information defining the left lens mask.
Example 41 is the method of claim 39, wherein the left image content is based on a right image portion of the right image corresponding to a location of the right lens artifact in the left image and the right image content is based on a left image portion of the left image corresponding to a location of the left lens artifact in the right image.
Example 42 is the method of claim 41, wherein: the left image content is based on a transformation of the right image portion from an image space of the right image to an image space of the left image; and the right image data is based on a transformation of the left image portion from the image space of the left image to the image space of the right image.
Example 43 is the method of claim 41, wherein: replacing the right lens artifact with the left image content comprises providing the right image portion to an inpainting model, wherein the left image data is based on the resulting output of the inpainting model; and replacing the left lens artifact with the right image content comprises providing the left image portion to the inpainting model, wherein the right image data is based on the resulting output of the inpainting model.
Example 44 is the method of claim 36, wherein the left image comprises one or more additional feature artifacts, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to replace the one or more additional feature artifacts with additional left image content that blends in with the left image.
Example 45 is the method of claim 36, wherein the right image comprises one or more additional feature artifacts, the one or more computer-readable media having stored thereon instructions that when executed further cause the one or more processors to replace the one or more additional feature artifacts with additional right image content that blends in with the right image.
Example 46 is one or more computer-readable storage media having instructions stored thereon that when executed cause one or more processors to perform the method of any of the claims 25-45.
Example 47 is an apparatus comprising means to perform a method as claimed in any of the claims 25-45.