This application relates to generating and displaying a three-dimensional (3D) medical image. In some cases, a handheld device generates and displays the 3D medical image.
Otoscopes are devices that illuminate and image the interior of a patient's ear. In many cases, otoscopes are handheld devices operated by clinicians, such as physicians or nurses. In general, an otoscope can be equipped with a disposable adapter configured to hold the otoscope to the patient's ear during use. Otoscopes are important devices to diagnose patients in a variety of clinical settings, such as emergency rooms, pediatric offices, and general practice clinics.
Most otoscopes are monocular devices with a single screen. Using the single screen, otoscopes an provide a two-dimensional (2D) image of the ear canal and its contents. For instance, an otoscope can be used to visualize the eardrum. As an alternative to conventional otoscopes, clinicians can visualize ear canal using a binocular microscope in conjunction with a speculum disposed in the patient's ear. The binocular microscope can be used to visualize the ear canal in three dimensions and potentially with enhanced illumination, but has a number of drawbacks that make it less convenient than conventional otoscopes. For example, the binocular microscope must be used while the patient is in a supine position while their head is lifted, which can be inconvenient for both the patient and the clinician. It can be difficult, for instance, to position pediatric patients for a binocular microscope. Furthermore, the binocular microscope is generally heavier and more expensive than a conventional otoscope.
Due to the inconvenience of binocular microscopes, monocular otoscopes are the dominant tool for visualization of the inner ear. However, studies have shown that reliance on monocular otoscopes to diagnose ear diseases results in more than a 50% chance of misdiagnosis, as compared to binocular microscopic otoscopy. Thus, there is a need for an otoscope that provides a 3D view of the inner ear, but that is lightweight, inexpensive, and easy to use.
Various implementations of the present disclosure relate to a handheld instrument that can obtain and display a 3D image of a subject. The instrument, for example, can be a dermascope, a fundascope, an otoscope, a nasoscope, anoscope, proctoscope, or an endoscope.
According to various cases, the instrument images a 3D surface of a subject. In some implementations, the instrument captures a 2D colorized image of the surface. For example, the instrument includes an imager including a 2D array of light sensors configured to detect the color and intensity of light reflected from the surface. Further, the instrument may capture a depth image of the surface of the subject. The instrument, for instance, utilizes optical coherence tomography (OCT), ultrasound, or some other depth imaging technique to generate the depth image. In various cases, the instrument generates a 3D image of the surface by combining the 2D image of the surface and the depth image of the surface. In some examples, the instrument generates a 3D mesh of the surface using the depth image, and then colorizes and/or texturizes the 3D mesh using the 2D image.
Further, the instrument displays the 3D image. In some cases, the instrument includes a lenticular display that includes a single 2D screen. The lenticular display can output the 3D image using an array of lenses that direct light output by the screen into different directions towards the two eyes of a user. As a result, the user can perceive the displayed image in three dimensions.
The following figures, which form a part of this disclosure, are illustrative of described technology and are not meant to limit the scope of the claims in any manner.
Various implementations of the present disclosure will be described in detail with reference to the drawings, wherein like reference numerals present like parts and assemblies throughout the several views. Additionally, any samples set forth in this specification are not intended to be limiting and merely set forth some of the many possible implementations.
The environment 100 includes a scope 104. In various implementations, the scope 104 is an imaging device configured to image a surface 106 of the subject 102. In some examples, the surface 106 is an external surface of the subject. For instance, the surface 106 may be an outer surface of the skin of the subject 102. In some cases, the surface 106 is an interior surface of an orifice of the subject 102. For example, the surface 106 is an interior surface of a nostril, a mouth, an ear canal, an anus, or a vagina of the subject. According to some cases, the scope 104 includes an adapter 108 configured to be inserted into the orifice of the subject 102. In various implementations, the surface 106 is an interior surface of the subject 102, such as an interior surface of an abdominal cavity, a blood vessel, a heart, or a gastrointestinal tract. According to various implementations, the surface 106 can be planar or any curved surface that can be modeled as a tessellated mesh. The scope 104, for instance, is a dermascope, a fundascope, an otoscope, a nasoscope, anoscope, proctoscope, or an endoscope.
In various implementations, the scope 104 is portable. For example, the scope 104 may be equipped with one or more batteries that supply power to the scope 104 during operation. In some cases, the one or more batteries are rechargeable. For instance, the one or more batteries are configured to be recharged wirelessly in the presence of an electromagnetic field, or recharged via a wire configured to connect the scope 104 to mains power (e.g., via a port in the scope 104).
The elements of the scope 104, according to some examples, are enclosed in a housing. In some examples, the housing is waterproof. For instance, the housing includes a plastic shell that at least partially encloses the elements of the scope 104. In various cases, the scope 104 is configured to be hand-held. For instance, the housing of the scope 104 includes a handle 110 configured to be grasped by a user 112 during operation. The scope 104, for example, is relatively lightweight and is configured to be held by a single hand of the user 112. For instance, the scope 104 may weigh less than 2 kilograms (kg). The user 112, according to various cases, is a clinician. For instance, the user 112 is a physician, a nurse, a medical student, a medical technician, a resident, or a fellow. For instance, the user 112 may be examining the subject 102 with the scope 104 during a medical appointment. In various cases, the scope 104 is utilized in a primary care context. For instance, in some cases, the user 112 may be a family medicine practitioner, a nurse practitioner, a physician's assistant, or an internal medicine practitioner rather than a specialist.
In some cases, the scope 104 captures a two-dimensional (2D) image of the surface 106 of the subject. For example, the scope 104 emits light toward the surface 106 and detects the light reflected from the surface. According to some cases, the scope 104 includes at least one image sensor configured to generate the 2D image of the surface 106. In various implementations, the 2D image includes a 2D array of image pixels. As used herein, the term “image pixel,” and its equivalents, can refer to a unit of data that represents an area of a 2D image. In some implementations, the 2D image is referred to as “2D image data.” An example image sensor includes a 2D array of light sensors, wherein each individual light sensor is configured to generate an individual image pixel in the 2D image of the surface 106. For instance, the image sensor includes a single photon avalanche diode (SPAD), a charge-coupled device (CCD), an active-pixel sensor (APS), or any combination thereof. The image pixels of the 2D image may respectively correspond to the intensity and/or frequency of light reflected from different portions of the surface 106 and detected by the scope 104. In some examples, the 2D image is a grayscale image wherein each image pixel corresponds to a value between a minimum value and a maximum value, the value corresponding to the intensity of light received by the corresponding light sensor. In some instances, the 2D image is a color image wherein each image pixel corresponds to a set of values respectively corresponding to the intensities of a set of base colors of the light received by the corresponding light sensor. For example, the 2D image is a red-green-blue (RGB) image, wherein each image pixel is a set of three numbers corresponding to the intensity of red light, green light, and blue light components in the light received by the corresponding light sensor.
However, the 2D image may omit clinically significant details of the surface 106. For example, the surface 106 may include a bump or other three-dimensional (3D) structure that is indicative of a particular medical diagnosis. For example, if the subject 102 has a foreign body in their ear canal, but the foreign body is the same color and apparent texture of the ear canal, it may be difficult for the user 112 to identify the foreign body using the 2D image alone. Thus, it may be advantageous for the scope 104 to capture 3D information about the surface 106 in addition to the 2D image.
In various implementations of the present disclosure, the scope 104 generates a depth image representing the surface 106. For example, the scope 104 includes an ultrasound transducer configured to emit ultrasound toward the surface 106, to detect a reflection of the ultrasound from the surface 106, and to generate a depth image representing the distance between the surface 106 and the scope 104 based on the detected ultrasound. In some implementations, the depth image is also referred to as “depth image data.”
In some cases, the scope 104 generates the depth image using optical coherence tomography (OCT). For instance, the scope 104 may emit low-coherence light toward the surface 106 and detect an interference pattern including a reflection of the low-coherence light from the surface 106 and a reference beam of the low-coherence light. As used herein, the term “low-coherence light,” and its equivalents, may refer to a group of photons that are transmitted together and which have at least two frequencies. For instance, the scope 104 may include a halogen bulb, a light-emitting diode (LED), or a tungsten filament configured to generate the low-coherence light. In various cases, the scope 104 omits a laser that would otherwise generate coherent light. The scope 104 may generate the depth image based on the interference pattern. In various implementations, a single imager is configured to detect the 2D image and the interference pattern, which can reduce the size, weight, and complexity of the scope 104. For example, the same image sensor used to generate the 2D image may also be used to generate the depth image using OCT.
According to various examples, the scope 104 generates a 3D image of the surface 106 based on the 2D image and the depth image. For example, the scope 104 may texturize and/or colorize the depth image using the 2D image. According to some instances, the 2D image and the depth image have different resolutions. In various implementations, the scope 104 interpolates values of the 2D image onto locations of the surface 106 indicated by the depth image. The 3D image, in various implementations, includes voxels defined in three dimensions. As used herein, the term “voxel,” and its equivalents, may refer to a unit of data representing the intensity and/or frequency of light representing a unit of volume in a 3D image. In some cases, voxels with nonzero values in the 3D image correspond to the location distribution of the surface 106. These nonzero values, in various implementations, correspond to the color and/or intensity of the light reflected from the surface 106 along the location distribution. In some implementations, voxels representing volumes in the 3D image that do not align with the surface 106 are assigned zero values. According to some examples, the 3D image can be referred to as “3D image data.”
In some cases, the scope 104 is equipped with a single screen to display the 2D image, the depth image, or the 3D image. However, the single screen may include display pixels that are arranged two-dimensionally. In cases where the scope 104 includes two screens, the scope 104 can portray the 3D image to the user 112 by generating two derived 2D images that are projected from the 3D image at two different locations. The locations, for instance, can correspond to the perspective of two eyes 114 of the user 112. If the two screens respectively display the two derived 2D images to the respective eyes 114 of the user 112, the user 112 may perceive the 3D image due to the parallax induced from the two derived 2D images.
However, in some cases, it may be disadvantageous for the scope 104 to include two screens respectively displaying images to the eyes 114 of the user 112. For instance, the two screens can greatly increase the cost, size, weight, and energy consumption of the scope 104. Further, in some cases, the user 112 may seek to view the 3D image at a distance from the scope 104. For example, it may be cumbersome for the user 112 to bring her face up to the scope 104 in order to view the 3D image.
In various implementations of the present disclosure, the scope 104 displays the 3D image using a lenticular display 116. According to examples of the present disclosure, the lenticular display 116 includes a 2D array of display pixels 118 and an array of lenses 120. As used herein, the term “display pixels,” and their equivalents, can refer to light-emitting elements of a display screen. For example, the display pixels 118 may include light-emitting diodes (LEDs), organic LEDs (OLEDs), cold cathode fluorescent lamps, quantum dot (QD) display elements, thin-film transistors, or other types of light-emitting elements. The display pixels 118 are configured to emit light through at least some of the lenses 120.
The array of lenses 120 is disposed (e.g., overlaid) on the display pixels 118, such that the lenses 120 are disposed between the eye 114 of the user 112 and the display pixels 118. In various implementations, the lenses 120 are configured to transmit and/or refract light. The lenses 120 include a material transmissible to visible light. For example, the lenses 120 include glass, quartz, sapphire, or a transparent plastic. In some cases, the lenses 120 are spherical lenses. In various cases, the lenses 120 include Fresnel lenses.
As shown in
In various implementations, the user 112 can adjust the lenticular display 116 using an input device, such as a dial 126 on the side of the scope 104. For example, the dial 126 can be used to power on the scope 104, to adjust the brightness of the lenticular display 116, to adjust the brightness of the light source used to illuminate the surface 106, or the like.
The scope 104, in some examples, may display the 2D image, the depth image, the 3D image, or a combination thereof, on the lenticular display 116. In various cases, the scope 104, upon detecting an input signal via the dial 126, may cause the lenticular display 116 to switch between displaying the 2D image and the 3D image. For example, the user 112 may prefer to view the 2D image and operate the dial 126 accordingly.
In some implementations, the lenticular display 116 is associated with a predetermined viewing distance and viewing angles. For example, the lenses in the lenticular display 116 are configured to output the 3D image at a particular distance from the scope 104 at predetermined viewing angles, such that the user 112 can perceive the 3D image at the particular distance and in a position corresponding to the viewing angles. According to various examples, the scope 104 may be configured to detect whether the user 112 is disposed at the particular distance from the scope 104. For example, the scope 104 includes a distance sensor directed toward the eye 114 and which can detect a distance between the lenticular display 116 and the eye 114. Examples of the distance sensor include an ultrasound transducer, an infrared distance sensor, or a camera. For instance, the distance sensor is configured to generate a signal indicative of the distance between the lenticular display 116 and the eye 114 by detecting a time at which an ultrasound pulse takes to travel from the scope 104 to the user 112 and back again; by analyzing a reflection of an infrared pattern from the user 112; or by analyzing an image of the face of the user 112 that depicts the eye 114. In various implementations, the scope 104 detects a direction by which the eye 114 is disposed with reference to a surface of the lenticular display 116. For instance, the scope 104 includes a camera configured to detect an image of the eye 114, at least one processor within the scope 104 is configured to identify a pupil of the eye 114 depicted in the image, and the processor(s) detect the position of the eye 114 based on the identified depiction of the pupil.
In some examples, the scope 104 compares the detected distance to a predetermined viewing distance. In some cases, the scope 104 compares the position of the eye 114 to a predetermined viewing position range, as defined by the predetermined viewing angles. If the scope 104 determines that the detected distance is different than the predetermined viewing distance (e.g., outside of a predetermined range including the predetermined viewing distance), then the scope 104 may perform one or more remedial actions. Further, if the scope 104 determines that the position of the eye 114 is outside of the predetermined viewing position corresponding to the predetermined viewing angles, the scope 104 may perform the remedial action(s). In some examples, the one or more remedial actions include outputting an alert to the user 112 recommending that the user 112 adjust their position. In some cases, the alert is output as a visual alert on the lenticular display 116 and/or via a light integrated with the housing of the scope 104. In some implementations, the alert is output as an audible alert by a speaker within the scope 104. The alert, for instance, indicates a recommendation to increase or decrease the distance between the user 112 and the scope 104. In some implementations, the one or more remedial actions include dimming or deactivating the lenticular display 116 until the scope 104 detects that the distance between the lenticular display 116 and the eye 114 (or the position of the eye 114) has entered the predetermined range. In some cases, the one or more remedial actions include adjusting the lenses 120 to change the predetermined range. For instance, the scope 104 may be configured to alter the targeted viewing distance by tilting the lenses 120, moving the array of lenses 120 closer or farther from the display pixels 118, or the like. In some cases, the scope 104 includes one or more actuators configured to adjust the position of the lenses 120. In some examples, the scope 104 controls the display pixels 118 based on the detected distance (or viewing angle) between the lenticular display 116 and the eye 114.
The scope 104 described with reference to
Although not specifically illustrated in
In some implementations, the scope 104 is configured to communicate with one or more external devices. For example, the scope 104 is configured to transmit data to an external computing device (e.g., a mobile phone, a tablet computer, a server, etc.) and/or to receive data from the external computing device. In some implementations, the external computing device performs at least some of the analysis and/or data generation functionality described herein.
The lenticular display 202 includes an array of display pixels 206 and an array of lenses 208. In various implementations, the display pixels 206 are arranged as a 2D array. For instance,
The lenses 208, in various cases, are also arranged as a 2D array. For example, the lenticular display 202 includes multiple columns of the display pixels 206 arranged in a z-direction. In various cases, the lenses 208 are disposed between the display pixels 206 and a viewer, such as between the display pixels 206 and the right eye 210 and left eye 212 of the viewer. The lenses 208 overlap the display pixels 206 in the x-direction. For instance, the lenses 208 are configured to refract light emitted by the display pixels 206. Although
Despite the display pixels 206 being arranged in a 2D display screen, the lenticular display 202 may output the 2D image 204 to the right eye 210 and the left eye 212 of the viewer based on the refraction of the lenses 208. In various cases, the lenses 208 redirect the light output by the display pixels 206 in multiple directions. Even though the display pixels 206 are configured to output light in the x-direction, the lenses 208 may steer the light in various xy-directions that intersect the right eye 210 and left eye 212. By steering the light in these different directions, the right eye 210 perceives a different 2D image than the left eye 212. That is, a right image 214 received by the right eye 210 is different than a left image 216 received by the left eye 212. The right image 214, for instance, is a projection of the 3D image 204 onto the position of the right eye 210. The left image 216, for instance, is a projection of the 3D image 206 onto the position of the left eye 212. As a result, the viewer may perceive the 3D image 204 three-dimensionally by viewing the lenticular display 202.
In various implementations, the right eye 210 and/or the left eye 212 may be separated from the lenticular display 202 by a viewing distance 218. For instance, the viewing distance 218 may be a distance between the lenses 208 and a lens of the right eye 210 or the left eye 212. In various implementations, the light output by the display pixels 206 and the lenses 208 are optimized for the viewing distance 218.
In various implementations, a light source 310 emits low coherence light 312. In some implementations, the light source 310 includes one or more LEDs, a xenon-based light source, a halogen bulb, a tungsten filament, or any combination thereof configured to generate the low coherence light 312.
An incident mirror 314, in various cases, reflects the low coherence light 312 to a beam splitter 316. The incident mirror 314, for instance, is a digital micro mirror device (DMMD) or a digital light pressing (DLP) mirror. In various cases, the beam splitter 316 is configured to divide the low coherence light 312 into a reference beam 318 and an interference beam 320. The beam splitter 316 directs the reference beam 318 to a deformable reference mirror 322. The deformable reference mirror 322 reflects the reference beam 318 to the imager 304. In contrast, the beam splitter 316 directs the interference beam 320 to the subject 302. A reflection of the interference beam 320 from the subject 302 is received by the beam splitter 316, which may direct the reflection to the imager 304. Collectively, the reflection of the reference beam 318 from the deformable reference mirror 322 and the reflection of the interference beam 320 generate an interference pattern 324 that is detected by the imager 304.
In various implementations, the imager 304 includes an array of photosensors configured to detect the interference pattern 324. In various implementations, the imager 304 is configured to generate the depth image 308 based on the interference pattern 324.
Further, the imager 304 may detect at least a portion of the reflection of the interference beam 320 from the subject 302. Based on the resulting image of this reflection, the imager 304 may generate a 2D image 306 of the subject 302. For example, based on the intensity and/or frequency of the reflection of the interference beam 320 detected by the imager 304, the imager 304 may generate a 2D image of the subject 302. The 2D image 306 may depict the subject 302 from a single perspective.
In various implementations, a 3D image generator 326 generates a 3D image of the subject 302 based on the 2D image 306 and the depth image 308. According to some cases, the 3D image generator 326 is implemented by one or more processors configured to execute instructions. For instance, the 3D image generator 326 is configured to generate an array of voxels representing the subject 302 in 3D space. In various cases, the 3D image generator 326 performs ray tracing, shading, and tessellation to generate the 3D image of the subject 302. In various implementations, the 3D image generator 326 further performs point-of-view (POV) display segmentation to encode the 3D image for display via a lenticular display.
In some cases, the environment 300 includes one or more lenses 328 configured to refract low-coherence light 312, the reference beam 318, the reflection of the reference beam 318 from the deformable reference mirror 322, the interference beam 320, the reflection of the interference beam 320 from the subject 302, or any combination thereof. For example, the lens(es) 328, in various cases, are disposed between the light source 310 and the incident mirror 314, between the beam splitter 316 and the subject 302, between the beam splitter 316 and the deformable reference mirror 322, between the beam splitter 316 and the imager 304, or any combination thereof. In some cases, the position(s) of the lens(es) 328 can be modified via one or more actuators (not illustrated), such as in order to adjust the focus and/or resolution of the 2D image 306 and/or the depth image 308.
According to various implementations, the shape of a reflective surface of the deformable reference mirror 322 is adjustable. For example, an actuator 330 is physically coupled to the deformable reference mirror 322. In various cases, the actuator 330 is configured to alter a curvature and/or position of the deformable reference mirror 322. For example, if the deformable reference mirror 322 is in a concave shape, the reflection of the reference beam 318 converges. Alternatively, if the deformable reference mirror 322 is in a convex shape, the reflection of the reference beam 318 diverges.
In various implementations, the curvature of the deformable reference mirror 322 is optimized according to the interference pattern 324. In various cases, the curvature of the deformable reference mirror 322 corresponds to a depth of the subject 302 represented by the interference pattern 324. For example, if the reference beam 318 and the interference beam 320 are too mismatched, the interference pattern 324 cannot be adequately converted into the depth image 308. Changing the curvature of the deformable reference mirror 322 can correct the depth alignment of the interference pattern 324.
In various implementations, the light source 310 includes at least two elements configured to output different types of light. One element may be configured to output the low-coherence light 312 or even coherent light, which may be optimized for generating the interference pattern 324 and the depth image 308. Another element may be configured to output a more broad-spectrum light, such as white light, that is configured to illuminate the subject 302 in order to obtain the 2D image 306. In some implementations, the light source 310 includes a light-emitting element and a filter wheel that selectively passes coherent or low-coherence light (e.g., when obtaining the depth image 308) and emits white light (e.g., when obtaining the 2D image 306) at different times.
As shown, the depth pixels 402 have a higher resolution at a periphery of the depth image 400, and may have a lower resolution at a center of the depth image 400. For example, the depth image 400 may depict the center of the FOV with a lower pixel density than a periphery of the FOV. In various cases, this resolution distribution may be suitable for imaging the interior surface of an orifice, such as an ear. Because a center of the FOV of an orifice is a hole with relatively high depth and minimal diagnostic value, the depth image 400 may be generated to have relatively high resolution depicting a periphery of the FOV and relatively low resolution depicting a center of the FOV.
In various implementations, the scope 500 includes an adapter 502 configured to be inserted into an orifice of a subject, such as an ear canal. In some cases, the adapter 502 is disposable and removably coupled to the rest of the scope 500. The scope 500 includes a handle 504 configured to be grasped and held by the hand of a user, such as a clinician operating the scope 500. Further, the scope 500 includes a display 506. The display 506, in some cases, is circular. In various implementations, the display 506 is a lenticular display. For example, the display 506 includes a single 2D screen that outputs the 3D image via an array of lenses overlapping the 2D screen.
In various implementations, the display 600 has a uniform resolution (e.g., each display pixel in the display 600 has a consistent width and height). The resolution of the display pixels in the display 600 corresponds to a maximum resolution of the FOV. However, in cases where the display 600 is a lenticular display, a single location in the depicted image may be represented by multiple display pixels, which can lower the image resolution. Thus, as more depth information is portrayed by the display 600, the less image resolution is portrayed by the display 600. In various cases, the first zone 602 depicts the 3D image at a high image resolution, with minimal depth information. In contrast, the third zone 606 depicts the 3D image at low image resolution but with high depth information.
In various implementations, the depth pixels 708 are used to define a polygonal mesh 714. For example, the depth pixels 708, in some cases, define vertices of the polygonal mesh 714. Alternatively, each depth pixel 708 defines a face (e.g., a center of the face) of the polygonal mesh 714. In the example illustrated in
The 2D image 704, in various implementations, includes an array of image pixels 716. In various implementations, the image pixels 716 are defined according to the intensity and/or frequency of light reflected from positions along the surface being imaged. In various cases, the 2D image 704 can be defined along a plane that is parallel to a sensing face of the imager(s). For example, if the imager(s) include a 2D photosensor array defined in a plane along a first direction and a second direction, then the 2D image 704 may be defined along a parallel plane.
In various implementations, the 3D image of the surface is generated by ray tracing. For example, rays may be defined from the focal point 710 through the image pixels 716 and onto the faces of the polygonal mesh 714 defined by the depth pixels 708. Thus, the image pixels 716 are projected onto the polygonal mesh 714. In various implementations, voxels in the 3D surface 706 are defined based on nearby and/or overlapping vertices of the depth image 702 and/or values of the image pixels 716 that project near the voxels. In some cases, a first example voxel in the 3D surface 706 is located at a position in 3D space that overlaps the first depth pixel 708′. Thus, the first example voxel is located along the surface being imaged. The value of the first example voxel (e.g., a color of the first example voxel) is determined based on a value of a first image pixel 716′, which is projected onto the first depth pixel 708′ and the first example voxel in the 3D surface 706. For instance, the projection of the first image pixel 716′ is directly onto the first example voxel, such that the value of the first example voxel is equivalent to the value of the first image pixel 716′. the vertices of the polygonal mesh 714 are defined based on the values of the image pixels 716.
According to some cases, not all of the image pixels 716 project perfectly onto the positions corresponding to the depth pixels 708. For example, a second image pixel 716″ projects onto a position between a second depth pixel 708″ and a third depth pixel 708″′ in the depth image 702. In various implementations, the value of the second image pixel 716″ is used to define the value of a voxel located between the second depth pixel 708″ and the third depth pixel 708″′ in the 3D space.
In various implementations, the spatial resolution of the image pixels 716 may be different than that of the depth pixels 708 at various regions within 3D space. In some cases, a value of an example face of the polygonal mesh 714 is defined by linearly interpolating at least two of the image pixels 716. For example, if two neighboring image pixels 716 project onto two separate voxels of the polygonal mesh 714, wherein another voxel is disposed between the two separate voxels of the polygonal mesh 714, then a value of the middle voxel can be defined based on a combination of the values of the two neighboring image pixels 716. For instance, the combination may be a linear interpolation of the values of the two neighboring image pixels 716. For instance, the combination may be a mean of the values.
By projecting the image pixels 716 onto the polygonal mesh 714, the voxels of the 3D surface 706 may be defined. In various implementations, the 3D surface 706 is generated by performing tessellation on the polygonal mesh 714. For example, vertices are added to the polygonal mesh 714 in order to smooth the polygonal mesh 714 into the 3D surface 706. For example, the polygonal mesh 714 is converted into the 3D surface 706 using at least one tessellation algorithm. In some cases, the 3D surface 706 is defined as a polygonal shape with a greater number of faces than the polygonal mesh 714. In various implementations, the projection of the image pixels 716 is performed after the tessellation, such that the projection is performed on voxels defining the 3D surface 706 rather than the depth pixels 708. However, implementations are not so limited.
At 802, the entity identifies a 2D image of a surface. For example, the entity receives and/or generates the 2D image that is indicative of light reflected from the surface. In some cases, the 2D image indicates the intensity and/or color of the light reflected from the surface. In various implementations, the entity includes an imager configured to generate the 2D image of the surface. For instance, the entity may emit light onto the surface, detect the light reflected from the surface, and generate the 2D image based on the detected light. The 2D image may be represented by multiple image pixels. According to various cases, a spatial resolution of the 2D image decreases radially from a center of the FOV of the 2D image.
At 804, the entity identifies a depth image of the surface. In various implementations, the entity receives and/or generates data indicative of a distance between a sensor and the surface. In some cases, the entity generates the depth image using ultrasound and/or OCT imaging. For instance, the entity may detect an interference pattern using the same imager used to detect the 2D image. According to various cases, a spatial resolution of the depth image increases radially from a center of the FOV of the depth image. In some cases, the depth image has a different relative spatial resolution than the 2D image.
At 806, the entity generates a 3D image of the surface based on the image and the depth image. In various implementations, the entity defines locations along the surface in 3D space using the depth image. In some cases, the entity generates a 3D binary image, wherein the nonzero values of the voxels in the 3D binary image correspond to the locations along the surface. For instance, the entity converts the radial data in the depth image to 3D coordinates. In some cases, voxels disposed between neighboring depths in the depth image are defined by geometrically interpolating the neighboring depths in 3D space.
The entity, in some cases, colorizes and/or texturizes the 3D binary image using the 2D image. For example, the entity defines values of the nonzero voxels based on the image pixels in the 2D image of the surface. In various implementations, the entity generates the 3D image by ray tracing the values of the image pixels in the 2D image onto the shape of the surface (e.g., as defined in the 3D binary image). In some cases, the entity performs linear interpolation to define the values of one or more nonzero voxels disposed between rays traced between neighboring image pixels in the 2D image and projected onto the nonzero values of the 3D binary image. In some implementations, the entity performs tessellation to generate the 3D image.
At 808, the entity outputs the 3D image on a 3D display. In some implementations, the entity outputs the 3D image via a lenticular display. For instance, the entity controls a single 2D display pixel array and/or an array of lenses overlapping the pixel array to direct the 3D image in two directions corresponding to two eyes of a user. For instance, the entity causes the apparent image perceived by a first eye of the user to be a 2D projection of the 3D image from a first angle, and the apparent image perceived by a second eye of the user to be a 2D projection of the 3D image from a second angle, such that the perceived images are in parallax. In various implementations, the parallax causes the user to perceive the 2D projections in three dimensions. Accordingly, the entity can visually output the 3D image to a user using a single 2D array of display pixels. In some implementations, the entity outputs the 2D projections onto two separate displays that are respectively viewed by the eyes of the user.
At 902, the entity generates a 3D polygon mesh of a surface based on a depth image of the surface. In various implementations, the depth pixels define a surface in the 3D space. Various polygons are defined in the 3D space, wherein the depth pixels are located at vertices of the polygons. Thus, the surface is approximated as a collection of flat polygonal surfaces (e.g., in a polygonal mesh). In various implementations, the 3D polygon mesh is defined as a binary 3D image in a 3D space. Therefore, nonzero voxels in the binary 3D image are assigned at the defined surface. Zero value voxels are defined elsewhere in the binary 3D image.
At 904, the entity aligns a 2D image of the surface with the 3D polygon mesh using ray tracing. At 906, the entity defines values of the 3D polygon mesh by projecting values of the 2D image onto the 3D polygon mesh. In various implementations, values of individual image pixels in the 2D image are projected onto polygon mesh. In various cases, if neighboring image pixels of the 2D image project onto the 3D polygon mesh at different locations that are separated by a vertex of the 3D polygon mesh, the value (e.g., color) of the vertex of the 3D polygon mesh is defined based on a linear interpolation of the values of the projected image pixels.
At 908, the entity generates a 3D image of the surface by smoothing the 3D polygon mesh. In various implementations, tessellation is performed on the colorized 3D polygon mesh in order to smooth the surface. The resultant smoothed image is defined as a 3D image of the surface.
As illustrated, the device(s) 1000 comprise a memory 1004. In various embodiments, the memory 1004 is volatile (including a component such as Random Access Memory (RAM)), non-volatile (including a component such as Read Only Memory (ROM), flash memory, etc.) or some combination of the two.
The memory 1004 may include various components, such as at least one 2D image generator 1006, at least one depth image generator 1008, a 3D image generator 1010, a POV segmenter 1012, and the like. Any of the 2D image generator 1006, the depth image generator 1008, the 3D image generator 1010, and the POV segmenter 1012 can comprise methods, threads, processes, applications, or any other sort of executable instructions. The 2D image generator 1006, the depth image generator 1008, the 3D image generator 1010, and the POV segmenter 1012 and various other elements stored in the memory 1004 can also include files and databases. In various cases, the 2D image generator 1006 includes instructions for generating a 2D image based on light detected by an image sensor. The depth image generator 1008, in various cases, includes instructions for generating a depth image based on an interference pattern detected by an image sensor and/or ultrasound detected by a transducer. In various implementations, the 3D image generator 1010 includes instructions for generating a 3D image based on a 2D image and a depth image of the same scene. In some examples, the POV segmenter 1012 includes instructions for outputting the 3D image on a lenticular display.
The memory 1004 may include various instructions (e.g., instructions in the 2D image generator 1006, depth image generator 1008, 3D image generator 1010, and/or POV segmenter 1012), which can be executed by at least one processor 1014 to perform operations. In some embodiments, the processor(s) 1014 includes a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or both CPU and GPU, or other processing unit or component known in the art.
The device(s) 1000 can also include additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Such additional storage is illustrated in
The device(s) 1000 also can include input device(s) 1022, such as a keypad, a cursor control, a touch-sensitive display, voice input device, etc., and output device(s) 1024 such as a display, speakers, printers, etc. In particular implementations, a user can provide input to the device(s) 1000 via a user interface associated with the input device(s) 1022 and/or the output device(s) 1024. In various implementations, the input device(s) 1022 include a distance sensor configured to detect a distance between a user and the device(s) 1000, an ultrasound transducer, an OCT imaging system, or a combination thereof. According to various cases, the input device(s) 1022 include a single imager. In various implementations, the output device(s) 1024 include a lenticular display.
As illustrated in
In some implementations, the transceiver(s) 1016 can be used to communicate between various functions, components, modules, or the like, that are comprised in the device(s) 1000. For instance, the transceivers 1016 may facilitate communications between the 2D image generator 1006, the depth image generator 1008, the 3D image generator 1010, and/or the POV segmenter 1012.
In some instances, one or more components may be referred to herein as “configured to,” “configurable to,” “operable/operative to,” “adapted/adaptable,” “able to,” “conformable/conformed to,” etc. Those skilled in the art will recognize that such terms (e.g., “configured to”) can generally encompass active-state components and/or inactive-state components and/or standby-state components, unless context requires otherwise.
As used herein, the term “based on” can be used synonymously with “based, at least in part, on” and “based at least partly on.”
As used herein, the terms “comprises/comprising/comprised” and “includes/including/included,” and their equivalents, can be used interchangeably. An apparatus, system, or method that “comprises A, B, and C” includes A, B, and C, but also can include other components (e.g., D) as well. That is, the apparatus, system, or method is not limited to components A, B, and C.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described.
This application claims priority to U.S. Provisional App. No. 63/434,013, which is titled “Depth Rendering Scope” and was filed on Dec. 20, 2022, and is incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
63434013 | Dec 2022 | US |