This disclosure relates generally to tracking the motion of one or more objects relative to visible skin on a subject by means of computer vision from a freely movable camera and, in non-limiting embodiments, to systems and methods for tracking an ultrasound probe relative to a subject's skin and body and, in other non-limiting embodiments, to systems and methods for tracking a head-mounted display by means of cameras viewing a subject.
Ultrasound is a widely used clinical imaging modality for monitoring anatomical and physiological characteristics. Ultrasound combines several advantages including low-cost, real-time operation, a small size that is easy to use and transport, and a lack of ionizing radiation. These properties make ultrasound an ideal tool for medical image-guided interventions. However, unlike computed tomography (CT) and magnetic resonance imaging (MRI) that provide innate three-dimensional (3D) anatomical models, ultrasound suffers from a lack of contextual correlations due to changing and unrecorded probe locations, which makes it challenging to be applied in certain clinical environments.
Existing methods for tracking ultrasound probes relative to human skin involve mounting cameras directly on the ultrasound probes. Such arrangements require specialized hardware and calibration within an operating room. Because the ultrasound probe must be in contact with the skin, previous algorithms have relied on the camera being at a fixed distance from the skin. Such methods do not allow the camera looking at the skin to be moved separately from the ultrasound probe or to be used both near and far from the skin.
A challenge for ultrasound in clinical applications stems from lacking a stable, anatomic coordinate system. One approach uses a feature tracking Scale-Invariant Feature Transform (SIFT) on images taken by a low-cost camera mounted on an ultrasound probe, and simultaneous localization and mapping (SLAM) for 3D reconstruction. While this method is cost effective, SIFT often fails to track natural skin features, resulting in high cumulative error. Other methods involve manually attaching known markers on the body. However, tracking tissue deformation requires a dense set of tracked points, and attaching or inking large numbers of artificial markers on a patient is not desirable. For example, many artificial markers protrude or cover the skin in a manner than can get in the way of the clinician, and artificial markers do not usually persist across months or years as would be desirable for longitudinal patient monitoring. Another method uses a commercial clinical 3D scanning system to acquire a preoperative 3D patient model, which aids in determining the location and orientation of the probe and the patient, but that method also mounted a camera directly on the ultrasound probe and was not usable with a mobile camera, such as a smartphone camera or head-mounted display (HMD). Another method uses phase only correlation (POC) tracking to robustly find subtle features with sub-pixel precision on the human body, but this POC method uses a camera mounted on the probe and is unable to track features when the camera is moved toward or away from the patient due to scale and rotation.
According to non-limiting embodiments or aspects, provided is a method for determining the pose of an object relative to a subject, comprising: capturing, with at least one computing device, a sequence of images with a stationary or movable camera unit arranged in a room, the sequence of images comprising a subject and an object moving relative to the subject; and determining, with at least one computing device, the pose of the object with respect to the subject in at least one image of the sequence of images based on a computing or using a prior surface model of the subject, a surface model of the object, and an optical model of the camera unit. In non-limiting embodiments or aspects, the at least one computing device and the camera unit are arranged in a mobile device. In non-limiting embodiments or aspects, the object being tracked may be at least one camera unit itself or at least one object physically connected to at least one camera unit. In non-limiting embodiments or aspects, the subject may be a medical patient. In other non-limiting embodiments or aspects, the subject may not be a patient. In non-limiting embodiments or aspects, the object(s) may be tracked for non-medical purposes, including but not limited to utilitarian or entertainment purposes. In other non-limiting embodiments or aspects, an animal or other subject with skin-like features may take the place of the subject.
In non-limiting embodiments or aspects, determining the pose of the object includes determining the skin deformation of the subject. In non-limiting embodiments or aspects, determining the pose of the object comprises: generating a projection of the surface model of the subject through the optical model of the camera unit; and matching the at least one image to the projection.
According to non-limiting embodiments or aspects, provided is a system for determining the pose of an object relative to a subject, comprising: a camera unit; a data storage device comprising a surface model of a subject, a surface model of an object, and an optical model of the camera unit; and at least one computing device programmed or configured to: capture a sequence of images with the camera unit while the camera unit is stationary and arranged in a room, the sequence of images comprising the subject and the object moving relative to the subject; and determine the pose of the object with respect to the subject in at least one image of the sequence of images based on a surface model of the subject, a surface model of the object, and an optical model of the camera unit.
In non-limiting embodiments or aspects, the at least one computing device and the camera unit are arranged in a mobile device. In non-limiting embodiments or aspects, wherein determining the pose of the object includes determining the skin deformation of the subject. In non-limiting embodiments or aspects, wherein determining the pose of the object comprises: generating a projection of the surface model of the subject through the optical model of the camera unit; and matching the at least one image to the projection.
According to non-limiting embodiments or aspects, provided is a system for determining the pose of an object relative to a subject, the system comprising: a camera not attached to the object able to view the object and the surface of the subject; a computer containing 3D surface models of the subject and the object, and an optical model of the camera; wherein: the computer determines the optimal 3D camera pose relative to the surface model of the subject for which the camera image of the subject best matches the surface model of the subject projected through the optical model of the camera; the computer uses the camera pose thus determined to find the optimal 3D object pose relative to the subject for which the camera image of the object best matches the surface model of the object projected through the optical model of the camera. In non-limiting embodiments or aspects, the camera is in a smartphone or tablet. In non-limiting embodiments or aspects, the object is a surgical tool. In other non-limiting embodiments or aspects, the camera is head mounted, including a camera incorporated into a head-mounted display.
In non-limiting embodiments or aspects, the object is an ultrasound probe. In non-limiting embodiments or aspects, the object is a clinician's hand or finger. In non-limiting embodiments or aspects, at least one of the surface model of the subject and the surface model of the object are derived from a set of images from a multi-camera system. In non-limiting embodiments or aspects, wherein at least one of the surface model of the subject and the surface model of the object are derived from a temporal sequence of camera images. In non-limiting embodiments or aspects, the optical model of the camera is derived from a calibration of the camera prior to the run-time operation of the system. In non-limiting embodiments or aspects, the optical model of the camera is derived during the run-time operation of the system.
In non-limiting embodiments or aspects, an inertial navigation system is incorporated into the object to provide additional information about object pose. In non-limiting embodiments or aspects, an inertial navigation system is incorporated into the camera to provide additional information about camera pose. In non-limiting embodiments or aspects, the inertial navigation system provides orientation and the video image provides translation for the camera pose. In non-limiting embodiments or aspects, inverse rendering of one or both of the surface models is used to find its optimal 3D pose. In non-limiting embodiments or aspects, a means is provided to guide the operator to move the object to a desired pose relative to the subject. In non-limiting embodiments or aspects, the operator is guided to move the object to an identical pose relative to the subject as was determined at a previous time. In non-limiting embodiments or aspects, the means to guide the operator makes use of the real-time determination of the present object pose. In non-limiting embodiments or aspects, the means to guide the operator identifies when a desired pose has been accomplished. In non-limiting embodiments or aspects, the operator is guided to move the object by selective activation of lights attached to the object. In non-limiting embodiments or aspects, the operator is guided to move the object by audio cues. In non-limiting embodiments or aspects, the operator is guided to move the object by tactile cues. In non-limiting embodiments or aspects, the operator is guided to move the object by a graphical display. In non-limiting embodiments or aspects, the graphical display contains a rendering of the object in the desired pose relative to the subject. In non-limiting embodiments or aspects, the object is virtual, comprising a single target point on the surface of the subject. In non-limiting embodiments or aspects, the object is virtual, comprising a one-dimensional line intersecting the surface of the subject at a single target point in a particular direction relative to the surface.
Further embodiments or aspects are set forth in the following numbered clauses:
Clause 1: A system for determining a pose of an object relative to a subject with a skin or skin-like surface, the system comprising: a camera not attached to the object and arranged to view the object and a surface of the subject; and a computing device in communication with the camera and comprising a three-dimensional (3D) surface model of the subject, a 3D surface model of the object, and an optical model of the camera, the computing device configured to: determine an optimal 3D camera pose relative to the 3D surface model of the subject for which an image of the subject captured by the camera matches the 3D surface model of the subject projected through the optical model of the camera; and determine an optimal 3D object pose relative to the subject for which an image of the object matches the 3D surface model of the object projected through the optical model of the camera.
Clause 2: The system of clause 1, wherein the camera is arranged in a smartphone or tablet.
Clause 3: The system of clauses 1 or 2, wherein the object is at least one of the following: a surgical tool, an ultrasound probe, a clinician's hand or finger, or any combination thereof.
Clause 4: The system of any of clauses 1-3, wherein at least one of the 3D surface models of the subject and the 3D surface models of the object is derived from a set of images from a multi-camera system.
Clause 5: The system of any of clauses 1-4, wherein at least one of the 3D surface models of the subject and the 3D surface models of the object is derived from a temporal sequence of camera images.
Clause 6: The system of any of clauses 1-5, wherein the optical model of the camera is derived from a calibration of the camera prior to a run-time operation of the system.
Clause 7: The system of any of clauses 1-6, wherein the optical model of the camera is derived during a run-time operation of the system.
Clause 8: The system of any of clauses 1-7, further comprising an inertial navigation system incorporated into the object and configured to output data associated with the optimal 3D object pose.
Clause 9: The system of any of clauses 1-8, further comprising an inertial navigation system incorporated into the camera and configured to output data associated with the optimal 3D camera object pose.
Clause 10: The system of any of clauses 1-9, wherein the inertial navigation system provides orientation data and a video image provides translation for the optimal 3D camera object pose.
Clause 11: The system of any of clauses 1-10, wherein determining at least one of the optimal 3D camera object pose and the optimal 3D object pose is based on an inverse rendering of at least one of the 3D surface model of the subject and the 3D surface model of the object.
Clause 12: The system of any of clauses 1-11, further comprising a guide configured to guide an operator to move the object to a desired pose relative to the subject.
Clause 13: The system of any of clauses 1-12, wherein the operator is guided to move the object to an identical pose relative to the subject that was determined at a previous time.
Clause 14: The system of any of clauses 1-13, wherein the guide is configured to guide the operator based on a real-time determination of a present object pose.
Clause 15: The system of any of clauses 1-14, wherein the guide identifies to the operator when a desired pose has been accomplished.
Clause 16: The system of any of clauses 1-15, further comprising lights attached to the object, wherein the operator is guided to move the object by selective activation of the lights.
Clause 17: The system of any of clauses 1-16, wherein the guide is configured to guide the operator based on audio cues.
Clause 18: The system of any of clauses 1-17, wherein the guide is configured to guide the operator based on tactile cues.
Clause 19: The system of any of clauses 1-18, wherein the guide is displayed on a graphical display.
Clause 20: The system of any of clauses 1-19, wherein the graphical display comprises a rendering of the object in the desired pose relative to the subject.
Clause 21: The system of any of clauses 1-20, wherein the object is a virtual object comprising a single target point on the surface of the subject.
Clause 22: The system of any of clauses 1-21, wherein the object is a virtual object comprising a one-dimensional line intersecting the surface of the subject at a single target point in a particular direction relative to the surface.
Clause 23: A method for determining a pose of an object relative to a subject, comprising: capturing, with at least one computing device, a sequence of images with a stationary or movable camera unit arranged in a room, the sequence of images comprising the subject and an object moving relative to the subject; and determining, with at least one computing device, the pose of the object with respect to the subject in at least one image of the sequence of images based on computing or using a prior surface model of the subject, a surface model of the object, and an optical model of the stationary or movable camera unit.
Clause 24: The method of clause 23, wherein the at least one computing device and the stationary or movable camera unit are arranged in a mobile device.
Clause 25: The method of clauses 23 or 24, wherein determining the pose of the object includes determining a skin deformation of the subject.
Clause 26: The method of any of clauses 23-25, wherein determining the pose of the object comprises: generating a projection of the surface model of the subject through the optical model of the stationary or movable camera unit; and matching at least one image to the projection.
Clause 27: A system for determining a pose of an object relative to a subject, comprising: a camera unit; a data storage device comprising a surface model of a subject, a surface model of an object, and an optical model of the camera unit; and at least one computing device programmed or configured to: capture a sequence of images with the camera unit while the camera unit is stationary and arranged in a room, the sequence of images comprising the subject and the object moving relative to the subject; and determine the pose of the object with respect to the subject in at least one image of the sequence of images based on a surface model of the subject, a surface model of the object, and an optical model of the camera unit.
Clause 28: The system of clause 27, wherein the at least one computing device and the camera unit are arranged in a mobile device.
Clause 29: The system of clauses 27 or 28, wherein determining the pose of the object includes determining skin deformation of the subject.
Clause 30: The system of any of clauses 27-29, wherein determining the pose of the object comprises: generating a projection of the surface model of the subject through the optical model of the camera unit; and matching the at least one image to the projection.
Clause 31: The method of any of clauses 27-30, wherein the object comprises the stationary or movable camera unit or at least one object physically connected to the stationary or movable camera unit.
Clause 32: A computer program product comprising at least one non-transitory computer-readable medium including program instructions that, when executed by at least one processor, cause the at least one processor to perform the methods of any of clauses 23-26 and 31.
Clause 33: The system of any of clauses 1-22 and 27-30, wherein the subject is a medical patient.
Clause 34: The method of any of clauses 23-26, wherein determining the pose of the object comprises tracking a feature on a skin surface of the subject by: identifying an image patch including the feature of an image from the sequence of images; building an image pyramid based on the image patch, the image pyramid comprising scaled versions of the image patch; and matching an image patch from a next image from the sequence of images to an image patch from the image pyramid.
Clause 35: The system of any of clauses 27-30, wherein the at least one computing device is programmed or configured to determine the pose of the object by tracking a feature on a skin surface of the subject by: identifying an image patch including the feature of an image from the sequence of images; building an image pyramid based on the image patch, the image pyramid comprising scaled versions of the image patch; and matching an image patch from a next image from the sequence of images to an image patch from the image pyramid.
Clause 36: A method for tracking a feature on a skin surface of a subject, comprising: detecting, with at least one computing device, feature points on an image of a sequence of images captured of the skin surface of the subject; identifying, with the at least one computing device, an image patch of the image including at least one feature point; building, with the at least one computing device, an image pyramid based on the image patch, the image pyramid comprising scaled versions of the image patch; matching, with the at least one computing device, an image patch from a next image from the sequence of images to an image patch from the image pyramid; and calculating, with the at least one computing device, a shift value for the next image based on matching the image patch from the next image to the image patch from the image pyramid.
Clause 37: The method of clause 36, further comprising: transforming the image patch of the image into a mathematical function; and extracting phase information from the image patch of the image, wherein matching the image patch is based on the phase information.
These and other features and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structures and the combination of parts and economies of manufacture, will become more apparent upon consideration of the following description and the appended claims with reference to the accompanying drawings, all of which form a part of this specification, wherein like reference numerals designate corresponding parts in the various figures. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the invention.
Additional advantages and details are explained in greater detail below with reference to the non-limiting, exemplary embodiments that are illustrated in the accompanying drawings, in which:
It is to be understood that the embodiments may assume various alternative variations and step sequences, except where expressly specified to the contrary. It is also to be understood that the specific devices and processes described in the following specification, are simply exemplary embodiments or aspects of the disclosure. Hence, specific dimensions and other physical characteristics related to the embodiments or aspects disclosed herein are not to be considered as limiting. No aspect, component, element, structure, act, step, function, instruction, and/or the like used herein should be construed as critical or essential unless explicitly described as such. Also, as used herein, the articles “a” and “an” are intended to include one or more items and may be used interchangeably with “one or more” and “at least one.” Also, as used herein, the terms “has,” “have,” “having,” or the like are intended to be open-ended terms. Further, the phrase “based on” is intended to mean “based at least partially on” unless explicitly stated otherwise.
As used herein, the term “computing device” may refer to one or more electronic devices configured to process data. A computing device may, in some examples, include the necessary components to receive, process, and output data, such as a processor, a display, a memory, an input device, a network interface, and/or the like. A computing device may be a mobile device. A computing device may also be a desktop computer or other form of non-mobile computer. In non-limiting embodiments, a computing device may include an artificial intelligence (AI) accelerator, including an application-specific integrated circuit (ASIC) neural engine such as Apple's M1® “Neural Engine” or Google's TENSORFLOW® processing unit. In non-limiting embodiments, a computing device may be comprised of a plurality of individual circuits.
As used herein, the term “subject” may refer to a person (e.g., a human body), an animal, a medical patient, and/or the like. A subject may have a skin or skin-like surface.
Non-limiting embodiments described herein utilize a camera detached and separate from an ultrasound probe or other object (e.g., clinical tool) to track the probe (or other object) by analyzing a sequence of images (e.g., frames of video) captured of a subject's skin and the features thereon. The camera may be part of a mobile device, as an example, mounted in a region or held by a user. The camera may be part of a head-mounted device (HMD). Although non-limiting embodiments are described herein with respect to ultrasound probes, it will be appreciated that such non-limiting embodiments may be implemented to track the position of any object relative to a subject based on the subject's skin features (e.g., blemishes, spots, wrinkles, deformations, and/or other parameters). For example, non-limiting embodiments may track the position of a clinical tool such as a scalpel, needle, a clinician's hand or finger, and/or the like.
Non-limiting embodiments may be implemented with a smartphone, using both the camera unit of the smartphone and the internal processing capabilities of the smartphone. A graphical user interface of the smartphone may direct a clinician, based on the tracked object, to move the object to a desired pose (e.g., position, orientation, and/or location with respect to the subject) based on a target or to avoid a critical structure, to repeat a previously-used pose, and/or to train an individual how to utilize a tool such as an ultrasound probe. In such non-limiting embodiments, no special instrumentation is needed other than a smartphone with a built-in camera and a software application installed. Other mobile devices may also be used, such as tablet computers, laptop computers, and/or any other mobile device including a camera.
In non-limiting embodiments, a data storage device stores three-dimensional (3D) surface models of a subject (e.g., patient) and an object (e.g., ultrasound probe). The data storage device may also store an optical model of a camera unit. The data storage device may be internal to the computing device or, in other non-limiting embodiments, external to and in communication with the computing device over a network. The computing device may determine a closest match between the subject depicted in an image from the camera and the surface model of the subject projected through the optical model of the camera. The computing device may determine an optimal pose of the camera unit relative to the surface model of the subject. The computing device may determine a closest match between the object depicted in the image and the surface model of the object projected through the optical model of the camera. The computing device may determine an optimal pose for the object relative to the subject.
Non-limiting embodiments provide for skin-feature tracking usable in an anatomic simultaneous localization and mapping (SLAM) algorithm and localization of clinical tools relative to the subject's body. By tracking features from small patches of a subject's skin, the unique and deformable nature of the skin is contrasted to objects (such as medical tools and components thereof) that are typically rigid and can be tracked with 3D geometry, allowing for accurate tracking of objects relative to a subject in real-time. The use of feature tracking in a video taken by a camera (e.g., such as a camera in a smartphone or tablet) and anatomic SLAM of the camera motion relative to the skin surface allows for accurate and computationally efficient camera-based tracking of clinical tool(s) relative to reconstructed 3D features from the subject. In non-limiting examples, the systems and methods described herein may be used for freehand smartphone-camera based tracking of natural skin features relative to tools. In some examples, robust performance may be achieved with the use of a phase-only correlation (POC) modified to uniquely fit the freehand tracking scenario, where the distance between the camera and the subject varies over time.
In non-limiting embodiments, the computing device 104 also includes a graphical user interface (GUI) 108 which may visually direct the user of the computing device 104 to guide the object 102 to a particular pose. For example, a visual guide may be generated on the GUI 108 to direct a clinician to a particular area of the skin surface. The computing device 104 may also guide the user with audio. The system 1000 includes a data storage device 110 that may be internal or external to the computing device 104 and includes, for example, an optical model of the camera, a 3D model of the subject and/or subject's skin surface, a 3D model of the object (e.g., clinical tool) to be used, and/or other like data used by the system 1000. The 3D models of the subject, subject's skin surface, and/or object may be represented in various ways, including but not limited to 3D point clouds.
In non-limiting embodiments, natural skin features may be tracked using POC to enable accurate 3D ultrasound tracking relative to skin. The system 1000 may allow accurate two-dimensional (2D) and 3D tracking of natural skin features from the perspective of a free-hand-held smartphone camera, including captured video that includes uncontrolled hand motion, distant view of skin features (few pixels per feature), lower overall image quality from small smartphone cameras, and/or the like. The system 1000 may enable reliable feature tracking across a range of camera distances and working around physical limitations of smartphone cameras.
In non-limiting embodiments, at least one camera unit may be attached to an HMD. In some examples, the HMD may contain an augmented-reality or virtual-reality display that shows objects inside or relative to the subject's skin, such that the objects appear to move with the skin. In non-limiting embodiments or aspects, the HMD may show medical images or drawings at their correct location in-situ inside the subject's body, such that the images move with the subject's skin. In non-limiting embodiments or aspects, a camera attached to an HMD may simultaneously track an ultrasound probe along with the subject's skin, and the HMD could show the operator current and/or previous images (and/or content derived from the images) in their correct location inside the subject's body (or at any desired location in 3D space that moves with the subject's skin on the subject's body), whether the subject's body remains still, moves, or is deformed.
In non-limiting embodiments, the object 102 may be a virtual object including a one-dimensional (1D) line intersecting the surface of the subject at a single target point in a particular direction relative to the surface.
Referring now to
At step 302 of
In non-limiting embodiments, prior to capturing the sequence of images at step 300 with the camera, an optical model may be generated for the camera through which other data may be projected. An optical model may be based on an intrinsic matrix obtained by calibrating the camera. As an example, several images of a checkerboard may be captured from different viewpoints and feature points may be detected on the checkerboard corners. The prior known position of the corners may then be used to estimate the camera intrinsic parameters, such as a focal length, camera center, and distortion coefficient, as examples. In some examples, the camera calibration toolbox provided by Matlab may be utilized. In non-limiting embodiments, the optical model may be predefined for a camera. The resulting optical model may include a data structure including the camera parameters and may be kept static during the tracking process. In some examples, to prepare for tracking, preprocessing may be performed to convert color images to grayscale and to then enhance the appearance of skin features using contrast limited adaptive histogram equalization (CLAHE) to find better spatial frequency components for the feature detection and tracking.
While processing the sequence of images, a Structure from Motion (SfM) process may be performed locally on every newly obtained set Si, and the locally computed 3D positions may be used to initialize the global set S={S0, S1, S2}. Once a new set is obtained and added to global set, a Bundle Adjustment process may be used to refine the overall 3D scheme. Through this process, it is possible to simultaneously update and refine the 3D feature points and camera motions while reading in new frames from the camera. In non-limiting embodiments, rather than using an existing SfM process, which includes a normalized five-point algorithm and random sample consensus, a modified process is performed to minimize re-projection error in order to compute structure from motion for every several (e.g., five (5)) frames. First, re-projection error is defined by the Euclidean distance ∥x−xrep∥2, where x is a tracked feature point and xrep is a point obtained by projecting a 3D point back to the image using the calculated projection matrix (e.g., optical model) of the camera. After obtaining the initialized 3D points, camera projection matrix (including Intrinsic Matrix and Extrinsic Matrix), and corresponding 2D features in a set, the re-projection error is minimized. In this latter stage, the Intrinsic Matrix is fixed and the system updates the 3D points and camera Extrinsic Matrix repeatedly.
For higher robustness, an additional constraint may be set in some non-limiting embodiments that new feature points must persist across at least two (2) consecutive sets before they are added to the 3D model (e.g., point cloud) of the subject. Higher reconstruction quality may be achieved by setting larger constraint thresholds. Due to the use of SfM, the resulting 3D model of the arm and the camera trajectory are only recovered up to a scale factor. In some examples, the 3D positions may be adjusted to fit into real-world coordinates. In some examples, a calibrated object (such as a ruler or a small, flat fiducial marker (e.g., an AprilTag)) may be placed on the subject's skin during a first set of frames of the video.
With continued reference to
In non-limiting embodiments, fiducial markers (e.g., such as an Apriltag) may be placed on objects (e.g., such as clinical tools). In this manner, the fiducial marker(s) may be used to accurately track the 3D position of the object during use. The fiduciary markers may also be masked while tracking skin surface features. After reconstructing the 3D skin surface during a first portion of the video, as described herein, one or more objects may be introduced. The computing device may continue to execute SfM and Bundle Adjustment algorithms while the object moves with respect to the skin surface (e.g., such as a moving ultrasound probe) to accommodate the hand-held movement of the camera and possible skin deformation or subject motion. Continuous tracking of both the skin features and objects relative to the moving camera allows consistent tracking of the objects relative to the skin features. In some examples, this feature tracking approach may also find POC features on objects, which may confuse 3D reconstruction of the skin surface. This problem is addressed by first detecting the fiduciary marker on the object and then masking-out the object from the images (e.g., based on a known geometry) before performing feature detection.
Referring now to
At step 504 of
At step 510 of
In non-limiting embodiments, a wide field of view of the camera may introduce spurious objects that should not be tracked (e.g., an additional appendage, tool, or the like). This may be addressed in non-limiting embodiments by automatically identifying which pixels correspond to the subject using human pose tracking and semantic segmentation, as an example. In non-limiting embodiments, a color background (e.g., a blue screen) may be used to isolate the subject's skin by masking the color image. After masking the color image, the image may be converted to grayscale for further processing. In non-limiting embodiments, masks may be applied to mask known objects (e.g., an ultrasound probe).
In non-limiting embodiments, motion blur may be reduced or eliminated by forcing a short shutter speed. For example, the camera may be configured to operate at 120 frames per second (fps), of which every 20th frame (or other interval) is preserved to end up at a target frame rate (e.g., 6 fps). The target frame rate may be desirable because SfM requires some degree of motion within each of the sets Si, which is achieved by setting a lower frame rate resulting in a 0.8 second duration for each Si. The 3D model reconstruction may be updated every several (e.g., four (4)) captured frames (with 6 fps, the fifth and first frames of consecutive sets overlap), and the 3D skin feature tracking may be updated every two-thirds second, as an example. In non-limiting embodiments, rotational invariance may be integrated into the POC tracking of skin features.
Referring now to
With continued reference to
Device 900 may perform one or more processes described herein. Device 900 may perform these processes based on processor 904 executing software instructions stored by a computer-readable medium, such as memory 906 and/or storage component 908. A computer-readable medium may include any non-transitory memory device. A memory device includes memory space located inside of a single physical storage device or memory space spread across multiple physical storage devices. Software instructions may be read into memory 906 and/or storage component 908 from another computer-readable medium or from another device via communication interface 914. When executed, software instructions stored in memory 906 and/or storage component 908 may cause processor 904 to perform one or more processes described herein. Additionally, or alternatively, hardwired circuitry may be used in place of or in combination with software instructions to perform one or more processes described herein. Thus, embodiments described herein are not limited to any specific combination of hardware circuitry and software. The term “programmed or configured,” as used herein, refers to an arrangement of software, hardware circuitry, or any combination thereof on one or more devices.
Although embodiments have been described in detail for the purpose of illustration, it is to be understood that such detail is solely for that purpose and that the disclosure is not limited to the disclosed embodiments, but, on the contrary, is intended to cover modifications and equivalent arrangements that are within the spirit and scope of the appended claims. For example, it is to be understood that the present disclosure contemplates that, to the extent possible, one or more features of any embodiment can be combined with one or more features of any other embodiment.
This application claims priority to U.S. Provisional Patent Application No. 63/156,521, filed Mar. 4, 2021, the disclosure of which is incorporated herein by reference in its entirety.
This invention was made with Government support under 1R01EY021641 awarded by the National Institute of Health, and W81XWH-14-1-0370 and W81XWH-14-1-0371 awarded by the Department of Defense. The Government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US22/18835 | 3/4/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63156521 | Mar 2021 | US |