This invention pertains to acoustic (e.g., ultrasound) imaging, and in particular a system, device and method which may generate a three dimensional acoustic image by compounding a series of two dimensional acoustic images via deep learning.
Acoustic (e.g., ultrasound) imaging systems are increasingly being employed in a variety of applications and contexts.
Acoustic imaging is inherently based on hand-held acoustic probe motion and positioning, thus lacking the absolute three dimensional (3D) reference frame and anatomical context of other modalities such as computed tomography (CT) and magnetic resonance imaging (MRI). This makes interpreting the acoustic images (which are typically two dimensional (2D)) in three dimensions challenging. In addition, it is often desirable to have 3D views of structures, but 3D acoustic imaging is relatively expensive and less commonly used.
In order to obtain 3D volumetric acoustic images from a series of 2D acoustic images, one needs to know the relative position and orientation (herein together referred as the “pose”) of all of the 2D acoustic images with respect to each other. In the past, when these 2D acoustic images are obtained from a hand-held 2D acoustic probe, spatial tracking of the probe has been required in order to obtain the relative pose for each 2D acoustic image and “reconstruct” a 3D volumetric acoustic image from the sequence of individual, spatially localized, 2D images.
Until now, this has required additional hardware, such as optical or electromagnetic (EM) tracking systems, and involved additional work steps and time to set up and calibrate the system, adding expense and time to the imaging procedure. In order to obtain a registration between acoustic and another imaging modality, for example, it is required to identify common fiducials, common anatomical landmarks, or perform a registration based on image contents, all of which can be challenging, time consuming, and prone to error. A tracking system also typically puts constraints on how the acoustic probe can be used, e.g. by limiting the range of motion. Fully “internal” tracking systems, e.g. based on inertial sensors, exist but are limited in accuracy, suffer from long-term drift, and do not provide an absolute coordinate reference needed to relate or register the acoustic image information to image data obtained via other modalities.
These barriers have significantly impeded the adoption of 3D acoustic imaging in clinical settings.
Accordingly, it would be desirable to provide a system and a method which can address these challenges. In particular, it would be desirable to provide a system and method which can compound a series of 2D acoustic images which were acquired without spatial tracking, to produce a 3D acoustic image.
In one aspect of this disclosure, a system comprises: an acoustic probe and an acoustic imaging instrument. The acoustic probe has an array of acoustic transducer elements, and the acoustic probe is not associated with any tracking device. The acoustic probe is configured to transmit one or more acoustic signals to a region of interest (ROI) in a subject and is further configured to receive acoustic echoes from the region of interest. The acoustic imaging instrument is connected to the acoustic probe, and comprises an instrument communication interface and a processing system. The instrument communication interface is configured to provide transmit signals to at least some of the acoustic transducer elements to cause the array of acoustic transducer elements to transmit the one or more acoustic signals to the ROI in the subject, and further configured to receive one or more image signals from the acoustic probe produced from the acoustic echoes from the region of interest. The processing system includes memory, and is configured to: acquire a series of two dimensional acoustic images of the ROI in the subject from the image signals received from the acoustic probe without spatial tracking of the acoustic probe; predict a pose for each of the two dimensional acoustic images of the ROI in the subject based on a plurality of previously-obtained two dimensional acoustic images of corresponding ROIs in a plurality of other subjects which were obtained with spatial tracking. In certain embodiments, thee two dimensional acoustic images are applied to a convolutional neural network (CNN) which has been trained using a plurality of previously-obtained two dimensional acoustic images of corresponding ROIs in a plurality of other subjects which were obtained with spatial tracking. The predicted pose may then be used for each of the two dimensional acoustic images of the ROI in the subject with respect to the standardized three dimensional coordinate system to produce a three dimensional acoustic image of the ROI in the subject from the series of two dimensional acoustic images of the ROI of the subject.
In some embodiments, the system further comprises a display device, and the system is configured to display on the display device a representation of the three dimensional acoustic image of the ROI in the subject.
In some embodiments, the system is configured to use the predicted poses to display on a display device a plurality of the two dimensional acoustic images relative to each other in the ROI.
In some embodiments, the system is configured to: access a three dimensional reference image obtained using a different imaging modality than acoustic imaging; register the three dimensional acoustic image to the three dimensional reference image; and display on a display device the three dimensional acoustic image and the three dimensional reference image, registered with each other.
In some versions of these embodiments, the system is configured to superimpose the three dimensional acoustic image and the three dimensional reference image with each other on the display device.
In some embodiments, the ROI in the subject includes a reference structure, and the system is configured to: segment the reference structure in the three dimensional acoustic image of the ROI of the subject; register the segmented reference structure organ to a generic statistical model of the reference structure; and display on the display device at least one of the two dimensional images of the ROI in the subject relative to the generic statistical model of the reference structure.
In some embodiments, the system is configured to: generate one or more cut-plane views from the three dimensional acoustic image with is not coplanar with any of the two dimensional images of the ROI in the subject, and display on a display device the one or more cut-plane views.
In another aspect of this disclosure, a method comprises: employing an acoustic probe to acquire a series of two dimensional acoustic images of a region of interest (ROI) in a subject without spatial tracking of the acoustic probe; predicting (1030) a pose for each of the two dimensional acoustic images of the ROI in the subject with respect to a standardized three dimensional coordinate system (500) based on a plurality of previously-obtained two dimensional acoustic images of corresponding ROIs in a plurality of other subjects which were obtained with spatial tracking; and using the predicted pose for each of the two dimensional acoustic images of the ROI in the subject with respect to the standardized three dimensional coordinate system to produce a three dimensional acoustic image of the ROI in the subject from the series of two dimensional acoustic images of the ROI of the subject.
In some embodiments, the pose may be predicted by applying two dimensional acoustic images to a convolutional neural network which has been trained using a plurality of previously-obtained two dimensional acoustic images of corresponding ROIs in a plurality of other subjects which were obtained with spatial tracking; the convolutional neural network predicting a pose for each of the two dimensional acoustic images of the ROI in the subject with respect to a standardized three dimensional coordinate system.
In some embodiments, the method further comprises displaying on the display device a representation of the three dimensional acoustic image of the ROI in the subject.
In some embodiments, the method further comprises using the predicted poses to display on the display device a plurality of the two dimensional acoustic images relative to each other in the ROI.
In some embodiments, the method further comprises: accessing a three dimensional reference image obtained using a different imaging modality than acoustic imaging; registering the three dimensional acoustic image to the three dimensional reference image; and displaying on the display device the three dimensional acoustic image and the three dimensional reference image, registered with each other.
In some embodiments, the method further comprises superimposing the three dimensional acoustic image and the three dimensional reference image with each other on the display device.
In some embodiments, the ROI in the subject includes a reference structure, and the method further comprises: segmenting the reference structure in the three dimensional acoustic image of the ROI of the subject; registering the segmented reference structure organ to a generic statistical model of the reference structure; and displaying on a display device at least one of the two dimensional images of the ROI in the subject relative to the generic statistical model of the reference structure.
In some embodiments, the method further comprises: generating one or more cut-plane views from the three dimensional acoustic image with is not coplanar with any of the two dimensional images of the ROI in the subject, and displaying on the display device the one or more cut-plane views.
In yet another aspect of the disclosure, a method comprises: obtaining a plurality of series of spatially tracked two dimensional acoustic images of a region of interest (ROI) in a corresponding plurality of subjects; for each series of spatially tracked two dimensional acoustic images, constructing a three dimensional volumetric acoustic image of the ROI in the corresponding subject; segmenting a reference structure within each of the three dimensional volumetric acoustic images of the ROI; defining a corresponding acoustic image three dimensional coordinate system for each of the three dimensional volumetric acoustic images, based on the segmentation; defining a standardized three dimensional coordinate system for the ROI; determining for each of the spatially tracked two dimensional acoustic images of the ROI in the plurality of series its actual pose in the standardized three dimensional coordinate system, using: a pose of the spatially tracked two dimensional acoustic image in the acoustic image three dimensional coordinate system corresponding to the spatially tracked two dimensional acoustic image, and a coordinate system transformation from the corresponding acoustic image three dimensional coordinate system to the standardized three dimensional coordinate system; providing, to a convolutional neural network, the spatially tracked two dimensional acoustic images of the ROI from the plurality of series, wherein the convolutional neural network generates a predicted pose in the standardized three dimensional coordinate system for each of the provided spatially tracked two dimensional acoustic images; and performing an optimization process on the convolutional neural network to minimize differences between the predicted poses and the actual poses for all of the provided spatially tracked two dimensional acoustic images.
In some embodiments, the reference structure is an organ, and segmenting the reference structure in each of the three dimensional volumetric acoustic images of the ROI comprises segmenting the organ in the three dimensional volumetric acoustic image.
In some embodiments, defining the standardized three dimensional coordinate system for the ROI comprises: defining an origin for the standardized three dimensional coordinate system at a centroid of the segmented organ; and defining three mutually orthogonal axes of the standardized three dimensional coordinate system to be aligned with axial, coronal, and sagittal planes of the organ.
In some embodiments, defining the standardized three dimensional coordinate system for the ROI comprises selecting an origin and three mutually orthogonal axes for the standardized three dimensional coordinate system based on a priori knowledge about the reference structure.
In some embodiments, the provided spatially tracked two dimensional acoustic images are randomly selected from the plurality of series of spatially tracked two dimensional acoustic images of the ROI in the corresponding plurality of subjects.
In some embodiments, obtaining the series of spatially tracked two dimensional acoustic images of the ROI in the subject comprises receiving one or more imaging signals from an acoustic probe in conjunction with receiving an inertial measurement signal from an inertial measurement unit which spatially tracks movement of the acoustic probe while it provides the one or more imaging signals
Acoustic imaging system 200 may be employed in a method of fusing acoustic images, obtained in the absence of any tracking devices or systems. In some embodiments acoustic imaging system 200 may utilize images obtained via other imaging modalities, such as magnetic resonance imaging, MRI, computed tomography (CT), cone beam computed tomography (CBCT), etc. Elements of acoustic imaging system 200 may be constructed utilizing hardware i.e. circuitry, software or a combination of hardware and software.
Processing system 30 includes a processor 300 connected to one or more external memory devices by an external bus 316.
Processor 300 may be any suitable processor type including, but not limited to, a microprocessor, a microcontroller, a digital signal processor (DSP), a field programmable array (FPGA) where the FPGA has been programmed to form a processor, a graphical processing unit (GPU), an application specific circuit (ASIC) where the ASIC has been designed to form a processor, or a combination thereof.
Processor 300 may include one or more cores 302. The core 302 may include one or more arithmetic logic units (ALU) 304. In some embodiments, the core 302 may include a floating point logic unit (FPLU) 306 and/or a digital signal processing unit (DSPU) 308 in addition to or instead of the ALU 304.
Processor 300 may include one or more registers 312 communicatively coupled to the core 302. The registers 312 may be implemented using dedicated logic gate circuits (e.g., flip-flops) and/or any memory technology. In some embodiments the registers 312 may be implemented using static memory. The register may provide data, instructions and addresses to the core 302.
In some embodiments, processor 300 may include one or more levels of cache memory 310 communicatively coupled to the core 302. The cache memory 310 may provide computer-readable instructions to the core 302 for execution. The cache memory 310 may provide data for processing by the core 302. In some embodiments, the computer-readable instructions may have been provided to the cache memory 310 by a local memory, for example, local memory attached to the external bus 316. The cache memory 310 may be implemented with any suitable cache memory type, for example, metal-oxide semiconductor (MOS) memory such as static random access memory (SRAM), dynamic random access memory (DRAM), and/or any other suitable memory technology.
Processor 300 may include a controller 314, which may control input to the processor 300 from other processors and/or components included in a system (e.g., user interface 214 shown in
Registers 312 and the cache 310 may communicate with controller 314 and core 302 via internal connections 320A, 320B, 320C and 320D. Internal connections may be implemented as a bus, multiplexor, crossbar switch, and/or any other suitable connection technology.
Inputs and outputs for processor 300 may be provided via a bus 316, which may include one or more conductive lines. The bus 316 may be communicatively coupled to one or more components of processor 300, for example the controller 314, cache 310, and/or register 312.
Bus 316 may be coupled to one or more external memories. The external memories may include Read Only Memory (ROM) 332. ROM 332 may be a masked ROM, Electronically Programmable Read Only Memory (EPROM) or any other suitable technology. The external memory may include Random Access Memory (RAM) 233. RAM 333 may be a static RAM, battery backed up static RAM, Dynamic RAM (DRAM) or any other suitable technology. The external memory may include Electrically Erasable Programmable Read Only Memory (EEPROM) 335. The external memory may include Flash memory 334. The External memory may include a magnetic storage device such as disc 336. In some embodiments, the external memories may be included in a system, such as acoustic imaging system 200 shown in
It should be understood that in various embodiments, acoustic imaging system 200 may be configured differently than described below with respect to
In various embodiments, processor 212 may include various combinations of a microprocessor (and associated memory), a digital signal processor, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), digital circuits and/or analog circuits. Memory (e.g., nonvolatile memory), associated with processor 212, may store therein computer-readable instructions which cause a microprocessor of processor 212 to execute an algorithm to control acoustic imaging system 200 to perform one or more operations or methods which are described in greater detail below. In some embodiments, a microprocessor may execute an operating system. In some embodiments, a microprocessor may execute instructions which present a user of acoustic imaging system 200 with a graphical user interface (GUI) via user interface 214 and display device 216.
In various embodiments, user interface 214 may include any combination of a keyboard, keypad, mouse, trackball, stylus /touch pen, joystick, microphone, speaker, touchscreen, one or more switches, one or more knobs, one or more buttons, one or more lights, etc. In some embodiments, a microprocessor of processor 212 may execute a software algorithm which provides voice recognition of a user's commands via a microphone of user interface 214.
Display device 216 may comprise a display screen of any convenient technology (e.g., liquid crystal display). In some embodiments the display screen may be a touchscreen device, also forming part of user interface 214.
Beneficially, as described below with respect to
Also, at least some of acoustic transducer elements 422 of acoustic probe 220 receive acoustic echoes from area of interest 290 in response to acoustic probe signal 295 and convert the received acoustic echoes to one or more electrical signals representing an acoustic image of area of interest 290, in particular a two dimensional (2D) acoustic image. These electrical signals may be processed further by acoustic probe 220 and communicated by a probe communication interface 428 of acoustic probe 220 (see
Receive unit 215 is configured to receive the one or more acoustic image signals from acoustic probe 220 via probe communication interface 428 and to process the acoustic image signal(s) to produce acoustic image data from which 2D acoustic images may be produced. In some embodiments, receive unit 215 may include various circuits as are known in the art, such as one or more amplifiers, one or more A/D conversion circuits, and a phasing addition circuit, for example. The amplifiers may be circuits for amplifying the acoustic image signals at amplification factors for the individual paths corresponding to the transducer elements 422. The A/D conversion circuits may be circuits for performing analog/digital conversion (A/D conversion) on the amplified acoustic image signals. The phasing addition circuit is a circuit for adjusting time phases of the amplified acoustic image signals to which A/D conversion is performed by applying the delay times to the individual paths respectively corresponding to the transducer elements 422 and generating acoustic data by adding the adjusted received signals (phase addition). The acoustic data may be stored in memory associated with acoustic imaging instrument 200.
Processor 212 may reconstruct acoustic data received from receiver unit 215 into a 2D acoustic image corresponding to an acoustic image plane which intercepts area of interest 290, and subsequently causes display device 216 to display this 2D acoustic image.
The reconstructed 2D acoustic image may for example be an ultrasound Brightness-mode “B-mode” image, otherwise known as a “2D mode” image, a “C-mode” image or a Doppler mode image, or indeed any acoustic image.
In various embodiments, processing system 212 may include a processor (e.g., processor 300) which may execute software in one or more modules for performing one or more algorithms or methods as described below with respect to
Of course it is understood that acoustic imaging instrument 210 may include a number of other elements not shown in
In some embodiments, acoustic imaging instrument 210 also receives an inertial measurement signal from an inertial measurement unit (IMU) included in or associated with acoustic probe 220. The inertial measurement signal may indicate an orientation or pose of acoustic probe 220. The inertial measurement unit may include a hardware circuit, a hardware sensor or Microelectromechanical systems (MEMS) device. The inertial measurement circuity may include a processor, such as processor 300, running software in conjunction with a hardware sensor or MEMS device.
In other embodiments, acoustic imaging instrument does not receive any inertial measurement signal, but may determine a relative orientation or pose of acoustic probe 220 as described in greater detail below, for example with respect to
Acoustic probe 220 includes an array of acoustic transducer elements 422, a beamformer 424, a signal processor 426, and a probe communication interface 428.
In some embodiments, particularly in the case of an embodiment of acoustic probe 220 and acoustic imaging system 200 which is used in a training phase of a process or method as described in greater detail below for example with respect to
In other embodiments, particularly in the case of an embodiment of acoustic probe 220 and acoustic imaging system 200 which is used in an application phase of a process or method as described in greater detail below for example with respect to
Disclosed in greater detail below are arrangements based on acoustic imaging systems such as acoustic imaging system 200 which may be employed in a method of processing a series of 2D acoustic images, obtained in the absence of any tracking devices or systems, and generating therefrom a 3D acoustic image.
In some embodiments, these arrangements include what is referred to herein as a “training framework” and what is referred to herein as an “application framework.”
The training framework may execute a training process, as described in greater detail below, for example with respect to
The application framework may execute an application process, as described in greater detail below, for example with respect to
In some embodiments, the training framework may be established in a factory or laboratory setting, and training data obtained thereby may be stored on a data storage device, such as any of the external memories discussed above with respect to
In some embodiments, the application framework may be defined in a clinical setting wherein an embodiment of acoustic imaging system 200 which does not include or utilize IMU 421or other tracking device or system is used by a physician or clinician to obtain acoustic images of a subject or patient. In various embodiments, the data storage device which stores the optimized parameters for the convolutional neural network 600 may be included in or connected, either directly or via a computer network, including in some embodiments the Internet, to an embodiment of acoustic imaging system 200 which executes the application framework. In some embodiments, optimized parameters for the convolutional neural network 600 may be “hardwired” into the convolutional neural network 600 of acoustic imaging system 200.
Summaries of embodiments of the training framework and the application framework will now be provided, followed by more detailed descriptions thereof.
In some embodiments, the following operations may be performed within the training framework.
Although in the example above, it should be understood that the standardized 3D coordinate system could also have an origin at a vessel bifurcation and an axis oriented along one or two vessels; it could also have an origin at the distinguishable anatomical landmark, such as bony structure, etc. Everything that one can manually or automatically define in the 3D acoustic image of the ROI and relate to the 3D acoustic image can be employed to define the standardized 3D coordinate system.
Convolutional neural network 600 may be trained using a batch-wise approach on the task to regress the rigid transformation given an input 2D ultrasound image.
During training, the data input to convolutional neural network 600 is a 2D ultrasound image and a ground truth position of that 2D acoustic image with respect to a standardized 3D coordinate system. The input to the training framework is pairs or tuples of (2D acoustic image, ground truth poses). The input to the CNN is the image and the output is a prediction of the pose. The optimizer in the training framework modifies the CNN's parameters so that the prediction for each image approximates the corresponding known ground truth in an optimal way (e.g. minimizing the sum of absolute differences of the pose parameters between prediction and ground truth) In operation after training, convolutional neural network 600 takes a currently produced 2D acoustic image of a subject and predicts the rigid transformation to yield a predicted pose for the 2D acoustic image in the standardized 3D coordinate system.
Accordingly, the training framework automatically generates a training dataset of 2D acoustic images of a region or organ of interest, and corresponding actual poses of those 2D acoustic images in the standardized 3D coordinate system. The training framework then uses the training dataset to train a neural network (e.g., convolutional neural network 600) using the training dataset to optimize the neural network's ability to predict poses for other 2D acoustic images of the region (or e.g., organ) of interest.
In some embodiments, the following operations may be performed within the application framework.
In some embodiments, the 3D acoustic volume reconstruction can then be used to, e.g., make measurements of an organ in all three dimensions, obtain arbitrary “cut plane” views (aka. multi-planar reconstructions, or MPRs) of an organ, or register the 3D acoustic volume reconstruction with a model of an organ or with another image obtained with a different imaging modality (e.g., computer tomography (CT) or magnetic resonance imaging (MRI).
Various components of systems implementing the training framework and the application framework will now be described in greater detail.
Some embodiments of the training framework utilize a training dataset, a dataset processing controller, and a neural network training controller (NNT). In some embodiments, the DPC and/or the NNT may comprise a processing system such as processing system 30 described above.
The training dataset consists of a collection of spatially tracked 2D acoustic image sweeps over a specific part of the anatomy (e.g., an organ) in a subject population (beneficially a population of at least twenty subjects). Beneficially, the subject population exhibits variations in age, size of the anatomy, pathology, etc. 3D acoustic volumes are reconstructed from 2D acoustic images using methods which are known in the art (e.g., as disclosed Huang, et al., cited above). The acoustic probe (e.g., acoustic probe 220) which is used with an acoustic imaging system (e.g., acoustic imaging system 200) to obtain the spatially tracked 2D acoustic image sweeps can be tracked using one of the position measurement systems known in the art, such as optical tracking devices or systems, EM tracking devices or systems, IMU-based tracking, etc.. Based on the spatial tracking of the acoustic probe while acquiring the 2D acoustic images, the transformation describing the pose of each 2D acoustic image S, relative to the reconstructed 3D acoustic volume, T2DUS_to_3DUS, is known.
The DPC is configured to: load a single case from the training dataset, segment the area of interest or organ of interest from the 3D acoustic images ; based on the segmented mask create a mesh using, e.g., a marching cubes algorithm that is known in the art; and based on the mesh define a standardized 3D coordinate system (see
Optionally the DPC may preprocess one or more 2D acoustic images, for example by cropping the 2D acoustic image to a relevant rectangular region of interest.
The DPC may also compute the actual pose Ti of each (potentially pre-processed) 2D acoustic image relative to the standardized 3D coordinate system using the equation:
T
i
=T
3DUS_to_standardized
*T
tracking_to_3DUS
*T
2DUS_to_tracking,
where T2DUS_to_tracking is the pose of the (potentially cropped) acoustic image in tracking space, Ttracking_to_3DUS is the pose of the 3D acoustic image in the tracking space, and T3DUS_to_standardized is the pose of the 3D acoustic image in the standardized space (segmentation-based) 3D coordinate system, as described above and in
At the end of these operations, a large set of 2-tuples di may be provided:
d
i=(Si, Ti),
where Si is an input ultrasound image and Ti is a rigid transformation describing the position and orientation (herein referred to as the “actual pose”) of the ultrasound image Si in the standardized 3D coordinate system. The DPC provides this set of 2-tuples di to a network training controller (NTC).
The NTC is configured to: receive the set of 2-tuples from the DPC, and batch-wise train the CNN using sets of the provided 2-tuples that is to optimize parameters/weights of the CNN to minimize differences between the predicted poses of the 2D acoustic images, which are output by the CNN, and the actual poses for all of the spatially tracked 2D acoustic images for all of the subjects, which are obtained as described above. The NTC may comprise a processing system such as processing system 30 described above.
Thus, the output of the training framework may be an optimized set of parameters/weights for the CNN which maximizes the accuracy with which the CNN predicts unknown poses of 2D acoustic images which are input to it.
Some embodiments of the application framework utilize: an acoustic imaging system (e.g., acoustic imaging system 200); a pose prediction controller (PPC); and a multi-modality imaging controller (MMIC). In some embodiments, the PPC and/or the MMIC may comprise a processing system such as processing system 30 described above.
In some embodiments, the acoustic imaging system may include the PPC and/or the multi-modality imaging controller as part of a processing system (e.g., processing system 212) of the acoustic imaging system.
The acoustic imaging system preferably acquires a sequence of 2D acoustic images of a region of interest, which may include an organ of interest, in the human body. The acoustic imaging system employs an acoustic probe, which in some embodiments may be a hand-held transrectal ultrasound (TRUS) or transthoracic echocardiography (TTE) transducer. Whatever acoustic probe is employed, it does not include and is not associated with any tracking device, such as an IMU, EM tracker, optical tracker, etc. In other words, the acoustic imaging system does not acquire any tracking, location, orientation, or pose information for the acoustic probe as the acoustic probe is used to gather acoustic image data for the 2D acoustic images.
The PPC includes a deep neural network, for example a convolutional neural network (CNN) consisting of single or plurality of intermediate layers and last regression layers, for example as illustrated in
The PPC is configured to provide the CNN with an input 2D acoustic image, and to obtain from the CNN as an output the predicted pose of the 2D acoustic image in the standardized coordinate system.
A volume reconstruction controller (VRC) is configured to reconstruct a 3D acoustic image of the ROI or a reference structure (e.g., an organ) in the ROI from the sequence of 2D acoustic images and their poses predicted by the convolutional neural network, using methods known in the art as described above.
Some embodiments of the application framework include an intraoperative acoustic imaging modality, the VRC and a display such as display device 216.
The intraoperative acoustic imaging modality may include a 2D acoustic probe as described above and may acquire a sequence or sweep of 2D acoustic images of an ROI in real time, without spatial tracking, and send the 2D acoustic images to the VRC.
The VRC may receive the 2D acoustic images and provide them to a trained convolutional neural network (CNN) which predicts a rigid transformation that describes the pose (position and orientation) of each 2D acoustic image with respect to a standardized 3D coordinate system. The VRC may use the 2D acoustic images and their corresponding poses, provided by the trained CNN, to reconstruct a 3D acoustic image of the ROI using a volume compounding controller (VCC) using methods known in the art, as described above. The VRC and the VCC may comprise a processing system such as processing system 30 described above.
The display device 216 may display the 3D volumetric acoustic image to a user, for example in conjunction with an acoustic imaging system such as acoustic imaging system 200, and, for example: visualize and verify the reconstruction; perform volumetric measurements; plan a procedure; register the 3D acoustic image with 3D images obtained using other imaging modalities (e.g., CT or MRI) for improved diagnosis or guidance of therapy; display in real-time positioning of the 2D acoustic images on the reconstructed 3D acoustic image; and/or provide feedback to the user regarding the 2D acoustic images relative to a standardized coordinate system shown within the reconstructed 3D acoustic image volume.
An image 810 on the left hand side of
For comparison, an image 820 on the right hand side of
The pose predictions for the sequence of two dimensional acoustic images may be used to construct a three dimensional acoustic image of a volume in the region of interest, which can be used, e.g., to: perform volumetric measurements; and/or to create extended three dimensional acoustic imaging fields of view to show entire organs or other structures which are too large to be captured in a single two dimensional or three dimensional acoustic image.
An operation 905 includes defining a standardized three dimensional coordinate system for a region of interest (ROI) in a subject's body. The ROI may include a reference structure having a known shape and orientation in the body, for example an organ, a bone, a joint, one or more blood vessels, etc. In some embodiments, the standardized three dimensional coordinate system for the ROI may be defined by selecting an origin and three mutually orthogonal axes for the standardized three dimensional coordinate system based on a priori knowledge about an abstract reference structure (e.g., an abstract organ, such as a liver) in the ROI. Operation 910 may be performed using methods described above with regards to
An operation 910 includes selecting a first subject for the subsequent operations 915 through 940.-Here the first subject may be selected in any convenient way, for example randomly, as the order in which subjects are selected is irrelevant to the method of
An operation 915 includes obtaining a series of spatially tracked two dimensional acoustic images of the ROI in the subject using a tracking device, such as an EM or optical tracker.
An operation 920 includes constructing a three dimensional acoustic image of the ROI in the subject from the series of spatially tracked two dimensional acoustic images of the ROI, wherein the three dimensional acoustic image of the ROI in the subject is in an acoustic three dimensional coordinate system.
An operation 925 includes segmenting a reference structure in the three dimensional volumetric image of the ROI in the subject. The reference structure having a known shape and orientation in the body, and may be, for example an organ, a bone, a joint, one or more blood vessels, etc.
An operation 930 includes defining an acoustic image three dimensional coordinate system from the three dimensional volumetric acoustic image of the ROI in the subject, based on the segmentation of the acoustic images of the actual reference structure (e.g., an actual organ) in the subject in operation 925.
An operation 935 includes determining, for each of the spatially tracked two dimensional acoustic images (obtained in operation 915) of the ROI in the subject its actual pose in the standardized three dimensional coordinate system (defined in operation 905) using: a pose of the spatially tracked two dimensional acoustic image in the acoustic image three dimensional coordinate system (defined in operation 930) corresponding to the spatially tracked two dimensional acoustic image, and a coordinate system transformation from the corresponding acoustic image three dimensional coordinate system to the standardized three dimensional coordinate system.
An operation 940 includes determining whether the current subject is the last subject. If the current subject is not the last subject, then the process returns to operation 915, and operations 915 through 940 are performed for the next subject. If the current subject is the last subject, then the process proceeds to operation 945. An operation 945 includes performing an optimization process on a convolutional neural network (CNN) by providing the spatially tracked two dimensional acoustic images to the CNN and adjusting parameters of the CNN to minimize differences between predicted poses generated by the CNN for the spatially tracked two dimensional acoustic images and the actual poses of the spatially tracked two dimensional acoustic images. Beneficially, operation 945 may be performed “batch-wise,” i.e. by sequentially taking random subsets (e.g. 16, or 32) of the groups of images across a plurality of subjects and feeding them as inputs to the CNN for the next optimization step. For example, if 20 spatially tracked two dimensional acoustic images were obtained in operation 915 for each of 20 different subjects, that would produce a total of 400 spatially tracked two dimensional acoustic images, and each batch might be only, e.g., 16 or 32 of those 400 spatially tracked two dimensional acoustic images. During the training process, parameters of the CNN may be constantly updated by propagating errors between predicted and ground truth values for the poses given an input image that is fed to the CNN.
An operation 1010 includes employing an acoustic probe to acquire a series of two dimensional acoustic images of a region of interest (ROI) in a subject without spatial tracking of the acoustic probe.
An operation 1020 includes applying the two dimensional acoustic images to a convolutional neural network which has been trained using a plurality of previously-obtained two dimensional acoustic images of corresponding ROIs in a plurality of other subjects which were obtained with spatial tracking.
An operation 1030 includes the convolutional neural network predicting a pose for each of the two dimensional acoustic images of the ROI in the subject with respect to a standardized three dimensional coordinate system.
An operation 1040 includes using the predicted pose for each of the two dimensional acoustic images of the ROI in the subject with respect to the standardized three dimensional coordinate system to produce a three dimensional acoustic image of the ROI in the subject from the series of two dimensional acoustic images of the ROI of the subject.
While preferred embodiments are disclosed in detail herein, many variations are possible which remain within the concept and scope of the invention. Features and elements from various embodiments described herein can be combined to produce other embodiments within the scope of the invention. Such variations would become clear to one of ordinary skill in the art after inspection of the specification, drawings and claims herein. The invention therefore is not to be restricted except within the scope of the appended claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2020/061111 | 4/22/2020 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62838379 | Apr 2019 | US |