In research laboratories that work with small model organisms, microscopic imaging of small living organisms presents special challenges. One challenge is maintaining organisms within the viewing area of the microscope, which is typically quite small for high resolution microscopes (on the order of <1 square centimeter), throughout the duration of an experiment. For example, a standard microscope imaging at 5 μm per pixel resolution typically can observe a field-of-view that is a few square centimeters at most, while a microscope imaging at 1 μm per pixel resolution can observe a field-of-view of several millimeters. This area is typically not sufficient to observe a small model organism, such as a Drosophila, zebrafish larvae, medaka, or other invertebrate such as ants, spiders, crickets, or other organisms such as a slime mold, as they freely move in an unconstrained manner. It is also insufficient for observing multiple such organisms interacting. Observing such unconstrained movement and interaction is helpful for improving our understanding of organism behavior, or to observe such behaviors at high resolution, or in neuroscience to study social interaction, or in toxicology and pharmacology, to observe the effect of drugs and toxins on such natural behavior and social interaction.
There are a number of ways in which researchers approach this problem. One way is to sedate the animal while preserving heartbeat and other physiological processes that a researcher might be interested in imaging. Physically constraining the organism (e.g., in agar) is also an option, for example, by embedding the organism in agar or gluing it to a head mount or surface or in general fixing it in place. At such high imaging resolutions, it is possible to apply software to automatically examine heart function, for example. These methods naturally modify the organism behavior and are thus unsuitable for observing natural behaviors during free movement.
Using an imaging setup with a large field of view allows the organisms to move freely within an arena, but there is generally a tradeoff between field-of-view (FOV) and optical resolution, due to the reduced magnification. That is, if a large FOV is captured by a lens, it typically does so at reduced resolution. The tradeoff between resolution and FOV is encapsulated by the space-bandwidth product (SBP) of an imaging system such as a microscope, which is the total number of resolved pixels per snapshot. Standard microscopes typically have a SBP with an upper limit of 50 Megapixels. Despite the loss of resolution, numerous systems have been developed that are able to track the trajectories of multiple organisms simultaneously, but at the cost of not being able to image each of them at high resolution. This includes low-resolution (worse than 25 μm per pixel resolution) tracking of fruit flies, C. Elegans, and zebrafish larvae, for example.
Software also exists to enable tracking of organisms such as the zebrafish at low resolution, but typically just the location of the low-resolution organism, as enabled by a few related current products that image at low-resolution. This type of software has been included in several patents. For example, software exists to examine specific morphological features of zebrafish and to store this data, and to compare such image data to standard template images to gain insight. Software also exists to estimate position and velocity from video recordings at 8 frames per second, or to record the 3D position of organisms, such as fish. There are also systems that use two cameras to image zebrafish constrained in capillary tubes, and to image organisms such as rodents who are physically tagged. There have also been devices suggested that use two cameras to jointly image bright-field and fluorescence images. Alternatively, systems have used projected light patterns to assist with calculating physical quantities about moving organisms such as fish.
If the experiment calls for both high optical resolution and the need for allowing organisms to move freely in an arena, many researchers turn to tracking technologies in order to keep the target organism within view of the imaging unit. This often requires elaborate mechanical contraptions that either move the optical components or the sample itself in response to the motion of the organism. A major disadvantage that these mechanical-based tracking systems face is that they can only track one organism at a time, in virtue of the fact that different organisms at any given time can be in different locations and moving in different directions. Therefore, there are many potential model organism assays that are very difficult, if not impossible, to carry out with current technologies. There is a need for a microscope system that is able to track and image, simultaneously, multiple unconstrained organisms over a large arena at high optical resolution, and we present that technology here.
To address this problem, the present invention present a method for organism tracking that operates across many micro-cameras that are tiled together to image a large field-of-view in parallel. The micro-camera array microscope (MCAM) breaks the standard tradeoff between resolution and field-of-view to simultaneously obtain 5-20 μm per pixel image resolution over a 100 cm2 (or more) field-of-view. In other words, it offers a SBP of hundreds to thousands of megapixels (i.e., gigapixel SBP), which unlocks the ability to record video of multiple freely moving organisms at high resolution and thus track their individual behaviors within an optical system that includes no scanning or moving parts. Unlike other existing microscope tracking methods, our technology 1) works across many individual microscopes in parallel and 2) is optimized to automatically process and effectively compress (in a lossless manner) large amounts of image and video data.
There are also prior technologies to track objects from image data outside of the microscopic world. For example, pedestrians are commonly tracked across multiple security cameras, or across multiple cameras in autonomous vehicles. A few patents describing such tracking methods. This current invention differs from such existing technologies in several key regards: 1) as a microscope tracking technology, it can accurately track moving objects, such as living organisms, in all 3 dimensions, 2) the on-microscope processor and computer are located near one another, allowing for data transmission at significantly higher speeds than alternative technologies, 3) tracking software is unique for microscopic imaging cases, given that the scenes are well-controlled (i.e., with minimal clutter and the user can select what the sample and background are) and defocus must be accounted for, and 4) the desired output from microscope tracking (e.g., of model organisms for drug discovery experiments) is generally different than with cameras (e.g., to enable cars to avoid humans), and thus the processing pipeline is quite unique.
In some embodiments, the present invention discloses a multi-aperture microscope technology that offers the ability to track in real-time and image multiple independent small model organisms over a large area. The technology includes an organized array of micro-cameras which, together, capture image data of a group of organisms distributed over a wide arena. A first processor (e.g., in the form of a field-programmable gate array (FPGA)) aggregates and streams video data from all micro-cameras simultaneously to a second processor (e.g., within a nearby desktop computer). It is then possible to run an organism tracking algorithm, which is able to compute per-organism position coordinates, produce cropped video footage of each organism, and automatically measure key morphological statistics for each organism in the imaging area. These computational methods can be distributed across the first processor and the second processor. The technology can conduct small animal tracking and imaging with no moving parts and is immune to the performance tradeoffs faced by other microscope technologies developed so far.
The multi-aperture microscope technology can also be used fir 3D tracking of organisms, with a depth limited by a thickness of the sample. The presence of the multiple cameras capturing images from the sample, configured with 50% or more overlap, can allow the 3D tracking of the organisms through photogrammetry.
In some embodiments, the present invention discloses systems and methods to track freely moving objects, such as model organisms, over a large imaging area and at high spatial resolution in real-time, while also jointly providing high-resolution video and automated morphological analysis on a per-organism level. The system is based upon an imaging hardware unit, computational hardware, and jointly designed software.
In some embodiments, the present invention discloses a microscope technology that offers the ability to track in real-time and image multiple independent small model organisms over a large area. The technology can conduct small animal tracking and imaging with no moving parts and is immune to the performance tradeoffs faced by other microscope technologies developed so far.
The microscope can include multiple cameras, such as micro cameras, e.g., cameras having small form factor arranged in an array. The cameras in the microscope can be organized in an array that can capture images of a group of organisms distributed over a wide arena, e.g., on or in a sample.
The cameras can be configured to have overlapped fields of view on the sample, which can allow stitching the images across neighbor cameras for a seamless view of the sample. The overlapped fields of view between adjacent cameras can be less than or equal to 50%, e.g., there are areas on the sample that can be imaged only by the middle cameras and not by the neighbor cameras. The overlapped fields of view can be 5%, 10%, 20%, 30%, or 40%.
The cameras can be configured to have 50% or more overlapped fields of view on the sample, which can allow sample depth analysis, in addition to the stitching ability, for example, through a photogrammetry process such as photometric stereo. For 50% or more overlapped fields of view, all areas on the sample can be imaged by at least two cameras, which can allow the depth analysis of organisms detected in the images, for example, through image disparity between the captured images for a same feature.
The microscope can include one or more light sources, which can be disposed above the sample, below the sample, or both above and below the sample. The light sources can be configured to provide one or more illumination patterns to the sample. For example, the light sources can be configured to provide bright field images or dark field images to the cameras. The multiple illumination patterns can also allow depth analysis through multiple images captured by a same camera by under multiple illumination patterns.
The microscope can include one or more moving mechanisms configured to move the individual cameras, the camera array, the light sources, or the sample. For example, each camera can have a sensor adjustment mechanism configured to move the image sensor of the camera, an optical lens adjustment mechanism configured to move the optical lens of the camera, an objective lens adjustment mechanism configured to move the objective lens of the camera, and a camera adjustment mechanism configured to move the camera. The camera array can be coupled to a stage moving mechanism configured to move the camera array with respect to the sample. The sample can be disposed on a sample support, which can be coupled to a support moving mechanism configured to move the sample support in one or more directions, such as in a direction toward or away from the cameras, or in directions parallel to the sample for repositioning the sample under the cameras, such as for scanning the sample.
The microscope can include one or more excitation sources configured to affect the organisms in the sample, such as to provide excitation or disturbance to the organisms. The excitation sources can generate a local or a global disturbance, e.g., providing a disturbance limited to a small area of the sample or a disturbance applicable to the whole sample. The disturbance can be continuous or pulsed, and can be include periodic pulses or one or more discrete pulses.
The excitation sources can generate an acoustic signal, e.g., a sound or ultrasound, a radiation signal, e.g., a visible light, an IR light, a UV light, or a polarized light, a radiation pattern, e.g., an image generated from an LCD screen, a vibration signal that can vibrate the whole sample or only one or more local areas of the sample, an injector that can inject a stimulant such as a chemical or a radiation excitation component, e.g., a fluorescent excitation source, to the sample, an olfactory signal, or a manipulator for generate a mechanical disturbance or stimulant to the sample.
The microscope can include one or more processors configured to process the data from the images captured by the cameras. For example, the processors can be configured to run an organism tracking algorithm, which is able to compute per-organism position coordinates, produce cropped video footage of each organism, and automatically measure key morphological statistics for each organism in the imaging area. The computational process can be performed on a main processor, or can be distributed across multiple processors.
The processors can include only a main processor, which can be configured to accept image data from the cameras, such as configured to serially accept the multiple parallel image data streams from the multiple cameras. The processors can include a pre-processor, such as a Field Programmable Gate Array (FPGA), in addition to the main processor. The pre-processor can be configured to accept the multiple parallel image streams from the cameras, parallely process the multiple image streams, and then serially send to the main processor for additional analysis. The processors can include multiple pre-processors, with each pre-processor coupled to a camera output for pre-processing the camera image data right after capturing the image. Outputs from the multiple processors can be sent serially to the main processor, for additional analysis. The conversion of multiple parallel data streams to a serial data stream can be performed by electronic devices, such as by an FPGA, which can aggregate and stream video data from all cameras or from all pre-processors coupled to the cameras simultaneously to the main processor, which can be a processor of a data processing system such as a nearby desktop computer.
The microscope can include a controller configured to control the cameras, the light sources, the excitation sources, and the moving mechanisms, for example, to set the image parameters for the cameras, the radiation parameters for the light sources, and the excitation parameters for the excitation sources. The controller can be configured to control the moving mechanisms for moving the cameras or the sample support, for example, to change the amount of the overlapped field of view between adjacent cameras. The controller can also be configured to accept inputs, such as external inputs from an operator or from a memory device, to provide camera parameters such as distances between cameras or the magnification of the cameras, light parameters such as the wavelengths of the light sources or the locations of the light sources with respect to the cameras, sample support parameters such as positions of the sample support relative to the cameras and the light sources. The controller can also be configured to accept inputs related to the organisms to be tracked, such as sizes and shapes of the organisms or possible types and identification of the organisms.
In some embodiments, “controller”, “processor”, and “pre-processor” are electronic devices, and can be used interchangeably in the specification, with the distinction between these components based on the context. For example, pre-processor and processor can be the same device type, with a difference being the positions of the pre-processor, e.g., the pre-processor is configured to process data before sending to the processor for processing. An electronic device can be configured to function as a controller or a processor, e.g., a controller can be used to control devices, such as cameras, and at a same time, can be used to process data. A processor can be used to process data, and at a same time, can be used to control devices, such as cameras.
Thus, a controller can be a processor or a pre-processor, a processor can be a pre-processor or a controller, and a pre-processor can be a controller or a processor.
Unique properties of this tracking system include its ability to enable measurement of unconstrained small model organisms by imaging their behavior, morphological properties and biochemical variations at high resolution, in real-time, over a large field of view, and with no moving parts. The array of micro-cameras affords this technology multiple advantages over other tracking systems. Firstly, it expands the field of view of the system without sacrificing resolution. Second, it enables the tracking of multiple organisms simultaneously, which other mechanical based tracking technologies generally cannot achieve. Third, it allows for full field of view imaging at full optical resolution but at low frame rates where the target organisms and their surroundings are recorded as well as targeted imaging where only the tracked organisms are visible and much higher acquisition frame-rates can be achieved. These features allow for a high level of versatility that enables a wide range of research and commercial applications.
There are a multitude of applications which the tracking technology presented here enables or can advance in the future. One such application is studying neural correlates of behavior in small model organisms moving naturally and freely within a standard petri dish or other media that are larger than the field of view of a typical micrometer resolution microscope. Example model organisms that can be tracked include the zebrafish larvae, Drosophila (fruit fly), C. Elegan, ants and other small invertebrate, small fish such as the Danionella Translucida and Medaka (Oryzias latipes), and small rodents such as mice and rats. Observing and quantifying group dynamics of small model organisms is made possible by this technology, with the added advantage of being able to image at high resolution and high frame-rate. Furthermore, being able to track and crop to target organisms can significantly reduce the volume of data that is acquired as all the extraneous image data of the peripheral surroundings can be left out early in the video acquisition pipeline.
Micro-Camera Array Microscope (MCAM) System
In some embodiments, the present invention discloses a system having parallel image data acquisition, e.g., cameras, across an array of multiple separate image sensors and associated lenses, which can allow the image acquisition of a large sample, limited by the number of cameras in the camera array. The cameras can be micro cameras having small form factors assembled on a camera board, with data transfer cable coupled to a nearby computer system. With the small size and short transfer cable, fast data acquisition for large sample can be achieved.
In some embodiments, the system having parallel image data acquisition can include a computational microscope system of a micro-camera array microscope (MCAM) system. Details about the MCAM system can be found in patent application Ser. No. 16/066,065, filed on Jun. 26, 2018; and in patent application Ser. No. 17/092,177, filed on Nov. 6, 2020, entitled “Methods to detect image features from variably-illuminated images”; hereby incorporated by reference in their entirety, and briefly described below.
The MCAM system 100 can include multiple cameras 110, which can form a camera array, and one or more illumination sources disposed above 121 and below 122 for microscopic imaging. The light sources can be visible light sources, infrared light sources, ultraviolet light sources, fluorescent light sources, or polarized light sources, such as light emitting diodes (LEDs) or lasers with appropriate wavelengths and filters. The illumination system can be placed below 122 or above 121 the sample, to provide transmissive or reflective light to the micro cameras.
The MCAM system can use multiple micro-cameras 110 to capture light from multiple sample areas, with each micro camera capturing light from a sample area onto a digital image sensor, such as a charged coupled device (CCD), complementary metal-oxide semiconductor (CMOS) pixel array, or single-photon avalanche diode (SPAD) array.
In some embodiments, the illumination system can provide the sample with different illumination configurations, which can allow the micro cameras to capture images of the sample with light incident upon the sample at different angles, spatial patterns, and wavelengths. The illumination angle and wavelength are important degrees of freedom that impacts specimen feature appearance. For example, by slightly changing the incident illumination angle, a standard image can be converted from a bright field image into a phase-contrast-type image or a dark field image, where the intensity relationship between the specimen and background is completely reversed. The illumination system thus can be controlled to provide an optimum illumination pattern to the sample.
Alternatively, by providing the sample with different illumination light angles, spatial patterns, and wavelengths, both intensity and phase information of the imaged optical field can be recorded, which can allow the reconstruction of an image, for example, with more information or higher resolution, such as a measure of sample depth, spectral (e.g., color) properties, or the optical phase at the sample plane.
In some embodiments, the MCAM system can include one or more excitation sources 130, which can be configured to provide excitation energy to the sample, e.g., to disturb the organisms in the sample. The excitation sources can be local, e.g., the excitation energy is confined to one or more areas of the sample. The excitation sources can be global, e.g., the excitation energy is provided to the whole sample, e.g., to all areas of the sample. The excitation energy can be provided continuously, or in separate pulses. The pulses can be periodic, or can include burst of energy pulses. The excitation sources can include an acoustic signal, a radiation signal, a radiation pattern, a vibration signal, an injector that can inject a stimulant such as a chemical or a radiation excitation component, an olfactory signal, or a manipulator for generate a mechanical disturbance or stimulant to the sample.
The MCAM system 100 can include a controller 140 for controlling the cameras 110, the illumination sources 121 and 122, the excitation sources 130, and for processing the images. For example, the controller 140 can include a central processing unit or processor 142, which can couple to camera and light controllers for controlling the cameras units, such as to tell the cameras when to capture images, and for controlling the illumination sources, such as to tell the illumination sources when to be activated and what illumination sources to be activated. The central processing unit 142 can be coupled with the camera units to obtain the image data captured by the camera units. The data can be stored in memory 143, can be processed by the central processing unit to be stored in a post processing dataset 155, and can be displayed on a display 145 or to send to a final storage. The controller can optionally include a pre-processing unit or pre-processor 141, e.g., another processing unit or another processor, in addition to the central processing unit, for processing the image data from the cameras before sending to the central processing unit.
The post process data set 144 can include the coordinates of the objects or organisms detected in the sample, image frames of the objects cropped to contain only the objects, cropped object image video, and other information. The post process data set 144 can also include detailed per-organism analysis, such as fluorescent neural activity video, heartbeat video, behavior classification, and other information.
The filters for the light sources can change the characteristics of the emitted light, so that the sample can have the specific light property provided by the filters. For example, a fluorescent filter can allow the light sources to emit fluorescent excitation energy to the sample, causing the organisms in the sample to respond and emit fluorescent signals. A polarized filter, such as a circular polarized filter, can allow the light sources to emit circular-polarized light.
The MCAM system can include excitation sources 230 for exciting objects or organisms 250 in the sample. The excitation sources can be separate excitation sources, or can be incorporated into the light sources, for example, by filters 212, such as polarized filters or fluorescent excitation filter.
The MCAM system can include moving mechanisms configured to move the cameras or the sample. A moving mechanism 213 can be coupled to the camera array to move the camera array relative to the sample, such as toward or away from the sample. Another moving mechanism 223 can be coupled to a sample support to move the sample relative to the cameras, such as toward or away from the cameras. The moving mechanism 223 can also be configured to move the sample support in a lateral direction, for example, for scanning the sample. For example, the specimen can also be placed on a 3D motorized stage, whose position can be controlled via software on the computer to bring the specimen into appropriate focus and lateral position.
In some embodiments, the field of views of the cameras can be adjusted to vary the overlapping area, such as between non overlapping FOV, less than 50% overlapping FOV, and more than 50% FOV. The adjustment can be performed by changing the magnification of the cameras or the focus distance to the sample areas.
The FOV of the cameras can be non overlapped, for example, to observe samples with discrete areas such as well plates. The FOV of the cameras can overlap 50% or less in one or two lateral directions, such as x and y directions, such that less than half of the points on the object plane for one camera are also captured by one or more other cameras in the array. This permits stitching of the images to form a complete representation of the sample.
The FOV of the cameras can overlap 50% or more in one or two lateral directions, such that less than half of the points on the object plane for one camera are also captured by one or more other cameras in the array. This permits depth calculation for the objects positions, for example, through photogrammetry or photostereo.
The process module 341 can be an FPGA based module (e.g., a module containing a processing chipset, such as an FPGA, or other chipset of an ASIC, an ASSP, or a SOC), which can be configured to receive image data from the multiple camera units, e.g., through data streams 315. The FPGA based module 341 can include a shallow buffer, for example, to store incoming data from the data streams 315. The FPGA based module can be configured to send sensor configuration data to the camera array, for example, to provide image parameters to the image sensors of the camera units. The sensor configuration can be received from a computational unit having a processor 342 and a memory 343. For example, the processor can send configuration and settings to the FPGA based module, with the configuration and settings including setting information for the FPGA based module and the configurations for the image sensors. The FPGA based module can communicate 316 with the computational unit using direct memory access (DMA) to pass data directly to the memory 343, through a high speed link such as PCIe. The FPGA based module can communicate with a control module, which can be configured to control lighting, motion, and sample handling for the microscope system. The computational unit 342 can also communicate directly to the control module. The computational unit 342 can communicate with a storage or network devices (not shown). The system can include peripheral devices, such as stages, illumination units, or other equipment involved in the apparatus necessary to ensure adequate imaging conditions.
An imaging system can include an array of cameras 310 focused on a large sample 320 under the illumination of an array of light sources 321 and 322. Image parameters 317 to the camera array 310 can be inputted to the camera array, for example, to control focus mechanisms for focusing or for changing magnification of the individual cameras. A motion mechanism, e.g., a movable camera stage 313, can be used to adjust the positions of the camera array, such as tipping, tilting, translating the camera array, or for changing the overlap amounts between cameras. A motion mechanism, e.g., a movable sample holder 323, can be used to adjust the positions of the sample, such as tipping, tilting, translating, or curving the sample. The movable sample holder can also be used for advancing the sample or the sample holder in discrete steps for capturing scanning image data of the sample. An excitation module 330 can be used to provide excitation to the organisms in the sample 320.
A data processing system 340 can be used to control the elements of the imaging system. The data processing system 340 can be configured to receive inputs 318, such as data related to features of interest to be detected and analyzed on the sample. The data processing system 340 can be configured to receive data from the camera array 310, and to transfer the data to a data processing processor 341 or 342 for processing. The data processing system 340 can be configured to transfer the data to a second data processing processor 342 for analysis. The data processing system 340 can include a controller 346 to control the camera array, the illumination source, and the sample holder to provide suitable conditions for image captures, such as providing variably illuminated radiation patterns to the sample, repositioning the cameras, the camera array, the sample, or the sample holder for focusing or scanning operations.
In some embodiments, the data processing system is a desktop computer. This desktop computer can be attached to a monitor for visual analysis of recorded MCAM video and/or MCAM statistics. The desktop computer can also be networked to transmit recorded video data and/or MCAM statistics and is also used to control the image and video acquisition parameters of the MCAM instrument (exposure time, frame rate, number of micro-cameras to record video from, etc.) via electronic signal.
The imaging system 300, such as a camera array microscope, based on a set of more than one compact, high-resolution imaging system, can efficiently acquire image data from across a large sample by recording optical information from different sample areas in parallel. When necessary, physically scanning the sample with respect to the array and acquiring a sequence of image snapshots can acquire additional image data.
The imaging system can be used to obtain image and video data from the sample. The data can be analyzed to detect organisms for tracking. In addition, the data can be analyzed to classify the organisms, e.g., using the features on the organisms to classify the organisms into different organism categories or organism identification.
In some embodiments, the presented invention discloses methods to track freely moving objects, such as model organisms, over a large imaging area and at high spatial resolution in real-time, while also jointly providing high-resolution video and automated morphological analysis on a per-organism level.
A sample can be placed on a sample support in an MCAM system, under, above, or to a side of the cameras. Freely moving objects can be observed in the sample for imaging and analysis. The sample can be an arena, for example, having a glass or plastic flat surface with surrounding walls. Alternatively, the sample can have the form of a 6, 12, 24, 48, 54, 96, or more well-plate. The sample can contain model organisms, such as fruit flies, ants, C. Elegans, along with other materials of interest. The sample can also contain water, in which aquatic model organisms such as the zebrafish are placed for subsequent investigation and analysis.
The sample can be subjected to one or more excitation sources, which can be placed surrounding the sample area to manipulate the organisms. For example, the excitation sources can include micro-injectors to inject various model organisms with certain biochemical material, or to insert specific chemicals, toxins or other biochemical material into the sample area. Micro-manipulators may also be used to manipulate, stimulate, perturb or otherwise change the model organisms or their surrounding area. Equipment such as voice coils or LCD screens may also be used to stimulate the visual, auditory, olfactory or other sensory systems of the model organisms within the sample area.
In
In
In
In
In
In
In
Since the MCAM has multiple cameras, the object can freely move between cameras, e.g., the object can exit the field of view of one camera and enter the field of view of a neighbor camera. With some amount of overlap in the fields of view of neighbor cameras, the processor can maintain a precise measurement of the location of each organism as its crosses image sensor boundaries, by maintaining a precise measurement in both cameras, and communicating information that the object is moving towards a second camera from a first camera.
In
In
Subsequent analysis can be applied to this image frame set, including measuring one or more morphological properties about each organism of interest. Considering the case of imaging zebrafish larvae, some morphological features of interest include 3D organism position, Eye direction, Eye size, Gaze direction, Body length, Tail curvature, Mouth state, Pectoral fin angle, Pigmentation coverage, and Heart shape, for example. After measuring these quantities, additional subsequent analysis can be automatically executed, for example, unsupervised classification of behavior, watching eye position movement as a function of time, or tail movement or heartbeat as a function of time, or fluorescence variations within the brain as a function of time.
A feature or object detection algorithm can be used to locate and detect moving objects 550 and 550* within subsequently captured image frames from the cameras. Same or different object detection algorithms can be used to detect different objects of interest. The object detection algorithm can also be employed to ignore other objects, such as debris 563 or other features of the medium in which the organisms are contained during the imaging process.
The object detection algorithm may also be used to identify multiple objects across multiple image frames acquired as a function of time, to enable object tracking as a function of time. The object detection algorithm can be used to locate and draw bounding boxes around the objects 550 and 550*. The sequences of bounding boxes for the objects associated with different time points can be used to form video frames 561 and 561* with the objects centered within each frame.
In some embodiments, the bounding boxes can be employed to produce cropped image segments per frame, wherein only the pixels within each bounding box area are saved and utilized for additional processing. These cropped image segments can subsequently be spatially aligned to create a centered organism video for each organism of interest. Additional analysis 558 can be performed, including examining the size, speed, morphological features, fluorescent activity, and numerical assessment other important biochemical information that is transmitted optically to the MCAM sensor.
In some embodiments, the desired outputs from the MCAM video object tracking include a set of coordinates that each define the 2D or 3D location and bounding box encompassing the objects of interest within the MCAM field of view as a function of time. To enable rapid computation of object locations, information related to the MCAM system can be pre-determined, such as being measured or calibrated before the MCAM operation, and stored in memory off either or both pre-processors coupled to the cameras or the main processor receiving data from the cameras or from the pre-processors. The information about the mechanical and optical configuration of the MCAM can be accessed by the pre-processors or by the main processor to enable the pre-processors or the main processor to calculate the measured image data into actual and quantitative values regarding the detected target organisms. The actual data can allow the processors to determine if the detected objects are the target organisms or organisms not being targeted or a piece of debris, for example, by comparing the calculated actual dimensions to the target organism dimensions. The MCAM related information can take the form of a look-up-table, list of variables with associated values, or any other type of numeric array.
For example, the MCAM related information can include the distances between the cameras, the distances between the cameras and the sample being imaged (or the average distance for all cameras to the sample), the focal length of the lenses of the cameras, the distance between the lens and the image sensor in each camera (or the average distance for all cameras), the dimensions, such as in pixels, of each camera, the pixel pitch of each camera, and the number of rows and columns of pixel overlap between neighboring cameras.
The MCAM related information can allow the processors to calculate data useful in the detection or analysis of the target organisms. For example, the useful data can include the local locations within the individual camera, or the global locations within the entire MCAM imaging area, in either pixel or spatial coordinates of each organism being tracked, the identification of the camera in which each target organism is detected, the sizes, shapes, or dimensions of each organism, the average or maximum spatial distance between any two organisms, or other system information that can assist with the computational process of object localization.
Operation 701 inputs data of objects to be analyzed and track, including object shapes, dimensions and characteristics, object types, or object identification. The object data can be pre-stored in memory, and the input can include selecting an organism to be tracked, among a list of organisms presented to an operator of the MCAM system.
In some embodiments, the MCAM organism tracking includes detecting the organisms of interest in the MCAM imaging area. To enable rapid detection of objects within the MCAM field-of-view, it can be beneficial to store information about the features of the object types to be tracked. The object feature information can take the form of a look-up-table, list of variables with associated values, or any other type of numeric array.
The object feature information can include measurements of the objects, such as the sizes, shapes, and dimensions of the objects, which can enable the processors to distinguish a debris from the target organisms among the detected objects. As discussed above, the MCAM related information can enable actual measurements of the detected objects, and a comparison with the object measurements in the object feature information can allow the processors to accept detected objects as the target organisms and to reject detected objects not conforming to the measurements of the target organisms.
The object feature information can include detection characteristics, which can enable the processors to select optimum detection algorithm and to perform the selected algorithm. The detection characteristics can include specific features of the target organisms, e.g., the features that can enable the recognition and identification of the target organisms. For example, for frame-to-frame detection, a threshold value between change and no change can be stored. Similarly, for edge detection, a threshold value between edge and no edge can be stored. For projection detection, a threshold value between detection and no detection can be stored. Also, fitted curves for the projection detection can also be stored. For object detection convolutional neural network, the detection characteristics can include feature detection convolution filters, wavelet filters or other filters that are object or organism specific.
The object feature information can also take the form of a pre-trained neural network, such as a convolutional neural network (CNN), which is designed for object detection, object segmentation, object tracking or a related task. More than one CNN weights can be pre-set or pre-determined via supervised learning approaches, using either MCAM image data or similar image data, of examples of the desired object types to be tracked. Some weights can also be not determined, e.g., the weight values can be left un-initialized, and to be optimized at a later time after acquisition of additional image data, using new during or after image capture.
Operation 702 optionally provides excitation to the sample, e.g., to provide an influence, and disturbance, an excitation, or in general, something to affect or having an effect on the organisms. The excitation can be applied to the organisms, such as a fluorescent excitation signal configured to excite the organisms. The excitation can be applied to the sample medium, e.g., to the gaseous or liquid environment in which the organisms are located, The medium excitation can include a vibration or a disturbance of the medium. Multiple excitation can be applied to the sample, either in parallel, e.g., at a same time, or in sequence, e.g., one after the other.
In some embodiments, the excitation source can provide an energy or a signal configured to have an effect directly on the organism, or indirectly to the organism through the sample medium. Thus, the excitation source can be any source configured to provide an effect on the organisms. The excitation energy or signal can by any signal carrying an energy to be provided to the organisms, or to be provided to the medium that generates an effect on the organisms.
The excitation can include a global excitation to the entire sample, e.g., either to the whole sample or to only the area of the sample whose images can be captured by a camera of the MCAM. The global excitation can be provided to the top surface of the sample, or also to the depth of the sample. For example, an acoustic source can generate a sound to the sample, mostly at the surface. A radiation source can generate a light covering the whole sample, and can penetrate the sample surface. A medium vibration source can provide a disturbance to the whole sample, including the depth of the sample.
The excitation can include one or more local excitations to one or more areas of the sample. For example, a same type of excitation energy can be applied to multiple areas of the sample. Alternatively, different types of excitation energy can be applied to different areas of the sample. The multiple excitation energies can be applied in parallel or in sequence.
The local excitation energy can be applied to the surface of the sample, or to the depth of the sample, or to both surface and depth of the sample. The local excitation source can include a focus mechanism, to limit the excitation energy to a local area. For example, a focused acoustic source can generate a beam of sound to the sample. A focused radiation source can generate a beam of light to an area of the whole sample, and can penetrate the sample surface. A vibration source can provide a disturbance to a local area of the sample, whose effect can gradually reduced farther away from the excitation center.
The excitation signal can include a uniform excitation, a patterned excitation, a continuous excitation, a periodic excitation, or a pulse excitation. For example, a radiation source can provide a uniform light to the sample. A display screen, such as an LCD or OLED screen can provide a patterned light, e.g., space-varying light, which can be time-constant or time varying. The light from the radiation source can be continuous, e.g., as a time constant light. The light from the radiation source can be periodic, e.g., having a cyclic light. The light from the radiation source can include one or more pulses, either a series of periodic pulses or a number of pulses. In addition, the excitation signal can be time constant or time varying.
The excitation can include a noise, a sound, an audio effect, a light, a visual effect, an olfactory effect, a vibration, a mechanical manipulation, a chemical or biochemical injection, a fluorescence excitation. For example, a mechanical manipulation source in the form of a stirrer can be used to stir the gaseous or liquid medium of the sample. An injection source can be in the form of a pipette, which can be used to provide droplets of a chemical or a biochemical to the gaseous or liquid medium of the sample.
Operation 703 captures images from cameras, and sends to one or more processors, with each camera data sent to one pre-processor or all camera data streams to a central processor. The MCAM can have one main processor, for example, a central processing unit for a data processing system such as a computer. The images captured from the multiple cameras of the MCAM system then can be send to the main processor, such as through a parallel to serial device, which can be configured to accept multiple image data streams from the cameras to form a serial data stream to send to the main processor. The parallel to serial device can be a FPGA, or any electronic device configured to sequentially output data from multiple input streams.
The MCAM can have multiple processors, such as one or more pre-processors in addition to the main processor. The pre-processors can be coupled to the cameras, for example, with each pre-processor coupled to a camera, to pre-process the image data from the cameras before sending to the main processor through the parallel to serial device. A major advantage of the pre-processors is the ability for parallel processing, e.g., image data from all cameras can be processed, e.g., a portion or all of the needed processing, at a same time, instead of in sequence if the image data from the cameras are sent to the main processor for processing.
The pre-processors can be configured to only screen the input data to be sent to the main processor, such as to turn off, e.g., not send, cameras that a screening analysis of the images showing no object. The screening analysis is a quick analysis, with the goal of determine whether or not there is an object. The screening analysis is faster than the object detection process, since the object detection process also provide the coordinates and dimensions of the objects. After the screening process at the pre-processors, only image data from cameras that show the presence of an object is sent to the main processor.
In some embodiments, the screening analysis can be assisted through image data of previous analysis. For example, if partial object is detected from one camera, it is likely that the neighbor cameras containing the remaining portion of the object. The screening analysis can include a determination of no frame-to-frame change from the current captured image with the background image or with a previously captured image, for example, by comparing the sum of the pixel differences between two images with a threshold value. The screening analysis can include a determination of local area change, by applying a convolution filter.
The pre-processors can be configured to share the work load with the main processor, such as the object detection process at individual cameras without cross camera work. Since the object detection process can be performed on each camera, e.g., on the image captured by each camera, the images can be pre-processed at the pre-processors before sending the results, e.g., the detected objects, to the main processor. For this configuration, the pre-processor is coupled only to one camera, without cross camera connection. After the object detection process without cross camera analysis at the pre-processors, image segments containing whole or partial images are sent to the main processor. The cross camera work can be performed at the main processor to generate object location coordinates and sizes of bounding boxes containing the whole objects.
The pre-processors can be configured to share the work load with the main processor, such as the object detection process with the cross camera work. After the objects are detected at the individual cameras, cross camera data can be used to merge objects detected in multiple neighbor cameras, and to remove redundancy caused by the overlap between neighbor cameras. Thus, the pre-processors can send object location coordinates and sizes of bounding boxes containing the objects, together with image data within the bounding boxes to the main processor.
Operation 706 detects objects or partial objects in the captured images from individual cameras, using input information related to the target organisms, such as threshold or fitting curve values for the detection algorithms, or specific features of the target organisms such as feature filters for CNN object detection.
The detection process can include an edge detection (2D or line, monochrome or color), a projection detection, or a neural network detection (2D or 3D). The detection process can be performed at the main processor, if not being performed at the pre-processors or if there are no pre-processors. The detection process can detect whole object if the object is within the field of view of the camera. The detection process can detect a partial object, e.g., a portion of the object, if the object is shared between the fields of view of multiple neighbor cameras. Outputs of the detection process can include the image segments surrounding the objects or the partial objects.
Operation 707 merge or remove duplicate objects across neighbor cameras, including removing duplicated complete or partial objects at the overlapped captured image, and merging objects spanning across multiple cameras. The merge and remove process can be performed at the main processor, if not being performed at the pre-processors or if there are no pre-processors.
The merge process can merge partial objects from neighbor cameras, due to the main processor ability for accessing cross camera data. For example, for a partial object, partial objects from neighbor cameras are evaluated to see if they are from the same object.
The duplicate removal process can remove objects or portions of objects that are duplicated, e.g., showing in more than one camera at the overlapped areas between the cameras. For example, detected objects at the overlapped area are optionally transformed so that the objects are of the same sizes and orientations. Afterward, the objects in multiple cameras are compared to remove the duplicated portions. Outputs of the merge and remove process can include the location coordinates and the sizes (e.g., width and height) of the objects, together with the image data within the bounding boxes surrounding the objects.
Operation 708 determines characteristics, including dimensions and shapes, of detected objects, comparing the characteristics with the input object data, and rejects detected objects not meeting the input object data. The detected objects are compared with the stored input data of the target organisms, to reject detected objects showing discrepancy with the target organisms. Outputs of the process can include the detected target organisms, e.g., the locations and bounding boxes of the detected target organisms, e.g., of the detected objects that have been screened to make sure that they are the target organisms.
Operation 710 forms bounding boxes and locations for the objects meeting the characteristics of the input object data. The bounding boxes and locations can be used to form tracking data of the objects.
Operation 711 transforms objects in bounding boxes, including centering the objects and translating, rotating, skewing, enlarging, or reducing the objects to conform to a same size and orientation. This process can allow a uniform analysis, since the detected objects are all of a same size and orientation.
Operation 712 generates classification scores and categorizes objects, including classifying the objects into different categories or identifying the objects, based on statistically data, for example, through a convolutional neural network.
Operation 713 analyzes the objects in details.
Operation 714 repeats as a function of time for tracking
Operation 715 forms tracing data including movements and other actions of objects
The MCAM system optionally includes one or more excitation sources configured to provide one or more excitation to the sample, with each excitation including a local excitation to an area of the sample or a global excitation to a whole of the sample. The excitation includes a continuous excitation, a periodic excitation, or a pulse excitation. The excitation includes a noise, a sound, an audio effect, a light, a visual effect, an olfactory effect, a vibration, a mechanical manipulation, a chemical or biochemical injection, a fluorescence excitation.
The controller can be configured to store pre-measured calibration information, e.g., information related to the MCAM system, to determine locations of the objects detected from the captured images with respect to individual cameras or with respect to the sample, to determine identifications of the multiple cameras in which the objects are detected, to determine sizes of the objects, to determine a spatial distance between two objects.
The pre-measured calibration information includes camera data (focal lengths, distance between lenses and sensors, pixel sizes and pitches, magnification data, filter data and configurations, pixel rows and columns overlap, distances between cameras, distance to the sample), light source data (distance between light sources, distance between light sources and sample), sample stage data.
The controller can be configured to accept inputs related to the objects being tracked, e.g., object feature information, with the inputs including at least object shapes, dimensions and characteristics, object types, object identification, and threshold values and fitted curves for object screening process such as frame-to-frame change detection, edge detection and projection detection, feature filters for CNN processes.
The controller can be configured to detect objects or partial objects in the captured images from individual cameras, with the detection process including an edge detection (2D or line, monochrome or color), a projection detection, or a neural network detection (2D or 3D). The controller can be configured to merge or remove duplicate objects across neighbor cameras, including removing duplicated complete or partial objects at the overlapped captured image, and merging objects spanning across multiple cameras.
The controller can be configured to determine characteristics, including dimensions and shapes, of detected objects, comparing the characteristics with the input object data, and rejects detected objects not meeting the input object data. The controller can be configured to form bounding boxes and locations for the objects meeting the characteristics of the input object data.
The controller can be configured to transform objects in bounding boxes, including centering the objects and translating, rotating, skewing, enlarging, or reducing the objects to conform to a same size and orientation. The controller can be configured to generate classification scores and categorizes objects, including classifying the objects into different categories or identifying the objects, based on statistically data. The controller can be configured to analyze the objects in details, The controller can be configured to form tracking data including movements of the objects.
Other configurations can be used, such as there are one global excitation source, one local excitation source, or multiple local excitation sources. The excitation sources can be disposed above, below, or at a side of the sample.
In some embodiments, the light sources 1021 and/or 1022 can be configured to function as the excitation source, such as the excitation sources can be placed at or near the light sources. For example, a fluorescent filter can be disposed on a light source, which can provide fluorescent excitation energy to the sample.
An excitation source 1030B1, such as an LED, can emit a radiation signal, such as a visible, infrared, or ultraviolet light, to all areas of the sample, e.g., functioning as a global radiation excitation source. An excitation source 1030B2 can emit a focus radiation to an area of the sample, e.g., functioning as a local radiation excitation source.
The excitation energy 1031B can be periodically pulsed, e.g., the excitation source provides periodic pulses of excitation energy to the sample. The excitation energy can be constant or can be varied, such as changing pitches, duty cycles, on times, off times, a gradually increased excitation energy, or a gradually decreased excitation energy.
The excitation energy 1031C can be one or more pulses, e.g., the excitation source provides one or more pulses of excitation energy to the sample. The excitation energy can be constant or can be varied, such as changing pitches, duty cycles, on times, off times, a gradually increased excitation energy, or a gradually decreased excitation energy.
In addition, the excitation sources can include injectors or micro-injectors to inject various model organisms with certain biochemical material, or to insert specific chemicals, toxins or other biochemical material into the specimen arena. The excitation sources can include manipulators or micro-manipulators, which can be used to manipulate, stimulate, perturb or otherwise change the model organisms or their surrounding area. The excitation sources can include equipment such as voice coils or LCD screens, which can be used to stimulate the visual, auditory, olfactory or other sensory systems of the model organisms within the specimen plane.
The excitation sources can be placed surrounding the specimen or sample holder to manipulate the specimen, the sample, the medium, or the organisms in the sample. The excitation sources can be electronically controlled by a controller or a processor, such as a desktop computer.
In some embodiments, an object detection process can be used on the captured images to detect the presence of the objects, such as the organisms in the sample. The object detection process can include a feature extraction process, which can reduce the image data into a group of features that can be used to determine if an object is present in the image.
The features extraction process can be used to detect shapes or edges in an image. A general and basic approach to finding features is to find unique keypoints, e.g., finding the locations of distinctive features, e.g., the pixel coordinates surrounding the features, on each image. The feature then can be identified as a set of pixel coordinates and box sizes surrounding the feature. For example, the feature detection can look for areas of an image that contain high amounts of information that are likely to contain the features of interest.
In some embodiments, the object detection method can include an edge detection algorithm, e.g., finding the edge features of the object to detect the object. The object detection method combined with centroid-finding algorithms and/or inpainting algorithms to assist with robust object detection.
The image data is sent, pixel by pixel and row by row, from the cameras, e.g., from the image sensors of the cameras, to the processor, either to a main processor or to a processor coupled to each camera. The edge detection algorithm can process the image data after the image or a portion of the image is received by the processor. For example, the edge detection algorithm can process the image data row by row, e.g., processing each row of images after receiving the row data. The edge detection algorithm can process the image data pixel by pixel, e.g., processing each pixel as the pixels are coming. Alternatively, the edge detection algorithm can process the image data after the whole image data is received. The edge detection algorithm is performed by looking for rapid changes in image brightness or contrast in the image data. Example edge detection methods include the application of a Canny filter or a set of other asymmetric convolutional filters.
If the cameras capture images in monochrome, the edge detection algorithm can look at the brightness differences among nearby pixels. If the cameras capture images in color, brightness differences within color channels can be calculated. The rate of change in brightness/contrast (change in brightness divided by the number of nearby pixels) that signifies an edge can be a registered parameter that can be configured by the edge detection algorithm for tuning purposes. In addition, coordinates of the edges detected in each row of an image are stored, which are reset for a new frame, so that the location of each new edge detected can be compared to those of neighboring pixel rows to know whether it is part of the same object.
After an object is detected, information of the detected object, such as the shape, the dimensions, or the aspect ratios of various dimensions, of the detected object is compared to those of targeted objects. The comparison step can allow the removal of detected objects which are not the objects of interest.
Further, the detected object in one frame can be compared or associated with detected object in images from neighbor cameras to merge the same object appearing in multiple cameras. For example, the duplicated portion of the object can be removed based on the overlapped image that shows the same object in different cameras.
The algorithm can consider movements of the object, such as the object can exit the field of view of one camera and enter the field of view of a neighbor camera. Given that there is some amount of overlap in the fields of view of neighbor cameras, the processor can maintain an accurate measurement of the location of each organism as the organism crosses image sensor boundaries, for example, by maintaining a precise measurement in both cameras, and communicating information that the object is moving towards a second camera from a first camera.
In some embodiments, the object detection method can include a projection method or algorithm. In the projection method, image pixels are summed along each row streaming in. The image pixels are also accumulated along each column. A predefined threshold can be used to identify rows and columns with intensity values that deviate from some standard values, which can be created and stored in a look-up table created during a MCAM calibration process, from which an object may be localized along row. The process can be completed along the rows and columns of pixels to localize objects along two coordinates. The projection method can be extended to use fitted curves instead of threshold values and can be combined with centroid finding algorithms and inpainting algorithms to assist with robust object detection.
In
In some embodiments, a neural network such as a convolutional neural network (CNN) can be employed for object detection. Example convolutional neural networks include the YOLO series and Faster-RCNN series of algorithms that can be implemented at high speed. Other convolutional neural networks include Fast R-CNN, Histogram of Oriented Gradients (HOG), Region-based Convolutional Neural Networks (R-CNN), Region-based Fully Convolutional Network (R-FCN), Single Shot Detector (SSD), and Spatial Pyramid Pooling (SPP-net).
The CNN process can be applied to the captured images, such as to each camera image data in parallel, to create bounding box coordinates for detected objects for each camera. The bounding box coordinates can be aggregated, using inter-camera overlaps to reduce double-counting objects and to merge objects having portions in multiple neighbor cameras. The object detection CNN algorithms can additionally report a classification score for each object, in addition to the location and bounding box width/heights. The classification score can be used to categorize each object. Categorizations include unique identification of objects, or for unique identification of object type.
After the feature extraction process, the pooling layers can be subjected to a classification process, which can include a flattening process to form fully connected nodes 1364 and prediction output 1365. The prediction output can include probability distribution of the detected objects with the target organisms, and thus be used to classify the detected objects.
The object detection algorithm can also be used to uniquely identify multiple objects across multiple image frames acquired as a function of time, to enable object tracking as a function of time. For example, the object detection algorithm is used to locate and draw bounding boxes around detected objects. The bounding boxes are then aggregated to form moving videos of the objects.
The objects and partial objects from the center camera and the neighbor cameras can be processed together to merge the objects and partial objects across the cameras using the overlap areas between adjacent cameras. For example, the debris objects 1463 and 1463* are detected in the overlap area of the center camera and a neighbor camera. The two objects 1463 and 1463* are merged, e.g., one object is removed to form a single object 1453*. The partial objects 1450* are detected in the center camera, the neighbor camera, and the overlap area between the two cameras. The two objects 1450* are merged, e.g., the overlap portion of the object in the overlap area is removed to form a single object 1453.
In some embodiments, bounding boxes 1456 can be drawn around the merged objects. The bounding boxes can be employed to produce cropped image segments per frame, wherein only the pixels within each bounding box area are saved and utilized for additional processing. These cropped image segments can subsequently be spatially aligned to create a centered organism video for each organism of interest. Further, per-organism analysis can be performed to provide detailed information for each organism.
The imaging system can capture images of a sample, and can process the image data to obtain a statistical measure of one or more features of objects of interest within the sample.
Operation 1451 performs an image capture process for the imaging system. Operation 1452 performs an object detection process on the captured image. The image areas containing the detected objects can be cropped out from the captured images to form bounding boxes around each object.
In some embodiments, an image captured from each camera can be split into one or more smaller segments, so that the smaller segments can be fed into a supervised machine learning algorithm, such as a deep neural network, that has been trained with prior acquired data for the object detection task. The output of the object detection can be a set of pixel coordinates and box sizes, with each pair of pixel coordinates and two box sizes representing an object. The object detection process can include a rejection of detected object that do not meet the characteristics of the target objects.
Operation 1457 performs analysis and classification on the bounding boxes of objects. For example, each of the bounding box image data can be passed through a supervised machine learning algorithm, such as a deep convolutional neural network (CNN), for the task of machine learning-based image analysis. The deep CNN can be trained with prior data for classifying each objects into one of several categories.
Operation 1458 generates a decision based on a statistical analysis of the object. After the image classification task, the set of all classification scores may be further combined via a statistical (e.g., by computing their mean, median, mode or some other metric) or machine learning-based approach (such as used in multiple instance learning, which would consist of an additional classification-type step on the compiled set of class scores).
In some embodiments, the MCAM system can include a main processor, such as a central processing unit of a desktop computer, which is coupled to the cameras to receive the image data from the image sensors of the cameras. The processor can include a control module, e.g., a controller, for controlling the elements of the MCAM system, such as controlling the camera, the light source, or the excitation source parameters. In some embodiments, the MCAM system can include a controller for controlling the MCAM elements. The controller can include a main processor, such as a central processing unit of a desktop computer or a data processing system.
A parallel to serial data conversion device can be disposed between the main processor and the cameras, for converting the multiple parallel image data streams from the cameras to a serial data image stream to the memory of the processor. The parallel to serial data conversion device can be an FPGA, or any other electronic device configured to perform the parallel to serial conversion.
In operation, after each of the cameras acquires an image, the image data from each camera are sent, in parallel to the FPGA. The FPGA then sequentially outputs the image data into a serial data stream to the processor to be processed, or to the memory of the processor. The parallel to serial conversion, e.g., in the FPGA, can be performed sequentially on each image or on portions of each image. For example, image data from camera 1 is sent first to the processor, followed by the image data from camera 2, and so on. Alternatively, a portion of the image data from camera 1 is sent, followed by a portion of the image data from camera 2, and so on.
An object detection algorithm, and subsequently, an object tracking and analyzing algorithm can be applied on the image data stored in the memory, including an edge detection algorithm, a projection algorithm, a centroid-finding algorithm, a neural network such as a convolutional neural network, or an inpainting algorithm. For example, the object detection is first performed to find the objects of interest, e.g., after removing the objects not suitable. The image data then can be cropped out to form bounding boxes, e.g., regions of interest. The bounding boxes can be centered upon each object of interest, and correlate specific objects as a function of time for tracking. Data from the bounding boxes are saved to the memory after processing.
Using the main processor, advanced processing algorithms on a GPU or CPU can be run, with the advanced algorithms not fast enough or flexible enough to be run on the FPGA. Advantages of the configuration include the ability to reduce saved data for subsequent per-organism analysis. This is especially relevant for MCAM video, which typically streams 50-100 camera frames (10 million pixels each) at 10 frames per second for 5-10 gigabytes of data per second.
In some embodiments, the cameras can include micro-camera packages, which can include multiple camera sensors and optical components assembled on a board 1614, such as on a Printed Circuit Board (PCB).
In operation, the processor can process the image data from the cameras in sequence, e.g., one after the other. The detected objects can be subjected to an across camera analysis to merge objects and to remove duplicated objects across the cameras. The objects in bounding boxes can be analyzed, such as motion tracking and object analysis.
Operation 1707 detects objects or partial objects in the captured images in sequence. Operation 1708 merge or remove duplicate objects across neighbor cameras. Operation 1710 determines characteristics to reject detected objects not meeting the input object data. Operation 1711 forms bounding boxes and locations for the objects meeting the characteristics of the input object data. Operation 1712 transforms objects in bounding boxes. Operation 1714 analyzes the objects in details. Operation 1715 repeats as a function of time for tracking. Operation 1716 forms tracing data including movements and other actions of objects.
In
The controller can be configured to accept inputs related to the objects being tracked, with the inputs including at least object shapes, dimensions and characteristics, object types, object identification.
The controller can be configured to detect objects or partial objects in the captured images from individual cameras. The controller can be configured to merge or remove duplicate objects across neighbor cameras. The controller can be configured to determine characteristics to reject detected objects not meeting the input object data.
The controller can be configured to form bounding boxes and locations for the objects meeting the characteristics of the input object data. The controller can be configured to transform objects in bounding boxes. The controller can be configured to analyze the objects. The controller can be configured to form tracking data including movements of the objects.
In some embodiments, the present invention discloses methods to capture microscopy images from multiple image sensors and transfer them to a central processing unit with minimum delay in the image transfer. A benefit of the MCAM system is the ability to rapidly record high-resolution microscopy imagery over a very large field of view using a multitude of micro-cameras. Further, the MCAM system architecture can include complete or partial parallel processing for each image data captured from the cameras, which can negate the disadvantage of serially processing the image data from the multiple cameras.
In some embodiments, the cameras can include micro-camera packages, which can include multiple camera sensors and optical components assembled on a printed circuit board.
The multiple pre-processors can be integrated to the cameras, or can be a separate device. For example, the multiple pre-processors can be a separate FPGA (or any other electronic device) coupled between the cameras and the parallel to serial device 1867. Alternatively, the multiple pre-processors can be integrated to the parallel to serial device 1867, e.g., an FPGA can be configured to perform the multiple pre-processor functions and the parallel to serial function.
In operation, the pre-processors 1841 and the main processor 1842 can share the data analysis, including detecting objects, merging objects, removing debris, boxing objects, tracking object movements, and analyzing objects. The division of labor between the processors 1841 and 1842 can vary, from a lot of analysis performed on the main processor 1842 to a lot of analysis performed on the processors 1841.
In some embodiments, the pre-processors are configured to perform a quick analysis to quickly screen the image data to determine if there are objects in the image data from each camera. Only image data from the cameras detecting objects are sent to the processor 1842 for analysis. Thus, the processor 1842 is configured to perform a same analysis operation as without the pre-processors. A main benefit is the reduction of image data, since only the image data having objects are sent to the processor 1842.
Thus, for samples with few objects, the screening operation of the pre-processors 1841 can be beneficial, since it significantly reduces the amount of data that the processor 1842 needs to process. For example, in an MCAM system having 100 cameras with one or two organisms, the number of cameras having an object ranges from 2 (the objects are in a middle FOV of 2 cameras) to 3 (one object is in a middle FOV of one camera and one object is between 2 cameras) to 4 (two objects are between 2 cameras) to 5 to 6 to 8 (two objects are between 4 cameras), for a reduction ratio between 2/100 and 8/100.
The quick screening operation can include a determination of no frame-to-frame change across the cameras, up to some threshold, as frames stream in from each camera. The pre-processors can store previous frames from the cameras in memory, and the algorithm running on the pre-processors can compare new frames with the stored frame, such as computing the energy of the difference between new frames acquired from each camera and the previously stored frames from the same camera. Alternatively, the comparison can be performed on the background images, obtained and stored from a calibration process.
Further, the quick screening operation can include a detection of any finite area with significant deviation, up to some threshold, with respect to the background, by computing a spatial gradient via convolution and examining the total energy. The spatial gradient can additionally be implemented across one or more previously captured frames, and the change in the gradient over time (e.g., from frame to frame) can be used as an indication of whether there is a frame-to-frame change as a function of time for one or more micro-cameras within the array.
Subsequently, the MCAM an turn off the cameras that contain limited or no frame-to-frame change as a function time, or limited/no significant deviation identified via computing the spatial gradient and examining the total energy across each frame acquired as a function of time. By “turning off” cameras that exhibit no frame-to-frame change over time, it means that no data from such cameras will be passed along from Processor 1 to Processor 2 for subsequent processing. The power doesn't necessarily need to be eliminated, but instead the data can be ignored during the “turned off” state. This approach reduces the total data overhead sent from Processor 1 to Processor 2 and can subsequently yield higher image frame rates.
The pre-processors can then send the remaining frames, which do have frame-to-frame change or deviation of the spatial gradient energy as compared to some threshold (e.g., from cameras that have not been turned off), to the main processor for analysis. The main processor can be located on a nearby computer, either the same computer used to control the MCAM imaging system, or a separate computer dedicated to image processing.
Cameras from a camera array can capture images from a sample. After the images are captured, a pre-processing module in each camera can pre-process the data of the captured image, such as detecting the presence or absence of objects. The image data from cameras showing objects are sent to the parallel to serial device 1967, to form a serial data stream to the memory 1943, for example, by direct memory access.
Operation 2105 processes captured image data for each camera in parallel to find cameras that the captured image data not detecting any objects, e.g., to determine an absence or presence of an object, such as only containing background image data, showing no frame-to-frame change, showing no area with significant deviation with respect to the background, or detecting no object. The process is configured to detect a presence of an object, detecting whether or not there is an object, and not to detect an object, e.g., detecting locations of the object. The object present detection process can be faster than the object detection process, for example, the object present detection process can only observe changes in intensity in local areas or in whole frame, without the need for a detailed analysis to find the object.
Operation 2106 sends the captured image data from the cameras not detecting any objects to a central processor. For example, the captured data can be sent to a parallel to serial device, such as an FPGA, which can stream the multiple image streams from the cameras to the central processor. The parallel to serial device can be input controllable, e.g., the device can determine which input image streams to be used for forming the serial data stream. The input controlled characteristic of the parallel to serial device can be performed by programming the FPGA, which can allow the FPGA to only send image data from cameras detecting the presence of an object.
Thus, the one or more processors coupled to the cameras can be used for screening the image data from the cameras to only processing the image data having objects.
Operation 2107 detects objects or partial objects in the captured images in sequence. Operation 2108 merge or remove duplicate objects across neighbor cameras. Operation 2110 determines characteristics to reject detected objects not meeting the input object data. Operation 2111 forms bounding boxes and locations for the objects meeting the characteristics of the input object data. Operation 2112 transforms objects in bounding boxes. Operation 2114 analyzes the objects in details. Operation 2115 repeats as a function of time for tracking. Operation 2116 forms tracking data including movements and other actions of objects
The multiple cameras are disposed on a board, with each camera coupled to a processor or a device configured to detect a presence or an absence of objects in the captured images, such as only containing background image data, showing no frame-to-frame change, showing no area with significant deviation with respect to the background, or detecting no object.
The processors are coupled to the controller to deliver the captured image data from the camera detecting the presence of an object in a serial data stream. The processors can be disposed on the board, or disposed on a separate element. The processors can be configured to be multiple separate devices, with each device of the multiple separate components coupled to a camera. Alternatively, the processors are configured to be in one or more devices, with each device of the one or more devices including one or more processors of the processors for coupling to one or more cameras. Alternatively, the processors configured to be in a single component, with the single component coupled to the multiple cameras.
The controller can be configured to form tracking data including movements of objects detected from the captured images sent to the controller.
In some embodiments, the pre-processors are configured to detect objects or partial objects in the captured images, and forming bounding boxes around the detected objects or partial objects. The bounding boxes data are then sent to the processor 2342 for analysis. Image data without objects are not sent. In addition, areas around the objects in image data having objects are also not sent. The processor 2342 is configured to perform cross camera analysis on the bounding boxes, together with tracking and analysis of the objects.
The processor is configured to form bounding boxes around the detected objects or partial objects. The processors are coupled to the controller to deliver the bounding boxes in a serial data stream. The processors are disposed on the board, or disposed on a separate element. The processors are configured to be multiple separate components, with each component of the multiple separate components coupled to a camera. Alternatively, the processors are configured to be in one or more components, with each component of the one or more components including one or more processors of the processors for coupling to one or more cameras. Alternatively, the processors configured to be in a single component, with the single component coupled to the multiple cameras.
The controller can be configured to merge or remove duplicate bounding boxes across neighbor cameras to form composite bounding boxes. The controller can be configured to determine characteristics, including dimensions and shapes, of detected objects, comparing the characteristics with the input object data, and rejects detected objects not meeting the input object data. The controller can be configured to accept composite bounding boxes for the objects meeting the characteristics of the input object data. The controller can be configured to form tracking data including movements of objects detected from the captured images sent to the controller.
In some embodiments, the pre-processors are configured to detect objects in the captured images, and forming bounding boxes around the detected objects. Since the pre-processors are also connected to neighbor cameras, the cross camera analysis to merge and to remove duplicates can be performed at the pre-processors to form bounding boxes around the detected objects. The bounding boxes data are then sent to the processor 2642 for analysis. Image data without objects are not sent. In addition, areas around the objects in image data having objects are also not sent. The processor 2642 is configured to perform tracking and analysis of the objects.
The bounding box data are sent to the FPGA 2667, which performs a parallel to serial conversion. The serial data stream is then sent to the processor, e.g., to a memory of the processor, for analysis, including 2652 tracking and analyzing the objects.
There are two primary advantages of implementing object detection and tracking on the pre-processors. First is the data reduction at standard frame rates. In some embodiments, a full image from one sensor has between 10-20 million pixels. If object detection is implemented and only 3 objects are identified, each taking up 10,000 pixels, then the total number of pixels that will be passed to the computer is just 30,000, which is much less than the full frame. Second, data reduction is possible with increased frame rate. Many image sensors allow for a region of interest within the sensor's imaging area to be specified for acquisition, which reduces the amount of data the sensor needs to output to the FPGA per frame. This, therefore, allows the sensor to output video streams at increased frame rates. The FPGA, which is tracking model organisms in real-time, would be able to configure the sensors to output only the range of pixels that encompass the tracked organism, thereby increasing the frame rate of video acquisition as well as reducing the amount of data the FPGA needs to analyze to detect and track the model organisms in the next detection cycle. This would not only help the tracking algorithm detect model organisms in the image data more quickly and accurately, but it would provide for the user high frame-rate video footage, which is useful, if not necessary, for studies investigating motor function, neural activity, and behavior among other things.
In some embodiments, the frame rate of the cameras can be controlled, for example, by a main processor or a controller, based on the data transfer rate of the parallel to serial device. High camera frame rate can be set for low data transfer rate, which is related to the number of detected organisms in the sample. The optimum frame rate can be the frame rate corresponded to the maximum data transfer rate of the parallel to serial device, e.g., the FPGA between the pre-processors and the main processor.
The MCAM system further includes one or more processors coupled to the multiple cameras, with each processor coupled to a first camera and second cameras neighboring to the first camera. Each processor is configured to detect objects or partial objects in the captured images from the first camera. Each processor is configured to merge the detected partial objects or remove the objects or the partial objects duplicated with the second cameras. Each processor is optionally configured to reject objects of the objects or the merged objects not meeting characteristics of input object data, Each processor is configured to form bounding boxes around the non-rejected objects and merged objects.
The processors are coupled to the controller to deliver the bounding boxes in a serial data stream. The multiple cameras and the one or more processors are disposed on a board, Alternatively, the multiple cameras are disposed on a board and the one or more processors are disposed on a separate element.
The processors are configured to be multiple separate components, with each component of the multiple separate components coupled to a camera. Alternatively, the processors are configured to be in one or more components, with each component of the one or more components including one or more processors of the processors for coupling to one or more cameras. Alternatively, the processors configured to be in a single component, with the single component coupled to the multiple cameras.
The controller can be optionally configured to reject objects of the objects or the merged objects not meeting characteristics of input object data, The controller can be configured to form bounding boxes around the non-rejected objects and merged objects. The controller can be configured to form tracking data including movements of objects detected from the captured images sent to the controller.
In some embodiments, the present invention discloses a microscope technology that offers the ability to track and image organisms in large areas in 2D or 3D. The technology include multiple cameras having overlapped fields of view, which can be utilized for depth determination using stereoscopy of photogrammetry. For example, tuning the microscope to an large amount of field of view overlap, such as at least 50% in one direction, can enable the MCAM system to perform 3D object tracking and 3D organism behavior analysis across a finite depth range, which is useful in certain applications of model organism behavioral study.
After tuning to larger than 50% overlap, all areas of the sample are overlapped by two or more cameras. In the overlap areas, optical information about points within the specimen plane are captured by two or more cameras. Such redundant information can be used by stereoscopic and/or photogrammetry methods to obtain an estimate of object depth and/or an object depth map, which can be combined with the 2D information that is captured about object position and morphology.
With the larger than 50% overlap, all areas in the sample are captured by the cameras in two or more images. The captured images can be processed to obtain 3D positions of the objects, for example, by inputting the image data into a 3D object detection convolutional neural network (CNN), which employs stereoscopic or photogrammetry in the feature kernels.
In
In
In some embodiments, an MCAM system can be tuned to have 50% or larger overlap field of view for 3D object tracking. The overlap amount can be changed by changing the magnification of the cameras or the fields of view of the cameras. For example, decreasing the magnification of each camera can increase the inter-camera overlap for the MCAM system.
In some embodiments, the MCAM can include a mechanism for toggling between 2D tracking and 3D tracking, for example, by changing the magnification of the cameras to obtain between 50% or less or 50% or more FOV overlap.
In some embodiments, the MCAM can be tuned to 50% or more FOV overlap for both 2D and 3D object tracking. The 50% or more FOV overlap can enable 3D object tracking, and does not affect the ability of the MCAM for 2D object tracking.
With less than 50% FOV overlap, the MCAM is tuned for 2D object tracking, e.g., maximizing the sample area or maximizing the magnification of the sample.
In
With more than 50% FOV overlap, the MCAM is tuned for 3D object tracking. The MCAM can be set to more than 50% FOV overlap for both 2D and 3D object tracking.
In some embodiments, the MCAM can include a mechanism for toggling between 2D tracking and 3D tracking, for example, by adjusting the camera stage, the sensor adjustment mechanism, the lens adjustment mechanism, or the optic adjustment mechanism.
Changing at least a characteristic of individual cameras includes changing a magnification or a field of view of individual cameras in the MCAM. Changing at least a characteristic of a camera stage includes moving the camera stage relative to the sample support. Changing at least a characteristic of a sample support includes moving the sample support relative to the camera stage.
In
In
The optional pre-process includes finding excluded cameras that the captured image data not detecting any objects, such as only containing background image data, showing no frame-to-frame change, or detecting no object. Alternatively, the optional pre-process includes detecting objects, and forming bounding boxes in 3 dimensions around the detected objects. Operation 3405 sends the image data captured by the cameras or optionally pre-processed by the pre-processors to the central processor.
The image sending includes sending image data captured by the cameras, as a serial data stream to a memory of the central processor to be processed. Alternatively, the image sending includes sending image data captured by the cameras excluding image data from the excluded cameras. Alternatively, the image sending includes sending bounding box image data pre-processed by the pre-processors, as a serial data stream to a memory of the central processor.
Operation 3406 detects objects or partial objects in the captured images from individual cameras in 3 dimensions. Operation 3407 merge or remove duplicate objects across neighbor cameras. Operation 3408 determines characteristics of detected objects, compares the characteristics with the input object data, and rejects detected objects not meeting the input object data. Operation 3410 forms bounding boxes and locations in 3 dimensions for the objects meeting the characteristics of the input object data. Operation 3413 analyzes the objects. Operation 3414 repeats as a function of time for tracking. Operation 3415 forms tracing data including movements and other actions of objects.
The controller can be configured to accept inputs related to the objects being tracked, with the inputs including at least object shapes, dimensions and characteristics, object types, object identification.
The controller can be configured to detect objects or partial objects in the captured images from individual cameras in 3 dimensions. The controller can be configured to merge or remove duplicate objects across neighbor cameras. The controller can be configured to determine characteristics to reject detected objects not meeting the input object data. The controller can be configured to form bounding boxes and locations in 3 dimensions for the objects meeting the characteristics of the input object data.
The controller can be configured to transform objects in bounding boxes, including centering the objects and translating, rotating, skewing, enlarging, or reducing the objects to conform to a same size and orientation. The controller can be configured to analyze the objects. The controller can be configured to form tracking data including movements of the objects.
The present patent applicant claims priority from the U.S. Provisional Patent Application, Ser. No. 63/230,472, filed on Aug. 6, 2021, entitled “System and method to simultaneously track multiple organisms at high resolution”, of the same inventors, hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63230472 | Aug 2021 | US |