This application claims priority to European Application 23155841.2, which was filed on Feb. 9, 2023. The content of this earlier filed application is incorporated by reference herein in its entirety.
Examples relate to method, system, and computer program for adjusting a first and a second machine-learning model, to a method, system, and computer program for processing a set of images, imaging system, and to an imaging system.
In biomedical research, biological processes are monitored using imaging systems, such as microscopes. In many research methodologies, hypotheses are tested by observing such biological processes over time, and using the observations to conclude whether the observations match the respective hypotheses. In general, this monitoring of the biological processes can be improved, and, to a degree, automated, by evaluating the observations with the help of an image analysis workflow, which is used to analyze various aspects of the image data showing the biological process. However, designing such an image analysis workflow may require a substantial amount of skill from the operator of the respective imaging system.
There may be a desire for providing an improved concept for configuring an image analysis workflow of an imaging system being used for the observation of biological processes.
This desire is addressed by the subject-matter of the independent claims.
Various examples of the present disclosure are based on the finding, that, when a hypothesis to be confirmed is known, machine-learning can be used to tune the image analysis workflow such, that it is capable of providing the information required for testing the hypothesis. For this purpose, two machine-learning models are trained together-a first machine-learning model to configure the image analysis workflow (or to implement the image analysis workflow), and a second machine-learning model to predict the hypothesis being tested. By end-to-end training the first and second machine-learning model, the output of the first machine-learning model converges towards an image analysis workflow that is suitable for predicting the hypothesis, and that is thus also suitable for evaluating the hypothesis. Thus, the image analysis workflow can be configured or implemented without requiring manual adjustments by the operator, which facilitates the process, in particular for less experienced operators.
Some aspects of the present disclosure relate to a method for adjusting a first and a second machine-learning model. The method comprises inputting a set of images representing a biological process into the first machine-learning model. The first machine-learning model is trained to perform an image analysis workflow or to generate parameters for parametrizing an image analysis workflow. The method comprises inputting an output of the image analysis workflow into the second machine-learning model. The second machine-learning model is trained to output a prediction of a hypothesis being evaluated using the biological process. The method comprises calculating a loss function based on a difference between the prediction of the hypothesis being evaluated using the biological process and an actual hypothesis being evaluated using the biological process. The method comprises adjusting the first and/or second machine-learning model based on the result of the loss function. As outlined above, the resulting image analysis workflow converges towards an image analysis workflow that is suitable for predicting the hypothesis, without requiring manual adjustments by the operator, which facilitates the process, in particular for less experienced operators.
In various examples, the goal of the proposed procedure is to identify a configuration or implementation of the image analysis pipeline that is suitable for confirming or disproving the hypothesis. Accordingly, adjustment of the first and/or second machine-learning model may continue until the hypothesis predicted by the second machine-learning model matches the actual hypothesis that is to be confirmed. In other words, the first and/or second machine-learning model may be adjusted until the prediction of the hypothesis matches the actual hypothesis according to a matching criterion.
While the proposed scheme can be applied from scratch to an arbitrary set of images and corresponding hypothesis, convergence may be sped up by training the machine-learning model(s) first in a training phase, and then just performing minor adjustments in an application phase. In some examples, the method may be performed over a plurality of iterations using a plurality of sets of images as training input images and a plurality of corresponding actual hypotheses for comparison with the hypotheses predicted by the second machine-learning model to train the first and/or second machine-learning model. This may occur in the training phase, to obtain machine-learning models that can be broadly applied to different sets of images and hypotheses. In the application phase, the adjustment may be continued, albeit with the set of images and hypothesis for which an image analysis workflow is to be established.
As an alternative to the training of the machine-learning model(s) being part of the proposed method, the training of the machine-learning model(s) may be performed a priori, e.g., by a different entity, such as the manufacturer of an imaging system on which the proposed method is to be applied. For example, the first and second machine-learning models may be pre-trained machine-learning models, which are adjusted in the field.
The proposed concept works by applying machine-learning, e.g., supervised learning, on an entire pipeline comprising the first machine-learning model, the second machine-learning model, and, if not already implemented by the first machine-learning model, the image analysis workflow. During training, the loss function (or reward function) is based on the difference between the output of the pipeline and the actual hypothesis. As there is no separate way to evaluate the output of the first machine-learning model, both machine-learning models may be trained together in an end-to-end manner. In other words, the first and second machine-learning model may be adjusted and/or trained together in an end-to-end manner.
As outlined above, there are two broad implementation categories for implementing the first machine-learning model. In a first implementation category, the first machine-learning model is used to parametrize a separate image analysis workflow, while, in the second implementation category, the first machine-learning model includes the image analysis workflow. In the first implementation category, the first machine-learning model may be trained to generate parameters for parametrizing the image analysis workflow. In this case, the method comprises processing the set of images using the image analysis workflow. Moreover, the image analysis workflow is parametrized based on an output of the first machine-learning model.
In the present context, parametrizing the image analysis workflow is not (necessarily) restricted to the generation of parameters for a fixed set of image processing/analysis steps. It may also mean selecting the image processing/analysis steps being used, in addition to parameters being used to parametrize the selected image processing/analysis steps. For example, the first machine-learning model may be trained to select at least one of a use of one or more image processing steps, one or more numerical parameters of one or more image processing steps, and one or more categorical parameters of one or more image processing steps for the image analysis workflow. In particular, the image analysis workflow may comprise at least one of one or more deterministic image processing steps and one or more machine-learning-based image processing steps. These imaging processing steps may be selected and/or parametrized by the first machine-learning model.
While, during adjustment/training of the machine-learning model, the output of the second machine-learning model is used to calculate the loss function, during application of the pipeline, the output of the image analysis workflow may be the desired output, to enable a manual or automated evaluation of the hypothesis. Accordingly, the method may comprise providing the output of the image analysis workflow.
In some examples, it may be beneficial for the performance of the second machine-learning model if the second machine-learning model does not only receive the output of the image analysis workflow (which might not contain the images), but also the images (or processed versions thereof). For example, the set of images or a processed version of the set of images may be used as further input to the second machine-learning model. This may speed up convergence of the machine-learning model(s) during training and/or adjustment of the machine-learning model(s).
As the name implies, a biological process involves some kind of transformation (i.e., development). If such a biological process is to be monitored using an imaging system, this transformation is also discernible from the images documenting the biological process. Accordingly, the set of images may comprise a sequence of images showing a development of the biological process over time.
Machine learning is a (mostly) automated process, which is based on the determination of a loss function (or reward function), which represents the quality of the transformations applied to the machine-learning model during adjustment/training. In the present case, to be able to calculate such a loss function, the respective hypotheses may be codified in a formal representation, such that the two hypotheses can be compared to calculate the loss function. For example, the second machine-learning model may be trained to output a formal representation of the prediction of the hypothesis. The loss function may be calculated based on a comparison between the formal representation of the prediction of the hypothesis and a formal representation of the actual hypothesis. The formal representation may enable or facilitate the automated calculation of the loss function.
During training, the hypotheses being used may be taken from the training corpus comprising the plurality of sets of images. However, to prepare such a corpus, or during application of the proposed concept on a new set of images and hypothesis, the hypothesis may be derived from a user input. For example, the method may comprise processing user input to generate the formal representation of the actual hypothesis. For example, this may be part of a wizard for configuring the image analysis workflow, which may start by taking the set of images using an imaging device (such as a microscope) and by inputting the hypothesis.
The specification of a formal representation is a highly complex task. However, the structured nature of the formal representation defines a template that can be filled efficiently by filling in the blanks of the template. This can be done by recognizing the content of the blanks in the user input. For example, the user input may comprise one of spoken text and unstructured written text. The method may comprise processing the user input using natural language processing. This may greatly facilitate inputting the hypothesis. Alternatively, the user input may comprise structured input.
In the following, details are given on an example implementation of such a formal representation of the hypothesis. In the following, the templates of the hypothesis are presented. For example, the respective formal representation may represent at least one of a relation between two entities, a relation between two entities being dependent on a condition, a cell fate being dependent on a condition, a cell type distribution being dependent on a condition, a two-dimensional or three-dimensional geometry being dependent on a condition, and an entity distribution of a non-numerable entity being dependent on a condition. For example, in case of the formal representation representing a relation between two entities, the “blanks” in the template may include the type of relation, a first entity, and a second entity. For example, in case of the formal representation representing a cell fate being dependent on a condition, , the “blanks” in the template may include the cell fate and the condition etc.
In the following, examples are given for “entity”, “relation” and “condition”. For example, an entity may be of type chemical, protein, nucleotide, carbohydrate, lipid, drug and disease. For example, a relation may comprise one of a first entity acting as activator for a second entity, a first entity acting as inhibitor for a second entity, a first entity acting as antagonist for a second entity, a first entity acting as upregulator for a second entity, a first entity acting as downregulator for a second entity, a first entity acting as substrate of a second entity, a first entity being a product of a second entity. Additionally, or alternatively, a relation may be one of a chemical-protein relation, a drug-drug-interaction, and a gene-disease interaction, and a participant-intervention-comparator-output relation. For example, a condition may be one of a perturbant concentration condition, a culture condition, a co-culture condition, a cell composition condition and a proximity condition. By defining suitable lists of templates, entities, relations and conditions, the respective hypotheses may be easily constructed, e.g., using natural language processing, and in particular using named entity recognition.
In some examples, the training and/or adjustment of the machine-learning model(s) may be done by one system, while the resulting image processing workflow is used on another system. Some aspects of the present disclosure relate to a method for processing a set of images representing a biological process. The method comprises inputting the set of images representing the biological process into a machine-learning model. The machine-learning model is trained, according to the above method, to perform an image analysis workflow or to generate parameters for parametrizing an image analysis workflow. The method comprises processing the set of images using the image analysis workflow. The method comprises providing an output of the image analysis workflow.
As outlined above, in some cases, the output of the image processing workflow is used (directly) to evaluate the hypothesis. In some examples, in addition to the output of the processing workflow, or instead of it, the output of the second machine-learning model (i.e., the predicted hypothesis) may be used as part of the evaluation of the image processing workflow. For example, the method may comprise inputting the output of the image analysis workflow into a second machine-learning model, the second machine-learning model being trained to output a prediction of a hypothesis being evaluated using the biological process. The method may comprise providing the prediction of the hypothesis being evaluated using the biological process.
Another aspect of the present disclosure relates to a system comprising one or more processors and one or more storage devices. The system is configured to perform at least one of the above methods. In general, such a system may be placed in different locations, i.e., training and application of the proposed concept may be performed in different locations. For example, the system may be one of a server, a cloud computing node, a workstation computer and an embedded device.
An aspect of the present disclosure relates to computer program with a program code for performing the method according to one of the above methods when the computer program may be run on a processor.
Another aspect of the present disclosure relates to an imaging system comprising the above system and a scientific imaging device, such as a microscope. For example, the scientific imaging device is configured to generate the set of images. For example, the system may be co-located with the scientific imaging device. In other words, the proposed concept may be applied locally, at the scientific imaging device.
Some examples of apparatuses and/or methods will be described in the following by way of example only, and with reference to the accompanying figures, in which:
Various examples will now be described more fully with reference to the accompanying drawings in which some examples are illustrated. In the figures, the thicknesses of lines, layers and/or regions may be exaggerated for clarity.
Various examples of the present disclosure relate to a concept (e.g., a method) for selecting an appropriate image analysis workflow based on user input, e.g., based on a user input of a scientific hypothesis.
The present disclosure relates to biomedical imaging, and in particular to a system which automatically selects and configures an image analysis workflow which is improved or optimized to predict information supporting a user-defined hypothesis. In the following, it is discussed how a hypothesis can be formulated that can be answered by biomedical images which have been transformed and from which information was extracted and how to find a mapping from image to information and information to hypothesis by means of image analysis and machine learning. For example, the present concept may be used to select, configure, and modify an image analysis workflow in response to a user-defined hypothesis.
In other systems, selecting and configuring an image analysis workflow is typically done manually by the user. In such systems, the user needs a lot of technical expertise in image analysis as well as microscopy in order to pick or create a suitable image analysis workflow. The hypothesis has to be formulated precisely and then an experiment has to be designed and an image analysis workflow has to be created. The proposed concept assists with the latter task.
In the proposed concept, a backwards path is used from the goal of supporting or disproving a given scientific hypothesis. It is discussed how to build and train a pipeline for extracting relevant information from images which support this hypothesis. Selection and configuration of the image analysis workflow is based on learned parameters and can therefore be automated, given suitable biomedical images, a clearly formulated hypothesis and the possibility of finding a mapping between the two.
The proposed concept comprises two aspects-configuring and training an image analysis workflow, and inputting and formulating a hypothesis by the user (optional). For the former (configurating and training an image analysis workflow), the image analysis workflow may be configured and/or trained to improve or optimize the extraction of image information suitable to observation of the hypothesis in question using images from user input, an imaging device or a data repository. For the latter (inputting and formulating a hypothesis by the user), at least one of a voice interface which translates utterings into unstructured text, a standard text input by means of which the user can input the hypothesis as unstructured text, or an expert interface where the user selects observables, entities and relations from a guided dialog may be used.
In the following, the two tasks are described separately-how does the system select a suitable workflow based on this hypothesis and microscope images, and how does the user formulate a scientific hypothesis.
First, the configuration of a suitable image analysis workflow is discussed. This is done with the help of a pair of machine-learning models.
The method of
Machine learning generally refers to algorithms and statistical models that computer systems may use to perform a specific task without using explicit instructions, instead relying on models and inference. For example, in machine-learning, instead of a rule-based transformation of data, a transformation of data may be used, that is inferred from an analysis of historical and/or training data. For example, in a popular example of machine-learning, the content of images may be analyzed using a machine-learning model or using a machine-learning algorithm. In order for the machine-learning model to analyze the content of an image, the machine-learning model may be trained using training images as input and training content information as output. By training the machine-learning model with a large number of training images and/or training sequences (e.g. words or sentences) and associated training content information (e.g. labels or annotations), the machine-learning model “learns” to recognize the content of the images, so the content of images that are not included in the training data can be recognized using the machine-learning model. The same principle may be used for other kinds of sensor data, or more generally, data, as well: By training a machine-learning model using training sensor data and a desired output, the machine-learning model “learns” a transformation between the sensor data and the output, which can be used to provide an output based on non-training sensor data provided to the machine-learning model. The provided data (e.g. sensor data, meta data and/or image data) may be preprocessed to obtain a feature vector, which is used as input to the machine-learning model.
In many cases, machine-learning models may be trained using training input data. The examples specified above use a training method called “supervised learning”. In supervised learning, the machine-learning model is trained using a plurality of training samples, wherein each sample may comprise a plurality of input data values, and a plurality of desired output values, i.e. each training sample is associated with a desired output value. By specifying both training samples and desired output values, the machine-learning model “learns” which output value to provide based on an input sample that is similar to the samples provided during the training. Apart from supervised learning, semi-supervised learning may be used. In semi-supervised learning, some of the training samples lack a corresponding desired output value. Supervised learning may be based on a supervised learning algorithm (e.g. a classification algorithm, a regression algorithm or a similarity learning algorithm. Classification algorithms may be used when the outputs are restricted to a limited set of values (categorical variables), i.e. the input is classified to one of the limited set of values. Regression algorithms may be used when the outputs may have any numerical value (within a range). Similarity learning algorithms may be similar to both classification and regression algorithms but are based on learning from examples using a similarity function that measures how similar or related two objects are. Apart from supervised or semi-supervised learning, unsupervised learning may be used to train the machine-learning model. In unsupervised learning, (only) input data might be supplied and an unsupervised learning algorithm may be used to find structure in the input data (e.g. by grouping or clustering the input data, finding commonalities in the data). Clustering is the assignment of input data comprising a plurality of input values into subsets (clusters) so that input values within the same cluster are similar according to one or more (pre-defined) similarity criteria, while being dissimilar to input values that are included in other clusters.
Reinforcement learning is a third group of machine-learning algorithms. In other words, reinforcement learning may be used to train the machine-learning model. In reinforcement learning, one or more software actors (called “software agents”) are trained to take actions in an environment. Based on the taken actions, a reward is calculated. Reinforcement learning is based on training the one or more software agents to choose the actions such, that the cumulative reward is increased, leading to software agents that become better at the task they are given (as evidenced by increasing rewards).
In the present context, a more complex setting is used, in which two machine-learning models are used together as part of a pipeline, with an image analysis workflow inserted between the two machine-learning models (if the image analysis workflow is not already contained in the first machine-learning model). The set of images 230 representing a biological process is input into the pipeline by inputting the set of images into the first machine-learning model. On the other end of the pipeline, the predicted hypothesis is output by the second machine-learning model. If the two machine-learning models were trained individually, a suitable loss function would have to be defined for evaluating the quality of the resulting image analysis workflow. However, without knowledge about the purpose of the image analysis workflow, the quality of the image analysis workflow might not be easily evaluated automatically. As a result, an expert might have to judge the quality of the image analysis workflow manually, or at least specify a desired image analysis workflow manually, resulting in a highly labor-intensive training of the first machine-learning model. In the proposed concept, the second machine-learning model is used, in a way, to evaluate the quality of the image analysis workflow-if the second machine-learning model can predict, from the output of the image analysis workflow, which hypothesis is being tested, the image analysis workflow has yielded suitable information for confirming or disproving the hypothesis. Thus, if the second machine-learning model predicts the same (or a highly similar) hypothesis as the actual hypothesis being evaluated using the biological process, then the output of the image analysis workflow is, in all likelihood, also suitable for confirming or disproving the hypothesis. In effect, the entire pipeline is trained in an end-to-end manner to reduce the difference between the predicted hypothesis and the actual hypothesis, with the result that the image analysis workflow, which is part of the pipeline, is configured or implemented in a way that is suitable for the hypothesis at hand. In other words, the first and second machine-learning model are adjusted and/or trained together in an end-to-end manner. For this purpose, supervised learning may be used (with the sets of images (and optionally the actual hypotheses) as training inputs, the actual hypotheses as desired outputs, and the loss function). Alternatively, reinforcement learning may be used, with the loss function being used to calculate the reward in this case.
The task for the training is to configure an image analysis workflow such that it extracts information which can help to observe a particular hypothesis (see also
Input images 230 are loaded by the user 200, recorded by an imaging device 210 or loaded/streamed from a data repository 220. As shown in
The first image analysis model 240 (i.e., the first machine-learning model) is trained to process the images. In some examples, which will be elaborated in the following, the first image analysis model is trained to predict an image analysis workflow and initial parameters 250 for it (i.e., to generate parameters 250 for parametrizing an image analysis workflow 260). Alternatively, the image analysis workflow can be fully replaced by a deep learning model, in which case the model 240 does not need to predict any workflow parameters but predict a desired information (i.e., the output of the image processing workflow) 270 directly, i.e., implement the image analysis workflow. Accordingly, the first image analysis model 240 may be trained to perform the image analysis workflow.
In the former case, the first image analysis model 240 (i.e., the first machine-learning model) is trained to generate parameters 250 for parametrizing the image analysis workflow. In this case, the image analysis workflow comprises a plurality of image processing and analysis steps. For example, within the image analysis workflow, an arbitrary sequence of image processing and/or analysis steps may be used. In particular, the image processing workflow comprises a plurality of image processing or analysis steps that are executed sequentially, i.e., at least two image processing/analysis steps. To give an example-the image analysis workflow may comprise one or several general-purpose image processing steps, such as debayering, denoising, sharpening, contrast adjustment, bandpass filtering etc. In addition, one or more image analysis steps may be applied on the processed image data. For example, in one branch of the image analysis workflow, an image processing step may be applied to isolate a wavelength range (e.g., range of colors) that is indicative of a specific chemical, protein or disease. In a subsequent image analysis step, image segmentation may be performed to delimit the area, in which the specific chemical, protein or disease occurs. Finally, a further image analysis step may be applied to estimate the proportion of the overall area in which the specific chemical, protein or disease occurs, or fitting may be used to output information on a 2D or 3D geometry of the occurrence of the chemical, protein or disease. In another branch, an image processing step may be applied to isolate a wavelength range (e.g., range of colors) that is indicative of a cell fate. In a subsequent image analysis step, a trained classifier may be used to output the cell fate based on the isolated wavelength range. In another branch, image segmentation or image classification may be used to determine presence and/pr location of an entity.
Such image processing steps may be implemented using different techniques-for example, denoising may be performed using a (one-pass) deterministic filter (e.g., removing outliers based on the content of adjacent pixels of the same color/channel), using an iterative and deterministic filter (i.e., multi-pass) deterministic filter (e.g., gradually reducing differences based on the content of adjacent pixels), using a (one-pass) machine-learning-based filter (e.g., passing the image once through a machine-learning model being trained to reduce noise) or using an iterative machine-learning-based filter (e.g., based on a generative adversarial network or based on reinforcement learning). In other words, the image processing workflow may comprise at least one of one or more deterministic image processing steps, and one or more machine-learning-based image processing steps.
In the case of the first machine-learning model generating the parameters for configuring the image analysis workflow, the generation of the parameters might not only include the generation of parameters for a fixed set of image processing/analysis steps, but rather also the selection of the image processing/analysis steps. For example, the first machine-learning model may be trained to select at least one of a use of one or more image processing steps, one or more numerical parameters of one or more image processing steps, and one or more categorical parameters of one or more image processing steps for the image analysis workflow. In this case, the output of the first machine-learning model 240 may be encoded as follows: Every image analysis workflow comprising or consisting of one or more image processing/analysis steps (each representing an image transformation) can be stored with a unique identifier (UID). This UID can be suitably encoded to become part of a vector, for example using a sparse encoding or a one-hot encoding, that is denoted ωi in
The image analysis workflow 260 comprises or consists of one or more deterministic image analysis steps or one or more machine learning models being used to predict the desired information 270 (such as a classification, segmentation etc.) from the images 230. When such a separate image analysis workflow is used, the method may comprise processing 120 the set of images using the image analysis workflow, with the image analysis workflow being parametrized based on an output of the first machine-learning model.
The output 270 of the image analysis workflow, e.g., along with the images 230, is the input to the second machine learning model 280 which outputs a hypothesis. Accordingly, the method of
The output of the image analysis workflow, regardless of whether a separate image analysis workflow is used or whether the image analysis workflow is part of the first machine-learning model, may contain different types of information. In general, the output of the image analysis workflow may comprise a set of information characterizing the set of images. For example, the, output of the image analysis may comprise one or more of numerical data characterizing the set of images (e.g., a respective concentration of one or more entities, or a proportion of an overall area taken by an entity etc.), binary data characterizing the set of images (e.g., presence of absence of an entity, a condition being true or false etc.), categorical data characterizing the set of images (e.g., a cell fate being one of dead, alive, migratory, static etc.), or spatial data (e.g., coordinates of bounding boxes, or a segmentation map). The output of the image analysis workflow may be provided as an embedding (e.g., as a vector) characterizing the set of images. In some examples, the output of the image analysis workflow may comprise image data, e.g., cropped and/or processed portions of the set of images.
In addition to the output by the image analysis workflow, the set of images or a processed version of the set of images may be used as further input to the second machine-learning model. In other words, the second machine-learning model 280 may accept both, information predicted by the workflow 270 and the set of images 230 as input. The information output by the second machine-learning model and the images can also be preprocessed together. For example, if the information output by the second machine-learning model comprises bounding boxes, these can be used to crop objects from the images 230 and use only those as input to 280. Similarly, if the information 270 comprises binary semantic or instance segmentation maps or probability maps from a segmentation workflow, this can be multiplied to crop or modulate the original inputs 230 and use the such modulated signal as input to 280.
The second machine-learning model is then trained to output a prediction 290 of a hypothesis being evaluated using the biological process. As the predicted hypothesis is compared with the actual hypothesis when determining the loss function, the hypotheses may be provided in a format that a) can be efficiently output by the second machine-learning model, and that b) can be objectively compared with the actual hypothesis. In particular, the second machine-learning model may be trained to output a formal representation of the prediction of the hypothesis, with the loss function being calculated based on a comparison between the formal representation of the prediction of the hypothesis and a formal representation of the actual hypothesis. For example, as will be described in the following, the hypothesis may be output using the same encoding (i.e., formal representation) as a natural language processing model 330 (shown in
(At least) during application of the proposed concept, the method may comprise processing 150 user input 300, 350, 360 (as shown in
The formal representation discussed above is based on a template scheme, where the overall structure of the template comprises “blanks” where the respective information is to be inserted. In the following, first the different items that fill in the blanks are introduced -entities, relations, conditions, cell fates and distributions.
In the following, some definitions are given for formulating a formal definition of a hypothesis. In a hypothesis, an entity E can be any kind of biological or medical entity, informally any proper noun which is a technical term in biology or medicine. In particular, these can be of type protein, chemical, drug, disease. In summary, an entity may be of type chemical, protein, nucleotide, carbohydrate, lipid, drug and disease.
A relation R can exist between two entities, optionally given a condition. Common relations include chemical-protein relations. They can be one of activator, inhibitor, agonist, antagonist, upregulator, downregulator, substrate of, product of. Other relations include drug-drug interaction, gene-disease associations. In evidence-based medicine, e.g., in a corpus like EBM PICO, word-level annotations are included down the line of Participant, Intervention, Comparator and Outcome. Thus, a relation that includes Participant, Intervention, Comparator and Outcome enables investigating the empirical results of evidence-based medicine at the microscopic level. In summary, a relation may comprise one of a first entity acting as activator for a second entity, a first entity acting as inhibitor for a second entity, a first entity acting as antagonist for a second entity, a first entity acting as upregulator for a second entity, a first entity acting as downregulator for a second entity, a first entity acting as substrate of a second entity, a first entity being a product of a second entity, and/or a relation may be one of a chemical-protein relation, a drug-drug-interaction, and a gene-disease interaction, and a participant-intervention-comparator-output relation.
A condition K is an optional part of a hypothesis. Some entities or relations are only observable under a given condition with a certain probability. The condition may also be NULL (i.e., non-existent) in which case the probability to observe an entity or relation is 1.0. Conditions can be a function of time. Conditions can, for example, be of type NULL. Conditions can, for example, be of type perturbant concentration (chemical, drug, low molecular weight compound). This condition is time variant. Conditions can, for example, be of type culture conditions (pH (a value representing the acidity or alkalinity of a solution), CO2 (carbon dioxide) partial pressure, H2O (water) saturation of atmosphere, temperature, nutrients). This condition is time variant. Conditions can, for example, be of type co-culture: Simultaneous culturing of >1 cell type in a culture vessel, dish or vial. The relative abundance of cell types in 2D or 3D culture, organoids, spheroids can be expressed as a vector and is the co-culture condition. This condition is possibly time variant. Conditions can, for example, be of type cell composition: A tissue may contain several cell types defined by genetic traits or morphological, proteomic or metabolomic phenotypes. The relative abundance of a particular cell type in tissue or organs is the cell composition. This condition is unlikely to be time variant. Conditions can, for example, be of type proximity: Interactions (relations) can occur as a function of distance between two entities. Proximity can be expressed as a class (colocalizes/does not colocalize to a microscopically resolvable structure, such as a cell organelle, histological marker, tissue), or as a real valued number or vector with according to some distance metric (e.g., a p-norm as in L1 norm, L2norm, Lp norm, L∞ norm). Proximity may be time variant. In summary, a condition may be one of a perturbant concentration condition, a culture condition, a co-culture condition, a cell composition condition, and a proximity condition. Multiple conditions may co-occur or met simultaneously to make a particular observation.
A cell fate F indicates whether cells proliferate (i.e. undergo mitosis) or die (undergo necrosis in a tissue or apoptosis (controlled cell death) or unspecified cell death on the cell level). Other cell fates include migratory vs. static.
A distribution D(x) is a spatial map of cell types, i.e. the entirety of all locations of a cell type (or other enumerable entities which have object instances) in a culture, tissue or organ. It can also represent a population of non-resolvable entities (such as chemicals, dyes, proteins, nucleotides, carbohydrates, lipids). In the latter case D(x) represents a probability map to encounter a particular entity E in location x, which is proportional to the concentration of E at x. In all cases above x is a coordinate in Rn where n can comprise 2 to 3 spatial coordinates, a channel (emission wavelength or window) coordinate, an excitation wavelength, fluorescence lifetime and derivative values, a vibrational spectral property (e.g. wavenumber) as well as other properties which can be spatially resolved, such as multi-photon excited fluorescence, second or third harmonics, polarization or other physical properties of electromagnetic radiation which are recorded by an imaging device. Optionally, distributions can be function of space and time, i.e., D(x,t).
Given the definitions above, the hypothesis can be formulated as a probability to observe a particular event as one of
Any observable, such as a distribution D can depend on joint conditions of one or more conditions, one or more relations, the presence of one or more entities. So, there can be cases such as P(Dp|(E1; R; E2, K1, K2, . . . , Kn) where one observes a distribution of e.g. a protein E2 given that a chemical E1 is in a relation with E2 (such as being an activator thereof) and particular culture conditions (such as temperature, presence of E1 in the culture vessel at a particular concentration).
In summary, the respective formal representation may represent at least one of a relation between two entities (E1; R; E2), a relation between two entities being dependent on a condition ((E1; R; E2)|K), a cell fate being dependent on a condition (F|K), a cell type distribution being dependent on a condition (DT|K), a two-dimensional or three-dimensional geometry being dependent on a condition, and an entity distribution of a non-numerable entity being dependent on a condition (Dp|K). The hypothesis may relate to a probability of the above-referenced options, e.g., P(E1; R; E2) etc.
During application of the proposed concept, and, in some cases, during generation of the training corpus, the user can input a hypothesis based using one out of three ways: Voice interface, unstructured text or structured input.
In some examples of the proposed disclosure, for the NLP model 330 in
In general, the specification of the hypothesis may be done on the same system that is also being used to apply the proposed concept. However, as a further input modality, a mobile device may be used, such as a smartphone. For example, the specification of the hypothesis by the user may be performed on a mobile device. The hypothesis 340 output by that pipeline may then be sent to the system which runs the information extraction pipeline 240-280.
During training and application, the above-mentioned hypotheses are used to determine how to adjust the first and second machine-learning model. As shown in
As outlined above, the adjustment of the first and second machine-learning model may occur in two stages/phases-in a training stage/phase, and an application stage/phase. In the training stage/phase, the method may be performed over a plurality of iterations using a plurality of sets of images as training input images and a plurality of corresponding actual hypotheses for comparison with the hypotheses predicted by the second machine-learning model to train the first and/or second machine-learning model. For example, a corpus of training data (comprising the plurality of sets of images and the plurality of corresponding actual hypotheses) may be used for training.
In the following, the application of the trained system for selecting the workflow is discussed. Once the pipeline has been trained/adjusted, applying it means a forward pass through it comprising stages 240-270. For example, the output may be the desired information which the pipeline was trained to match to a hypothesis. In other words, the method may comprise providing 130 the output of the image analysis workflow. In addition, the method may comprise providing the output of the second machine-learning model (for comparison), or the output of the first machine-learning model (e.g., to store the configuration of the image analysis workflow for later use), or the first machine-learning model (to apply the first machine- learning model to further sets of images). In the application phase, the first and/or second machine-learning model may be finetuned to the set of images at hand. In this case, the first and second machine-learning models are pre-trained machine-learning models, which are adjusted in the field. In some examples, both training and application may be performed on the same system. In some other examples, however, different systems may be used for training and application.
In an alternative implementation, the whole pipeline 240-280 may be replaced by one deep neural network which accepts images as input and predicts the encoded hypothesis 290 directly. It can be trained end-to-end using backpropagation, e.g., shown for the combination of models 240 and 280. In this case, the first and second machine-learning model may be part of a single machine-learning model, with the output of the image analysis workflow being, a) inherently contained within the single machine-learning model, and b) being provided at an output of the single machine-learning model.
Machine-learning algorithms are usually based on a machine-learning model. In other words, the term “machine-learning algorithm” may denote a set of instructions that may be used to create, train or use a machine-learning model. The term “machine-learning model” may denote a data structure and/or set of rules that represents the learned knowledge (e.g. based on the training performed by the machine-learning algorithm). In embodiments, the usage of a machine-learning algorithm may imply the usage of an underlying machine-learning model (or of a plurality of underlying machine-learning models). The usage of a machine-learning model may imply that the machine-learning model and/or the data structure/set of rules that is the machine-learning model is trained by a machine-learning algorithm.
For example, the above-mentioned machine-learning models may be artificial neural networks (ANNs). ANNs are systems that are inspired by biological neural networks, such as can be found in a retina or a brain. ANNs comprise a plurality of interconnected nodes and a plurality of connections, so-called edges, between the nodes. There are usually three types of nodes, input nodes that receiving input values, hidden nodes that are (only) connected to other nodes, and output nodes that provide output values. Each node may represent an artificial neuron. Each edge may transmit information, from one node to another. The output of a node may be defined as a (non-linear) function of its inputs (e.g. of the sum of its inputs). The inputs of a node may be used in the function based on a “weight” of the edge or of the node that provides the input. The weight of nodes and/or of edges may be adjusted in the learning process. In other words, the training of an artificial neural network may comprise adjusting the weights of the nodes and/or edges of the artificial neural network, i.e. to achieve a desired output for a given input. For example, the machine-learning models may be deep neural networks, i.e., artificial neural networks comprising at least one hidden layer.
Alternatively, the respective machine-learning models may be support vector machines, random forest models or gradient boosting models. Support vector machines (i.e. support vector networks) are supervised learning models with associated learning algorithms that may be used to analyze data (e.g. in classification or regression analysis). Support vector machines may be trained by providing an input with a plurality of training input values that belong to one of two categories. The support vector machine may be trained to assign a new input value to one of the two categories. Alternatively, the respective machine-learning models may be Bayesian networks, which is a probabilistic directed acyclic graphical model. A Bayesian network may represent a set of random variables and their conditional dependencies using a directed acyclic graph. Alternatively, the machine-learning models may be based on a genetic algorithm, which is a search algorithm and heuristic technique that mimics the process of natural selection.
More details and aspects of the method for adjusting a first and/or second machine-learning model are mentioned in connection with the proposed concept, or one or more examples described above or below (e.g.,
In connection with
In some examples, it may be useful not only to obtain the output of the image analysis workflow, but also a predicted hypothesis being provided by a second machine-learning model. For example, the method may comprise inputting 440 the output of the image analysis workflow into a second machine-learning model, the second machine-learning model being trained to output a prediction of a hypothesis being evaluated using the biological process. The method may comprise providing 450 the prediction of the hypothesis being evaluated using the biological process.
More details and aspects of the method for processing a set of images representing a biological process are mentioned in connection with the proposed concept, or one or more examples described above or below (e.g.,
Such a system may be used to perform various tasks. For example, the system may be configured to perform the method shown in connection with
In various examples, the system 510 is used together with the scientific imaging device 520 of the imaging system. In particular, the system 510 may be co-located with the scientific imaging device 520. Alternatively, the system 510 may be part of a server (e.g., cloud node), and be coupled to the scientific imaging device 520 via a computer network (e.g., via the internet). In general, the scientific imaging device may be configured to generate the set of images being processed. As is evident, the system may be implemented differently, depending on what aspects of the above methods is being performed by the system. For example, the system may be one of a server, a cloud computing node, a workstation computer and an embedded device. For example, a server, cloud computing node or workstation computer may be used primarily for the training of the machine-learning models, and for a subsequent adjustment of the machine-learning models to the set of images at hand. An embedded device, i.e., a system that is co-located with the scientific imaging device, may perform the processing of the set of images using the image analysis workflow, and, if powerful enough, the adjustment of the machine-learning models to the set of images at hand.
The one or more interfaces 512 of the system 510 may correspond to one or more inputs and/or outputs for receiving and/or transmitting information, which may be in digital (bit) values according to a specified code, within a module, between modules or between modules of different entities. For example, the one or more interfaces 512 may comprise interface circuitry configured to receive and/or transmit information. The one or more processors 514 of the system 510 may be implemented using one or more processing units, one or more processing devices, any means for processing, such as a processor, a computer or a programmable hardware component being operable with accordingly adapted software. In other words, the described function of the one or more processors 514 may as well be implemented in software, which is then executed on one or more programmable hardware components. Such hardware components may comprise a general-purpose processor, a Digital Signal Processor (DSP), a micro-controller, etc. The one or more storage devices 516 of the system 510 may comprise at least one element of the group of a computer readable storage medium, such as a magnetic or optical storage medium, e.g., a hard disk drive, a flash memory, Floppy-Disk, Random Access Memory (RAM), Programmable Read Only Memory (PROM), Erasable Programmable Read Only Memory (EPROM), an Electronically Erasable Programmable Read Only Memory (EEPROM), or a network storage.
More details and aspects of the system and imaging system are mentioned in connection with the proposed concept, or one or more examples described above or below (e.g.,
As used herein the term “and/or” includes any and all combinations of one or more of the associated listed items and may be abbreviated as “/”.
Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
Some embodiments relate to a imaging device (e.g., microscope) or imaging system comprising a system as described in connection with one or more of the
The computer system 620 may be a local computer device (e.g. personal computer, laptop, tablet computer or mobile phone) with one or more processors and one or more storage devices or may be a distributed computer system (e.g. a cloud computing system with one or more processors and one or more storage devices distributed at various locations, for example, at a local client and/or one or more remote server farms and/or data centers). The computer system 620 may comprise any circuit or combination of circuits. In one embodiment, the computer system 620 may include one or more processors which can be of any type. As used herein, processor may mean any type of computational circuit, such as but not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a graphics processor, a digital signal processor (DSP), multiple core processor, a field programmable gate array (FPGA), for example, of a microscope or a microscope component (e.g. camera) or any other type of processor or processing circuit. Other types of circuits that may be included in the computer system 620 may be a custom circuit, an application-specific integrated circuit (ASlC), or the like, such as, for example, one or more circuits (such as a communication circuit) for use in wireless devices like mobile telephones, tablet computers, laptop computers, two-way radios, and similar electronic systems. The computer system 620 may include one or more storage devices, which may include one or more memory elements suitable to the particular application, such as a main memory in the form of random access memory (RAM), one or more hard drives, and/or one or more drives that handle removable media such as compact disks (CD), flash memory cards, digital video disk (DVD), and the like. The computer system 620 may also include a display device, one or more speakers, and a keyboard and/or controller, which can include a mouse, trackball, touch screen, voice-recognition device, or any other device that permits a system user to input information into and receive information from the computer system 620.
Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a processor, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some one or more of the most important method steps may be executed by such an apparatus.
Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a non-transitory storage medium such as a digital storage medium, for example a floppy disc, a DVD, a Blu-Ray, a CD, a ROM, a PROM, and EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may, for example, be stored on a machine readable carrier.
Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
In other words, an embodiment of the present invention is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
A further embodiment of the present invention is, therefore, a storage medium (or a data carrier, or a computer-readable medium) comprising, stored thereon, the computer program for performing one of the methods described herein when it is performed by a processor. The data carrier, the digital storage medium or the recorded medium are typically tangible and/or non-transitionary. A further embodiment of the present invention is an apparatus as described herein comprising a processor and the storage medium.
A further embodiment of the invention is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may, for example, be configured to be transferred via a data communication connection, for example, via the internet.
A further embodiment comprises a processing means, for example, a computer or a programmable logic device, configured to, or adapted to, perform one of the methods described herein.
A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
A further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
In some embodiments, a programmable logic device (for example, a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are preferably performed by any hardware apparatus.
Embodiments may be based on using a machine-learning model or machine-learning algorithm.
Furthermore, some techniques may be applied to some of the machine-learning algorithms. For example, feature learning may be used. In other words, the machine-learning model may at least partially be trained using feature learning, and/or the machine-learning algorithm may comprise a feature learning component. Feature learning algorithms, which may be called representation learning algorithms, may preserve the information in their input but also transform it in a way that makes it useful, often as a pre-processing step before performing classification or predictions. Feature learning may be based on principal components analysis or cluster analysis, for example.
In some examples, anomaly detection (i.e. outlier detection) may be used, which is aimed at providing an identification of input values that raise suspicions by differing significantly from the majority of input or training data. In other words, the machine-learning model may at least partially be trained using anomaly detection, and/or the machine-learning algorithm may comprise an anomaly detection component.
In some examples, the machine-learning algorithm may use a decision tree as a predictive model. In other words, the machine-learning model may be based on a decision tree. In a decision tree, observations about an item (e.g. a set of input values) may be represented by the branches of the decision tree, and an output value corresponding to the item may be represented by the leaves of the decision tree. Decision trees may support both discrete values and continuous values as output values. If discrete values are used, the decision tree may be denoted a classification tree, if continuous values are used, the decision tree may be denoted a regression tree.
Association rules are a further technique that may be used in machine-learning algorithms. In other words, the machine-learning model may be based on one or more association rules. Association rules are created by identifying relationships between variables in large amounts of data. The machine-learning algorithm may identify and/or utilize one or more relational rules that represent the knowledge that is derived from the data. The rules may e.g. be used to store, manipulate or apply the knowledge.
In the present concept, various acronyms and terms are used, which are shortly summarized in the following. A DNN is a deep neural network, which can involve any algorithm, such as MLP (Multi-Layer Perceptron), CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), or Transformer (a neural network mainly based on attention mechanism). A target neural network is a deep neural network being trained described in the present concept. An image is a digital image, for example with dimensions XY (i.e., two lateral dimensions X and Y), XYZ (i.e., a depth-dimension Z in addition to the two lateral dimensions X+Y), XY+T (XY+Time), XYZ+C (XYZ+Channel), XYZ+T (XYZ+Time), XYZCT (XYZ+Channel+Time), XYZCT+other modalities. In other words, a 2D or nD digital image (tensor) with n ∈N. A (image processing) workflow refers to the sequential execution of multiple image processing or image analysis steps where the output of the i-th step is passed to the input of the (i+1)th step.
| Number | Date | Country | Kind |
|---|---|---|---|
| 23155841.2 | Feb 2023 | EP | regional |