The present disclosure generally relates to molecule state and parameter estimation, and in particular, to a system and associated method for single molecule Förster Resonance Energy Transfer (smFRET) analysis from single photon arrivals leveraging Bayesian non-parametrics.
smFRET is a widely used technique for studying kinetics of molecular complexes. However, smFRET data analysis methods currently require specifying a priori the dimensionality of the underlying physical model (the exact number of kinetic parameters). Such approaches are inherently limiting given the typically unknown number of physical configurations a molecular complex may assume.
It is with these observations in mind, among others, that various aspects of the present disclosure were conceived and developed.
The present patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
Corresponding reference characters indicate corresponding elements among the view of the drawings. The headings used in the figures do not limit the scope of the claims.
A method outlined herein for smFRET analysis leveraging Bayesian Nonparametrics includes: accessing observation data indicative of interactions between a donor molecule and an acceptor molecule of a macromolecular complex; iteratively sampling a set of probability values associated with values of a set of model parameters from a measurement model; and jointly inferring, based on the set of probability values and the observation data, a set of most probable values of the set of model parameters that are associated with a highest likelihood of observing the observation data. The set of model parameters can include: a set of transition rates among superstates of a composite representing interactions between the macromolecular complex and FRET dyes; and a set of observed system states of the macromolecular complex.
The observation data can include time-series detector data indicative of photons having a wavelength within a first wavelength range captured at a first channel of a detector device and photons having a wavelength within a second wavelength range captured at a second channel of the detector device over time.
The method can further include: applying a Markov Chain Monte Carlo sampling scheme to iteratively sample probability values associated with values of the set of model parameters of the measurement model, the measurement model being constructed using a nonparametric Bayesian formulation of a system-FRET Master Equation descriptive of an evolution of the composite.
For continuous illumination (the macromolecular complex having been subjected to a continuous illumination scheme), the method can further include: iteratively sampling the set of transition rates using a Metropolis-Hastings sampling scheme; and determining the set of observed system states by sampling, for each load of a plurality of loads representing a hypothetical system state, a load value from a Bernoulli formulation of a conditional posterior probability distribution of the measurement model that jointly incorporates the set of transition rates and the plurality of loads.
For pulsed illumination (where the observation data further including photon arrival times corresponding with a plurality of illumination pulses applied to the macromolecular complex, where transitions between system states occur at a beginning of each illumination pulse), the method can further include: initializing a chain of transition probability values for a set of transition probabilities associated with a set of hypothetical system states from a Dirichlet process prior; sampling the transition probability values of the set of transition probabilities by directly sampling from a posterior probability distribution using a Gibbs sampling scheme; and selecting one or more hypothetical system states for inclusion in the set of observed system states based on the transition probability values.
The method can further include sampling a set of photophysical transition rates of the set of model parameters using a Metropolis-Hastings sampling scheme; and sampling a set of system state trajectories using a forward backward sampling procedure.
For sampling transition rates using Metropolis-Hastings, the method can include: initializing a chain of samples for each transition rate by drawing random values from a corresponding prior distribution for the transition rates; drawing, for an iteration of a plurality of iterations, a new value for the set of transition rates from a proposal distribution; determining, for the iteration, an acceptance probability associated with the new value; and accepting or rejecting the new value for inclusion in the set of transition rates based on the acceptance probability.
In a further aspect, a system outlined herein for smFRET analysis leveraging Bayesian Nonparametrics under continuous illumination includes: a processor in communication with a memory, the memory including instructions executable by the processor to: access observation data indicative of interactions between a donor molecule and an acceptor molecule of a macromolecular complex, the macromolecular complex having been subjected to a continuous illumination scheme; and iteratively sample a set of probability values associated with values of a set of model parameters from a measurement model. The set of model parameters can include a set of transition rates among superstates of a composite representing interactions between the macromolecular complex and FRET dyes. Further, the set of model parameters can include a load value for each load of a plurality of loads, the load value being sampled from a Bernoulli formulation of a conditional posterior probability distribution of the measurement model that jointly incorporates the set of transition rates and the plurality of loads, each load of the plurality of loads representing a hypothetical system state.
The memory can further include instructions executable by the processor to: determine a set of observed system states of the macromolecular complex based on load values for each load of the plurality of loads; and jointly infer, based on the set of probability values and the observation data, a set of most probable values of the set of model parameters that are associated with a highest likelihood of observing the observation data.
The observation data can include time-series detector data indicative of photons having a wavelength within a first wavelength range captured at a first channel of a detector device and photons having a wavelength within a second wavelength range captured at a second channel of the detector device over time.
The memory can further include instructions executable by the processor to: apply a Markov Chain Monte Carlo sampling scheme to iteratively sample probability values associated with values of the set of model parameters of the measurement model, the measurement model being constructed using a nonparametric Bayesian formulation of a system-FRET Master Equation descriptive of an evolution of the composite. This can include iteratively sampling the set of transition rates using a Metropolis-Hastings sampling scheme.
Load values being a first value indicate that a corresponding hypothetical system state is active and should be included in the set of observed system states. Load values being a second value indicate that a corresponding hypothetical system state is inactive and should be excluded from the set of observed system states.
In a further aspect, a system outlined herein for smFRET analysis leveraging Bayesian Nonparametrics under pulsed illumination includes: a processor in communication with a memory, the memory including instructions executable by the processor to: access observation data indicative of interactions between a donor molecule and an acceptor molecule of a macromolecular complex; and iteratively sample a set of probability values associated with values of a set of model parameters from a measurement model. The set of model parameters can include: a set of transition rates among superstates of a composite representing interactions between the macromolecular complex and FRET dyes; and a set of transition probability values for a set of transition probabilities associated with a set of hypothetical system states, sampled from a posterior probability distribution using a Gibbs sampling scheme.
The memory can further include instructions executable by the processor to: select one or more hypothetical system states for inclusion in a set of observed system states of the macromolecular complex based on the set of transition probability values; and jointly infer, based on the set of probability values and the observation data, a set of most probable values of the set of model parameters that are associated with a highest likelihood of observing the observation data.
The observation data can include time-series detector data indicative of photons having a wavelength within a first wavelength range captured at a first channel of a detector device and photons having a wavelength within a second wavelength range captured at a second channel of the detector device over time. In particular, the observation data can further include photon arrival times corresponding with a plurality of illumination pulses applied to the macromolecular complex, where transitions between system states occur at a beginning of each illumination pulse.
The memory can further include instructions executable by the processor to: apply a Markov Chain Monte Carlo sampling scheme to iteratively sample probability values associated with values of the set of model parameters of the measurement model, the measurement model being constructed using a nonparametric Bayesian formulation of a system-FRET Master Equation descriptive of an evolution of the composite.
The memory can further include instructions executable by the processor to: initialize a chain of transition probability values for the set of transition probabilities associated with the set of hypothetical system states from a Dirichlet process prior; and sample a set of system state trajectories using a forward backward sampling procedure.
In a further aspect, a non-transitory computer readable media includes instructions encoded thereon that are executable by a processor to: access observation data indicative of interactions between a donor molecule and an acceptor molecule of a macromolecular complex, the macromolecular complex having been subjected to a continuous illumination scheme; and iteratively sample a set of probability values associated with values of a set of model parameters from a measurement model. The set of model parameters can include a set of transition rates among superstates of a composite representing interactions between the macromolecular complex and FRET dyes. Further, the set of model parameters can include a load value for each load of a plurality of loads, the load value being sampled from a Bernoulli formulation of a conditional posterior probability distribution of the measurement model that jointly incorporates the set of transition rates and the plurality of loads, each load of the plurality of loads representing a hypothetical system state.
The non-transitory computer readable media can further include instructions executable by a processor to: determine a set of observed system states of the macromolecular complex based on load values for each load of the plurality of loads; and jointly infer, based on the set of probability values and the observation data, a set of most probable values of the set of model parameters that are associated with a highest likelihood of observing the observation data.
The observation data can include time-series detector data indicative of photons having a wavelength within a first wavelength range captured at a first channel of a detector device and photons having a wavelength within a second wavelength range captured at a second channel of the detector device over time.
The non-transitory computer readable media can further include instructions executable by a processor to: apply a Markov Chain Monte Carlo sampling scheme to iteratively sample probability values associated with values of the set of model parameters of the measurement model, the measurement model being constructed using a nonparametric Bayesian formulation of a system-FRET Master Equation descriptive of an evolution of the composite. This can include iteratively sampling the set of transition rates using a Metropolis-Hastings sampling scheme.
Load values being a first value indicate that a corresponding hypothetical system state is active and should be included in the set of observed system states. Load values being a second value indicate that a corresponding hypothetical system state is inactive and should be excluded from the set of observed system states.
In a further aspect, a non-transitory computer readable media includes instructions encoded thereon that are executable by a processor to: access observation data indicative of interactions between a donor molecule and an acceptor molecule of a macromolecular complex; and iteratively sample a set of probability values associated with values of a set of model parameters from a measurement model. The set of model parameters can include: a set of transition rates among superstates of a composite representing interactions between the macromolecular complex and FRET dyes; and a set of transition probability values for a set of transition probabilities associated with a set of hypothetical system states, sampled from a posterior probability distribution using a Gibbs sampling scheme.
The non-transitory computer readable media can further include instructions executable by a processor to: select one or more hypothetical system states for inclusion in a set of observed system states of the macromolecular complex based on the set of transition probability values; and jointly infer, based on the set of probability values and the observation data, a set of most probable values of the set of model parameters that are associated with a highest likelihood of observing the observation data.
The observation data can include time-series detector data indicative of photons having a wavelength within a first wavelength range captured at a first channel of a detector device and photons having a wavelength within a second wavelength range captured at a second channel of the detector device over time. In particular, the observation data can further include photon arrival times corresponding with a plurality of illumination pulses applied to the macromolecular complex, where transitions between system states occur at a beginning of each illumination pulse.
The non-transitory computer readable media can further include instructions executable by a processor to: apply a Markov Chain Monte Carlo sampling scheme to iteratively sample probability values associated with values of the set of model parameters of the measurement model, the measurement model being constructed using a nonparametric Bayesian formulation of a system-FRET Master Equation descriptive of an evolution of the composite.
The non-transitory computer readable media can further include instructions executable by a processor to: initialize a chain of transition probability values for the set of transition probabilities associated with the set of hypothetical system states from a Dirichlet process prior; and sample a set of system state trajectories using a forward backward sampling procedure.
Various embodiments of a unified conceptual framework for single molecule Förster Resonance Energy Transfer (smFRET) analysis of macromolecular systems from single photon arrivals leveraging Bayesian nonparametrics, BNP-FRET. This unified framework addresses the following key physical complexities of an smFRET experiment, including:
These physical features necessarily demand a novel framework that extends beyond existing tools. The present disclosure outlines a second order hidden Markov model (HMM) and Bayesian nonparametrics (BNP) on account of items 1, 2 and 5 on the list, respectively.
This method finds and quantifies states in a molecule by looking at data from FRET experiments. The method uses new combination of tools from BNP in order to analyze the data. This method overcomes the model selection problems by treating the number of states as a parameter to be inferred. Furthermore this method gives more accurate results in a more transparent way.
Förster Resonance Energy Transfer (FRET) has served as a spectroscopic ruler to study motion of components of a macromolecular system at the nanometer scale, and has revealed insight into intra- and intermolecular dynamics of proteins, nucleic acids, and their interactions. In particular, single molecule FRET (smFRET) experiments have been used to determine the pore size and opening mechanism of ion channels sensitive to mechanical stress in the membrane, the intermediate stages of protein folding, and the chromatin interactions modulated by the helper protein HP1α involved in allowing genetic transcription when the chromatin is tightly packed.
A typical FRET experiment involves labeling molecules of interest of the macromolecular system with donor and acceptor dyes such that the donor may transfer energy to the acceptor via dipole-dipole interaction when separated by distances of 2-10 nm. The donor, in turn, is illuminated by a laser in continuous or pulsating fashion to record a trace of photon arrival times, brightness values, and detection channels.
Upon excitation by the laser, a donor may emit a photon itself or transfer its energy non-radiatively to the acceptor which eventually relaxes to emit a photon of a different color. The distance dependence of this energy transfer is the key to using smFRET as a molecular ruler. Furthermore, this property directly manifests itself in the form of higher fraction of photons detected in the acceptor channel when the dyes are closer together.
The aim of smFRET is to capture on-the-fly changes of the macromolecular system in distance. However, this is often confounded by several sources of stochasticity which unavoidably obscure direct interpretation. These include:
Taken together, these problems necessarily contribute to uncertainty in the number of distinct system states visited by a labeled macromolecular system over the course of an experiment.
This disclosure delves into greater detail into items 2 and 4. In particular, item 2 pertains to questions of crosstalk, detector efficiency, dead time, dark current, and instrument response function (IRF) introducing uncertainty in excited photophysical state lifetime assessments.
Item 4 refers to a collection of effects including limited quantum yield and variable brightness due to blinking of dyes caused by nonradiative pathways, photobleaching or permanent deactivation of the dyes, spectral overlap between the donor and acceptor dyes which may result in direct excitation of the acceptors or leaking of photons into the incorrect channel, or a donor-acceptor pair's relative misalignment or positioning resulting in false signals and inaccurate characterization of the separation between labeled molecules.
Though the goal has always remained to analyze the rawest form of data, the reality of these noise properties has traditionally led to the development of approximate binned photon analyses. Binning is either achieved by directly binning photon arrivals when using single photon detectors or by integrating intensity over a few pixels when using widefield detectors.
While binned data analyses can be used to determine the number and connectivity of system states—by computing average FRET efficiencies over bin time windows and using them in turn to construct FRET efficiency histograms—they come at the cost of averaging dynamics existing below a time bin not otherwise easily accessible. They also eliminate information afforded by, say, the excited photophysical state lifetime in the case of pulsed illumination.
While histogram analyses are suited to infer static molecular properties, dynamics over binned time traces have also been extracted by supplementing these techniques with a hidden Markov model (HMM) treatment.
Using HMMs, histogram analyses techniques immediately face the difficulty of an unknown number of system states visited. Therefore, they require the number of system states as an input to deduce the kinetics between the hypothesized system states.
What is more, the histogram analysis' accuracy is determined by bin sizes where large bins may result in averaging of dynamics. Moreover, increasing bin size may lead to estimation of an excess number of system states. This artifact arises when a system appears to artificially spend more time in the system states below the bin size. To address these challenges, continuous time trajectories below the bin size must be inferred through, for example, the use of Markov jump processes while retaining a binned, i.e., discrete measurement model.
When single photon data is available, and to avoid the binning issues inherent to HMM analysis-and leverage, for example, direct noise properties of detectors known for single photon arrivals (e.g., IRF), and knowledge of excited photophysical state lifetimes available from pulsed illumination—single photon arrivals can be directly employed in the analysis. This, naturally, comes with added computational cost.
Often to help mitigate computational costs, further approximations on the system dynamics are often invoked such as assuming system dynamics to be much slower than FRET label excitation and relaxation rates. This approximation helps decouple photophysical and system (molecular) dynamics.
What is more, as they exist, the rigor of direct photon arrival analysis methods are further compromised to help reduce computational cost by treating detector features and background as preprocessing steps. In doing so, simultaneous and self-consistent inference of kinetics and other molecular features becomes unattainable.
Finally, all methods, whether relying on the analysis of binned photons or single photon arrival, suffer from the “model selection problem”. That is, the problem associated with identifying the number of system states are warranted by the data. More precisely, the problem associated with propagating the uncertainty introduced by items 1-4 into a probability over the models (i.e., system states). Existing methods for system state identification only provide partial reprieve.
For example, while FRET histograms identify peaks to intuit the number of system states, these peaks may provide unreliable estimates for a number of reasons:
To address the model selection problem, overfitting penalization criteria (such as the Bayesian information criterion or BIC) or variational Bayesian approaches have been employed. Often, these model selection methods assume implicit properties of the system. For example, the BIC requires the assumption of weak independence between measurements (i.e., ideally independent identically distributed measurements and thus no Markov dynamics in state space) and a unique maximum in the likelihood, both of are violated in smFRET data. Furthermore, BIC and other such methods provide point estimates rather than full probabilities over system states ignoring uncertainty from items 1-4 propagated into model selection.
As such, there is a need to learn distributions over system states and kinetics of a macromolecular system warranted by the data and whose breadth is dictated by the sources of uncertainty discussed above. More specifically, to address model selection and build joint distributions over system states and their kinetics, the number of system states is treated as a random variable just as the current community treats smFRET kinetics as random variables. An objective of the present disclosure is therefore to obtain distributions over all unknowns (including system states and kinetics) while accounting for items 1-4. Furthermore, this must be achieved in a computationally efficient way avoiding, altogether, the draconian assumptions of existing in single photon analysis methods. In other words, there is a need to do more (by learning joint distributions over the number of system states alongside everything else) and it needs to cost less.
When learning distributions over unknowns, then it is convenient to operate within a Bayesian paradigm. Also, if the model (i.e., the number of system states) is unknown, then the model must further specialize to the Bayesian nonparametric (BNP) paradigm. BNPs directly address the model selection problem concurrently and self-consistently while learning the associated model's parameters and output full distributions over the number of system states and the other parameters.
In this disclosure, a description is provided of single photon smFRET analysis within the BNP paradigm addressing noise sources discussed above (items 1-4). In addition, specialized computational schemes are presented for both continuous and pulsed illumination for it to “cost less”.
Indeed, mitigating computational cost becomes critical especially with the added complexity of working within the BNP paradigm, warranting a detailed treatment of continuous and pulsed illumination analyses in two companion manuscripts.
To complement this theoretical framework, a suite of programs is also developed called BNP-FRET written in the compiled language Julia for high performance. These freely available programs allow for comprehensive analysis of single photon smFRET time traces on immobilized molecules obtained with a wide variety of experimental setups.
Section 2 presents a forward model which can be applied in practicality through an inverse strategy discussed herein. The inverse strategy in section 3 enables learning full posteriors within the BNP paradigm. Finally, multiple examples are presented by applying the method to simulated data sets across different parameter regimes. Experimental data are treated in two subsequent companion manuscripts.
The present novel concepts of the disclosure include information from the following publications:
Conventions and terms used herein are defined as follows:
Data collected from typical smFRET experiments analyzed by BNP-FRET is discussed here. In such experiments, donor and acceptor dyes labeling a macromolecular system can be excited using either continuous illumination, or pulsed illumination where short laser pulses arrive at regular time intervals. Moreover, acceptors can also be excited by nonradiative transfer of energy from an excited donor to a nearby acceptor.
Upon relaxation, both donor and acceptor can emit photons collected by single photon detectors. These detectors record the set of photon arrival times and detection channels. Arrival times are denoted by:
and detection channels with:
for a total number of K photons. In the equations above, Tstart and Tend are experiment's start and end times. Further, the strategy used to index the detected photons above is independent of the illumination setup used.
Throughout the experiment, photon detection rates from the donor and acceptor dyes vary as the distance between them changes, due to the kinetics of the macromolecular system. In cases where the distances form an approximately finite set, the macromolecular system can be treated as exploring a discrete system state space. The acquired FRET traces can then be analyzed to estimate the transition rates between these system states assuming a known model (i.e., known number of system states). This assumption of knowing the model a priori is lifted by providing nonparametric models that consider quantity of system states as a parameter to be learned.
The computing device 101 can access the observation data indicative of interactions between the donor molecule and the acceptor molecule of the macromolecular complex 10, and can apply a measurement model (e.g., measurement model 102A for continuous illumination or measurement model 102B for pulsed illumination) which can infer values of various model parameters based on the observations. These model parameters can include transition rates among superstates of a composite representing interactions between the macromolecular complex and FRET dyes, as well as observed system states of the macromolecular complex (as opposed to current methods which require a priori specification of a quantity of system states). In particular, each version of the measurement model applies a Markov Chain Monte Carlo sampling scheme to iteratively sample probability values associated with values of the set of model parameters of the measurement model, the measurement model being constructed using a nonparametric Bayesian formulation of a system-FRET Master Equation descriptive of an evolution of the composite.
Based on the outcome of the measurement model, the computing device 101 can jointly infer, based on the set of probability values and the observation data, a set of most probable values of the set of model parameters that are associated with the highest likelihood of observing the observation data.
Referring briefly to
Given the observation data, the measurement model 102A iteratively samples the set of transition rates using a Metropolis-Hastings sampling scheme. This includes initializing a chain of samples using a Gamma prior and sampling transition rates by proposing new samples from a proposal distribution and accepting or rejecting the samples based on an acceptance probability. Details about this step are further provided in Sections 3.1 and 3.2.1.
For inferring a set of observed system states of the macromolecular complex, the measurement model 102A determines the set of observed system states using a Bernoulli process prior, where a set of hypothetical system states are each represented by a load of a plurality of loads. In view of the transition rates, the measurement model 102A samples load values for the plurality of loads directly from a Bernoulli form of a posterior probability distribution further outlined in Section 3.2. For load values that are found to have a first value (e.g., where a load value is equal to 1), the associated hypothetical system state can be included in the set of observed system states. Otherwise, for load values that are found to have a second value (e.g., where a load value is equal to 0), the associated hypothetical system state is excluded from the set of observed system states.
Referring briefly to
Given the observation data, the measurement model 102B iteratively samples the set of transition rates using a Metropolis-Hastings sampling scheme similar to that of the measurement model 102A of
For inferring a set of observed system states of the macromolecular complex, the measurement model 102B determines the set of observed system states using an Infinite Hidden Markov Model (iHMM) and Dirichlet process prior, where a set of hypothetical system states are each represented by a transition probability outlined in Section 3.2.2. In contrast with the measurement model 102A of
To derive the likelihood, consider the stochastic evolution of an idealized system, transitioning through a discrete set of total Mσ system states, {σ1, . . . , σM
Assuming Markovianity (memorylessness) of transitions among superstates, the probability of finding the composite in a specific superstate at a given instant evolves according to the master equation:
where the row vector ρ(t) of length Mϕ has elements coinciding with probabilities for finding the system-FRET composite in a given superstate at time t. More explicitly, defining the photophysical portion of the probability vector ρ(t) corresponding to system state σi as
ρ(t) can be written as
Furthermore, in the master equation above, G is the generator matrix of size Mϕ×Mϕ populated by all transition rates λϕ
Each diagonal element of the generator matrix G corresponds to self-transitions and is equal to the negative sum of the remaining transition rates within the corresponding row. That is,
This results in zero row-sums, assuring that ρ(t) remains normalized at all times as described later in more detail (see Eq. 5). Furthermore, for simplicity, the present disclosure assumes no simultaneous transitions among system states and photophysical states as such events are rare (although the incorporation of these events in the model may be accommodated by expanding the superstate space). This assumption results in λ(ψ
where the matrices on the diagonal Gσ
For later convenience, the system transition rates λσ→σ
Moreover, the explicit forms of Gσ
where the * along the diagonal represents the negative row-sum of the remaining elements, λex is the excitation rate, λd and λa are the donor and acceptor relaxation rates, respectively, and λdirect is direct excitation of the acceptor by a laser, and λσ
where the ratio on the right hand side represents the fraction of FRET transitions among all competing transitions out of an excited donor, that is, the fraction of emitted acceptor photons among total emitted photons.
With the generator matrix at hand, the present disclosure now outlines solutions to the master equation of Eq. 1. Due to its linearity, the master equation accommodates the following analytical solution:
illustrating how the probability vector ρ(t) arises from the propagation of the initial probability vector at time t0 by the exponential of the generator matrix (the propagator matrix Π(t−t0)). The exponential maps the transition rates λϕ
After describing the generator matrix and deriving the solution to the master equation, the present disclosure continues by explaining how to incorporate observations into a likelihood.
In the absence of observations, any transition among the set of superstates are unconstrained. However, when monitoring the system using suitable detectors, observations rule out specific transitions at the observation time. For example, ignoring background for now, the detection of a photon from a FRET label identifies a transition from an excited photo-physical state to a lower energy photophysical state of that label. On the other hand, no photon detected during a time period indicates the absence of radiative transitions or the failure of detectors to register such transition. Consequently, even periods without photon detections are informative in the presence of a detector. In other words, observations from a single photon smFRET experiment are continuous in that they are defined at every point in time.
Additionally, since smFRET traces report radiative transitions of the FRET labels at photon arrival times, uncertainty remains about the occurrences of unmonitored transitions (e.g., between system states). Put differently, smFRET traces (observations) only partially specify superstates at any given time.
Now, to compute the likelihood for such smFRET traces, those probabilities must be summed over all trajectories across superstates (superstate trajectories) consistent with a given set of observations. Assuming the system ends in superstate ϕi at Tend, this sum over all possible trajectories can be very generally given by the element of the propagated vector ρ(Tend) corresponding to superstate ϕi. Therefore, a general likelihood may be written as
However, as the final superstate at time Tend is usually unknown, it is important to marginalize (sum) over the final superstate to obtain the following likelihood
where all elements of the vector ρnorm are set to 1 as a means to sum the probabilities in vector ρ(Tend). In the following sections, the present disclosure describes how to obtain concrete forms for these general likelihoods.
For pedagogical reasons, it is helpful to first look at the trivial case where a system-FRET composite evolves but no observations are made (due to a lack, say, of detection channels). In this case, all allowed superstate trajectories are possible between the start time of the experiment, Tstart, and end, Tend. This is because the superstate cannot be specified or otherwise restricted at any given time by observations previously explained. Consequently, the probability vector ρ(t) remains normalized throughout the experiment as no superstate trajectory is excluded. As such, the likelihood is given by summing over probabilities associated to the entire set of trajectories, that is,
where {e1, . . . eK} are the emission times of all emitted photons, not recorded due to lack of detection channels and thus not appearing on the right hand side of the expression.
In what follows, the present disclosure describes how the probability vector ρ(t) does not remain normalized as it evolves to ρ(Tend) when detectors partially collapse knowledge of the occupied superstate during the experiment. This results in a likelihood smaller than one. This is applied for the conceptually simpler case of continuous illumination for now.
To compute the likelihood when single photon detectors are present, start by defining a measurement model where the observation at a given time is dictated by ongoing transitions and detector features (e.g., crosstalk, detector efficiency). As discussed in more detail later, if the evolution of a system can be described by defining its states at a discrete time points and these states are not directly observed, and thus hidden, then this measurement model adopts the form of a hidden Markov Model (HMM). Here, Markovianity arises when a given hidden state only depends on its immediate preceding hidden state. In such HMMs, an observation at a given time is directly derived from the concurrent hidden state.
As an example of an HMM, for binned smFRET traces, an observation is often approximated to depend only on the current hidden state. However, contrary to such a naive HMM, an observation in a single photon setup in a given time period depends on the current superstate and the immediate previous superstate. This naturally enforces a second order structure on the HMM where each observed random variable depends on two super-states as demonstrated herein. A similar HMM structure was noted previously to model a fluorophore's photo switching behavior.
Now, in order to address this observation model, the experiment's time duration can be divided into N windows of equal size, ϵ=(Tend−Tstart)/N. The continuum limit ϵ→0 will eventually be applied to recover the original system as described by the master equation. In addition, the present disclosure also contemplates summing over all possible transitions between superstates within each window. These windows are marked by the times (see
where the n-th window is given by (tn−1, tn) with t0=Tstart and tN=Tend. Corresponding to each time window, observations include:
where wn=∅ if no photons are detected and w={(Tn(1), cn(1)), (Tn(2), cn(2)), . . . } otherwise, with the j-th photon in a window being recorded by the channel cn(j) at time Tn(j). Note here that observations in a time window, being a continuous quantity, allow for multiple photon arrivals or none at all.
As mentioned earlier, each of these observations originate from the evolution of the superstate. Therefore, superstates occupied at the beginning of each window are defined as
where an is the superstate at the beginning of the n-th time window as shown in
After describing the issue in computing the likelihood due to the second order structure of the HMM, the present disclosure now describe a solution to this problem. As such, to simplify the likelihood calculation, the present disclosure temporarily introduces superstates bn at the end of n-th window separated from superstate an+1 at the beginning of (n+1)-th window by a short time τ as shown in
which now permits a recursive strategy for likelihood calculation as described in the next section. Furthermore, the τ→0 limit can eventually be applied to obtain the likelihood of the original HMM with the second order structure.
The analysis outlined above provides the means to compute the terminal probability vector ρend=ρ(Tend) by evolving the initial vector ρstart=ρ(Tstart). This is most conveniently achieved by recursively marginalizing (summing) over all superstates in Eq. 9 backwards in time, starting from the last superstate aN+1 as follows
where N+1(aN+1) are elements of the vector N+1 of length Mϕ, commonly known as a filter. Moving backwards in time, the filter at the beginning of the nth time window, n (an+1), is related to the filter at the end of the nth window, n (bn), due to Markovianity, as follows
or in matrix notation as
where p(an+1|bn) are the elements of the transition probability matrix {tilde over (Π)}n described in the next section. Again due to Markovianity, the filter at the end of the nth window, n (bn), is related to the filter at the beginning of the same time window, n (an), as
or in matrix notation as
where the terms p(wn|an, bn, G)p(bn|an, G) populate the transition probability matrix Πn(r) described in the next section. Here, the superscript (r) denotes that elements of this matrix include observation probabilities. Note that the last filter in the recursion formula, 1, is equal to starting probability vector ρstart itself.
To derive the different terms in the recursive filter formulas, note that the transition probabilities p(an|bn−1, G) and p(bn|an, G) do not involve observations. As such, the full propagator can be used as follows
respectively. On the other hand, the term p(wn|an, bn, G) includes observations which results in modification to the propagator by ruling out a subset of transitions. For instance, observation of a photon momentarily eliminates all nonradiative transitions. The modifications now required can be structured into a matrix Dn of the same size as the propagator with elements (Dn)a
relating the modified propagator (termed reduced propagator and distinguished by the superscript (r) hereafter) (Πn(r))a
where the symbol ⊙ represents element-by-element product of matrices. Here, however, the detection matrices cannot yet be computed analytically as the observations wn allow for an arbitrary number of transitions within the finite time window (tn−1, tn). However, they become manageable in the limit that the time windows become vanishingly small, as demonstrated further herein.
2.3.5 Likelihood for the HMM with Second Order Structure
Now, inserting the matrix expressions for filters of Eq. 13 into the recursive formula likelihood Eq. 10:
where, in the second step, a row vector of ones, ρnorm, is added at the end to sum over all elements. Here, the superscript T denotes matrix transpose. As such, the structure of the likelihood above amounts to propagation of the initial probability vector ρstart to the final probability vector ρ(Tend) via multiple propagators corresponding to N time windows.
where is the identity matrix. In this limit, the likelihood for the HMM with a second order structure can be recovered as
Note here that the final probability vector ρ(Tend) is not normalized to one upon propagation due to the presence of reduced propagators corresponding to observations. More precisely, the reduced propagators restrict the superstates evolution to only a subset of trajectories over a time window ϵ in agreement with the observation over this window. This, in turn, results in a probability vector whose elements sum to less than one. That is,
Up until now, the finite size of the time window e allowed for an arbitrary number of transitions per time window (tn−1, tn), which hinders the computation of an exact form for the detection matrices. Here, the continuum limit is taken, as the time windows become vanishingly small (that is, ϵ→0 as N→∞). Thus, no more than one transition is permitted per window. This enables full specification of the detection matrices Dn.
To derive the detection matrices, first assume ideal detectors with 100% efficiency (detector effects are examined in subsequent sections, see Sec. 2.4). In such cases, the absence of photon detections during a time window, while detectors are active, indicates that only nonradiative transitions took place. Thus, only nonradiative transitions have nonzero probabilities in the detection matrices. As such, for evolution from superstate an to bn, the elements of the nonradiative detection matrix, Dnon, are given by
On the other hand, when the k-th photon is recorded in a time window, only elements corresponding to radiative transitions are nonzero in the detection matrix denoted by Dkrad as
Here, note that the radiative detection matrices have zeros along their diagonals, since self-transitions are nonradiative.
The reduced propagators corresponding to the nonradiative and radiative detection matrices, Dnon and Dkrad, can now be defined using the Taylor approximation limϵ→0Πn=+ϵG+(ϵ2) as
In the equations above, Gnon=G⊙Dnon and Gkrad⊙Dkrad, where the symbol ⊙ represents an element-by-element product of the matrices. Furthermore, the product between the identity matrix and Dkrad above vanishes in the radiative propagator due to zeros along the diagonals of Dkrad.
With the asymptotic forms of the reduced propagators in Eq. 14 now defined in the last subsection, all the ingredients needed to arrive at the final form of the likelihood have been identified.
To do so, begin by considering the period right after the detection of the (k−1)th photon until the detection of the k-th photon. For this time period, the nonradiative propagators in Eq. 14 can now be easily merged into a single propagator Πknon=exp((Tk→Tk−1)Gnon), as the commutative arguments of the exponentials can be readily added. Furthermore, at the end of this interphoton period, the radiative propagator Πk(r)rad marks the arrival of the k-th photon. The product of these two propagators
now governs the stochastic evolution of the system-FRET composite during that interphoton period.
Inserting Eq.20 for each interphoton period into the likelihood for the HMM with second order structure in Eq. 14, the desired likelihood is outlined:
2.3.8 Effect of Binning Single Photon smFRET Data
When considering binned FRET data, the time period of an experiment (Tend−Tstart) is typically divided into a finite number (=N) of equally sized (=ϵ) time windows (bins), and the photon counts (intensities) in each bin are recorded in the detection channels. This is in contrast to single photon analysis where individual photon arrival times are recorded. To arrive at the likelihood for such binned data, the present disclosure starts with the single photon likelihood derived in Eq. 15 where c is not infinitesimally small, that is,
where
or in the matrix notation
where Dn is the detection matrix introduced in Sec. 2.3.4.
Next, it is necessary to sum over all superstate trajectories that may give rise to the recorded photon counts (observations) in each bin. However, such a sum is challenging to compute analytically. Here, the present disclosure will only show likelihood computation under commonly applied approximations/assumptions when analyzing binned smFRET data, which are: 1) bin size e is much smaller than typical times spent in a system state or, in other words, for a system transition rate λσ
The first assumption is based on realistic situations where system kinetics (at seconds timescale) are many orders of magnitude slower than the photophysical transitions (at nanoseconds timescale). This timescale separation allows us to simplify the propagator calculation in Eq. 23. To see that, the system transition rates can be separated from photo-physical transition rates in the generator matrix as
where ⊙ denotes a tensor product, Gσ is the portion of generator matrix G containing only system transition rates previously defined in Eq. 3, and Gψ is the portion containing only photophysical transition rates, that is,
where Gσ
Now plugging Eq. 24 into the full propagator Π=exp(ϵG) and applying the famous Zassenhaus formula for matrix exponentials:
where the square brackets represent the commutator of the constituting matrices and the last term represents the remaining exponentials involving higher order commutators. Furthermore, the commutator [Gσ⊗, Gψ] results in a very sparse matrix given by
where
Now, the propagator calculation in Eq. 27 simplifies if the commutator ϵ2[Gσ⊗, Gψ]→0, implying that either the bin size ϵ is very small such that ϵλσ
where the last equality follows from the block diagonal form of Gσ ⊗ given in Eq. 25 and Πσ=exp(ϵGσ) is the system transition probability matrix (propagator) given as
Moreover, Πψ=exp(ϵGψ) is the photophysical transition probability matrix (propagator) as
where the elements are given as Πσ
After deriving the full propagator Π for the time period ϵ (bin) under the first assumption, the present disclosure now proceeds to incorporate observations during this period via detection matrices Dn to compute the reduced propagator of Eq. 23. To do so, the second assumption of relatively slower excitation rate λex can be applied. This assumption implies that interphoton periods are dominated by the time spent in the ground state of the FRET pair and are distributed according to a single exponential distribution, Exponential(λex). Consequently, the total photon counts per bin follow a Poisson distribution, Poisson(ϵλex), independent of the photophysical portion of the photophysical trajectory taken from superstate an to an+1.
Now, the first and the second assumptions imply that the observation during the n-th bin only depends on the system state sn (or the associated FRET rate λs
Using these approximations, the reduced propagator in Eq. 23 can now be written as
Next, to compute the likelihood for the n-th bin, it is necessary to sum over all possible superstate trajectories within this bin as
where ρn is a normalized row vector populated by probabilities of finding the system-FRET composite in the possible superstates at the beginning of the n-th bin. Furthermore, portions of ρn corresponding to system state σi as are written as ρσ
following the convention in Sec. 2.3 and using n to now represent time tn.
Moreover, since each row of Πσ
where
is defined as the probability of the system to occupy system state σi. The previous equation can also be written in matrix form as
where ρnσ is a row vector of length Mσ (number of system states) populated by ρσ
Therefore, these new propagators can be multiplied for each bin to approximate the likelihood of Eq. 22 as
where ρstartσ is a row vector, similar to ρnσ, populated by probabilities of being in a given system state at the beginning of an experiment.
To conclude, the two assumptions regarding system kinetics and excitation rate enable significant reduction of the dimensions of the propagators. This, in turn, leads to much lowered expense for likelihood computation. However, cheaper computation comes at the expense of requiring large number of photon detections or excitation rate per bin to accurately determine FRET efficiencies (identify system states) due to marginalization over photophysics in each bin. Such high excitation rates lead to faster photobleaching and increased photo-toxicity, and thereby much shorter experiment durations. As discussed in Sec. 2.5.1, this problem can be mitigated by using pulsed illumination, where the likelihood takes a similar form as Eq. 38, but FRET efficiencies can be accurately estimated from the measured microtimes.
The analysis in the previous section assumed idealized detectors to illustrate basic ideas on detection matrices. However, realistic FRET experiments must typically account for detector nonidealities. For instance, many emitted photons may simply go undetected when the detection efficiency of single photon detectors, i.e., the probability of an incident photon being successfully registered, is less than one due to inherent nonlinearities associated with the electronics or the use of filters in cases of polarized fluorescent emission. Additionally, donor photons may be detected in the channel reserved for acceptor photons or vice-versa due to emission spectrum overlap. This phenomenon, commonly known as crosstalk, crossover, or bleedthrough, can significantly affect the determination of quantities such as transition rates and FRET efficiencies. Other effects adding noise to fluorescence signal include dark current (false signal in the absence of incident photons), dead time (the time a detector takes to relax back into its active mode after a photon detection), and timing jitter or IRF (stochastic delay in the output signal after a detector receives a photon). This section of the present disclosure describes the incorporation of all such effects into the model except dark current and background emissions which require more careful treatment and will be discussed in Sec. 2.6.
Noise sources such as crosstalk and detection efficiency necessarily result in photon detection being treated as a stochastic process. Both crosstalk and detection efficiency can be included into the propagators in both cases by substituting the zeros and ones, appearing in the ideal radiative and nonradiative detection matrices (Eqs. 16-17), with probabilities between zero and one. In such a way, the resulting propagators obtained from these detection matrices, in turn, incorporate into the likelihood the effects of crosstalk and detection efficiency into the model.
Here, in the presence of crosstalk, for clarity, a superscript is added to the radiative detection matrix of Eq. 17 for the k-th photon, Dkrad−ct. The elements of this detection matrix for the an→bn transition, when a photon intended for channel j is registered in channel i reads
where ϕji is the probability for this event (upon transition from superstate an to bn). Further, detector efficiencies can also be accounted for in these probabilities in order to represent the combined effects of crosstalk, arising from spectral overlap, and absence of detection channel registration. Upon doing so,
(for cases where i and j can be both the same or different) can be recovered, as not all emitted photons can be accounted for by the detection channels.
This new detection matrix above results in the following modification to the radiative propagator of Eq. 19 for the k-th photon
The second equality above follows by recognizing that the identity matrix multiplied, element-wise, by Dkrad−ct is zero. By definition, Gkrad−ctis the remaining nonzero product.
On the other hand, for time periods when no photons are detected, the nonradiative detection matrices in Eq. 16 becomes
where the sum gives the probability of the photon intended for channel i to be registered in any channel. The nonradiative propagator of Eq. 18 for an infinitesimal period of size e in the presence of crosstalk and inefficient detectors is now:
where Gnon−ct=G⊙Dnon−ct. With the propagators incorporating crosstalk and detection efficiency now defined, the evolution during an interphoton period between the (k−1)-th photon and the k-th photon of size (Tk−Tk−1) is now governed by the product:
where the nonradiative propagators in Eq. 39 have now been merged into a single propagator Πknon−ct=exp((Tk−Tk−1)Gnon−ct) following the same procedure as Eq. 20.
Finally, inserting Eq. 40 for each interphoton period into the likelihood of Eq. 14, the final likelihood incorporating crosstalk and detection efficiency can be arrived at:
After incorporating crosstalk and detector efficiencies into the model, the calibration of the crosstalk probabilities/detection efficiencies ϕij is briefly outlined here. To calibrate these parameters, two samples, one containing only donor dyes and another containing only acceptor dyes, are individually excited with a laser under the same power to determine the number of donor photons ndiraw and number of acceptor photons nairaw detected in channel i.
From photon counts recorded for the donor only sample, assuming ideal detectors with 100% efficiency, the crosstalk probabilities for donor photons going to channel i, ϕdi, can be computed using the photon count ratios as ϕdi=ndiraw/ndem where ndem is the absolute number of emitted donor photons. Similarly, crosstalk probabilities for acceptor photons going to channel i, ϕai, can be estimated as ϕai=nairaw/naem where naem is the absolute number of emitted acceptor photons. In the matrix form, these crosstalk factors for a two-detector setup can be written as
Using this matrix, for the donor only sample:
and similarly for the acceptor only sample:
and naem experimentally, and therefore the crosstalk factors in A can only be determined up to multiplicative factors of ndem and naem.
Since scaling the photon counts in an smFRET trace by an overall constant does not affect the FRET efficiency estimates (determined by photon count ratios) and escape rates (determined by changes in FRET efficiency), crosstalk factors up to a constant are required as in the last equation.
For this reason, one possible solution toward determining the matrix elements of A up to one multiplicative constant is to first tune dye concentrations such that the ratio ndem/naem=1, which can be accomplished experimentally. This enables to write the crosstalk factors in the matrix form up to a constant as follows:
It is common to set the multiplicative factor in Eq. 44 by the total donor photons counts Σjndjraw to give:
Note that from the convention adopted here, ϕd1+ϕd2=1.
Furthermore, in situations where realistic detectors affect the raw counts, the matrix elements of A as computed above automatically incorporate the effects of detector inefficiencies including the fact that
Additionally, the matrix A can be further generalized to account for more than two detectors by appropriately expanding the size of the matrix dimensions to coincide with the number of detectors. Calibration of the matrix elements then follows the same procedure as above.
Now, in performing single photon FRET analysis, the elements of A can be used directly in constructing the measurement matrix. However, it is also common to compute the matrix elements of A from what is termed the route correction matrix (RCM) typically used in binned photon analysis. The RCM is defined as the inverse of A to obtain corrected counts ndem and naem up to a proportionality constant as
Typically, a detection channel i becomes inactive (dead) after the detection of a photon for a period δi as specified by the manufacturer. Consequently, radiative transitions associated with that channel cannot be monitored during that period.
To incorporate this detector dead period into the likelihood model, an inter-photon period between the (k−1)-th and k-th photon is broken into two intervals: the first interval with an inactive detector and the second interval with an active detector. Assuming that the (k−1)-th photon is detected in the ith channel, the first interval is thus δik long. As such, the detection matrix for this interval can be defined as:
Next, corresponding to this detection matrix, the propagator can be written as follows:
which evolves the superstate during the detector dead time. This propagator can now be used to incorporate the detector dead time into Eq. 20 to represent the evolution during the period between the (k−1)-th and k-th photons as:
where ΠknonΠk(r)rad describes the evolution when the detector is active.
Finally, inserting Eq. 47 for each interphoton period into the likelihood for the HMM with a second order structure in Eq. 14, the following likelihood that includes detector dead time is as follows:
To provide an explicit example on the effect of the detector dead time on the likelihood, the present disclosure takes a detour for pedagogical reasons. In this context, consider a very simple case of one detection channel (dead time δ) observing a fluorophore with two photophysical states, ground (ψ1) and excited (ψ2), illuminated by a laser. The data in this case includes only photon arrival times:
{T1, Ts, T3, . . . , TK}. (49)
The generator matrix containing the photophysical transition rates for this setup is:
where the * along the diagonal represents the negative row-sum of the remaining elements, λex is the excitation rate, and λd is the donor relaxation rate.
Here, all transitions are possible during detector dead times as there are no observations. As such, the dead time propagators in the likelihood (Eq. 48) are simply expressed as exponentials of the full generator matrix, that is, Πki
As demonstrated herein, these dead times, similar to detector inefficiencies, simply increase the uncertainty over parameters to be learned, such as kinetics, by virtue of providing less information. By contrast, background emissions and cross talk, provide false information. However, the net effect is the same: all noise sources increasing uncertainty.
Due to various sources of noise impacting the detection timing electronics (also known as jitter), the time elapsed between photon arrival and detection is itself a hidden (latent) random variable. Under continuous illumination, this stochastic delay in time is sampled from a probability density, ƒ(τ), termed the detection instrument response function (IRF). To incorporate the detection IRF into the likelihood of Eq. 48, the propagators are convoluted with ƒ(τ) as follows:
where dead time propagators Πki
To facilitate the computation of this likelihood, the systems outlined herein leverage the fact that typical acquisition devices record at discrete (but very small) time intervals. For instance, a setup with the smallest acquisition time of 16 ps and a detection IRF distribution that is approximately 100 ps wide will have the detection IRF spread over, roughly, six acquisition periods. This allows each convolution integral to be discretized over the six acquisition intervals and computed in parallel, thereby avoiding extra real computational time associated to this convolution other than the overhead associated with parallelization.
After discussing detector effects, the present disclosure further considers different illumination features. For simplicity alone, the likelihood computation until now assumed continuous illumination with a uniform intensity. More precisely, the element λex of the generator matrix in Eq. 4 was assumed to be time-independent. Here, the formulation is generalized. Further, the present disclosure shows how other illumination setups (such as pulsed illumination and alternating laser excitation, ALEX) can be incorporated into the likelihood by simply assigning a time dependence to the excitation rate λex(t).
Here, consider an smFRET experiment where the FRET pair is illuminated using a laser for a very short period of time (a pulse), δpulse, at regular intervals of size τ; see
where Qn, with n=1, . . . , N, denotes the propagator evolving the superstate during the n-th interpulse period between the (n−1)-th and the n-th pulse.
To derive the structure of Qn during the n-th interpulse period, Qn can be broken into two portions: 1) pulse with nonzero laser intensity where the evolution of the FRET pair is described by the propagator Πnpulse introduced shortly; 2) dark period with zero illumination intensity where the evolution of the FRET pair is described by the propagator Πndark introduced shortly. Furthermore, depending on whether a photon is detected or not over the n-th interpulse period the propagators Πnpulse and Πndark assume different forms.
First, when no photons are detected:
where the integration over the pulse period now involves a time dependent Gnon due to temporal variations in λex(t). The integral in Eq. 52 is sometimes termed the excitation IRF though this convention will not be used here. For this reason, when referring to IRF herein, detection IRF alone is implied. Additionally, Gdark is the same as Gnon except for the excitation rate that is now set to zero due to lack of illumination. Finally, the propagator for an interpulse period with no photon detection can now be written as
On the other hand, if a photon is detected sometime after a pulse (as in
where μn is the photon arrival time measured with respect to the n-th pulse (also termed microtime) as shown in in
Moreover, the propagator can be constructed for situations where a photon is detected during a pulse itself in a similar fashion. Here, the propagator Πndark remains the same as in Eq. 53 but Πnpulse must now be modified to include the radiative generator matrix Gnrad as:
The propagators derived so far in this section assumed ideal detectors. The present disclosure now describes a procedure to incorporate the IRF into this formulation. This is especially significant for accurate estimation of fluorophores' lifetimes, which is commonly done in pulse illumination smFRET experiments. To incorporate the IRF, the same procedure is followed as in Sec. 2.4.3 convolution is introduced between the IRF function ƒ(ϵ) and propagators above involving photon detections. That is, when there is a photon detected during the dark period, the propagator Πndark can be modified as:
while the Πnpulse stays the same as in Eq. 52. Here, ϵn is the stochastic delay in photon detection resulting from the IRF as shown in in
Moreover, when there is a photon detected during a pulse, the propagator Πnpulse of Eq. 56 can be modified in a similar fashion to accommodate the IRF, while the propagator Πndark remains the same as in Eq. 53.
The propagators Qn presented in this section involve integrals over large generator matrices that are analytically intractable and computationally expensive when considering large pulse numbers. Therefore, a strategy similar to Sec. 2.3.8 for binned likelihood can be applied to approximate these propagators.
To reduce the complexity of the calculations, one may start by making realistic approximations. Given the timescale separation between the interpulse period (typically tens of nanoseconds) and the system kinetics (typically of seconds timescale) in a pulsed illumination experiment, it is possible to approximate the system state trajectory as being constant during an inter-pulse period. In essence, rather than treating the system state trajectory as a continuous time process, the trajectory can be discretized such that system transitions only occur at the beginning of each interpulse period. This enables separation of the photophysical part of the generator matrix Gψ in Eq. 4 from the portion describing the evolution of the system under study Gσ given in Eq. 3. Here, by contrast to the likelihood shown in Sec. 2.5.1 for pulsed illumination, the photophysical and system likelihood portions can be independently computed, as described below.
To derive the likelihood, the system state propagator during an interpulse period can be written as:
Furthermore, observations must be incorporated into these propagators by multiplying each system transition probability in Πσ, πσ
where ⊙ again represents the element-by-element product. Here, the elements of Dnσ depend on the photophysical portion of the generator matrix Gψ. Note here that propagator matrix dimensions are now Mσ×Mσ making them computationally less expensive than in the continuous illumination case. Finally, the likelihood for the pulsed illuminated smFRET data with these new propagators reads:
which, similar to the case of binned likelihood under continuous illumination (see Sec. 2.3.8), sums over all possible system state trajectories.
This likelihood can be used later to put forward an inverse model to learn transition probabilities (elements of œσ) and photophysical transition rates appearing in Gψ.
Here, consider background photons registered by detectors from sources other than the labeled system under study. The majority of background photons comprise ambient photons, photons from the illumination laser entering the detectors, and dark current (false photons registered by detectors).
Due to the uniform laser intensity in the continuous illumination case, considered in this section, all background photons can be modeled using a single distribution from which waiting times are drawn. Often, such distributions are assumed (or verified) to be Exponential with fixed rates for each detection channel. Here, the waiting time distribution for background photons arising from both origins is modeled as a single Exponential, as is often the most common case. However, in the pulsed illumination case, laser source and the two other sources of background require different treatments due to nonuniform laser intensity. That is, the ambient photons and dark current are still modeled by an Exponential distribution though it is often further approximated as a Uniform distribution given that interpulse period if much shorter than the average background waiting time.
Now, the background can be incorporated into the likelihood under continuous illumination. This is achieved, as mentioned earlier, by assuming an Exponential distribution for the background, which effectively introduces new photophysical transitions into the model. As such, these transitions may be incorporated by expanding the full generator matrix G (described in Sec. 2.3) appearing in the likelihood, thereby leaving the structure of the likelihood itself intact, c.f., Eq. 21.
To be clear, in constructing the new generator matrix, the background in each detection channel is treated as if originating from fictitious independent emitters with constant emission rates (exponential waiting time). Furthermore, it is assumed that an emitter corresponding to channel i is a two state system with photophysical states denoted by:
Here, each transition to the other state coincides with a photon emission with rate λibg. As such, the corresponding background generator matrix for channel i can now be written as:
Since the background emitters for each channel are independent of each other, the expanded generator matrix G for the combined setup (system-FRET composite plus background) can now be computed. This can be achieved by combining the system-FRET composite state space and the background state spaces for all of the total C detection channels using Kronecker sums as
where the symbol ⊕ denotes the matrix Kronecker sum, and Gno−bg represents previously shown generator matrices without any background transition rates.
The propagators needed to compute the likelihood can now be obtained by exponentiating the expanded generator matrix above as
where the symbol ® denotes the matrix Kronecker product (tensor product).
Furthermore, the same detection matrices defined earlier to include only nonradiative transitions or only radiative transitions, and their generalization with crosstalk and detection efficiency, can be used to obtain nonradiative and radiative propagators, as shown in Sec. 2.3.6.
Consequently, as mentioned earlier, by contrast to incorporating the effects of dead time or IRF, addition of background sources do not entail any changes in the basic structure (arrangement of propagators) of the likelihood appearing in Eq. 21.
As demonstrated for background in the previous section, to incorporate new photophysical transitions such as fluorophore blinking and photobleaching into the likelihood, the full generator matrix G must be modified. This can again be accomplished by adding extra photophysical states, relaxing nonradiatively, to the fluorophore model. These photophysical states can have long or short lifetimes depending on the specific photophysical phenomenon at hand. For example, donor photobleaching can be included by introducing a third donor photophysical state into the matrix of Eq. 4 without any escape transitions as follows:
where ψ1 is the lowest energy combined photophysical state for the FRET labels, ψ2 represents the excited donor, ψ3 represents the excited acceptor, and ψ4 represents a photobleached donor, respectively. Additionally, λd and λa denote donor and acceptor relaxation rates, respectively, λbleach represents permanent loss of emission from the donor (photobleaching), and λσ
Fluorophore blinking can be implemented similarly, except with a nonzero escape rate out of the new photophysical state, allowing the fluorophore to resume emission after some time. Here, assuming that the fluorophore cannot transition into the blinking photophysical state from the donor ground state results in the following generator matrix:
So far, this discussion has ignored direct excitation of acceptor dyes in the likelihood model. This effect can also be incorporated into the likelihood by assigning a nonzero value to the transition rate λψ
Other photophysical phenomena can also be incorporated into the likelihood by following the same procedure as above. Finally, just as when adding background, the structure of the likelihood (arrangement of the propagators) when treating photophysics (including adding the effect of direct acceptor excitation) stays the same as in Eq. 21.
The previous subsections described how to compute the likelihood, which is the sum of probabilities over all possible superstate trajectories that could give rise to the observations made by a detector, as demonstrated in Sec. 2. Here, the present disclosure demonstrates how one such superstate trajectory can be simulated to produce synthetic photon arrival data using the Gillespie algorithm, as described in the next section followed by the addition of detector artefacts. The generated data can be used to test the BNP-FRET sampler.
The Gillespie algorithm generates two sets of random variables. The times at which super-states change (indexed 1 through N). These times occur anywhere along a continuous time grid. The next set of random variables are the states associated to the superstate preceding the time at which the superstate changes.
The sequence of superstates is designated as:
where bn∈{ϕ1, ϕ2, . . . , ϕM
Now, to generate the superstate sequence above, the first superstate, b1, is randomly drawn from the set of possible superstates given their corresponding probabilities. Next, second superstate b2 of the sequence is drawn using the set of transition rates out of the first state with self-transitions excluded by construction. Now, after choosing b2, the holding time h1 (the time spent in b1) is generated from the Exponential distribution with rate constant associated with transitions b1→b2. Finally, the two previous steps are repeated to sequentially generate the full sequence of superstates along with the corresponding holding times.
More formally, a trajectory is generated, by first sampling the initial superstate as
where ρstart is the initial probability vector and the Categorical distribution is the generalization of the Bernoulli distribution for more than two possible outcomes. The remaining superstates can now be sampled as:
where
is the escape rate for the superstate bn and rates for self-transitions are zero. The above equation reads as follows: “the superstate bn+1 is drawn (sampled) from a Categorical distribution given the superstate bn and the generator matrix G”.
Once the n-th superstate bn is chosen, the holding time hn (the time spent in bn) is sampled as follows
Finally, with ideal detectors, the detection channel ck is assigned deterministically to the k-th photon emitted at time Tkem, which can be computed by summing all the holding times preceding the corresponding radiative transition.
Furthermore, in the presence of detection effects, such as crosstalk, detection efficiency, and IRF, another layer of stochasticity originating from the measurement model must be added to the stochastic output of the Gillespie simulation. That is, the detection channel and detection times are stochastically assigned to an emitted photon, as described below.
In the presence of crosstalk and inefficient detectors, the detection channel for the k-th photon emitted upon a radiative transition is selected as:
where p∅k, p1k and p2k, respectively, denote the probability of the photon going undetected, being detected in channel 1 and channel 2.
Moreover, in the presence of the IRF, a stochastic delay ϵk, sampled from a probability distribution f(ϵ), is assigned to the absolute photon emission time Tkem. This results in the detection time, Tk=Tkem+ϵk, as registered by the timing hardware.
Additionally, when photophysical effects (such as blinking and photobleaching) and background are present, a superstate trajectory can be generated following the same procedure as above using the generator matrices G incorporating these effects as described in the previous sections.
Finally, the desired smFRET trace is obtained (see
This section builds on the forward model outlined above to show how to obtain parameters of interest for single photon smFRET analysis within the BNP paradigm.
Now, armed with the likelihood for different experimental setups and a means by which to generate synthetic data (or having experimental data at hand), the parameters of interest can be learned. Assuming precalibrated detector parameters, these include transition rates that enter the generator matrix G, and elements of ρstart that denote probabilities for the macromolecular system-FRET composite to be in a given superstate at the beginning of the experiment. However, accurate estimation of the unknowns requires an inverse strategy capable of dealing with all existing sources of uncertainty in the problem such as photon's stochasticity and detector noise. This naturally leads to adoption of a Bayesian inference framework where Monte Carlo methods can be applied to learn distributions over the parameters.
The distribution of interest over the unknown parameters to be learned is termed the posterior. The posterior is proportional to the product of the likelihood and prior distributions using Bayes' rule as follows:
where the last term p(G, ρstart) is the joint prior distribution over G and ρstart defined over the same domains as the parameters. The prior is often selected on the basis of computational convenience. The influence of the prior distribution on the posterior diminishes as more data is incorporated through the likelihood. Furthermore, the constant of proportionality is the inverse of the absolute probability of the collected data, 1/p(w), and can be safely ignored as generation of Monte Carlo samples only involves ratios of posterior distributions or likelihoods.
Additionally, the ϵK factor in the likelihood first derived in Eq. 21 can be absorbed into the proportionality constant as it does not depend on any of the parameters of interest, resulting in the following expression for the posterior (in the absence of detector dead time and IRF for simplicity)
Next, assuming a priori that different transition rates are independent of each other and initial probabilities, the prior can be simplified as follows:
where the Dirichlet prior distribution over initial probabilities is selected as this prior is conveniently defined over a domain where the probability vectors, drawn from it, sum to unity. That is,
where the Dirichlet distribution is a multivariate generalization of the Beta distribution and ζ is a vector of the same size as the superstate space. Typically parameters of the prior are termed hyperparameters and as such ζ collects as many hyperparameters as its size.
Additionally, Gamma prior distributions are selected for individual rates. That is,
guaranteeing positive values. Here, α and λref (a reference rate parameter) are hyperparameters of the Gamma prior. For simplicity, these hyperparameters are usually chosen (with appropriate units) such that the prior distributions are very broad, minimizing their influence on the posterior.
Furthermore, to reduce computational cost, the number of parameters to be learned can be reduced by reasonably assuming the macromolecular system was at steady-state immediately preceding the time at which the experiment began. That is, instead of sampling ρstart from the posterior, ρstart can be computed by solving the time-independent master equation,
Therefore, the posterior in Eq. 63 now reduces to
In the following subsections, a parametric inverse strategy is provided, i.e., assuming a known number of system states, for sampling parameters from the posterior distribution in Eq. 67 using Monte Carlo methods. Next, this inverse strategy is generalized to a nonparametric case to enable deduction of the number of system states.
3.1 Parametric Sampler: BNP-FRET with Fixed Number of System States
Now with the posterior, Eq. 67, at hand and assuming steady-state ρstart, a sampling scheme is illustrated to deduce the transition rates of the generator matrix G.
As the posterior of Eq. 67 does not assume a standard form amenable to analytical calculations, numerical samples of the transition rates within G can be iteratively drawn using Markov Chain Monte Carlo (MCMC) techniques. Specifically, a Gibbs algorithm can be adopted to sequentially and separately, generate samples for individual transition rates at each MCMC iteration. To do so, the posterior of Eq. 67 can be written using the chain rule as follows:
where the backslash after G indicates exclusion of the following rate parameters and w denotes the set of observations as introduced in Sec. 2.3.2. Here, the first term on the right hand side is the conditional posterior for the individual rate λϕ
Just as with the posterior over all parameters, this conditional posterior shown above does not take a closed form allowing for direct sampling.
As such, the Metropolis-Hastings (MH) algorithm can be applied to draw samples from this conditional posterior, where new samples are drawn from a proposal distribution q and accepted with probability
where the asterisk represents the proposed rate values from the proposal distribution q.
To construct an MCMC chain of samples, the process begins by initializing the chain for each transition rate λϕ
A computationally convenient choice for the proposal is a Normal distribution leading to a simpler acceptance probability in Eq. 70. This is due to its symmetry resulting in q (λϕ
where κ=1 is an auxiliary parameter in the same units as λϕ
The variable transformation above now requires introduction of Jacobian factors in the acceptance probability as follows
where the derivative terms represent the Jacobian and the proposal distributions are canceled by virtue of using a symmetric Normal distribution.
The acceptance probability above depends on the difference of the current and proposed values for a given transitions rate. In other words, smaller differences between the current and proposed values often lead to larger acceptance probabilities. This difference is determined by the covariance of the Normal proposal distribution σ2 which needs to be tuned for each rate individually to achieve optimal performance of the BNP-FRET sampler, or very approximately, one-fourth acceptance rate for the proposals.
This whole algorithm can now be summarized in the following pseudocode:
After describing the inverse strategy for a known number of system states (i.e., parametric inference), more realistic scenarios are considered where the number of system states may be unknown which, in turn, leads to an unknown number of superstates (i.e., nonparametric inference). In the following subsections, the BNP framework is first provided for continuous illumination and then for pulsed illumination. Such BNP frameworks introduced herein, eventually, provide distributions over the number of system states simultaneously, and self-consistently, with other model parameters.
3.2.1—Bernoulli process for Continuous Illumination
The number of system states is often unknown and cannot a priori be set by hand. Therefore, in order to learn the states warranted by the data, the Bayesian nonparametric paradigm can be considered. That is, an infinite-dimensional version of the generator matrix in Eq. 2 is first defined; each of its elements can be multiplied by a Bernoulli random variable bi (also termed loads). These loads, indexed by i, enable turning on/off portions of the generator matrix associated with transitions between specific system states (including self-transitions). The nonparametric generator matrix can be written as follows:
where a load value of 1 represents an “active” system state, while “inactive” system states (not warranted by the data) get a load value of 0. Here, there are two loads associated to every transition because there is a pair of states corresponding to each transition. Within this formalism, the number of active loads is the number of system states estimated by the BNP-FRET sampler. As before, * represents negative row-sums.
The full set of loads, b={b1, b2, . . . , b∞}, now become quantities to be learnt. In order to leverage Bayesian inference methods to learn the loads, the previously defined posterior distribution (Eq. 67) now reads as follows
where the prior p(b) is Bernoulli while the remaining prior, p(G), can be assumed to be the same as in Eq. 66.
As in the case of the parametric BNP-FRET sampler presented in Sec. 3.1 of the present disclosure, samples can be generated from this nonparametric posterior employing a similar Gibbs algorithm. To do so, the MCMC chains of loads and rates can first be initialized by taking random values from their priors. Next, to construct the MCMC chains, samples can be iteratively drawn from the posterior in two steps: 1) sequentially sample all rates using the MH algorithm; then 2) loads by direct sampling, one-by-one from their corresponding conditional posteriors. Here, step (1) can be identical to the parametric case of Sec. 3.1 of the present disclosure. The following elaborates on step (2).
To sample the i-th load, the corresponding conditional posterior reads:
where the backslash after b indicates exclusion of the following load and Mσmax and γ are hyperparameters.
Here γ sets the a priori expected number of system states.
A note on the interpretation of Mσmax is in order. When dealing with nonparametrics, an infinite set of loads and priors for these loads called Bernoulli process priors are nominally considered. Samplers for such process priors are available though inefficient means. However, for computational convenience, it is possible to introduce a large albeit finite number of loads set to Mσmax. It can be shown that parameter inference are unaffected by this choice of cutoff when setting the success probability to
as in the Bernoulli distribution of Eq. 72. This is because such a choice forces the mean (expected number of system states) of the full prior on loads Πip(bi) to be finite (=γ).
Since the conditional posterior in the equation above must be discrete and describes probabilities of the load being either active or inactive, it must itself follow a Bernoulli distribution with updated parameters
where
The Bernoulli form of this posterior allows direct sampling of the loads.
This method can be applied to synthetic and experimental data.
3.2.2—iHMM Methods for Pulsed Illumination
Under pulsed illumination, the Bernoulli process prior described earlier for continuous illumination can in principle be used as is to estimate the number of system states and the transition rates. However, in this section, a computationally cheaper inference strategy is provided that is applicable to the simplified likelihood of Eq. 60 assuming system state transitions occurring only at the beginning of each pulse. The reduction in computational expense is achieved by directly learning the elements of the propagator Πσ of Eq. 58, identical for all interpulse periods. In doing so, transition probabilities for the system states can be learned instead of learning rates, however it is still advantageous to continue learning rates for photophysical states. This avoids expensive matrix exponentials for potentially large system state numbers required for computing the propagators under continuous illumination.
Now, to infer the transition probabilities in Πσ, the dimensions of which are unknown owing to an unknown number of system states, as well as transition rates among the photophysical states (elements of Gψ in Eq. 4), and initial probabilities, suitable priors can be placed on these parameters yielding the following posterior:
where the joint prior is written as a product prior over ρstart, Gψ, and Πσ. Next, for ρstart and Gσ, the same priors as in Eqs. 65-66 can be applied. However, as the number of system states is unknown, Πσ requires special treatment. To learn Πσ, it is convenient to adopt the infinite HMM (iHMM) due to the discrete nature of system state transitions over time.
As the name suggests, the iHMM leverages infinite system state spaces (Mσ→∞ in Eq. 3) similar to the Bernoulli process prior of Sec. 3.2.1 of the present disclosure. However, unlike the Bernoulli process, all system states remain permanently active. The primary goal of an iHMM is then to infer transition probabilities between system states, some of which, not warranted by the data, remain very small and set by the (nonparametric) prior discussed herein. Thus the effective number of system states can be enumerated from the most frequently visited system states over the course of a learned trajectory.
Within this iHMM framework, an infinite dimensional version of the Dirichlet prior, termed the Dirichlet process prior, is selected as priors over each row of the propagator Π_σ. That is,
where πm is the mth row of Πσ. Here, the hyperparameters of the Dirichlet process prior include the concentration parameter a which determines the sparsity of the πm and the hyper parameter β which is a probability vector, also known as base distribution. Together αβ are related to the ζ introduced earlier for the (finite) Dirichlet distribution of Eq. 65.
Next, as the base distribution itself is unknown and all transitions out of each state should be likely to revisit the same set of states, the same base distribution can be placed on all Dirichlet process priors placed on the rows of the propagator. To sample this unique base, a Dirichlet process prior is selected, that is,
β=DirichletProcess(ξγ),
where ξ=1 and γ is a vector of hyperparameters of size Mσ.
Now, to deduce the unknown parameters, samples must be drawn from the posterior in Eq. 73. However, due to the nonanalytical form of the posterior, the posterior cannot be jointly sampled. Thus, as before, a Gibbs sampling strategy can be adopted to sequentially and separately draw samples for each parameter. Here, the Gibbs sampling step is illustrated for the transition probabilities πm. The Gibbs steps for the remaining parameters are similar to Sec. 3.2.1 of the present disclosure.
Similar to the Bernoulli process prior, there are two common approaches to draw samples within the iHMM framework: slice sampling using the exact Dirichlet process prior and finite truncation. Just as before for the case of continuous illumination, the Dirichlet process prior is truncated to a finite Dirichlet distribution and fix its dimensionality to a finite (albeit large) number set to Mσmax to improve the sampling. It can then be shown that for large enough Mσmax the number of system states visited becomes independent of Mσmax.
As before, to numerically sample the transition probabilities πmfrom the full posterior in Eq. 73 through MCMC, initial samples are selected from the priors:
where elements of γ are selected as 1/Mσmax to ascribe similar weights across the state space a priori.
As shown in Sec. 2.5.1, the likelihood typically takes the following generic form
where Qi are matrices whose exact form depends on which effects are incorporated into the likelihood. Computing this last expression would typically lead to underflow as likelihood values quickly drop below floating-point precision.
For this reason, it is convenient to introduce the logarithm of this likelihood. In order to derive the logarithm of the likelihood of Eq. 75, the likelihood is re-written as a product of multiple terms as follows:
where ρi are the normalized probability vectors given by the following recursive formula:
Now, using the recursion relation above, the log-likelihood can be written as:
where const is a constant.
Note that ρstartρnormT=1. The pseudocode to compute the log-likelihood is as follows
Results using the BNP-FRET sampler described above are presented in Section 4 of Saurabh et al. 2022a. Experimental results were undertaken to benchmark the parametric (i.e., fixed number of system states) version of the sampler using synthetic data, while the two subsequent manuscripts focus on the nonparametric (i.e., unknown number of system states) analysis of experimental data. Saurabh et al. 2022b expands upon experimentation conducted for continuous illumination, while Safar et al. expands upon experimentation conducted for pulsed illumination.
For simplicity alone, this discussion begins by analyzing data from an idealized macromolecular system with two system states using different photon budgets and excitation rates. Next, more realistic examples are considered incorporating the following, one at a time: 1) crosstalk and detection efficiency; 2) background emission; 3) IRF; and then 4) end with a brief discussion on the unknown number of system states. This disclosure demonstrates when these features become relevant, as well as the overall robustness and generality of the BNP-FRET sampler.
For now, assume continuous illumination for all parametric examples and use the following priors for the analyses. The prior used for the FRET rates are:
and following prior is used over the system transition rates:
As discussed earlier in Sec. 3.1 of the present disclosure, it is more convenient to work within logarithmic space where the following proposal distributions are used to update the parameter values:
where * denotes proposed rates and where it is understood that all rates appearing in the logarithm have been divided through by a unit constant in order for the argument of the logarithm to remain dimensionless.
For efficient exploration of the parameter space by the BNP-FRET sampler and upon extensive experimentation with acceptance ratios, it is prudent to alternate between two sets of variances, {σex2=10−5, σFRET2=0.01, σsys2=0.1} and {σex2=10−5, σFRET2=0.5, σsys2=5.0} to generate an MCMC chain. This ensures that samples of different orders of magnitude are proposed.
As an intuitive guide, the more data there is, the sharper the posterior is expected to be over the rates and thus, the smaller both variances should be in proposal distributions.
For the sake of completeness, relevant aspects of the methods presented in Section 3.2.1 of the present disclosure are presented, including the likelihood needed in Bayesian inference, and parametric and nonparametric Markov Chain Monte Carlo (MCMC) samplers.
An smFRET experiment involves at least two single photon detectors collecting information on stochastic arrival times. These arrival times are denoted with:
in detection channels:
for a total number of K photons. In this representation above, Tstart and Tend are the experiment's start and end times, respectively.
Using this dataset, parameters governing a macromolecular system's kinetics can be inferred. That is, the number of system states Mσ and the associated transition rates λσ
as described in Sec. 2.3. Here, ρ(t) is a row vector populated by probabilities for finding the composite in a given superstate at time t.
In estimating these parameters, all sources of uncertainty present in the experiment must be accounted for, such as shot noise and detector electronics. Therefore, working in the Bayesian paradigm can be helpful, where the parameters are learned by sampling from probability distributions over these parameters termed posteriors. Such posteriors are proportional to the product of the likelihood, which is the probability of the collected data w given the physical model, and prior distributions over the parameters as follows
where w constitutes the set of all observations, including photon arrival times and detection channels.
To construct the posterior, the likelihood:
is derived in Sec. 2.3. Here, Πknon and Gknon are the non-radiative and radiative propagators, respectively. Furthermore, ρstart is computed by solving the master equation assuming the macromolecular system was at steady-state immediately preceding the time at which the experiment began. That is, solving:
Next, assuming that the transition rates are independent of each other, the associated prior can be written as:
where Gamma prior distributions over individual rates are selected. That is:
to guarantee positive values. Here, ϕi represents one of the Mϕ superstates of the macromolecular system-FRET composite collecting both the system states and photophysical states. Furthermore, α and λref are parameters of the Gamma prior.
In what follows, the inference procedure in 4.1.1 of the present disclosure assumes that the number of system states are known and provides an inverse strategy that uses the posterior above to learn only transition rates. Next, in 4.1,2 of the present disclosure, the model is applied to a nonparametric case accommodating more practical situations with unknown system state numbers by assuming an infinite dimensional system state space and making the existence of each system state itself a random variable.
Now, with the posterior defined, a sampling scheme is provided to learn distributions over all parameters of interest, namely, transitions rates populating G and the number of system states. However, the posterior in Eq. B2 does not assume a form amenable to analytical calculations. Therefore, Markov Chain Monte Carlo (MCMC) techniques can be applied to draw numerical samples.
Particularly convenient here is the Gibbs algorithm that sequentially and separately generates samples for individual transition rates in each MCMC iteration. This requires first writing the posterior in Eq. B2 using the chain rule as follows:
where the backslash after G indicates exclusion of the subsequent rate parameter. Furthermore, the first term on the right hand side is the conditional posterior for the individual rate λϕ
Since the conditional posterior above does not take a closed form that allows for direct sampling, the Metropolis-Hastings (MH) step can be applied where new samples are drawn from a proposal distribution q and accepted with probability:
where the asterisk denotes proposed rate values from the proposal distribution q.
Now, to generate an MCMC chain of samples, the chains for all transition rates λϕ
In the MH step, a convenient choice for the proposal is a Normal distribution leading to a simpler formula for the acceptance probability in Eq. B6. This is due to its symmetry resulting in q(λϕ
where κ=1 is an auxiliary parameter in the same units as λϕ
The transformation above requires introduction of Jacobian factors in the acceptance probability as follows:
where the derivatives represent the Jacobian and the proposal distributions are canceled by virtue of using a Normal distribution.
The acceptance probability above depends on the difference of the current and proposed values for a given transition rate. This difference is determined by the covariance of the Normal proposal distribution σ2 which needs to be tuned for each rate individually to achieve an optimum performance of the BNP-FRET sampler, or equivalently approximately one-third acceptance rate for the proposals.
In one case, where the smFRET traces analyzed contain about 105 photons, it is prudent to make the sampler alternate between two sets of variances at every MCMC iteration, {σex2=10−5, σFRET2=0.01, σsys2=0.1} and {σex2=10−5, σFRET2=0.5, σsys2=5.0}, for the excitation rates, FRET rates, and system transition rates. This ensures that the sampler is quickly able to explore values at different orders of magnitude.
Intuitively, these covariance values in the proposal distributions above would ideally scale with the relative widths of the conditional posteriors for these parameters (in log-space) if the approximate width could be estimated. Since posterior widths depend on the amount of data used, an increase in the number of photons available in the analysis would require a correspondingly smaller variance.
Here, aspects of the inference procedure described in Sec. 3.1 & 3.2.1 of the present disclosure are recited for ease of reference.
In realistic situations, the system state space's dimensionality is usually unknown as molecules under study may exhibit complex and unexpected behaviors across conditions and timescales. Consequently, the dimensionality Mϕ of the generator matrix G is also unknown, and must be determined by adopting a BNP framework.
Such a framework starts with assuming an infinite set of system states and placing a binary weight, termed load, on each system state such that if it is warranted by the data, the value of the load is realized to one. Put differently, a Bernoulli prior must be placed on each candidate state (of which there are formally an infinite number). In practice, distributions over Bernoulli random variables bi can be learned that activate/deactivate different portions of the full generator matrix as (see Sec. 3.2.1):
where active loads are set to 1, while inactive loads are set to 0. Furthermore, * represents negative row-sums. Finally, the number of active loads provides an estimate of the number of system states warranted by a given dataset.
As new variables to be learned have been introduced, the posterior of Eq. B2 is updated to incorporate the full set of loads, b={b1, b2, . . . , b∞}, as follows:
assuming that all parameters of interest are independent of each other.
As in the parametric sampler presented in the previous subsection, samples can be generated from the nonparametric posterior above using Gibbs sampling. That is, the MCMC chains for loads and rates can be initialized by drawing random samples from their priors. Next, to construct the chains, samples can be iteratively drawn from the posterior in two steps: 1) sequentially sample all rates using the MH procedure; then 2) loads by direct sampling, from their corresponding conditional posteriors (as described in Sec. 3.2.1). Since step (1) is similar to the parametric case, the following focuses on step (2).
To generate samples for load bi, the corresponding conditional posterior is given by:
where the backslash after b indicates exclusion of the following load. Hyperparameters Mσmax, the maximum allowed number of system states used in computations, and γ, the expected number of system states, can be set based on simple visual inspection of the smFRET traces.
Now, the conditional posterior in the equation above is discrete and describes the probability for the load to be either active or inactive, that is, it is itself a Bernoulli distribution as follows:
where:
The simple form of this posterior is amenable to direct sampling. In the end, the chain of generated samples can be used for subsequent statistical analysis.
Saurabh et al. (2022b) shows results demonstrating the robustness of the BNP-FRET sampler by investigating the effects of excitation rate on the distributions over transitions rates and system state numbers. The BNP-FRET sampler's performance is illustrated on synthetic data, and is then applied to estimate the number of system states along with associated escape rates from publicly available experimental data for a complex involving instrinsically disorder proteins (ACTR-NCBD). Results are compared with reported literature values.
In this section the adaptation of the general formalism described in Section 3 to the pulsed illumination case is illustrated. Next, a specialized inference procedure is presented.
As before, a molecular complex labeled is considered with a donor-acceptor FRET pair. As the molecular complex transitions through its Mσ system states indexed by σ1:M
Such photophysical transitions lead to photon emissions that may be detected in either donor or acceptor channels. The set of N observations, e.g., photon arrival times, from N pulses are recorded as:
Here, each individual measurement is a pair wn=(μnd, μna), where μnd and μna are the recorded arrival times (also known as microtimes) after the n-th pulse in both donor and acceptor channels, respectively. In cases where there is no photon detection, absent microtimes are denoted with μnd=∅ and μna=∅ for donor and acceptor channels, respectively.
As is clear from
where ρstart is the initial probability vector for the macromolecular system-FRET composite to be in one of M(=Mψ×Mσ) superstates, and ρnorm is a vector that sums the elements of the propagated probability vector. Here, Qn is the transition probability matrix between pulses n and n+1, characterizing macromolecular system-FRET composite transitions among superstates.
The propagators Qn above adopt different forms depending on whether a photon is detected or not during the associated period. Their most general forms are derived in Sec. 2.5.1. However, these propagators involve computationally expensive integrals and thus a few approximations are made here as follows: 1) assuming that the system state remains the same over an interpulse period since typical system kinetic timescales (typically 1 ms or more) are much longer than interpulse periods (˜100 ns); 2) assuming the interpulse period (˜100 ns) is longer than the donor and acceptor lifetimes (˜a few ns) such that they relax to the ground state before the next pulse. Furthermore, a specialized sampling scheme is demonstrated under these physically motivated approximations.
The immediate implications of assumption (1) are that the macromolecular system transitions may now, to a good approximation, only occur at the beginning of each pulse. Consequently, the evolution of the FRET pair between two consecutive pulses is now exclusively photophysical as the system state remains the same during interpulse times. As such, the macromolecular system now evolves in equally spaced discrete time steps of size τ where the system state trajectory can be written as:
where sn is the system state between pulses n and n+1. The stochastic evolution of the system states in such discrete steps is then determined by the transition probability matrix designated by Πσ. For example, in the simplest case of a molecular complex with two system states σ1:2, this matrix is computed as follows:
where the matrix in the exponential contains transition rates among the system states and the * represents the negative row sum.
Next, by assumption 2, further suppose that the fluorophores always start in the ground state at the beginning of every pulse. As a result, pulses are treated independently and the probability of observation wn can be written as:
where ρground denotes the probability vector when the FRET pair is in the ground state at the beginning of each pulse, Gψ is the generator matrix with only photophysical transition rates, and Qnψ(sn) is the photophysical propagator for the n-th interpulse period.
The observation probabilities of Eq. C4 are further organized into a newly defined detection matrix Dnσ with its elements given by (Dnσ)s
Now, using the matrix Dnσ, the reduced propagators for each interpulse period are defined as:
where ⊙ denotes element-by-element product.
Finally, using these simplified propagators, the likelihood for an smFRET trace under pulsed illumination can be written as:
as also introduced in the Sec. 2.5.1. This form of the likelihood is advantageous in that it allows empty pulses to be computed as a simple product, greatly reducing computational cost.
In the following, a parametric inference procedure assuming a given number of system states is illustrated. Next, the procedure is generalized to the nonparametric case to deduce the number of system states along with the rest of parameters.
Now, with the likelihood at hand, the posterior can be constructed as follows:
where the prior assumes that the unknown parameters, including the initial probability vector, ρstart, the photophysical transition rates in the generator matrix Gψ, and the transition probabilities among system states in propagator Πσ, are independent. Here, the set of unknowns can be sampled using the above posterior with the Gibbs sampling procedure as described in Sec. 3.2.2. However, a computationally more convenient inference procedure that allows direct sampling is accomplished by writing the posterior as a marginalization (sum) over state trajectories as follows:
where s1:N={s1, s2 . . . , sN} denotes a system state trajectory. Now, the nonmarginal posterior:
can be used to sample the trajectory s1:N which, in turn, allows direct sampling of the elements of propagator Πσ described shortly. For priors on ρstart and rates in Gψ, we, respectively, use a Dirichlet and Gamma distributions similar to Eq. 65-66. The system state trajectory s1:N is sampled by recursively sampling the states using a forward filtering backward sampling algorithm.
Finally, for each row in the propagator Πσ, a Dirichlet prior is used:
where Mσ is the number of system states and πm denotes the m-th row of the propagator. Here, the hyperparameters α and β are, respectively, the concentration parameter and a vector of length Mσ described in Sec. 3.2.2. Now, samples can be directly generated for the transition probability vectors πm of length Mσ via prior-likelihood conjugacy:
where the vector nm collects the number of times each transition out of system state σm occurs obtained using the system state trajectory.
After constructing the posterior, inferences can be made on the parameters by drawing samples from the posterior. However, as the resulting posterior has a nonanalytical form, it cannot be directly sampled. Therefore, a Markov chain Monte Carlo sampling (MCMC) procedure is developed to draw samples from the posterior.
The MCMC procedure follows a Gibbs sampling technique sweeping through updates of the set of parameters in the following order: 1) photophysical transition rates including donor relaxation rates λd (inverse of donor lifetime), acceptor relaxation rate λa (inverse of acceptor lifetime), FRET rates for each system state, and excitation rate (inverse of excitation probability πex) using MH; 2) transition probabilities between system states, π1:M
The smFRET data analysis method illustrated above assumes a given number of system states, Mσ. However, in many applications the number of system states is not specified a priori. Here, a generalization of the parametric method is provided to address this shortcoming and estimate the number of system states simultaneously along with other unknown parameters.
The previously-introduced parametric posterior can be modified as follows. First, suppose an infinite number of system states (Mσ→∞) for the likelihood introduced previously and learn the transition matrix Πσ. The number of system states can then be interpreted as those appreciably visited over the course of the trajectory.
To incorporate this infinite system state space into the inference strategy, the iHMM from the BNP repertoire can be leveraged, placing a hierarchical Dirichlet process prior over the infinite set of system states as described in Sec. 3.2.2 of the present disclosure. However, dealing with an infinite number of random variables, though feasible, is not computationally efficient and this infinite value can be approximated with a large number Mσmax, reducing the hierarchical Dirichlet process prior to:
Here, β denotes the base probability vector of length Mσmax serving itself as a prior on the probability transition matrix Πσ, and πm is the m-th row of Πσ. Moreover, γ is a positive scalar hyperparameter of the Dirichlet process prior often chosen to be one. As such, identical weights can be ascribed across the state space a priori for computational convenience.
Now, equipped with the nonparametric posterior, inferences can be made simultaneously on transition probabilities, excited state escape rates, and the remaining parameters. To do so, the Gibbs sampling scheme detailed in Sec. 3.2.2 of the present disclosure can be employed, except that the system state trajectory s1:N must also be sampled.
Safar et al. shows results for validation of the systems and methods discussed herein as applied to pulsed illumination. The main objective of the methods described herein is to learn full distributions over:
To sample from distributions over these parameters, the BNP-FRET sampler requires input data comprised of photon arrival time traces from both donor and acceptor channels as well as a set of precalibrated input parameters including: camera effects such as crosstalk matrix and detection efficiency (see Sec. 2.4 and Example V); background emission (see Sec. 2.6); and the IRF (see Sec. 2.5).
The results shown in Safar et al. indicate that the method samples posteriors over a set of parameters employing realistic synthetic data generated using the Gillespie algorithm to simulate system transitions and photophysical transitions while incorporating detector artefacts such as crosstalk (see Sec. 2.8).
Experimental results show that the method works for the simplest case of slow transitions compared to the interpulse period (25˜ns) with two system states using synthetic data. Next, more challenging synthetic data is discussed with three system states and higher transition rates. The nonparametric algorithm correctly infers system transition probabilities and thus the number of system states.
After demonstrating the performance of the method using synthetic data, experimental data is applied to investigate the kinetics of HJs under different MgCL2 concentrations in buffer.
Device 101 comprises one or more network interfaces 110 (e.g., wired, wireless, PLC, etc.), at least one processor 120, and a memory 140 interconnected by a system bus 150, as well as a power supply 160 (e.g., battery, plug-in, etc.). Device 101 can also include or otherwise communicate with a display device 130 to communicate results of the methods outlined herein.
Network interface(s) 110 include the mechanical, electrical, and signaling circuitry for communicating data over the communication links coupled to a communication network. Network interfaces 110 are configured to transmit and/or receive data using a variety of different communication protocols. As illustrated, the box representing network interfaces 110 is shown for simplicity, and it is appreciated that such interfaces may represent different types of network connections such as wireless and wired (physical) connections. Network interfaces 110 are shown separately from power supply 160, however it is appreciated that the interfaces that support PLC protocols may communicate through power supply 160 and/or may be an integral component coupled to power supply 160.
Memory 140 includes a plurality of storage locations that are addressable by processor 120 and network interfaces 110 for storing software programs and data structures associated with the embodiments described herein. In some embodiments, device 101 may have limited memory or no memory (e.g., no memory for storage other than for programs/processes operating on the device and associated caches). Memory 140 can include instructions executable by the processor 120 that, when executed by the processor 120, cause the processor 120 to implement aspects of the system and the methods outlined herein.
Processor 120 comprises hardware elements or logic adapted to execute the software programs (e.g., instructions) and manipulate data structures 145. An operating system 142, portions of which are typically resident in memory 140 and executed by the processor, functionally organizes device 101 by, inter alia, invoking operations in support of software processes and/or services executing on the device. These software processes and/or services may include BNP-FRET processes/services 190, which can include aspects of methods and implementations of various modules described herein. Note that while BNP-FRET processes/services 190 is illustrated in centralized memory 140, alternative embodiments provide for the process to be operated within the network interfaces 110, such as a component of a MAC layer, and/or as part of a distributed computing network environment.
It will be apparent to those skilled in the art that other processor and memory types, including various computer-readable media, may be used to store and execute program instructions pertaining to the techniques described herein. Also, while the description illustrates various processes, it is expressly contemplated that various processes may be embodied as modules or engines configured to operate in accordance with the techniques herein (e.g., according to the functionality of a similar process). In this context, the term module and engine may be interchangeable. In general, the term module or engine refers to model or an organization of interrelated software components/functions. Further, while the BNP-FRET processes/services 190 is shown as a standalone process, those skilled in the art will appreciate that this process may be executed as a routine or module within other processes.
Referring to
Step 204 of method 200 includes iteratively sampling (e.g., by a Markov Chain Monte Carlo (MCMC) sampling procedure) a set of probability values associated with values of a set of model parameters from a measurement model, including: a set of transition rates among superstates of a composite representing interactions between the macromolecular complex and FRET dyes; and a set of observed system states of the macromolecular complex. The measurement model is constructed using a nonparametric Bayesian formulation of a system-FRET Master Equation (e.g., Eq. (1)) descriptive of an evolution of the composite.
Step 206 of method 200 includes jointly inferring, based on the set of probability values and the observation data, a set of most probable values of the set of model parameters that are associated with a highest likelihood of observing the observation data.
Notably, step 204 may be divided into sub-steps. For continuous illumination, step 204 may be performed using method 300 outlined in
Referring to
Step 312 of method 300 can include determining a set of observed system states of the macromolecular complex based on load values for each load of a plurality of loads. Step 312 can encompass steps 314 and 316. Step 314 can include initializing load values for a plurality of loads using a Bernoulli process prior (see Eqs. (71) and (72)) each load representing a hypothetical system state. Step 316 can include sampling, for each load of the plurality of loads, a load value from a Bernoulli formulation (see Bernoulli formulation of Eq. (72) in Section 3.2.1) of a conditional posterior probability distribution of the measurement model that jointly incorporates the set of transition rates and the plurality of loads. Because the measurement model incorporates the set of transition rates to sample load values, steps 302-310 should be applied prior to step 316. If the load value for a load is a first value, this indicates that a corresponding hypothetical system state is active and is included in the set of observed system states; likewise, if the load value for a load is a second value, this indicates that a corresponding hypothetical system state is inactive and is to be excluded from the set of observed system states.
Following conclusion of method 300, step 206 of method 200 may be applied to infer the set of most probable values of the set of model parameters that are associated with a highest likelihood of observing the observation data (e.g., that are “warranted by the data”).
Referring to
Step 402 of method 400 includes initializing a chain of transition probability values for a set of transition probabilities associated with a set of hypothetical system states from a Dirichlet process prior. Step 404 of method 400 includes sampling the transition probability values of the set of transition probabilities by directly sampling from a posterior probability distribution using a Gibbs sampling scheme. Step 406 of method 400 includes selecting one or more hypothetical system states for inclusion in the set of observed system states based on the transition probability values. A high transition probability value (e.g., the “most visited” transition probabilities) means that the associated hypothetical system state should be included within the set of observed system states, and a low transition probability value means the associated system state should be excluded from in the set of observed system states.
Step 408 of method 400 includes sampling a set of photophysical transition rates of the set of model parameters using a Metropolis-Hastings sampling scheme. Step 408 can be applied similarly to steps 304-310 of
Following conclusion of method 400, step 206 of method 200 may be applied to infer the set of most probable values of the set of model parameters that are associated with a highest likelihood of observing the observation data (e.g., that are “warranted by the data”).
It should be understood from the foregoing that, while particular embodiments have been illustrated and described, various modifications can be made thereto without departing from the spirit and scope of the invention as will be apparent to those skilled in the art. Such changes and modifications are within the scope and teachings of this invention as defined in the claims appended hereto.
This is a U.S. Non-Provisional Patent Application that claims benefit to U.S. Provisional Patent Application Ser. No. 63/454,590 filed 24 Mar. 2023, which is herein incorporated by reference in its entirety.
This invention was made with government support under R01 GM130745 and R01 GM134426 awarded by the National Institutes of Health. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63454590 | Mar 2023 | US |