Bayesian inference of accurate population sizes and FRET efficiencies from single diffusing biomolecules
It is of significant biophysical interest to obtain accurate intramolecular distance information and population sizes from single-molecule Förster resonance energy transfer (smFRET) data obtained from biomolecules in solution. Experimental methods of increasing cost and complexity are being developed to improve the accuracy and precision of data collection. However, the analysis of smFRET data sets currently relies on simplistic, and often arbitrary methods, for the selection and denoising of fluorescent bursts. Although these methods are satisfactory for the analysis of simple, low-noise systems with intermediate FRET efficiencies, they display systematic inaccuracies when applied to more complex systems. We have developed an inference method for the analysis of smFRET data from solution studies based on rigorous model-based Bayesian techniques. We implement a Monte Carlo Markov chain (MCMC) based algorithm that simultaneously estimates population sizes and intramolecular distance information directly from a raw smFRET data set, with no intermediate event selection and denoising steps. Here, we present both our parametric model of the smFRET process and the algorithm developed for data analysis. We test the algorithm using a combination of simulated data sets and data from dual-labeled DNA molecules. We demonstrate that our model-based method systematically outperforms threshold-based techniques in accurately inferring both population sizes and intramolecular distances.