The Bayesian Data-Analysis Software Package
The programs that run the various Bayesian analysis, the server software,
were developed at Washington University by Dr. G. Larry Bretthorst
and the Java language client interface was developed by Dr. Karen Marutyan.
The combination of the server and client software is called the
"Bayesian Data-Analysis Toolbox" software.
However, this name is slightly misleading because this software can
analyze data from many different sources, not just NMR data.
Additionally, unlike the previous interface to this software, this new
interface does not require the user to have access to any specialized NMR software,
i.e., this interface is completely independent of Varian's VnmrJ,
although the interface can load and process data from a Varian spectrometer.
The software contains a series of programs which we call packages and these packages implement
various calculations using
Bayesian probability theory.
Most of these calculations are implemented using
Markov chain Monte Carlo.
All of the programs except Bayes Analyze, are capable of fully using
multiple CPUs if you have them.
The various packages implemented by this software are
describe here in the order they occur on the package menu in the interface.
The hyperlinks contained in the following descriptions will download
the the Chapter from the user manual being discussed.
Here is the current list of packages:
-
The Exponential package estimates
the decay rate constants and amplitudes of signals known to be decaying exponentially.
It does this when the number of the exponentials
is
known
or
unknown.
In both cases the input to this package can come from ASCII files, from a peak pick or
from Bayes Analyze files.
In all cases one or more input data sets can be processed and the package looks
for exponentially decaying signals that are common to the multiple data sets,
but allowing each exponential to have differing initial conditions in each data set.
-
The Inversion Recovery
package
is a special type of exponential
analysis that is very common in NMR. In this problem the NMR signal starts
at a negative value and decays to a positive value. The inversion recover model
differs from an exponential plus a constant model only in that the model is typically
formulated so that the two amplitudes represent the initial, time equal to zero, and equilibrium amplitude;
thus the amplitudes are linear combinations of the amplitudes that would be estimated by
an exponential plus a constant model.
As a side note, this package is really a special case of the Enter Ascii package described below.
We call these special cases preloaded enter Ascii models because the interface preloades
the inversion recover model from the system model and thus simplifies what the user
must do to run this inversion recovery model.
This package can analyze multiple data jointly to look for a common diffusion parameters.
-
the Diffusion Tensor
package
analyzes
NMR diffusion measurements using one, two or three diffusion tensor models with or without a constant.
These tensor can use either "b" values or "g" (gradient) values for the abscissa and
the "b" values can be either 3D vectors or "b" matrices.
Thus this package process 18 different diffusion tensor models. Because McMC packages compute
the probability for the model using thermodynamic integration, this package has the ability to
do some simple model selection.
As with most packages multiple ASCII data sets can be analyzed jointly to look for common
diffusion tensor parameters.
-
The Enter Ascii Model
package
package allows the user to define a model and then use Bayesian Probability
theory to analyze that model.
To create a simple model, the user must copy the example model and then write the
the Fortran or C code necessary to evaluate the model.
In addition, the users must create a file that describes the model parameters
and the prior probabilities for those models.
The user has the option to load a user generated model or a system model, a model written by us.
When the package is run, the data along with this model are sent to the server.
The server compiles this model on the fly and creates a dynamic load library.
Because of this, to use this package, the server must have either Fortran or C installed on the it.
The Enter Ascii program is then run and the model dynamically loaded and used
to analyze the data.
As with most packages that require ASCII data, this package can analyze
multiple data jointly to look for common parameters.
Finally, the models used in Enter Ascii package are the same models used to analyze images.
So one can use the Enter Ascii package to analyze a few pixels from an image and
then proceed to analyze an entire image, see Analyze Image Pixels for more on this.
-
The Enter Ascii Model Selection
package
utilizes
the models generated for Enter Ascii to do model selection.
After setting up a number of rival models using Enter Ascii, one can then proceed to this package.
Here one can load up to 10 different models and then use this package to compute the posterior
probability for the models.
Because this is a new package, the manual pages are not
yet available for this package.
-
The Test Ascii Model model
package
supports the Ascii Model packages,
by giving you a facility for testing models to enshure they are doing their calculations
correctly. This package allows you to load a model and then it
will throughly test the model by evaluating the model 10,000 times using parameter
sampled from the priors. In the process of evaluating the model, the package will
catch any arithmetic errors that occurr and it will show you where the invalid
arithmetic occurred. The outputs form the model include a peak posterior probability
estimate of the model and plots of the model signal as a function of the
parameter samples.
-
The Magnetization Transfer (two sites)
package
solves the Block-McConnell equations to obtain the exchange rate constants
for two site magnetization exchange.
Input to this package is usually the peak amplitudes or intensities from
two inversion recovery time coarses where the exchanging
peaks in are selectively inverted.
-
The Magnetization Transfer Kinetics
package
is a magnetization transfer package
that solves the Block-McConnell equations at multiple temperatures and concentrations to derive
the entropy and enthalpies of the the exchange process.
Input to this package is the same as for the two site magnetization transfer package with
multiple temperature and concentration measurements.
-
The Big Magnetization Transfer
package
solves the magnetization transfer problem
when one of the sites can be considered
infinite compared to the other.
-
The Bayes Analyze
package
is a time domain frequency estimation
package that is fully
capable of determining the number of resonances
in an FID and estimating the resonance parameters.
This package can analyze single FIDS, or it can run multiple FIDs and look
for frequencies common to these FIDs.
Input to this package can come from different sources and appropriate
data conversions are carried out when the data are loaded.
-
The Big Peak/Little Peak
package
analyzes
time domain FID data in which there is a single big peak that
may be many orders of magnitude larger in intensity (the big peak) than
the metabolic peaks (the little peaks) of interest.
The Big Peak/Little Peak package solves this
problem by treating the big peak as a nuisance and then
uses Bayesian probability theory to account for the big peak while
simultaneously estimating the frequencies, decay rate constants and
amplitudes of the resonances of interest.
-
The Find Resonances
package
analyzes
NMR FID data looking for resonances. The program is a model selection program that is attempting
to determine the number of resonances in the data and estimate the parameters associated with
those resonances.
This package uses Markov chain Monte Carlo simulations to determine the posterior probability
for the number of resonances in the data.
This package essentially solves the same problem as the Bayes Analyze package described below.
However, because it uses McMC the calculations are much slower than those in Bayes Analyze,
but they are much more through; often having much better resolution than Bayes Analyze.
Because this is a new package, the manual pages are not yet available.
-
The Metabolite
package
analyzes data from a given NMR sample, for example a C13 FID
of Glutamate.
The intensity of the Glutamate resonances are related to each other through a metabolic model.
This model can be very simple or very complex and with help from us they can be user defined.
The metabolic model relates the intensity of the resonances in the model to a series
of metabolic parameters, typically fractional rates that relates how much of a compound
went through a certain chemical reaction.
The resonances in a metabolic models are described in a metabolite file
and the metabolic model itself is encoded in a FORTRAN or C routine.
The metabolic package
reads the resonance and the metabolic models and then
uses Bayesian probability theory to estimate the metabolic parameters
as well as the parameters associated with the resonances, i.e., the frequencies
and decay rate constants.
-
The Behrens-Fisher
package
solves the
classical medical testing problem:
given two experiments that consist of repeated measurements of the
same quantity where in the second measurement one has change
some experiential parameter determine if the experiments are the
same or if they differ.
For more information on this calculation
see On the Difference in Means.
-
The Errors in Variables
package
solves
the errors in variables problem.
In this problem one has a data set that has uncertainty in both the X and Y variables.
These errors may be know or unknown, so this package solves four different
errors in variables problems. In the name the "given" refers to the fact
that the program solves this problem given the order of the polynomial to fit.
The input data are described in the manual.
-
The Polynomial Models package fits polynomials
of either a
given
given
or an
unknown
order to the input data.
When the order is specified then a polynomial of that order is analyzed using
Bayesian probability theory to determine the appropriate coefficients.
When the order is specified as unknown, the Bayesian probability theory
is used to compute the posterior probability for the order of the polynomials.
The input data is two column ASCII and this package do not process multiple
data sets.
-
The MaxEnt Histograms, density estimation
package,
is a ASCII package that takes as
its input a two column ASCII file. Column one is just a data point number and column
two is a sample from the unknown density function.
The program models the density function as a Maximum Entropy moment
distribution having an unknown number of Lagrange multipliers. So the
parameters are Lagrange multipliers and the unknown number of them.
The program does a Markov chain Monte Carlo simulation with simulated annealing
where the number of multipliers is one more parameter in the simulation.
Outputs include the posterior probability for the number of multipliers,
the posterior probabilities for the multipliers, scatter plots and the
polynomials used in the calculations.
-
The Binned Histogram
package
is a new histogramming package. In the
previous release of the software, there was a MaxEnt histogramming
package that infers histograms that are functionally
Maximum Entropy moment distributions. As such the program
is inferring the moments and the number of moments needed
to represent the input samples from unknown density.
This procedure works well for compact distribution, but fails
badly when the distribution of samples is multimodal. In order
to estimate density functions when the samples are multimodal we
added a histogramming package that infers what can only
be called binned histograms. These histograms can represent
any distribution, they have error bars on the number of counts
in the bins, and the user can indicate if the histograms are to be
smoothed or not.
-
A Kernel density estimation package has been added to the list of packages.
This is a true density estimation package that attempts to estimate
a density function by expanding it on a set of kernels. There
are nine
different kernel types and the package attempts to determine what superposition
of kernels best describes the denstiy function.
Because this is a Bayesian estimation of the density function,
the estimated density function comes with uncertainity estimates.
Note that I have not yet written the manual pages for this package.
-
The Linear Phasing
package
produces linearly phased images.
In spin echo MRI most images can be phased (absorption mode images) by calculating two
first order phases and one zero order phase. Bayes Phase computes these phases and
then applies them to the images. The resulting images are then available for further
processing by the Analyze Image Pixels package. For more on this calculation
see Automatic phasing of MR images. Part I: Linearly varying phase.
-
The Nonlinear phasing
package
phases images that are varying in
a nonlinear fashion.
This package can be used to produce absorption mode images for gradient echo MR images
or any other image in which the phase is varying in an unpredictable fashion.
For more information on this calculation
see Automatic phasing of MR images. Part II: Voxel-wise phase estimation.
-
The Image Pixels
package
loads a predefined model
and then uses that model to analyze images on a pixel by pixel basis.
Model can be loaded from the system directory and these predefined models perform a number
of common calculations in MRI such as exponential analysis with one or more exponentials
with or without a constant, diffusion tensor, Additionally, the users can copy and the edit an example
model to create models of his own. These models can be loaded from the users home directory
and then used to analyze the image.
-
The Image Pixels package includes an option for finding the peak of the posterior
probability. When this option is selected, a different program is actually run by
the package. This program is a searching algorithm that looks for the peak in the
posterior probability for the parameters in the model.
The Chapter on the Bayes Analyze
package
has an extensive discussion of Levenberg-Marquardt and Newton-Raphson.
These peak parameter estimates
are then used to generate maps of the various parameters appearing in the model.
Because this program is a searching routine rather than an MCMC routine, it is very
fast and can give you good results using any ASCII model in a small fraction of the
time needed to run the Markov chain Monte Carlo simulations.
-
The Image Pixels Model Selection
package
extends the concepts in Analyze Image Pixels to model selection.
In this package one can load a number of different models and then use Bayesian probability
theory to determine which model best accounts for the data.
The models in use here are the same models mentioned in both Analyze Image Pixels and the Enter Ascii packages.
However, here because the models can have different parameterizations, the output images
are constructed from the derived parameters. For more on this package and how to use it see
the user manual.
In addition to the discussions of the various packages,
the user manual also contains discussions of the various
file formats used by the Bayes Analysis software.
These include a discussion of the four dimensional floating point format,
sometimes called
4dfp,
a discussion of the
Ascii File
formats used.
The
directory orginization,
how to
install
the software,
a description of the
interface,
a tutorial on
Markov chain Monte Carlo,
with thermodynamic
integration,
outlier
detection,
and, finally, a detailed description on how to write and build
your own
models.
This site is being maintained by:
Larry Bretthorst
Dept. Of Chemistry and Radiology
Washington University
St. Louis MO 63130
Phone: 314 362-9994