The Protein Data Bank (PDB) contains more than 32,000 experimentally solved protein structures. The Structural Classification of Proteins (SCOP) database, a manual classification of these structures, cannot keep pace with the rapid growth of the PDB. We will provide an automatic classification of proteins which reflects the manual classification. We use structurally derived features to cluster groups of related proteins. Each cluster has a maximal set of shared features, or fingerprint.

When given a group of proteins and a target protein, the LGA algorithm[1] creates one structural alignment of each protein to the target protein. We use the Gaussian Mixture Model to cluster the proteins according to the structural regions they share with the target. Taking each of the proteins in turn as the target yields an ensemble of clusters, multiple partitions on the same set of proteins. Discrepancies are resolved by grouping together proteins that clustered together across many of the partitions.

The test data is comprised of PDB structures having resolutions ranging from 0.54Å (X-ray structures) to greater than 15Å (electron microscopy). Despite this noise, the robust nature of our clustering method detects relationships on the level of the SCOP superfamily with 88% accuracy and a low false positive rate.

Future work involves predicting the family and superfamily to which a new structure belongs. Comparison with the structural fingerprint determines whether a new structure belongs to the cluster. Initial results indicate that the fingerprint derived directly from a SCOP family can predict membership with almost complete accuracy.

We will present and compare computational and statistical results of some approximation methods for a general class of inverse problems in which the underlying dynamics are described by partial differential equations and the unknown quantity of interest is a probability distribution. We are interested in determining the distribution P* from a given family P(Q) that gives the best fit of the underlying model to some given data. In general, this optimization problem involves both an infinite dimensional state space and an infinite dimensional parameter space. Therefore, computationally efficient approximation methods are desired. In choosing these methods, we want the finite dimensional sets P^M(Q) to converge to P(Q) in some sense, and in our efforts, we use the Prohorov metric of weak star convergence of measures for these results. The approximation methods that we will present are applicable to a variety of inverse problems, including Type I problems in which individual longitudinal data is available for members in the population and Type II problems in which only aggregate or population level longitudinal data is available. The results that we will present are for problems of Type II, where the population of interest is a size-structured mosquitofish population.

Mosquitofish are being used in the place of chemicals by biologists to control mosquito populations in rice fields in an effort to protect the environment. While they have used mosquitofish in the place of pesticides for some time, they have not completely understood the control of the growth of mosquitofish populations. In order to determine the optimal amount of mosquitofish to use for control purposes, biologists would like a mathematical model that is capable of accurately predicting the growth and decline of the mosquitofish populations. We will present the Sinko-Streifer population model modified as in the Growth Rate Distribution model of Banks-Botsford-Kappel-Wang. We will also present and compare computational and statistical results of a delta function approximation method, a spline based approximation method, and a parameterized ordinary least squares (OLS) formulation. The latter uses an a priori probability distribution in the inverse problem for estimation of distributions of growth rates in size-structured mosquitofish populations. The approximation methods are tested with experimental data collected from rice fields.

An idealized model is presented that can represent the main features of the tropical atmosphere. The model uses a truncated vertical structure with only two vertical modes. It is numerically integrated to a steady state that represents the climatology of the tropical atmosphere. The numerical model is non-oscillatory, so discontinuous flows can be represented well, and the numerical model can represent a balanced state of the atmosphere to machine precision.

The Magnetothermal Instability (MTI) is a plasma instability that occurs in dilute, magnetized astrophysical plasmas. The driving force is anisotropic thermal transport that occurs along magnetic field lines. This instability is likely to be applicable to the X-ray emitting gas that makes up the intracluster medium in clusters of galaxies as well as to the atmospheres of magnetized neutron stars. In this poster we show the results of 2D and 3D magnetohydrodynamic simulations of this instability with a version of the MHD code ATHENA. We demonstrate verification of the simulation by comparison to linear WKB theory. We then follow the instability into the nonlinear regime in order to measure saturation and transport properties.

The yeast *Saccharomyces cerevisiae* has become an essential model for the study of fundamental aspects of eukaryotic cell behavior. As research in Systems Biology moves steadily toward the development of a quantitative description of biological networks, there is an increasing need for a yeast specific single-cell assay capable of capturing the dynamics of cellular processes. Recent advances in microscopy and microfluidic devices promise to facilitate the acquisition of long image sequences monitoring large populations of cells. However, manual extraction of single-cell information from these images is prohibitively time consuming. Here, we present a software package for the automated extraction of single-cell trajectories from a series of fluorescence microscopy images. Operated in automatic mode, image segmentation is over 95% accurate, and a manual correction option allows the user to correct any errors. Coupled with advanced imaging technologies, the availability of this software should greatly aid quantitative modelers of both native and synthetic gene circuits by facilitating the long-term observance of dynamical properties of gene regulation in *S. cerevisiae*.

We investigate the effect of different coarse-graining techniques on the observed properties of micellar systems, including their impact on critical micelle concentration and average aggregation number. Based on molecular thermodynamic arguments, we argue that an effective coarse-graining technique in micellar systems is one that preserves the surfactant tail free energy of transfer between aqueous and melt phases.

We report a multi-scale modeling study of the transport and orientation properties of single strand DNA molecules in a nanoscale channel. A novel nanotechnology concept has been proposed which offers the possibility of unprecedented rapidity in the detection of DNA sequences. The proposed device consists of a detection gate of approximately two to five nanometers in width placed between two nonconductive plates. The DNA molecules in aqueous solution contained between the plates will be driven by an electric field through the detection gate. Individual base pairs within the DNA sequence are determined experimentally by examining the variations in the tunneling conductance of the gate. We are conducting large scale molecular dynamics simulations to study the transport and orientation of the DNA segment and the orientation of each base pair when they pass through the nanogate. Electric fields are applied along both vertical and horizontal directions to control the motion of DNA segments and the orientations of the base pairs. Molecular dynamics is used to determine ideal gate widths, optimal electric fields to be applied, and ideal solvent environment. Recent efforts include the implementation of charge dynamics to properly represent the metal/non-metal interactions. Additionally, an alternative to the use of an electric field as a controlling mechanism is examined. Results from these molecular dynamics simulations are presented. In the broader project, the molecular dynamics simulations are combined with ab initio calculations of differences in tunneling electron transport across the nanoscale gap as different amino acids pass through the gap; these calculations are combined with experimental fabrication of the actual device. Both the computational and experimental projects are supported by complementary NIH grants.

WHAM, weighted histogram analysis method, is an intuitive and surprisingly accurate methodology for extracting work information from molecular dynamics. One advantage of this methodology is its foundation in umbrella sampling computations and ability to combine several simulations without worrying about overlap issues. This method, along with several other established protocols, was applied to driving solvated ions through a silicon dioxide pore.

We argue that the functional quality of a biochemical signaling pathway or a regulatory circuit should be measured in terms of the amount of information (in bits) between the copy numbers of the input and the output signaling molecules that is attainable by the circuit. Treating stochastic effects by the linear noise (semiclassical) expansion around a deterministic solution of a biochemical dynamical system (which we verify by direct Gillespie simulations), we systematically analyze this mutual information in many small biochemical circuits, including various feedback loops, that can be built out of 3 chemical species coupled by Hill-type interactions as a function of ~15 chemical kinetics parameters. We study this information for a certain distribution of the input signals and maximize it over biologically realistic ranges of the parameters. Surprisingly, all the circuits manage to attain almost the maximum information possible (which we calculate analytically) for the given mean molecular copy numbers and integration times. Additionally, these high information solutions are robust to rather large fluctuations in the parameters. These findings suggest potential explanations for the “cross-talk” paradox and other molecular information processing phenomena. Furthermore, they lead us to question an assumption behind many recent publications that naturally occurring biochemical networks are special in their information processing properties.

A typical neuron in the human brain receives thousands of synaptic inputs from neighboring cells and may direct its output signals to just as many targets. An important long-term goal in the field of neuroscience is to understand how this complex local connectivity shapes network dynamics and supports information processing in the brain. Toward this end, recent technical advances in serial section electron microscopy have provided us with a very high resolution 3D window onto neuronal ultrastructure and synaptic connectivity throughout large volumes of fixed tissue. Image stacks produced by high-throughput techniques such as Serial Block-Faced Scanning EM (SBF-SEM) are so large, however, that manual tracing of all anatomical features in a typical data set would take years. For this reason, automated 3D reconstruction is imperative. Currently, I am developing robust algorithms for edge detection and segmentation of electron micrographs, as well as algorithms for registration and merging of traced objects across sections in SBF-SEM data. In the near term, my goal is to demonstrate accurate and exhaustive 3D reconstruction of anatomical features within small sample volumes. Future work will apply this technique in a high-performance computing context by simultaneously reconstructing and then reassembling many sub-volumes from a large data set. Through this work, I hope to demonstrate a robust methodology for automated 3D reconstruction of neurons from serial EM data sets, which may be used to gain important new insights into the structure and function of neural systems.