DNA and RNA Structure

218.Molecular dynamics of protein-RNA interactions: the recognition of an RNA stem-loop by a Staufen double-stranded RNA-binding domain
219.Numerical analysis of RNA structurisation process.
220.Estimation of the Amount of A-DNA and Z-DNA in Sequenced Chromosomes
221.Feature selection to discriminate five dominating bacteria typically observed in Effluent Treatment Plant



218. Molecular dynamics of protein-RNA interactions: the recognition of an RNA stem-loop by a Staufen double-stranded RNA-binding domain (up)
Tiziana Castrignano`, Giovanni Chillemi, CASPUR (Italian Interuniversities Consortium for Supercomputing Applications);
Gabriele Varani, MRC Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, UK;
Alessandro Desideri, University of Rome "Tor Vergata", Via della Ricerca Scientifica, 00133 Rome, Italy;
Tiziana.Castrignano@caspur.it
Short Abstract:

In this work we report 2ns molecular dynamics simulations of three molecular systems: 1. the complex between a double-stranded RNA-binding domain (dsRBD) and a RNA stem-loop from Drosophila 2. the free protein dsRBD 3. the free RNA stem-loop. Analysis of trajectories highlights the regions involved in the recognition process.

One Page Abstract:

RNA-protein interaction play a central role in a wide range of biological processes. One of the most common RNA-binding motifs is the double-stranded RNA-binding domain (dsRBD), found in many eukaryotic and procaryotic proteins involved in RNA processing maturation and localization. In this work we report 2ns molecular dynamics simulations in aqueous solution of three molecular systems: 1. the complex between the third dsRBD (dsRBD3) from Drosophila Staufen and a RNA stem-loop 2. the free protein dsRBD3 3. the free RNA stem-loop. All the systems have been simulated using the AMBER force field and the Ewald summation methods to treat the electrostatic interactions. Analysis of the trajectories has permitted us to highlight the regions involved in the recognition process and to compare the difference in flexibility between the free and the bound macromolecules. These data are also compared to the experimental NMR structures and the experimental mutational studies to provide a description of the residues crucial for the binding affinity and specificity.


219. Numerical analysis of RNA structurisation process. (up)
Ekaterina Kozyreva, State Reseatch Institute of Genetics and Selection of Industrial Microorganisms;
T. M. Eneev, N. N. Kozlov, E. I. Kugushev, Keldysh Institute of Applied Mathematics RAS;
D. I. Sabitov, Moscow State University;
Katya.Kozyreva@moscow.att.com
Short Abstract:

A problem of RNA structurisation process during its transcription is considered. Both a mathematical model of RNA secondary structure formation and its principally new algorithm developed on the basis of oriented, dynamically changing mathematical graph with corresponding computer implementation are described.

One Page Abstract:

A problem of RNA structurisation during its transcription process is considered. Both a mathematical model of RNA secondary structure formation and its principally new algorithm with corresponding computer implementation are described. The growth of RNA chain is associated to the set of discrete internal structural gaps from unstructured state to the final configuration when the molecule folds into locally stable structure. The ordered set of these gaps simulates elongation of RNA and its structure formation at the current step of the transcription. The application of this so called consecutive approach significantly improves accuracy of RNA secondary structure prediction and confirms the assumption on discontinuity of RNA transcription process. The set of inter-structural gaps is presented by oriented, dynamically changing mathematical graph. Each vertex of the graph is related to the set of possible secondary structures of RNA transcript at the current step of RNA chain growth. The verges are assigned dG – a free energy increment related to the permissible structural transitions from unstructured to structured state at the each step of simulation process. The path on this graph characterizes step-by-step formation of RNA secondary structure. A new computer implementation on the basis of consecutive accumulation and storage of transition paths on the structural graph allows to reduce RNA secondary structure calculations expenditures up to two orders in comparison with other traditional approaches and averages about 12 hours per molecule at MVS-1000 complex. The development of the approach described above makes possible carrying out a comprehensive set of calculating experiments for RNA molecules of more then 150 nucleotides. Fifty molecules of Rnase P-RNA with known secondary structure were tested over approach proposed. It was revealed that the accuracy of secondary structure prediction against the period of transcription grows rapidly while the increase of period of transcription from 1 to 20 nucleotides per step. Three remarkable peaks of the number of RNA molecules with more then 50% of correctly predicted base-pairs should be noticed for the range of RNA speed chain growth from 20 to 60 nucleotides per step. It can be assumed that the period of transcription lies within these limits. All of three peaks in the diapason of 20 – 60 nucleotides correspond to the values of T which are multiple to the length of the spire of RNA in A-form. For values of T more then 60 nucleotides the secondary structure forecasting accuracy curve demonstrates fading trend.


220. Estimation of the Amount of A-DNA and Z-DNA in Sequenced Chromosomes (up)
David Ussery, Dikeos Mario Soumpasis, Hans Henrick Stærfeldt, Peder Worning, Anders Krogh, CBS, DTU;
dave@cbs.dtu.dk
Short Abstract:

We have examined sequenced chromosomes for stretches of purines (R) or pyrimidines (Y) capable of forming A-DNA and alternating YR stretches which could form Z-DNA. Out of more than 500 sequenced chromosomes from eukaryotes, prokaryotes, and viruses, the majority have more A-DNA and Z-DNA than expected for a random sequence.

One Page Abstract:

We have examined sequenced chromosomes for stretches of purines (R) or pyrimidines (Y) capable of forming A-DNA and alternating YR stretches which could form left-handed Z-DNA. Since A-DNA helices can readily form with stretches of 5 purines in a row, we measure the fraction of each genome which contains purine (or pyrimidine) tracts of lengths of 5 bp or longer, as a measure of the A-DNA content. Using this criteria, a random sequence would be expected to contain about 18.75% percent A-DNA. On average, the more than 500 sequenced chromosomes examined contained an average of 25% A-DNA, with a low of 10% and a high of 40% A-DNA content. In the majority of cases (e.g., for 84% of the chromosomes, (which contain 98% of the total DNA)), there is more A-DNA than would be expected from a random sequence. The percent of the chromosome which is capable of forming Z-DNA is estimated by looking for alternating pyrimidine-purine stretches (YR)n of length of at least 10 bp. Based on this assumption, the expected value is calculated as being about 0.6% of the genome. Although the average for all genomes was higher than expected (with an average of 1.04% for all chromosomes), in many prokaryotic genomes such tracts are found less than would be expected, whilst in eukaryotic chromosomes, alternating YR tracts are more common than anticipated. Overall, about a third of the chromosomes (197 mainly viral of 589 total) had less alternating purine/pyrimidine stretches than expected, whilst the remaining two thirds (including all eukaryotic chromosomes) had more, with some protozoan chromosomes containing more than 6% (e.g., more than 10x the expected value) of their length containing YR stretches of 10 bp or longer. Localisation of A-DNA and Z-DNA regions within chromosomes and possible biological roles for these alternative DNA conformations are discussed.


221. Feature selection to discriminate five dominating bacteria typically observed in Effluent Treatment Plant (up)
Hemant J. Purohit, D.V.Raje, R.N.Singh, National Environmental Engineering Research Institute;
hemantdrd@hotmail.com
Short Abstract:

Six dinucleotide features were obtained using stepwise algorithm from 16S rRNA to discriminate five dominating bacterial groups from effluent treatment plant. Two linear composites were obtained to discriminate the training set of sequences with 91% accuracy and were validated for a test set with almost the same predictive accuracy.

One Page Abstract:

Defining a microbial community and identifying bacteria, at least at the genus level, is a first step in predicting the behavior of any biological treatment system. In effluent treatment plants, the most dominating and typically observed bacterial groups are Pseudomonas, Moraxella, Acinetobactor, Burkholderia and Alcaligenes. Even though genetically close, these bacteria may be distinguished from each other based on their nucleotide compositions. Our interest lies in selecting the features from 16S rDNA sequences, which could be used to develop a tracking tool. Twenty sequences from each of the above groups were retrieved from GenBank. A feature space comprising of likelihood estimate of dinucleotides was defined on the sequence data. A stepwise feature selection method was used which resulted in six out of the total sixteen features that had significant variability across the sequences. Multiple group discriminant analysis was carried out to test the efficacy of the selected features to segregate the sequences into respective groups. Two linear composites, as a function of these features, could discriminate the training set of sequences with 91% accuracy and were validated for a test set with almost the same predictive accuracy. This ascertained the relevance of the selected features in the classification. These features independently or in combination might generate genus specific patterns that might be used to develop PCR protocols and thereby a tracking tool. The program for determining the likelihood estimates is available with the corresponding author. The rest of the analysis has been carried out using SPSS software, which is commercially available.