Key points: The RNA alignments were founded on superimposition of crystal structures for all known resolved protein structures. Whenever sequences were added, initially by ClustalW profile-profile fits, the compilation was edited to conform with that fit, and also with any other known information concerning protein or RNA conservation. The RNA alignments (*.rna) maintain open reading frames from which the aligned polyproteins (*.p123) were translated. In addition, they respect known 2D and 3D RNA structural motifs, including 5' and 3' stem elements, 5' pseudoknots, 5' IRES, poly(C) tracts, 3' polyadenylation sites, internal cre elements, etc. The protein alignments respect proteolytic cleavage sites, enzyme active sites, core protein structures (helices, sheets, turns), and mapped antigenic sites.
File Format: Many older links will access text files in MSF format (Wisconsin Package). Newer links include *.fas and *.meg. To access the collective directory of files CLICK HERE. Usually, the individual sequences are referenced by their GenBank accession numbers. Except for the HRV, conversion tables listing strain designations are in the respective ReadMe files (pdf format).
What's Available? In 2005 updated RNA and protein alignments were posted for: Cardioviruses (whole genus), FMDVs (select isolates representing the whole species), FMDV-types A, O, C, Asia, and Sat (all available complete genomes). Enterovirus versions date from 2002. Note that for the family alignments, only picorna.p1 and picorna.2CP3 files can be formed. The remainder of the polyproteins, and the 5' and 3' RNA elements are not (always) homologs among the genera so they cannot be usefully aligned. Rhinovirus alignments and trees are from 2009 (Science) or 2013 (Virology).