The effect of RNA substitution models on viroid and RNA virus phylogenies.
Patiño Galindo J., González Candelas F., Pybus OG.
Many viroids and RNA viruses have genomes that exhibit secondary structure, with paired nucleotides forming stems and loops. Such structures violate a key assumption of most methods of phylogenetic reconstruction, that sequence change is independent among sites. However, phylogenetic analyses of these transmissible agents rarely use evolutionary models that account for RNA secondary structure. Here we assess the effect of using RNA-specific nucleotide substitution models on the phylogenetic inference of viroids and RNA viruses. We obtained data sets comprising full-genome nucleotide sequences from 6 viroid and 10 single-stranded RNA virus species. For each alignment, we inferred consensus RNA secondary structures, then evaluated different DNA and RNA substitution models. We used model selection to choose the best-fitting model and evaluate estimated Bayesian phylogenies. Further, for each data set we generated and compared Robinson-Foulds (RF) statistics in order to test whether the distributions of trees generated under alternative models are notably different to each other. In all alignments, the best-fitting model was one that considers RNA secondary structure: RNA models that allow a non-zero rate of double substitution (RNA16A, RNA16C) fitted best for both viral and viroid data sets. In 14 of 16 data sets, the use of an RNA-specific model led to significantly longer tree lengths, but only in 3 cases did it have a significant effect on RFs. In conclusion, using RNA model when undertaking phylogenetic inference of viroids and RNA viruses can provide a better model fit than standard approaches and model choice can significantly affect branch length estimates.