Using the repeat finding algorithm FT-Rep, we have identified 154 pentatricopeptide repeat (PPR) proteins in nine fully sequenced genomes from green algae (with a total of 1201 repeats) and grouped them in 47 orthologous groups. All data are available in a database, PPRdb, accessible online at http://giavap-genomes.ibpc.fr/ppr. Based on phylogenetic trees generated from the repeats, we propose evolutionary scenarios for PPR proteins. Two PPRs are clearly conserved in the entire green lineage: MRL1 is a stabilization factor for the rbcL mRNA, while HCF152 binds in plants to the psbH-petB intergenic region. MCA1 (the stabilization factor for petA) and PPR7 (a short PPR also acting on chloroplast mRNAs) are conserved across the entire Chlorophyta. The other PPRs are clade-specific, with evidence for gene losses, duplications, and horizontal transfer. In some PPR proteins, an additional domain found at the C terminus provides clues as to possible functions. PPR19 and PPR26 possess a methyltransferase_4 domain suggesting involvement in RNA guanosine methylation. PPR18 contains a C-terminal CBS domain, similar to the CBSPPR1 protein found in nucleoids. PPR16, PPR29, PPR37, and PPR38 harbor a SmR (MutS-related) domain similar to that found in land plants pTAC2, GUN1, and SVR7. The PPR-cyclins PPR3, PPR4, and PPR6, in addition, contain a cyclin domain C-terminal to their SmR domain. PPR31 is an unusual PPR-cyclin containing at its N terminus an OctotricoPeptide Repeat (OPR) and a RAP domain. We consider the possibility that PPR proteins with a SmR domain can introduce single-stranded nicks in the plastid chromosome.
Keywords: chloroplast; cyclin; evolution; green algae; mitochondrion; pentatricopeptide repeat; small MutS-related; tRNA methyltransferase.