Detalhes da Produção

TipoTrabalho em Eventos
GrupoProdução Bibliográfica
DescriçãoPROSDOCIMI, F ; LINARD, B. ; PONTAROTTI, P. ; POCH, O. ; Thompson, J. The influence of bad gene predictions to evolutionary analysis: a case study on the evolutionary fate of paralogs. In: X-meeting 2011 -- Brazilian International Conference in Bioinformatics, 2011, Florianópolis. Anais do X-meeting 2011 -- Brazilian International Conference in Bioinformatics, 2011. v. . p. -.
AutorFrancisco Prosdocimi
Ano2011

Informações Complementares

Ano de Realização2011
Ano do Trabalho2011
Cidade do EventoFlorianópolis
Classificação do EventoINTERNACIONAL
Descrição e Informações AdicionaisHigh throughput genomics technologies are providing huge amounts of raw genomic sequence data. As a consequence, it is now feasible to compare sequences from hundreds of diverse organisms, to perform detailed studies of the evolutionary patterns and forces that shaped extant proteins and to understand the effect of the genetic changes that are responsible for the phenotypic differences between organisms. The shift to the genome scale in evolutionary biology has led to many interesting, but often conflicting studies. Some recent work has suggested that at least part of the conflict may be due to errors in the initial data: i.e. the set of gene sequences for each organism. Most protein sequences are now predicted by bioinformatics programs and a number of quality issues have been raised concerning their accuracy, due to either DNA sequencing errors or badly predicted coding regions, particularly in eukaryotes. Here, we investigated the potential impact of these errors on evolutionary studies and specifically on the identification of important genetic events in the evolutionary history of a protein. We focused on the detection of a specific event: asymmetric evolution after duplication (AED), which has been the subject of controversy recently. Using the well-studied human genome as a reference, we established a reliable set of 688 duplicated genes in 13 complete vertebrate genomes, where significantly different evolutionary rates can be observed. We estimated the rates at which protein sequence errors occur in these genomes and are then accumulated in the higher-level analyses. We showed that the majority of the putative AED events (57%) are in fact false positives due to putative bad predicted genes. Using a GO enrichment analysis, we demonstrated that the false positives are sufficient to mask the true functional significance of these events. Initial errors are accumulated at each level of the evolutionary analysis, generating artificially high rates of unusual event
Descrição e Informações Adicionais(en)High throughput genomics technologies are providing huge amounts of raw genomic sequence data. As a consequence, it is now feasible to compare sequences from hundreds of diverse organisms, to perform detailed studies of the evolutionary patterns and forces that shaped extant proteins and to understand the effect of the genetic changes that are responsible for the phenotypic differences between organisms. The shift to the genome scale in evolutionary biology has led to many interesting, but often conflicting studies. Some recent work has suggested that at least part of the conflict may be due to errors in the initial data: i.e. the set of gene sequences for each organism. Most protein sequences are now predicted by bioinformatics programs and a number of quality issues have been raised concerning their accuracy, due to either DNA sequencing errors or badly predicted coding regions, particularly in eukaryotes. Here, we investigated the potential impact of these errors on evolutionary studies and specifically on the identification of important genetic events in the evolutionary history of a protein. We focused on the detection of a specific event: asymmetric evolution after duplication (AED), which has been the subject of controversy recently. Using the well-studied human genome as a reference, we established a reliable set of 688 duplicated genes in 13 complete vertebrate genomes, where significantly different evolutionary rates can be observed. We estimated the rates at which protein sequence errors occur in these genomes and are then accumulated in the higher-level analyses. We showed that the majority of the putative AED events (57%) are in fact false positives due to putative bad predicted genes. Using a GO enrichment analysis, we demonstrated that the false positives are sufficient to mask the true functional significance of these events. Initial errors are accumulated at each level of the evolutionary analysis, generating artificially high rates of unusual event
Divulgação CientíficaNAO
IdiomaPortuguês
Meio de DivulgaçãoIMPRESSO
NaturezaRESUMO
Nome do EventoX-meeting 2011 -- Brazilian International Conference in Bioinformatics
País do EventoBrasil
RelevânciaNAO
Título dos Anais ou ProceedingsAnais do VII X-meeting
Título do TrabalhoThe influence of bad gene predictions to evolutionary analysis: a case study on the evolutionary fate of paralogs
Título do Trabalho(en)The influence of bad gene predictions to evolutionary analysis: a case study on the evolutionary fate of paralogs