Detalhes da Produção

Tipo	Trabalho em Eventos
Grupo	Produção Bibliográfica
Descrição	PROSDOCIMI, F ; LINARD, B. ; PONTAROTTI, P. ; POCH, O. ; Thompson, J. The influence of bad gene predictions to evolutionary analysis: a case study on the evolutionary fate of paralogs. In: X-meeting 2011 -- Brazilian International Conference in Bioinformatics, 2011, Florianópolis. Anais do X-meeting 2011 -- Brazilian International Conference in Bioinformatics, 2011. v. . p. -.
Autor	Francisco Prosdocimi
Ano	2011

Informações Complementares

Ano de Realização	2011
Ano do Trabalho	2011
Cidade do Evento	Florianópolis
Classificação do Evento	INTERNACIONAL
Descrição e Informações Adicionais	High throughput genomics technologies are providing huge amounts of raw genomic sequence data. As a consequence, it is now feasible to compare sequences from hundreds of diverse organisms, to perform detailed studies of the evolutionary patterns and forces that shaped extant proteins and to understand the effect of the genetic changes that are responsible for the phenotypic differences between organisms. The shift to the genome scale in evolutionary biology has led to many interesting, but often conflicting studies. Some recent work has suggested that at least part of the conflict may be due to errors in the initial data: i.e. the set of gene sequences for each organism. Most protein sequences are now predicted by bioinformatics programs and a number of quality issues have been raised concerning their accuracy, due to either DNA sequencing errors or badly predicted coding regions, particularly in eukaryotes. Here, we investigated the potential impact of these errors on evolutionary studies and specifically on the identification of important genetic events in the evolutionary history of a protein. We focused on the detection of a specific event: asymmetric evolution after duplication (AED), which has been the subject of controversy recently. Using the well-studied human genome as a reference, we established a reliable set of 688 duplicated genes in 13 complete vertebrate genomes, where significantly different evolutionary rates can be observed. We estimated the rates at which protein sequence errors occur in these genomes and are then accumulated in the higher-level analyses. We showed that the majority of the putative AED events (57%) are in fact false positives due to putative bad predicted genes. Using a GO enrichment analysis, we demonstrated that the false positives are sufficient to mask the true functional significance of these events. Initial errors are accumulated at each level of the evolutionary analysis, generating artificially high rates of unusual event
Descrição e Informações Adicionais(en)	High throughput genomics technologies are providing huge amounts of raw genomic sequence data. As a consequence, it is now feasible to compare sequences from hundreds of diverse organisms, to perform detailed studies of the evolutionary patterns and forces that shaped extant proteins and to understand the effect of the genetic changes that are responsible for the phenotypic differences between organisms. The shift to the genome scale in evolutionary biology has led to many interesting, but often conflicting studies. Some recent work has suggested that at least part of the conflict may be due to errors in the initial data: i.e. the set of gene sequences for each organism. Most protein sequences are now predicted by bioinformatics programs and a number of quality issues have been raised concerning their accuracy, due to either DNA sequencing errors or badly predicted coding regions, particularly in eukaryotes. Here, we investigated the potential impact of these errors on evolutionary studies and specifically on the identification of important genetic events in the evolutionary history of a protein. We focused on the detection of a specific event: asymmetric evolution after duplication (AED), which has been the subject of controversy recently. Using the well-studied human genome as a reference, we established a reliable set of 688 duplicated genes in 13 complete vertebrate genomes, where significantly different evolutionary rates can be observed. We estimated the rates at which protein sequence errors occur in these genomes and are then accumulated in the higher-level analyses. We showed that the majority of the putative AED events (57%) are in fact false positives due to putative bad predicted genes. Using a GO enrichment analysis, we demonstrated that the false positives are sufficient to mask the true functional significance of these events. Initial errors are accumulated at each level of the evolutionary analysis, generating artificially high rates of unusual event
Divulgação Científica	NAO
Idioma	Português
Meio de Divulgação	IMPRESSO
Natureza	RESUMO
Nome do Evento	X-meeting 2011 -- Brazilian International Conference in Bioinformatics
País do Evento	Brasil
Relevância	NAO
Título dos Anais ou Proceedings	Anais do VII X-meeting
Título do Trabalho	The influence of bad gene predictions to evolutionary analysis: a case study on the evolutionary fate of paralogs
Título do Trabalho(en)	The influence of bad gene predictions to evolutionary analysis: a case study on the evolutionary fate of paralogs