This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Detalhes da Produção
Tipo | Trabalho em Eventos |
Grupo | Produção Bibliográfica |
Descrição | PROSDOCIMI, F ; LINARD, B. ; PONTAROTTI, P. ; POCH, O. ; Thompson, J. The influence of bad gene predictions to evolutionary analysis: a case study on the evolutionary fate of paralogs. In: X-meeting 2011 -- Brazilian International Conference in Bioinformatics, 2011, Florianópolis. Anais do X-meeting 2011 -- Brazilian International Conference in Bioinformatics, 2011. v. . p. -. |
Autor | Francisco Prosdocimi |
Ano | 2011 |
Informações Complementares
Ano de Realização | 2011 |
Ano do Trabalho | 2011 |
Cidade do Evento | Florianópolis |
Classificação do Evento | INTERNACIONAL |
Descrição e Informações Adicionais | High throughput genomics technologies are providing huge amounts of raw genomic sequence data. As a consequence, it is now feasible to compare sequences from hundreds of diverse organisms, to perform detailed studies of the evolutionary patterns and forces that shaped extant proteins and to understand the effect of the genetic changes that are responsible for the phenotypic differences between organisms. The shift to the genome scale in evolutionary biology has led to many interesting, but often conflicting studies. Some recent work has suggested that at least part of the conflict may be due to errors in the initial data: i.e. the set of gene sequences for each organism. Most protein sequences are now predicted by bioinformatics programs and a number of quality issues have been raised concerning their accuracy, due to either DNA sequencing errors or badly predicted coding regions, particularly in eukaryotes. Here, we investigated the potential impact of these errors on evolutionary studies and specifically on the identification of important genetic events in the evolutionary history of a protein. We focused on the detection of a specific event: asymmetric evolution after duplication (AED), which has been the subject of controversy recently. Using the well-studied human genome as a reference, we established a reliable set of 688 duplicated genes in 13 complete vertebrate genomes, where significantly different evolutionary rates can be observed. We estimated the rates at which protein sequence errors occur in these genomes and are then accumulated in the higher-level analyses. We showed that the majority of the putative AED events (57%) are in fact false positives due to putative bad predicted genes. Using a GO enrichment analysis, we demonstrated that the false positives are sufficient to mask the true functional significance of these events. Initial errors are accumulated at each level of the evolutionary analysis, generating artificially high rates of unusual event |
Descrição e Informações Adicionais(en) | High throughput genomics technologies are providing huge amounts of raw genomic sequence data. As a consequence, it is now feasible to compare sequences from hundreds of diverse organisms, to perform detailed studies of the evolutionary patterns and forces that shaped extant proteins and to understand the effect of the genetic changes that are responsible for the phenotypic differences between organisms. The shift to the genome scale in evolutionary biology has led to many interesting, but often conflicting studies. Some recent work has suggested that at least part of the conflict may be due to errors in the initial data: i.e. the set of gene sequences for each organism. Most protein sequences are now predicted by bioinformatics programs and a number of quality issues have been raised concerning their accuracy, due to either DNA sequencing errors or badly predicted coding regions, particularly in eukaryotes. Here, we investigated the potential impact of these errors on evolutionary studies and specifically on the identification of important genetic events in the evolutionary history of a protein. We focused on the detection of a specific event: asymmetric evolution after duplication (AED), which has been the subject of controversy recently. Using the well-studied human genome as a reference, we established a reliable set of 688 duplicated genes in 13 complete vertebrate genomes, where significantly different evolutionary rates can be observed. We estimated the rates at which protein sequence errors occur in these genomes and are then accumulated in the higher-level analyses. We showed that the majority of the putative AED events (57%) are in fact false positives due to putative bad predicted genes. Using a GO enrichment analysis, we demonstrated that the false positives are sufficient to mask the true functional significance of these events. Initial errors are accumulated at each level of the evolutionary analysis, generating artificially high rates of unusual event |
Divulgação Científica | NAO |
Idioma | Português |
Meio de Divulgação | IMPRESSO |
Natureza | RESUMO |
Nome do Evento | X-meeting 2011 -- Brazilian International Conference in Bioinformatics |
País do Evento | Brasil |
Relevância | NAO |
Título dos Anais ou Proceedings | Anais do VII X-meeting |
Título do Trabalho | The influence of bad gene predictions to evolutionary analysis: a case study on the evolutionary fate of paralogs |
Título do Trabalho(en) | The influence of bad gene predictions to evolutionary analysis: a case study on the evolutionary fate of paralogs |