Identification of gene 3′ ends by automated EST cluster analysis
- Publication type:
- Journal article
- Metadata:
-
- Autoren
- Enrique M Muro
- Robert Herrington
- Salima Janmohamed
- Catherine Frelin
- Miguel A Andrade-Navarro
- Norman N Iscove
- Autoren-URL
- https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=fis-test-1&SrcAuth=WosAPI&KeyUT=WOS:000261995600050&DestLinkType=FullRecord&DestApp=WOS_CPL
- DOI
- 10.1073/pnas.0807813105
- eISSN
- 1091-6490
- Externe Identifier
- Clarivate Analytics Document Solution ID: 388BP
- PubMed Identifier: 19095794
- ISSN
- 0027-8424
- Ausgabe der Veröffentlichung
- 51
- Zeitschrift
- PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
- Schlüsselwörter
- 3 ' UTR
- gene prediction
- alternative polyadenylation
- transcriptome
- transcript probe design
- Paginierung
- 20286 - 20290
- Datum der Veröffentlichung
- 2008
- Status
- Published
- Titel
- Identification of gene 3′ ends by automated EST cluster analysis
- Sub types
- Article
- Ausgabe der Zeitschrift
- 105
Data source: Web of Science (Lite)
- Other metadata sources:
-
- Abstract
- <jats:p>The properties and biology of mRNA transcripts can be affected profoundly by the choice of alternative polyadenylation sites, making definition of the 3′ ends of transcripts essential for understanding their regulation. Here we show that 22–52% of sequences in commonly used human and murine “full-length” transcript databases may not currently end at bona fide polyadenylation sites. To identify probable transcript termini over the entire murine and human genomes, we analyzed the EST databases for positional clustering of EST ends. The analysis yielded 58,282 murine- and 86,410 human-candidate polyadenylation sites, of which 75% mapped to 23,091 known murine transcripts and 22,891 known human transcripts. The murine dataset correctly predicted 97% of the 3′ ends in a manually curated and experimentally supported benchmark transcript set. Of currently known genes, 15% had no associated prediction and 25% had only a single predicted termination site. The remaining genes had an average of 3–4 alternative polyadenylation sites predicted for each murine or human transcript, respectively. The results are made available in the form of tables and an interactive web site that can be mined for rapid assessment of the validity of 3′ ends in existing collections, enumeration of potential alternative 3′ polyadenylation sites of known transcripts, direct retrieval of terminal sequences for design of probes, and detection of polyadenylation sites not currently mapped to known genes.</jats:p>
- Autoren
- Enrique M Muro
- Robert Herrington
- Salima Janmohamed
- Catherine Frelin
- Miguel A Andrade-Navarro
- Norman N Iscove
- DOI
- 10.1073/pnas.0807813105
- eISSN
- 1091-6490
- ISSN
- 0027-8424
- Ausgabe der Veröffentlichung
- 51
- Zeitschrift
- Proceedings of the National Academy of Sciences
- Sprache
- en
- Online publication date
- 2008
- Paginierung
- 20286 - 20290
- Datum der Veröffentlichung
- 2008
- Status
- Published
- Herausgeber
- Proceedings of the National Academy of Sciences
- Herausgeber URL
- http://dx.doi.org/10.1073/pnas.0807813105
- Datum der Datenerfassung
- 2022
- Titel
- Identification of gene 3′ ends by automated EST cluster analysis
- Ausgabe der Zeitschrift
- 105
Data source: Crossref
- Abstract
- The properties and biology of mRNA transcripts can be affected profoundly by the choice of alternative polyadenylation sites, making definition of the 3' ends of transcripts essential for understanding their regulation. Here we show that 22-52% of sequences in commonly used human and murine "full-length" transcript databases may not currently end at bona fide polyadenylation sites. To identify probable transcript termini over the entire murine and human genomes, we analyzed the EST databases for positional clustering of EST ends. The analysis yielded 58,282 murine- and 86,410 human-candidate polyadenylation sites, of which 75% mapped to 23,091 known murine transcripts and 22,891 known human transcripts. The murine dataset correctly predicted 97% of the 3' ends in a manually curated and experimentally supported benchmark transcript set. Of currently known genes, 15% had no associated prediction and 25% had only a single predicted termination site. The remaining genes had an average of 3-4 alternative polyadenylation sites predicted for each murine or human transcript, respectively. The results are made available in the form of tables and an interactive web site that can be mined for rapid assessment of the validity of 3' ends in existing collections, enumeration of potential alternative 3' polyadenylation sites of known transcripts, direct retrieval of terminal sequences for design of probes, and detection of polyadenylation sites not currently mapped to known genes.
- Addresses
- Ottawa Health Research Institute, 501 Smyth Road, Ottawa, ON, Canada K1H 8L6.
- Autoren
- Enrique M Muro
- Robert Herrington
- Salima Janmohamed
- Catherine Frelin
- Miguel A Andrade-Navarro
- Norman N Iscove
- DOI
- 10.1073/pnas.0807813105
- eISSN
- 1091-6490
- Externe Identifier
- PubMed Identifier: 19095794
- PubMed Central ID: PMC2629301
- Open access
- false
- ISSN
- 0027-8424
- Ausgabe der Veröffentlichung
- 51
- Zeitschrift
- Proceedings of the National Academy of Sciences of the United States of America
- Schlüsselwörter
- Animals
- Humans
- Mice
- Poly A
- Cluster Analysis
- Methods
- Polyadenylation
- 3' Flanking Region
- Expressed Sequence Tags
- Sprache
- eng
- Medium
- Print-Electronic
- Online publication date
- 2008
- Paginierung
- 20286 - 20290
- Datum der Veröffentlichung
- 2008
- Status
- Published
- Datum der Datenerfassung
- 2008
- Titel
- Identification of gene 3' ends by automated EST cluster analysis.
- Sub types
- Research Support, Non-U.S. Gov't
- research-article
- Journal Article
- Ausgabe der Zeitschrift
- 105
Files
http://www.pnas.org/content/105/51/20286.full.pdf http://www.pnas.org/cgi/reprint/105/51/20286.pdf https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/19095794/pdf/?tool=EBI https://europepmc.org/articles/PMC2629301?pdf=render
Data source: Europe PubMed Central
- Abstract
- The properties and biology of mRNA transcripts can be affected profoundly by the choice of alternative polyadenylation sites, making definition of the 3' ends of transcripts essential for understanding their regulation. Here we show that 22-52% of sequences in commonly used human and murine "full-length" transcript databases may not currently end at bona fide polyadenylation sites. To identify probable transcript termini over the entire murine and human genomes, we analyzed the EST databases for positional clustering of EST ends. The analysis yielded 58,282 murine- and 86,410 human-candidate polyadenylation sites, of which 75% mapped to 23,091 known murine transcripts and 22,891 known human transcripts. The murine dataset correctly predicted 97% of the 3' ends in a manually curated and experimentally supported benchmark transcript set. Of currently known genes, 15% had no associated prediction and 25% had only a single predicted termination site. The remaining genes had an average of 3-4 alternative polyadenylation sites predicted for each murine or human transcript, respectively. The results are made available in the form of tables and an interactive web site that can be mined for rapid assessment of the validity of 3' ends in existing collections, enumeration of potential alternative 3' polyadenylation sites of known transcripts, direct retrieval of terminal sequences for design of probes, and detection of polyadenylation sites not currently mapped to known genes.
- Autoren
- Enrique M Muro
- Robert Herrington
- Salima Janmohamed
- Catherine Frelin
- Miguel A Andrade-Navarro
- Norman N Iscove
- Autoren-URL
- https://www.ncbi.nlm.nih.gov/pubmed/19095794
- DOI
- 10.1073/pnas.0807813105
- eISSN
- 1091-6490
- Externe Identifier
- PubMed Central ID: PMC2629301
- Ausgabe der Veröffentlichung
- 51
- Zeitschrift
- Proc Natl Acad Sci U S A
- Schlüsselwörter
- 3' Flanking Region
- Animals
- Cluster Analysis
- Expressed Sequence Tags
- Humans
- Methods
- Mice
- Poly A
- Polyadenylation
- Sprache
- eng
- Country
- United States
- Paginierung
- 20286 - 20290
- PII
- 0807813105
- Datum der Veröffentlichung
- 2008
- Status
- Published
- Datum, an dem der Datensatz öffentlich gemacht wurde
- 2009
- Titel
- Identification of gene 3' ends by automated EST cluster analysis.
- Sub types
- Journal Article
- Research Support, Non-U.S. Gov't
- Ausgabe der Zeitschrift
- 105
Data source: PubMed
- Beziehungen:
- Property of