Regions with two amino acids in protein sequences: A step forward from homorepeats into the low complexity landscape
- Publication type:
- Journal article
- Metadata:
-
- Autoren
- Pablo Mier
- Miguel A Andrade-Navarro
- Autoren-URL
- https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=fis-test-1&SrcAuth=WosAPI&KeyUT=WOS:000874658200014&DestLinkType=FullRecord&DestApp=WOS_CPL
- DOI
- 10.1016/j.csbj.2022.09.011
- Externe Identifier
- Clarivate Analytics Document Solution ID: 5R7AF
- PubMed Identifier: 36249567
- ISSN
- 2001-0370
- Zeitschrift
- COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
- Schlüsselwörter
- Protein sequence analysis
- Low complexity regions
- Linear motifs
- polyXY
- Paginierung
- 5516 - 5523
- Datum der Veröffentlichung
- 2022
- Status
- Published
- Titel
- Regions with two amino acids in protein sequences: A step forward from homorepeats into the low complexity landscape
- Sub types
- Article
- Ausgabe der Zeitschrift
- 20
Data source: Web of Science (Lite)
- Other metadata sources:
-
- Autoren
- Pablo Mier
- Miguel A Andrade-Navarro
- DOI
- 10.1016/j.csbj.2022.09.011
- ISSN
- 2001-0370
- Zeitschrift
- Computational and Structural Biotechnology Journal
- Sprache
- en
- Paginierung
- 5516 - 5523
- Datum der Veröffentlichung
- 2022
- Status
- Published
- Herausgeber
- Elsevier BV
- Herausgeber URL
- http://dx.doi.org/10.1016/j.csbj.2022.09.011
- Datum der Datenerfassung
- 2024
- Titel
- Regions with two amino acids in protein sequences: A step forward from homorepeats into the low complexity landscape
- Ausgabe der Zeitschrift
- 20
Data source: Crossref
- Abstract
- Low complexity regions (LCRs) differ in amino acid composition from the background provided by the corresponding proteomes. The simplest LCRs are homorepeats (or polyX), regions composed of mostly-one amino acid type. Extensive research has been done to characterize homorepeats, and their taxonomic, functional and structural features depend on the amino acid type and sequence context. From them, the next step towards the study of LCRs are the regions composed of two types of amino acids, which we call polyXY. We classify polyXY in three categories based on the arrangement of the two amino acid types 'X' and 'Y': direpeats (e.g. 'XYXYXY'), joined (e.g. 'XXXYYY') and shuffled (e.g. 'XYYXXY'). We developed a script to search for polyXY, and located them in a comprehensive set of 20,340 reference proteomes. These results are available in a dedicated web server called XYs, in which the user can also submit their own protein datasets to detect polyXY. We studied the distribution of polyXY types by amino acid pair XY and category, and show that polyXY in Eukaryota are mainly located within intrinsically disordered regions. Our study provides a first step towards the characterization of polyXY as protein motifs.
- Addresses
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany.
- Autoren
- Pablo Mier
- Miguel A Andrade-Navarro
- DOI
- 10.1016/j.csbj.2022.09.011
- eISSN
- 2001-0370
- Externe Identifier
- PubMed Identifier: 36249567
- PubMed Central ID: PMC9550522
- Open access
- true
- ISSN
- 2001-0370
- Zeitschrift
- Computational and structural biotechnology journal
- Sprache
- eng
- Medium
- Electronic-eCollection
- Online publication date
- 2022
- Open access status
- Open Access
- Paginierung
- 5516 - 5523
- Datum der Veröffentlichung
- 2022
- Status
- Published
- Publisher licence
- CC BY
- Datum der Datenerfassung
- 2022
- Titel
- Regions with two amino acids in protein sequences: A step forward from homorepeats into the low complexity landscape.
- Sub types
- research-article
- Journal Article
- Ausgabe der Zeitschrift
- 20
Files
https://europepmc.org/articles/PMC9550522?pdf=render
Data source: Europe PubMed Central
- Abstract
- Low complexity regions (LCRs) differ in amino acid composition from the background provided by the corresponding proteomes. The simplest LCRs are homorepeats (or polyX), regions composed of mostly-one amino acid type. Extensive research has been done to characterize homorepeats, and their taxonomic, functional and structural features depend on the amino acid type and sequence context. From them, the next step towards the study of LCRs are the regions composed of two types of amino acids, which we call polyXY. We classify polyXY in three categories based on the arrangement of the two amino acid types 'X' and 'Y': direpeats (e.g. 'XYXYXY'), joined (e.g. 'XXXYYY') and shuffled (e.g. 'XYYXXY'). We developed a script to search for polyXY, and located them in a comprehensive set of 20,340 reference proteomes. These results are available in a dedicated web server called XYs, in which the user can also submit their own protein datasets to detect polyXY. We studied the distribution of polyXY types by amino acid pair XY and category, and show that polyXY in Eukaryota are mainly located within intrinsically disordered regions. Our study provides a first step towards the characterization of polyXY as protein motifs.
- Date of acceptance
- 2022
- Autoren
- Pablo Mier
- Miguel A Andrade-Navarro
- Autoren-URL
- https://www.ncbi.nlm.nih.gov/pubmed/36249567
- DOI
- 10.1016/j.csbj.2022.09.011
- Externe Identifier
- PubMed Central ID: PMC9550522
- ISSN
- 2001-0370
- Zeitschrift
- Comput Struct Biotechnol J
- Schlüsselwörter
- Linear motifs
- Low complexity regions
- Protein sequence analysis
- polyXY
- Sprache
- eng
- Country
- Netherlands
- Paginierung
- 5516 - 5523
- PII
- S2001-0370(22)00416-0
- Datum der Veröffentlichung
- 2022
- Status
- Published online
- Titel
- Regions with two amino acids in protein sequences: A step forward from homorepeats into the low complexity landscape.
- Sub types
- Journal Article
- Ausgabe der Zeitschrift
- 20
Data source: PubMed
- Beziehungen:
- Property of