Systematic Comparison of the Influence of Different Data Preprocessing Methods on the Performance of Gait Classifications Using Machine Learning
- Publikationstyp:
- Zeitschriftenaufsatz
- Metadaten:
-
- Autoren
- Johannes Burdack
- Fabian Horst
- Sven Giesselbach
- Ibrahim Hassan
- Sabrina Daffner
- Wolfgang I Schoellhorn
- Autoren-URL
- https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=fis-test-1&SrcAuth=WosAPI&KeyUT=WOS:000531241900001&DestLinkType=FullRecord&DestApp=WOS_CPL
- DOI
- 10.3389/fbioe.2020.00260
- Externe Identifier
- Clarivate Analytics Document Solution ID: LL0IU
- PubMed Identifier: 32351945
- ISSN
- 2296-4185
- Zeitschrift
- FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY
- Schlüsselwörter
- gait classification
- data selection
- data processing
- ground reaction force
- multi-layer perceptron
- convolutional neural network
- support vector machine
- random forest classifier
- Artikelnummer
- ARTN 260
- Datum der Veröffentlichung
- 2020
- Status
- Published
- Titel
- Systematic Comparison of the Influence of Different Data Preprocessing Methods on the Performance of Gait Classifications Using Machine Learning
- Sub types
- Article
- Ausgabe der Zeitschrift
- 8
Datenquelle: Web of Science (Lite)
- Andere Metadatenquellen:
-
- Autoren
- Johannes Burdack
- Fabian Horst
- Sven Giesselbach
- Ibrahim Hassan
- Sabrina Daffner
- Wolfgang I Schöllhorn
- DOI
- 10.3389/fbioe.2020.00260
- eISSN
- 2296-4185
- Zeitschrift
- Frontiers in Bioengineering and Biotechnology
- Online publication date
- 2020
- Status
- Published online
- Herausgeber
- Frontiers Media SA
- Herausgeber URL
- http://dx.doi.org/10.3389/fbioe.2020.00260
- Datum der Datenerfassung
- 2020
- Titel
- Systematic Comparison of the Influence of Different Data Preprocessing Methods on the Performance of Gait Classifications Using Machine Learning
- Ausgabe der Zeitschrift
- 8
Datenquelle: Crossref
- Abstract
- Human movements are characterized by highly non-linear and multi-dimensional interactions within the motor system. Therefore, the future of human movement analysis requires procedures that enhance the classification of movement patterns into relevant groups and support practitioners in their decisions. In this regard, the use of data-driven techniques seems to be particularly suitable to generate classification models. Recently, an increasing emphasis on machine-learning applications has led to a significant contribution, e.g., in increasing the classification performance. In order to ensure the generalizability of the machine-learning models, different data preprocessing steps are usually carried out to process the measured raw data before the classifications. In the past, various methods have been used for each of these preprocessing steps. However, there are hardly any standard procedures or rather systematic comparisons of these different methods and their impact on the classification performance. Therefore, the aim of this analysis is to compare different combinations of commonly applied data preprocessing steps and test their effects on the classification performance of gait patterns. A publicly available dataset on intra-individual changes of gait patterns was used for this analysis. Forty-two healthy participants performed 6 sessions of 15 gait trials for 1 day. For each trial, two force plates recorded the three-dimensional ground reaction forces (GRFs). The data was preprocessed with the following steps: GRF filtering, time derivative, time normalization, data reduction, weight normalization and data scaling. Subsequently, combinations of all methods from each preprocessing step were analyzed by comparing their prediction performance in a six-session classification using Support Vector Machines, Random Forest Classifiers, Multi-Layer Perceptrons, and Convolutional Neural Networks. The results indicate that filtering GRF data and a supervised data reduction (e.g., using Principal Components Analysis) lead to increased prediction performance of the machine-learning classifiers. Interestingly, the weight normalization and the number of data points (above a certain minimum) in the time normalization does not have a substantial effect. In conclusion, the present results provide first domain-specific recommendations for commonly applied data preprocessing methods and might help to build more comparable and more robust classification models based on machine learning that are suitable for a practical application.
- Addresses
- Department of Training and Movement Science, Institute of Sport Science, Johannes Gutenberg-University, Mainz, Germany.
- Autoren
- Johannes Burdack
- Fabian Horst
- Sven Giesselbach
- Ibrahim Hassan
- Sabrina Daffner
- Wolfgang I Schöllhorn
- DOI
- 10.3389/fbioe.2020.00260
- eISSN
- 2296-4185
- Externe Identifier
- PubMed Identifier: 32351945
- PubMed Central ID: PMC7174559
- Open access
- true
- ISSN
- 2296-4185
- Zeitschrift
- Frontiers in bioengineering and biotechnology
- Sprache
- eng
- Medium
- Electronic-eCollection
- Online publication date
- 2020
- Open access status
- Open Access
- Paginierung
- 260
- Datum der Veröffentlichung
- 2020
- Status
- Published
- Publisher licence
- CC BY
- Datum der Datenerfassung
- 2020
- Titel
- Systematic Comparison of the Influence of Different Data Preprocessing Methods on the Performance of Gait Classifications Using Machine Learning.
- Sub types
- research-article
- Journal Article
- Ausgabe der Zeitschrift
- 8
Files
https://www.frontiersin.org/articles/10.3389/fbioe.2020.00260/pdf https://europepmc.org/articles/PMC7174559?pdf=render
Datenquelle: Europe PubMed Central
- Abstract
- Human movements are characterized by highly non-linear and multi-dimensional interactions within the motor system. Therefore, the future of human movement analysis requires procedures that enhance the classification of movement patterns into relevant groups and support practitioners in their decisions. In this regard, the use of data-driven techniques seems to be particularly suitable to generate classification models. Recently, an increasing emphasis on machine-learning applications has led to a significant contribution, e.g., in increasing the classification performance. In order to ensure the generalizability of the machine-learning models, different data preprocessing steps are usually carried out to process the measured raw data before the classifications. In the past, various methods have been used for each of these preprocessing steps. However, there are hardly any standard procedures or rather systematic comparisons of these different methods and their impact on the classification performance. Therefore, the aim of this analysis is to compare different combinations of commonly applied data preprocessing steps and test their effects on the classification performance of gait patterns. A publicly available dataset on intra-individual changes of gait patterns was used for this analysis. Forty-two healthy participants performed 6 sessions of 15 gait trials for 1 day. For each trial, two force plates recorded the three-dimensional ground reaction forces (GRFs). The data was preprocessed with the following steps: GRF filtering, time derivative, time normalization, data reduction, weight normalization and data scaling. Subsequently, combinations of all methods from each preprocessing step were analyzed by comparing their prediction performance in a six-session classification using Support Vector Machines, Random Forest Classifiers, Multi-Layer Perceptrons, and Convolutional Neural Networks. The results indicate that filtering GRF data and a supervised data reduction (e.g., using Principal Components Analysis) lead to increased prediction performance of the machine-learning classifiers. Interestingly, the weight normalization and the number of data points (above a certain minimum) in the time normalization does not have a substantial effect. In conclusion, the present results provide first domain-specific recommendations for commonly applied data preprocessing methods and might help to build more comparable and more robust classification models based on machine learning that are suitable for a practical application.
- Date of acceptance
- 2020
- Autoren
- Johannes Burdack
- Fabian Horst
- Sven Giesselbach
- Ibrahim Hassan
- Sabrina Daffner
- Wolfgang I Schöllhorn
- Autoren-URL
- https://www.ncbi.nlm.nih.gov/pubmed/32351945
- DOI
- 10.3389/fbioe.2020.00260
- Externe Identifier
- PubMed Central ID: PMC7174559
- ISSN
- 2296-4185
- Zeitschrift
- Front Bioeng Biotechnol
- Schlüsselwörter
- convolutional neural network
- data processing
- data selection
- gait classification
- ground reaction force
- multi-layer perceptron
- random forest classifier
- support vector machine
- Sprache
- eng
- Country
- Switzerland
- Paginierung
- 260
- Datum der Veröffentlichung
- 2020
- Status
- Published online
- Titel
- Systematic Comparison of the Influence of Different Data Preprocessing Methods on the Performance of Gait Classifications Using Machine Learning.
- Sub types
- Journal Article
- Ausgabe der Zeitschrift
- 8
Datenquelle: PubMed
- Author's licence
- CC-BY
- Autoren
- Johannes Burdack
- Fabian Horst
- Sven Giesselbach
- Ibrahim Hassan
- Sabrina Daffner
- Wolfgang I Schöllhorn
- Hosting institution
- Universitätsbibliothek Mainz
- Sammlungen
- JGU-Publikationen
- Resource version
- Published version
- DOI
- 10.3389/fbioe.2020.00260
- Funding acknowledgements
- DFG, Open Access-Publizieren Universität Mainz / Universitätsmedizin Mainz
- File(s) embargoed
- false
- Open access
- true
- ISSN
- 2296-4185
- Zeitschrift
- Frontiers in Bioengineering and Biotechnology
- Schlüsselwörter
- 796 Sport
- 796 Athletic and outdoor sports and games
- Sprache
- eng
- Open access status
- Open Access
- Paginierung
- Art. 260
- Datum der Veröffentlichung
- 2020
- Public URL
- https://openscience.ub.uni-mainz.de/handle/20.500.12030/5265
- Herausgeber
- Frontiers Media
- Herausgeber URL
- https://doi.org/10.3389/fbioe.2020.00260
- Datum der Datenerfassung
- 2020
- Datum, an dem der Datensatz öffentlich gemacht wurde
- 2020
- Zugang
- Public
- Titel
- Systematic comparison of the influence of different data preprocessing methods on the performance of gait classifications using machine learning
- Ausgabe der Zeitschrift
- 8
Files
burdack_johannes-systematic_com-20201027094017196.pdf
Datenquelle: OPENSCIENCE.UB
- Abstract
- Human movements are characterized by highly non-linear and multi-dimensional interactions within the motor system. Recently, an increasing emphasis on machine-learning applications has led to a significant contribution to the field of gait analysis, e.g., in increasing the classification performance. In order to ensure the generalizability of the machine-learning models, different data preprocessing steps are usually carried out to process the measured raw data before the classifications. In the past, various methods have been used for each of these preprocessing steps. However, there are hardly any standard procedures or rather systematic comparisons of these different methods and their impact on the classification performance. Therefore, the aim of this analysis is to compare different combinations of commonly applied data preprocessing steps and test their effects on the classification performance of gait patterns. A publicly available dataset on intra-individual changes of gait patterns was used for this analysis. Forty-two healthy participants performed 6 sessions of 15 gait trials for 1 day. For each trial, two force plates recorded the 3D ground reaction forces (GRFs). The data was preprocessed with the following steps: GRF filtering, time derivative, time normalization, data reduction, weight normalization and data scaling. Subsequently, combinations of all methods from each preprocessing step were analyzed by comparing their prediction performance in a six-session classification using Support Vector Machines, Random Forest Classifiers, Multi-Layer Perceptrons, and Convolutional Neural Networks. In conclusion, the present results provide first domain-specific recommendations for commonly applied data preprocessing methods and might help to build more comparable and more robust classification models based on machine learning that are suitable for a practical application.
- Autoren
- Johannes Burdack
- Fabian Horst
- Sven Giesselbach
- Ibrahim Hassan
- Sabrina Daffner
- Wolfgang I Schöllhorn
- Autoren-URL
- http://arxiv.org/abs/1911.04335v2
- Zeitschrift
- Front. Bioeng. Biotechnol.
- Schlüsselwörter
- cs.LG
- cs.LG
- stat.ML
- Notes
- 12 pages, 3 figures, 4 tables
- Paginierung
- 260
- Datum der Veröffentlichung
- 2019
- Herausgeber URL
- http://dx.doi.org/10.3389/fbioe.2020.00260
- Datum der Datenerfassung
- 2019
- Datum, an dem der Datensatz öffentlich gemacht wurde
- 2019
- Titel
- Systematic Comparison of the Influence of Different Data Preprocessing Methods on the Performance of Gait Classifications Using Machine Learning
- Ausgabe der Zeitschrift
- 8
Files
1911.04335v2.pdf
Datenquelle: arXiv
- Abstract
- Human movements are characterized by highly non-linear and multi-dimensional interactions within the motor system. Therefore, the future of human movement analysis requires procedures that enhance the classification of movement patterns into relevant groups and support practitioners in their decisions. In this regard, the use of data-driven techniques seems to be particularly suitable to generate classification models. Recently, an increasing emphasis on machine-learning applications has led to a significant contribution, e.g., in increasing the classification performance. In order to ensure the generalizability of the machine-learning models, different data preprocessing steps are usually carried out to process the measured raw data before the classifications. In the past, various methods have been used for each of these preprocessing steps. However, there are hardly any standard procedures or rather systematic comparisons of these different methods and their impact on the classification performance. Therefore, the aim of this analysis is to compare different combinations of commonly applied data preprocessing steps and test their effects on the classification performance of gait patterns. A publicly available dataset on intra-individual changes of gait patterns was used for this analysis. Forty-two healthy participants performed 6 sessions of 15 gait trials for 1 day. For each trial, two force plates recorded the three-dimensional ground reaction forces (GRFs). The data was preprocessed with the following steps: GRF filtering, time derivative, time normalization, data reduction, weight normalization and data scaling. Subsequently, combinations of all methods from each preprocessing step were analyzed by comparing their prediction performance in a six-session classification using Support Vector Machines, Random Forest Classifiers, Multi-Layer Perceptrons, and Convolutional Neural Networks. The results indicate that filtering GRF data and a supervised data reduction (e.g., using Principal Components Analysis) lead to increased prediction performance of the machine-learning classifiers. Interestingly, the weight normalization and the number of data points (above a certain minimum) in the time normalization does not have a substantial effect. In conclusion, the present results provide first domain-specific recommendations for commonly applied data preprocessing methods and might help to build more comparable and more robust classification models based on machine learning that are suitable for a practical application.
- Autoren
- Johannes Burdack
- Fabian Horst
- Sven Giesselbach
- Ibrahim Hassan
- Sabrina Daffner
- Wolfgang I Schöllhorn
- DOI
- 10.3389/fbioe.2020.00260
- ISSN
- 2296-4185
- Zeitschrift
- Frontiers in bioengineering and biotechnology
- Notes
- file: http://www.ncbi.nlm.nih.gov/pubmed/32351945 file: http://www.ncbi.nlm.nih.gov/pubmed/32351945
- Paginierung
- 260 - 260
- Datum der Veröffentlichung
- 2020
- Datum der Datenerfassung
- 2021
- Titel
- Systematic Comparison of the Influence of Different Data Preprocessing Methods on the Performance of Gait Classifications Using Machine Learning
- Sub types
- article
- Ausgabe der Zeitschrift
- 8
Datenquelle: Manual
- Beziehungen:
- Eigentum von