Machine learning techniques are powerful tools that can be applied to a large variety of fields due to their great versatility. Here, numerous machine learning regression methods are compared for the analysis of FTIR spectra of biological human serum samples in order to support and validate the use of vibrational spectroscopies for the quantification of clinical parameters and the identification of pathologies or states of alteration. To this end, we systematically analysed the prediction of 6 clinical parameters through machine learning techniques: Triglycerides, Cholesterol, HDL Cholesterol, Urea, Glucose and Total Proteins. The prediction ability is excellent in the case of Partial Least Squares regression (PLSR), Neural Networks (NN) and Support Vector regression (SVR) and in particular for Triglycerides, Cholesterol, HDL Cholesterol and Urea while for Glucose and Total Proteins the prediction ability is less accurate. The ensemble regression algorithms, specifically Boosting (BOOST), Boostrap Aggregation (BAG) applied to these base learners and to Decision Trees (DT) and Random Forest (RF), doesn't significantly improve the base learner results. The comparison also shows superior performances in the case of linear regression and considering the entire infrared spectrum without the need to select spectral features. The results obtained here go in the direction of standardizing the FTIR data analysis methodology to optimize the prediction of clinical parameters. Coupled with the development of portable spectrometers, faster detectors and powerful light sources, FTIR spectroscopy can replace standard clinical testing procedures by making them faster, simpler and lower cost.

Optimization of machine learning techniques for the determination of clinical parameters in dried human serum samples from FTIR spectroscopic data

Palumbo D.;Giorni A.;Minocchi R.;Amendola R.;
2022-01-01

Abstract

Machine learning techniques are powerful tools that can be applied to a large variety of fields due to their great versatility. Here, numerous machine learning regression methods are compared for the analysis of FTIR spectra of biological human serum samples in order to support and validate the use of vibrational spectroscopies for the quantification of clinical parameters and the identification of pathologies or states of alteration. To this end, we systematically analysed the prediction of 6 clinical parameters through machine learning techniques: Triglycerides, Cholesterol, HDL Cholesterol, Urea, Glucose and Total Proteins. The prediction ability is excellent in the case of Partial Least Squares regression (PLSR), Neural Networks (NN) and Support Vector regression (SVR) and in particular for Triglycerides, Cholesterol, HDL Cholesterol and Urea while for Glucose and Total Proteins the prediction ability is less accurate. The ensemble regression algorithms, specifically Boosting (BOOST), Boostrap Aggregation (BAG) applied to these base learners and to Decision Trees (DT) and Random Forest (RF), doesn't significantly improve the base learner results. The comparison also shows superior performances in the case of linear regression and considering the entire infrared spectrum without the need to select spectral features. The results obtained here go in the direction of standardizing the FTIR data analysis methodology to optimize the prediction of clinical parameters. Coupled with the development of portable spectrometers, faster detectors and powerful light sources, FTIR spectroscopy can replace standard clinical testing procedures by making them faster, simpler and lower cost.
2022
Clinical parameters prediction
FTIR
Human serum
Machine learning
Regression
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12079/70247
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
social impact