Optimization of machine learning techniques for the determination of clinical parameters in dried human serum samples from FTIR spectroscopic data

ENEA-IRIS Open Archive è l’archivio della produzione scientifica dell'ENEA, realizzato con l'obiettivo di raccogliere, catalogare e rendere facilmente accessibili in rete i risultati della ricerca. Gli autori dell’ENEA provvedono a depositare le proprie pubblicazioni (articoli su rivista, presentazioni a congressi, report, ecc.). In particolare, quelle finanziate dalla Commissione Europea nell’ambito del programma H2020 (che prevede il deposito obbligatorio in un Repository), una volta caricate, vengono automaticamente importate dal portale europeo OpenAIRE. È possibile inserire, o importare direttamente dalle banche dati previste, le informazioni descrittive del documento e anche allegare, ove consentito dalla normativa sul diritto d'autore, il testo completo della pubblicazione.

ENEA-IRIS Open Archive utilizza la piattaforma IRIS (Institutional Research Information System) sviluppata da CINECA.

Machine learning techniques are powerful tools that can be applied to a large variety of fields due to their great versatility. Here, numerous machine learning regression methods are compared for the analysis of FTIR spectra of biological human serum samples in order to support and validate the use of vibrational spectroscopies for the quantification of clinical parameters and the identification of pathologies or states of alteration. To this end, we systematically analysed the prediction of 6 clinical parameters through machine learning techniques: Triglycerides, Cholesterol, HDL Cholesterol, Urea, Glucose and Total Proteins. The prediction ability is excellent in the case of Partial Least Squares regression (PLSR), Neural Networks (NN) and Support Vector regression (SVR) and in particular for Triglycerides, Cholesterol, HDL Cholesterol and Urea while for Glucose and Total Proteins the prediction ability is less accurate. The ensemble regression algorithms, specifically Boosting (BOOST), Boostrap Aggregation (BAG) applied to these base learners and to Decision Trees (DT) and Random Forest (RF), doesn't significantly improve the base learner results. The comparison also shows superior performances in the case of linear regression and considering the entire infrared spectrum without the need to select spectral features. The results obtained here go in the direction of standardizing the FTIR data analysis methodology to optimize the prediction of clinical parameters. Coupled with the development of portable spectrometers, faster detectors and powerful light sources, FTIR spectroscopy can replace standard clinical testing procedures by making them faster, simpler and lower cost.

Optimization of machine learning techniques for the determination of clinical parameters in dried human serum samples from FTIR spectroscopic data

Palumbo D.;Giorni A.;Minocchi R.;Amendola R.;Cestelli Guidi M.

2022-01-01

Abstract

Machine learning techniques are powerful tools that can be applied to a large variety of fields due to their great versatility. Here, numerous machine learning regression methods are compared for the analysis of FTIR spectra of biological human serum samples in order to support and validate the use of vibrational spectroscopies for the quantification of clinical parameters and the identification of pathologies or states of alteration. To this end, we systematically analysed the prediction of 6 clinical parameters through machine learning techniques: Triglycerides, Cholesterol, HDL Cholesterol, Urea, Glucose and Total Proteins. The prediction ability is excellent in the case of Partial Least Squares regression (PLSR), Neural Networks (NN) and Support Vector regression (SVR) and in particular for Triglycerides, Cholesterol, HDL Cholesterol and Urea while for Glucose and Total Proteins the prediction ability is less accurate. The ensemble regression algorithms, specifically Boosting (BOOST), Boostrap Aggregation (BAG) applied to these base learners and to Decision Trees (DT) and Random Forest (RF), doesn't significantly improve the base learner results. The comparison also shows superior performances in the case of linear regression and considering the entire infrared spectrum without the need to select spectral features. The results obtained here go in the direction of standardizing the FTIR data analysis methodology to optimize the prediction of clinical parameters. Coupled with the development of portable spectrometers, faster detectors and powerful light sources, FTIR spectroscopy can replace standard clinical testing procedures by making them faster, simpler and lower cost.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Parole chiave
	
				Clinical parameters prediction
FTIR
Human serum
Machine learning
Regression
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12079/70247

Citazioni

ND

8

social impact