Foundation models are gaining considerable interest for their capacity of solving many downstream tasks without fine-tuning parameters on specific datasets. The same solutions can connect visual and linguistic representations through image-text contrastive learning. These abilities allow an artificial agent to act similarly to a human, but significant cognitive processes still need to be introduced in the learning process. The present study proposes an advancement to more human-like artificial intelligence by introducing CognitiveNet, a learnable architecture integrating foundation models. Starting from the latest studies in the field of Artificial Consciousness, a hierarchy of cognitive layers has been modeled and pre-trained for estimating the emotional content of images. By employing CLIP as the backbone model, significant concordant emotional activity was produced. Furthermore, the proposed model overcomes the accuracy of CLIP in classifying CIFAR-10 and -100 datasets through supervised optimization, suggesting CognitiveNet as a promising solution for solving classification tasks through online meta-learning.

CognitiveNet: Enriching Foundation Models with Emotions and Awareness

Chinnici M.;
2023-01-01

Abstract

Foundation models are gaining considerable interest for their capacity of solving many downstream tasks without fine-tuning parameters on specific datasets. The same solutions can connect visual and linguistic representations through image-text contrastive learning. These abilities allow an artificial agent to act similarly to a human, but significant cognitive processes still need to be introduced in the learning process. The present study proposes an advancement to more human-like artificial intelligence by introducing CognitiveNet, a learnable architecture integrating foundation models. Starting from the latest studies in the field of Artificial Consciousness, a hierarchy of cognitive layers has been modeled and pre-trained for estimating the emotional content of images. By employing CLIP as the backbone model, significant concordant emotional activity was produced. Furthermore, the proposed model overcomes the accuracy of CLIP in classifying CIFAR-10 and -100 datasets through supervised optimization, suggesting CognitiveNet as a promising solution for solving classification tasks through online meta-learning.
2023
artificial emotion
awareness
computational consciousness
computer vision
foundation models
meta-learning
online learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12079/78187
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
social impact