Resultados totales (Incluyendo duplicados): 1
Encontrada(s) 1 página(s)
e-cienciaDatos, Repositorio de Datos del Consorcio Madroño
doi:10.21950/LBNLGA
Dataset. 2024

ART-GENEVALGPT

  • D'Haro Enríquez, Luis Fernando
  • Gil Martín, Manuel
  • Luna Jiménez, Cristina
  • Esteban Romero, Sergio
  • Estecha Garitagoitia, Marcos
  • Bellver Soler, Jaime
  • Fernández Martínez, Fernando

Description of the project

ASTOUND is an EIC funded project (No. 101071191) under the HORIZON-EIC-2021-PATHFINDERCHALLENGES-01 call.

The aim of the project is to develop an artificial conscious AI based on the Attention Schema Theory (AST) proposed by Michel Graziano. This theory proposes that consciousness arises from the brain's ability to create and maintain a simplified model of its own processing, particularly focusing attention on certain aspects of its internal and external environment.

The project entails creating an AI system capable of exhibiting consciousness-like behaviours by implementing principles from the AST. This involves constructing a model that simulates attentional processes, allowing the AI to prioritise and focus on relevant information while disregarding irrelevant stimuli.

The ASTOUND project will provide an Integrative Approach for Awareness Engineering to establish consciousness in machines, and targeting the following goals:

Develop an AI architecture for Artificial Consciousness based on the Attention Schema Theory (AST) through an internal model of the state of the attention.

Implement the proposed architecture into a contextually aware virtual agent and prove improved performance thanks to the Attention Schema; for instance, by providing coherent discussion, self-regulation, short-and-long term memory, personalisation capabilities.

Define novel ways to measure the presence and level of consciousness in both humans and machines.

Description of the dataset

The dataset includes synthetic dialogues in the art domain that can be used for training a chatbot to discuss artworks within a museum setting. Leveraging Large Language Models (LLMs), particularly ChatGPT, the dataset comprises over 13,000 dialogues generated using prompt-engineering techniques. The dialogues cover a wide range of user and chatbot behaviours, including expert guidance, tutoring, and handling toxic user interactions.

The ArtEmis dataset serves as a basis, containing emotion attributions and explanations for artworks sourced from the WikiArt website. From this dataset, 800 artworks were selected based on consensus among human annotators regarding elicited emotions, ensuring balanced representation across different emotions. However, an imbalance in art styles distribution was noted due to the emphasis on emotional balance.

Each dialogue is uniquely identified using a "DIALOGUE_ID", encoding information about the artwork discussed, emotions, chatbot behaviour, and more. The dataset is structured into multiple files for efficient navigation and analysis, including metadata, prompts, dialogues, and metrics.

Objective evaluation of the generated dialogues was conducted, focusing on profile discrimination, anthropic behaviour detection, and toxicity evaluation. Various syntactic and semantic-based metrics are employed to assess dialogue quality, along with sentiment and subjectivity analysis. Tools like the MS Azure Content Moderator API, Detoxify library and LlamaGuard aid in toxicity evaluation.

The dataset's conclusion highlights the need for further work to handle biases, enhance toxicity detection, and incorporate multimodal information and contextual awareness. Future efforts will focus on expanding the dataset with additional tasks and improving chatbot capabilities for diverse scenarios.


Proyecto: EC/HE/101071191
DOI: https://doi.org/10.21950/LBNLGA
e-cienciaDatos, Repositorio de Datos del Consorcio Madroño
doi:10.21950/LBNLGA
HANDLE: https://doi.org/10.21950/LBNLGA
e-cienciaDatos, Repositorio de Datos del Consorcio Madroño
doi:10.21950/LBNLGA
PMID: https://doi.org/10.21950/LBNLGA
e-cienciaDatos, Repositorio de Datos del Consorcio Madroño
doi:10.21950/LBNLGA
Ver en: https://doi.org/10.21950/LBNLGA
e-cienciaDatos, Repositorio de Datos del Consorcio Madroño
doi:10.21950/LBNLGA

Buscador avanzado