Dataset. 2023

Sound and music recommendation with knowledge graphs [dataset]

CORA.Repositori de Dades de Recerca
doi:10.34810/data444
CORA.Repositori de Dades de Recerca
  • Oramas, Sergio
  • Ostuni, Vito Claudio
  • Vigliensoni, Gabriel
Music Recommendation Dataset (KGRec-music). Number of items: 8,640. Number of users: 5,199. Number of items-users interactions: 751,531. All the data comes from songfacts.com and last.fm websites. Items are songs, which are described in terms of textual description extracted from songfacts.com, and tags from last.fm. Files and folders in the dataset: /descriptions: In this folder there is one file per item with the textual description of the item. The name of the file is the id of the item plus the ".txt" extension. /tags: In this folder there is one file per item with the tags of the item separated by spaces. Multiword tags are separated by -. The name of the file is the id of the item plus the ".txt" extension. Not all items have tags, there are 401 items without tags. implicit_lf_dataset.txt: This file contains the interactions between users and items. There is one line per interaction (a user that downloaded a sound in this case) with the following format, fields in one line are separated by tabs: user_id /t sound_id /t 1 /n. Sound Recommendation Dataset (KGRec-sound). Number of items: 21,552. Number of users: 20,000. Number of items-users interactions: 2,117,698. All the data comes from Freesound.org. Items are sounds, which are described in terms of textual description and tags created by the sound creator at uploading time. Files and folders in the dataset: /descriptions: In this folder there is one file per item with the textual description of the item. The name of the file is the id of the item plus the ".txt" extension. /tags: In this folder there is one file per item with the tags of the item separated by spaces. The name of the file is the id of the item plus the ".txt" extension. downloads_fs_dataset.txt: This file contains the interactions between users and items. There is one line per interaction (a user that downloaded a sound in this case) with the following format, fields in one line are separated by tabs: /nuser_id /t sound_id /t 1 /n. Two different datasets with users, items, implicit feedback interactions between users and items, item tags, and item text descriptions are provided, one for Music Recommendation (KGRec-music), and other for Sound Recommendation (KGRec-sound).
 
DOI: https://doi.org/10.34810/data444
CORA.Repositori de Dades de Recerca
doi:10.34810/data444

HANDLE: https://doi.org/10.34810/data444
CORA.Repositori de Dades de Recerca
doi:10.34810/data444
 
Ver en: https://doi.org/10.34810/data444
CORA.Repositori de Dades de Recerca
doi:10.34810/data444

Recercat. Dipósit de la Recerca de Catalunya
oai:recercat.cat:2072/336257
Artículo científico (article).

SOUND AND MUSIC RECOMMENDATION WITH KNOWLEDGE GRAPHS

Recercat. Dipósit de la Recerca de Catalunya
  • Oramas, Sergio
  • Ostuni, Vito Claudio
  • Di Noia, Tommaso
  • Serra, Xavier
  • Di Sciascio, Eugenio
The Web has moved, slowly but steadily, from a collection of documents towards a collection of structured data. Knowledge graphs have then emerged as a way of representing the knowledge encoded in such data as well as a tool to reason on them in order to extract new and implicit information. Knowledge graphs are currently used, for example, to explain search results, to explore knowledge spaces, to semantically enrich textual documents, or to feed knowledge-intensive applications such as recommender systems. In this work, we describe how to create and exploit a knowledge graph to supply a hybrid recommendation engine with information that builds on top of a collections of documents describing musical and sound items. Tags and textual descriptions are exploited to extract and link entities to external graphs such as WordNet and DBpedia, which are in turn used to semantically enrich the initial data. By means of the knowledge graph we build, recommendations are computed using a feature combination hybrid approach. Two explicit graph feature mappings are formulated to obtain meaningful item feature representations able to catch the knowledge embedded in the graph. Those content features are further combined with additional collaborative information deriving from implicit user feedback. An extensive evaluation on historical data is performed over two different datasets: a dataset of sounds composed of tags, textual descriptions, and user’s download information gathered from Freesound.org and a dataset of songs that mixes song textual descriptions with tags and user’s listening habits extracted from Songfacts.com and Last.fm, respectively. Results show significant improvements with respect to state-of-the-art collaborative algorithms in both datasets. In addition, we show how the semantic expansion of the initial descriptions helps in achieving much better recommendation quality in terms of aggregated diversity and novelty.




CORA.Repositori de Dades de Recerca
doi:10.34810/data444
Dataset. 2023

SOUND AND MUSIC RECOMMENDATION WITH KNOWLEDGE GRAPHS [DATASET]

CORA.Repositori de Dades de Recerca
  • Oramas, Sergio
  • Ostuni, Vito Claudio
  • Vigliensoni, Gabriel
Music Recommendation Dataset (KGRec-music). Number of items: 8,640. Number of users: 5,199. Number of items-users interactions: 751,531. All the data comes from songfacts.com and last.fm websites. Items are songs, which are described in terms of textual description extracted from songfacts.com, and tags from last.fm. Files and folders in the dataset: /descriptions: In this folder there is one file per item with the textual description of the item. The name of the file is the id of the item plus the ".txt" extension. /tags: In this folder there is one file per item with the tags of the item separated by spaces. Multiword tags are separated by -. The name of the file is the id of the item plus the ".txt" extension. Not all items have tags, there are 401 items without tags. implicit_lf_dataset.txt: This file contains the interactions between users and items. There is one line per interaction (a user that downloaded a sound in this case) with the following format, fields in one line are separated by tabs: user_id /t sound_id /t 1 /n. Sound Recommendation Dataset (KGRec-sound). Number of items: 21,552. Number of users: 20,000. Number of items-users interactions: 2,117,698. All the data comes from Freesound.org. Items are sounds, which are described in terms of textual description and tags created by the sound creator at uploading time. Files and folders in the dataset: /descriptions: In this folder there is one file per item with the textual description of the item. The name of the file is the id of the item plus the ".txt" extension. /tags: In this folder there is one file per item with the tags of the item separated by spaces. The name of the file is the id of the item plus the ".txt" extension. downloads_fs_dataset.txt: This file contains the interactions between users and items. There is one line per interaction (a user that downloaded a sound in this case) with the following format, fields in one line are separated by tabs: /nuser_id /t sound_id /t 1 /n. Two different datasets with users, items, implicit feedback interactions between users and items, item tags, and item text descriptions are provided, one for Music Recommendation (KGRec-music), and other for Sound Recommendation (KGRec-sound).