Can emojis be used to capture meaning?

poster

Description

This project aims to investigate the use of emojis as a new modality for representing English text. We propose to train a transformer model using the ELCo dataset, which contains English phrases paired with emoji sequences, to predict emojis from text. The goal is to evaluate whether the emoji space, consisting of 3,790 Unicode emojis, can effectively capture the semantic richness of the English language. In addition, we explore whether emojis can be treated as a low-resource modality, where their limited set could provide a more efficient way for language models to learn and represent concepts.