16: NLP model on the News Headlines Dataset For Sarcasm Detection

poster

Description

Sarcasm is an element of speech that is particularly hard to decipher for neurodivergent individuals. To help out, we built a model that detects sarcasm in text. In doing so, we explored augmenting our dataset by swapping synonyms, and extending it using LLMs. We experimented with various feature engineering techniques, like W2V and TF. Lastly, we experimented with various models, before landing on DistilBERT. Overall, we’ve managed to achieve a macro-F1 score of 0.94.

The experiment doesn’t end here, we’ve built a telebot to collect more labelled data from user to continuously improve the accuracy and corpus of our model.