16: NLP model on the News Headlines Dataset For Sarcasm Detection

Description

Sarcasm is an element of speech that is particularly hard to decipher for neurodivergent individuals. To help out, we built a model that detects sarcasm in text. In doing so, we explored augmenting our dataset by swapping synonyms, and extending it using LLMs. We experimented with various feature engineering techniques, like W2V and TF. Lastly, we experimented with various models, before landing on DistilBERT. Overall, we’ve managed to achieve a macro-F1 score of 0.94.

The experiment doesn’t end here, we’ve built a telebot to collect more labelled data from user to continuously improve the accuracy and corpus of our model.

Project Members

goh jun yi
fong yih jie
daniel kok
eugene chia
lin chieh
daniel kok
eugene chia
goh jun yi
lin chieh

Media Links

View Homepage View Poster