Real-Time NLP Detection Engine (HateBlocker)

HateBlocker pairs a Chrome extension (content scripts, JavaScript/HTML/CSS) with a FastAPI backend to score text on web pages and hide hateful or offensive content—similar to an ad blocker for toxic language. Trained on the Davidson Hate Speech and Offensive Language dataset (~24,800 labeled tweets): stacked 5,000-dimensional TF-IDF with linguistic metadata (tweet length, punctuation, hashtags). BERT sequence classification reached 91.5% accuracy; XGBoost reached 90.72%, with SVM, logistic regression, random forest, and naive Bayes as baselines for a three-way hateful / offensive / neither task.

Real-Time NLP Detection Engine (HateBlocker)

Technologies Used