Skip to content
VRAJ PATEL
Back to Projects Machine Learning

Automated Interview Scoring Engine

Multimodal pipeline fusing text embeddings and 100+ audio features to predict interview scores. MAE < 0.4, Pearson r > 0.70 with human raters.

Designed a multimodal predictive regression pipeline fusing dense text embeddings (sentence-transformers/all-mpnet-base-v2) with 100+ prosodic audio features (pitch variance, speech rate, pause density). Used Cosine Similarity over multi-turn dialogue structures to model conversational coherence. Deployed K-Fold Cross-Validation (k=5) to minimize prediction error, achieving MAE < 0.4 on normalized score predictions and Pearson r > 0.70 correlation with human rater scores. Applied SHAP explainability to identify which linguistic and tonal features most strongly predicted candidate ratings across competency dimensions.

Technologies Used

SentenceTransformers TensorFlow Scikit-Learn SHAP