Skip to content
VRAJ PATEL
Back to Projects Machine Learning

End-to-End Disease Prediction Pipeline

Diabetes risk modeling on 400K+ BRFSS survey responses. Compared SVM, random forest, and neural networks with focus on class imbalance and recall-precision tradeoffs.

Diabetes risk modeling on BRFSS 2015 tabular survey data (400,000+ U.S. respondents). Compared SVM, decision trees, random forests, logistic regression, and Keras Tuner-driven neural networks, with explicit attention to class imbalance and the recall–precision tradeoff for screening. Reported tuned accuracies around 75.3% (logistic regression), 74.8% (SVM), and 74.3% (decision tree); GenHealth, BMI, age, high blood pressure, and income surfaced as consistently important risk factors across models.

Technologies Used

TensorFlow Keras Tuner Scikit-Learn Pandas NumPy