Experience
My professional background spanning ML research, backend engineering, and
systems infrastructure — across academia, defense, and industry.
Research Assistant (Data Engineering & ML)
March 2023 – OngoingStudio Lab — CU Boulder → Princeton University
- Political Data Pipeline (ETL): Engineered a cloud-hybrid pipeline to ingest 40+ GB of election speeches (Modi & Gandhi, 2014/2019) and archive 40+ years (1981–2024) of Lok Sabha debates, creating the largest unified dataset for Indian political linguistic analysis
- ML & Cloud Optimization: Architected an automated AWS workflow (S3, Transcribe, Translate) using Boto3, implementing Custom Language Models (CLM) to recognize niche political entities and reduce WER
- Web Scraping System: Developed a resumable Selenium crawler with SQLite state management to index the Parliament Digital Library, implementing logic to handle dynamic pagination and sync with OneDrive storage
- Unstructured Data Parsing: Designed a text extraction engine using PyMuPDF and FuzzyWuzzy (string matching) to structure thousands of raw PDF statements, mapping OCR text to standardized Ministry entities
Systems Integration Engineer
May 2025 – PresentUniversity of Colorado Boulder - Institute of Behavioral Science
- Developing a full-stack ticket classification system using FastAPI and PostgreSQL, architecting a microservices based solution to automate incident tagging via a fine-tuned BERT transformer model
- Developed an interactive analytics dashboard using React (Vite) and Recharts, utilizing complex state management to visualize historical data and identify critical operational trends, such as pinpointing peak ticket volume (Tuesdays at 10 AM) to proactively optimize staffing schedules
- Engineered a PowerShell automation tool to recursively scan IBS OU under Colorado AD and purge group memberships, reducing per user offboarding time by 93% (15 mins to <1 min)
- Enforced Secure Compute compliance standards across 50+ endpoints by implementing Windows Autopilot and Jamf Pro enrollment workflows
Undergraduate Research Assistant - Satellite Telemetry Data Analysis
August 2024 – May 2025The Data Mine – Purdue University @ L3Harris
- Presented ML cybersecurity research findings at the Data Mine of the Rockies Symposium to stakeholders from the US Space Force, Lockheed Martin, CrowdStrike, and L3Harris
- Trained Isolation Forest and LSTM models on synthetically-generated satellite telemetry data from the NASA Simulator for Small Satellites (NOS3) to detect and classify cyberattack anomalies in real time
- Leveraged Wireshark for network traffic analysis to identify victim devices and indicators of compromise within simulated satellite communication channels
- Mapped potential space-cyber threats against the SPARTA Matrix and MITRE ATT&CK frameworks, generating synthetic telemetry scenarios to validate model robustness across attack vectors
Lead Technical Research Assistant
March 2023 – May 2025University of Colorado Boulder - Institute of Behavioral Science
- Computer Vision Pipeline (PyTorch/ResNet50): Collaborated with PhD researchers in weekly Agile sprints to architect a visual bias analysis pipeline for NYT COVID-19 imagery. Engineered a custom preprocessing workflow using OpenCV and Scikit-image to perform Z-score normalization and channel-wise intensity rescaling
- Unsupervised Learning: Leveraged Transfer Learning by deploying a truncated ResNet50 model to extract high dimensional feature embeddings, which were fed into K-Means clustering algorithms to uncover latent patterns in media datasets without reliance on labeled data
- ETL Architecture: Engineered a resilient Data Engineering pipeline to aggregate 15+ years of legislative data. Built a custom Selenium and BeautifulSoup scraper to navigate dynamic DOM elements, implementing JSON checkpointing to ensure data integrity during long running jobs
- API Optimization: Developed a Python wrapper for the LegiScan API with in-memory caching and rate-limit handling, reducing redundant network requests by 40% during bulk data ingestion