Arnav Goel
Data Science · UC San Diego · Graduating June 2027
About
Senior in the BS Data Science program at UC San Diego (minor in Entrepreneurship & Innovation). Work sits where machine learning meets systems that ship — Flask microservices for a 50,000-patient hospital platform at ADA, quantitative research at Triton Quant, and the full-stack platform for Gondilal Saraf, my family's 150-year-old jewelry business.
Education
UC San Diego
Sep 2022 — Jun 2027
BS Data Science · Minor in Entrepreneurship & Innovation (Rady School)
UC GPA 3.911 · Major GPA 3.860 · Minor GPA 3.950. Key coursework: CSE 150A (AI: Probabilistic Models), CSE 151A (ML: Learning Algorithms), CSE 151B (Deep Learning, Summer 2026), CSE 158R (Recommender Systems & Web Mining), LIGN 167 (Deep Learning for NLP), DSC 80, DSC 100, COGS 108, MATH 183, MATH 189, MGT 127R (AI & Technology Strategy).
Delhi Public School, R. K. Puram
Apr 2020 — Jun 2022
High School · Engineering Science (CBSE)
96.75% · Mathematics Society + PhySoc leadership.
Experience
Member · Triton Quantitative Trading
Apr 2025 — Present
UC San Diego
- Systematic-strategy research on tick-level market data in Python, pandas, NumPy.
- Statistical modelling, time-series analysis, and signal processing; backtests across multi-year historical data with feature engineering and parameter tuning.
- Weekly research discussions on risk management, market structure, and advanced quantitative methods.
Python · pandas · NumPy · Time-Series · Backtesting
SWE Intern · ADA
Oct 2024 — Dec 2024
Bengaluru, India
- Spearheaded backend development for a multi-region patient-management platform targeting 50,000+ patients across 120+ US, Japanese, and South Korean hospitals.
- Built and containerised Flask microservices with Docker Compose — cut onboarding and setup time by 60%.
- Developed the Nurse Panel Backend API; engineered supporting data models for onboarding, scheduling, and real-time shift tracking.
- Optimised PostgreSQL queries, automated daily ETL pipelines in Python, and architected scalable MDM tables.
- Established full-stack observability with OpenTelemetry (logs, traces, metrics) — 90% faster issue detection.
Flask · Docker · PostgreSQL · Python · OpenTelemetry · ETL · REST APIs
Cloud Engineering Intern · Espire Infolabs
Jul 2024 — Sep 2024
Hybrid
- Built KPI dashboards and real-time monitoring for enterprise digital-transformation clients.
- Integrated Azure Monitor, Log Analytics, and Workbooks — reduced manual reporting 40%.
- Deployed AWS Lambda + CloudWatch alerting pipelines — cut incident detection time 30%.
Azure Monitor · AWS Lambda · CloudWatch · Dashboards
Algorithmic Trading Intern · AGS
Jun 2023 — Sep 2023
On-site
- Designed high-frequency arbitrage strategies on tick-level market data.
- Built scalable Python backtesting pipelines over 10+ years of HFT data.
- Boosted predictive accuracy 15% via feature engineering, hyperparameter tuning, and time-series normalisation.
- Ran volatility-adjusted Monte Carlo simulations; improved overall Sharpe Ratio by 12%.
Python · pandas · NumPy · Monte Carlo · Backtesting
Digital Platform Lead · Gondilal Saraf
2022 — Present
Banda, India · Remote
- Run the full-stack platform for my family's 150-year-old jewelry business (since 1873).
- Bilingual Hindi/English storefront with live gold rates, AR virtual try-on, and AI-generated product descriptions via Gemini 2.0 Flash.
- Admin ERP with POS, inventory, customer database, and barcode generation — 15 Prisma models, 26 API routes, 85 vitest tests.
- Apply data-science techniques (demand forecasting, segmentation) to a century-old traditional industry.
Full-Stack Development · Next.js · PostgreSQL · Prisma · Gemini AI · Business Strategy
Selected Projects
Solo — Chrome/Firefox/Safari extension
Cross-site video-sync extension shipping on the Chrome Web Store. WebSocket relay with heartbeat drift correction (<0.5s), host-mode enforcement, site-specific player adapters, 59 vitest tests + Puppeteer e2e.
Solo — full-stack platform
Three surfaces (public site + bilingual catalogue + admin ERP) in one Next.js 15 codebase. Image pipeline: Photoroom → Sharp → Gemini 2.0 Flash → Replicate SDXL. 15 Prisma models, 26 API routes, 85 tests.
Solo — AI health companion
Rant-first health logging where Claude extracts structured symptoms, medications, and mood from free-form text. Lab-report PDF parser with Zod-validated JSON output and manual fallback form. 15 Prisma models.
Solo — data-science case study
VADER sentiment on 500 YouTube comments. Net Sentiment Score +28.6 pp (2× industry benchmark), 100% hashtag discipline across a 50-video catalog, 8-chart Excel dashboard + executive-summary PDF.
With Paulina Pelayo
Analysed 1,534 major U.S. power outages (2000–2016). Random Forest with engineered features reached RMSE 6,189 min and R² 0.220, with fairness checks across weather vs. non-weather outages (p ≈ 0.007 on price-tier hypothesis test).
Team of 5 — ethics lead
Team-of-5 Winter 2026 final project on early-season predictors of MLB playoff qualification 2015–2023 via Fangraphs pybaseball. Proposal → EDA → final analysis across four notebooks. I led the ethics section.
Skills
Proficient
Python (pandas, NumPy) · SQL / PostgreSQL · TypeScript / React / Next.js · Flask / REST APIs · Jupyter · EDA · data cleaning · Git / GitHub
Comfortable
scikit-learn (Random Forest, permutation tests, NMAR reasoning) · TensorFlow / Keras (CNNs, transfer learning) · Matplotlib · Seaborn · Recharts · Time-series analysis · backtesting · Prisma ORM · relational schema design · Docker · containerised microservices · WebSockets · real-time sync · Swift / SwiftUI · Claude / Gemini / OpenAI APIs · Chrome Manifest V3 extensions
Familiar
PyTorch · Bayesian networks · Markov models · RL (CSE 150A) · Recommender systems · web mining (CSE 258) · VADER · NLP fundamentals · Azure Monitor · Log Analytics · AWS Lambda · CloudWatch · OpenTelemetry (logs, traces, metrics) · Monte Carlo simulation · C (systems programming) · Linear algebra · differential equations
Certifications
Neural Networks and Deep Learning
DeepLearning.AI · Jul 2024
Machine Learning Specialization
Stanford University · Jul 2024
Google AI Essentials
Google · Jul 2024
Google Data Analytics Professional Certificate
Google · Jul 2024
Legal name: Arnav Goel. Also goes by alias Yash Goel (UCSD email: yashgoel0304@gmail.com). F-1 international student; would require H-1B sponsorship after OPT STEM extension. Considering new-grad Applied Scientist / ML Engineer / Data Scientist / SWE roles for summer 2027 onward.