Arnav Goel

Data Science · UC San Diego · Graduating June 2027

PDF

About

Senior in the BS Data Science program at UC San Diego (minor in Entrepreneurship & Innovation). Work sits where machine learning meets systems that ship — Flask microservices for a 50,000-patient hospital platform at ADA, quantitative research at Triton Quant, and the full-stack platform for Gondilal Saraf, my family's 150-year-old jewelry business.

Education

UC San Diego

Sep 2022 — Jun 2027

BS Data Science · Minor in Entrepreneurship & Innovation (Rady School)

UC GPA 3.911 · Major GPA 3.860 · Minor GPA 3.950. Key coursework: CSE 150A (AI: Probabilistic Models), CSE 151A (ML: Learning Algorithms), CSE 151B (Deep Learning, Summer 2026), CSE 158R (Recommender Systems & Web Mining), LIGN 167 (Deep Learning for NLP), DSC 80, DSC 100, COGS 108, MATH 183, MATH 189, MGT 127R (AI & Technology Strategy).

Delhi Public School, R. K. Puram

Apr 2020 — Jun 2022

High School · Engineering Science (CBSE)

96.75% · Mathematics Society + PhySoc leadership.

Experience

Member · Triton Quantitative Trading

Apr 2025 — Present

UC San Diego

  • Systematic-strategy research on tick-level market data in Python, pandas, NumPy.
  • Statistical modelling, time-series analysis, and signal processing; backtests across multi-year historical data with feature engineering and parameter tuning.
  • Weekly research discussions on risk management, market structure, and advanced quantitative methods.

Python · pandas · NumPy · Time-Series · Backtesting

SWE Intern · ADA

Oct 2024 — Dec 2024

Bengaluru, India

  • Spearheaded backend development for a multi-region patient-management platform targeting 50,000+ patients across 120+ US, Japanese, and South Korean hospitals.
  • Built and containerised Flask microservices with Docker Compose — cut onboarding and setup time by 60%.
  • Developed the Nurse Panel Backend API; engineered supporting data models for onboarding, scheduling, and real-time shift tracking.
  • Optimised PostgreSQL queries, automated daily ETL pipelines in Python, and architected scalable MDM tables.
  • Established full-stack observability with OpenTelemetry (logs, traces, metrics) — 90% faster issue detection.

Flask · Docker · PostgreSQL · Python · OpenTelemetry · ETL · REST APIs

Cloud Engineering Intern · Espire Infolabs

Jul 2024 — Sep 2024

Hybrid

  • Built KPI dashboards and real-time monitoring for enterprise digital-transformation clients.
  • Integrated Azure Monitor, Log Analytics, and Workbooks — reduced manual reporting 40%.
  • Deployed AWS Lambda + CloudWatch alerting pipelines — cut incident detection time 30%.

Azure Monitor · AWS Lambda · CloudWatch · Dashboards

Algorithmic Trading Intern · AGS

Jun 2023 — Sep 2023

On-site

  • Designed high-frequency arbitrage strategies on tick-level market data.
  • Built scalable Python backtesting pipelines over 10+ years of HFT data.
  • Boosted predictive accuracy 15% via feature engineering, hyperparameter tuning, and time-series normalisation.
  • Ran volatility-adjusted Monte Carlo simulations; improved overall Sharpe Ratio by 12%.

Python · pandas · NumPy · Monte Carlo · Backtesting

Digital Platform Lead · Gondilal Saraf

2022 — Present

Banda, India · Remote

  • Run the full-stack platform for my family's 150-year-old jewelry business (since 1873).
  • Bilingual Hindi/English storefront with live gold rates, AR virtual try-on, and AI-generated product descriptions via Gemini 2.0 Flash.
  • Admin ERP with POS, inventory, customer database, and barcode generation — 15 Prisma models, 26 API routes, 85 vitest tests.
  • Apply data-science techniques (demand forecasting, segmentation) to a century-old traditional industry.

Full-Stack Development · Next.js · PostgreSQL · Prisma · Gemini AI · Business Strategy

Selected Projects

Watch Together

Solo — Chrome/Firefox/Safari extension

Cross-site video-sync extension shipping on the Chrome Web Store. WebSocket relay with heartbeat drift correction (<0.5s), host-mode enforcement, site-specific player adapters, 59 vitest tests + Puppeteer e2e.

Gondilal Saraf

Solo — full-stack platform

Three surfaces (public site + bilingual catalogue + admin ERP) in one Next.js 15 codebase. Image pipeline: Photoroom → Sharp → Gemini 2.0 Flash → Replicate SDXL. 15 Prisma models, 26 API routes, 85 tests.

PCOD Tracker

Solo — AI health companion

Rant-first health logging where Claude extracts structured symptoms, medications, and mood from free-form text. Lab-report PDF parser with Zod-validated JSON output and manual fallback form. 15 Prisma models.

Red Bull YouTube Sentiment Analytics

Solo — data-science case study

VADER sentiment on 500 YouTube comments. Net Sentiment Score +28.6 pp (2× industry benchmark), 100% hashtag discipline across a 50-video catalog, 8-chart Excel dashboard + executive-summary PDF.

U.S. Power Outages (DSC 80)

With Paulina Pelayo

Analysed 1,534 major U.S. power outages (2000–2016). Random Forest with engineered features reached RMSE 6,189 min and R² 0.220, with fairness checks across weather vs. non-weather outages (p ≈ 0.007 on price-tier hypothesis test).

MLB Playoff Prediction (COGS 108)

Team of 5 — ethics lead

Team-of-5 Winter 2026 final project on early-season predictors of MLB playoff qualification 2015–2023 via Fangraphs pybaseball. Proposal → EDA → final analysis across four notebooks. I led the ethics section.

Skills

Proficient

Python (pandas, NumPy) · SQL / PostgreSQL · TypeScript / React / Next.js · Flask / REST APIs · Jupyter · EDA · data cleaning · Git / GitHub

Comfortable

scikit-learn (Random Forest, permutation tests, NMAR reasoning) · TensorFlow / Keras (CNNs, transfer learning) · Matplotlib · Seaborn · Recharts · Time-series analysis · backtesting · Prisma ORM · relational schema design · Docker · containerised microservices · WebSockets · real-time sync · Swift / SwiftUI · Claude / Gemini / OpenAI APIs · Chrome Manifest V3 extensions

Familiar

PyTorch · Bayesian networks · Markov models · RL (CSE 150A) · Recommender systems · web mining (CSE 258) · VADER · NLP fundamentals · Azure Monitor · Log Analytics · AWS Lambda · CloudWatch · OpenTelemetry (logs, traces, metrics) · Monte Carlo simulation · C (systems programming) · Linear algebra · differential equations

Certifications

Neural Networks and Deep Learning

DeepLearning.AI · Jul 2024

Machine Learning Specialization

Stanford University · Jul 2024

Google AI Essentials

Google · Jul 2024

Google Data Analytics Professional Certificate

Google · Jul 2024

Legal name: Arnav Goel. Also goes by alias Yash Goel (UCSD email: yashgoel0304@gmail.com). F-1 international student; would require H-1B sponsorship after OPT STEM extension. Considering new-grad Applied Scientist / ML Engineer / Data Scientist / SWE roles for summer 2027 onward.