Featured Project Case Studies

Deep dives into challenge, role, process, and outcomes for selected portfolio projects.

NBA Shot Data Engineering Package

Role: Data engineer and modeling workflow builder

Challenge

Create a repeatable multi-source shot-data pipeline that supports stable downstream modeling and archetype analysis without schema drift.

Process

  • Ingested and merged 1.1M+ shots across five seasons from multiple sources
  • Applied schema contracts and validation before feature generation
  • Implemented rerunnable SQLite load/upsert behavior for repeatable refreshes
  • Engineered xFG, residual, and SDI features for GAM and clustering workflows

Outcomes

  • xFG model reached 62.7% holdout accuracy
  • Generated stable feature layers for shot-quality and value analysis
  • Delivered interpretable archetype visuals and residual diagnostics

AI Multitool Assistant

Role: Full-stack developer and AI integration lead

Challenge

Unify chat, PDF Q&A, and real-time external tools into one secure workflow with consistent authentication and data isolation.

Process

  • Split system into React/Vite frontend and Django REST backend
  • Implemented JWT access and refresh token lifecycle
  • Integrated Gemini via LlamaIndex ReAct with tool orchestration
  • Built PDF index persistence and lazy query loading

Outcomes

  • Single user workflow for chat, docs, and tools
  • Composable API surface for fast feature iteration
  • Reliable user-scoped data separation across notes and history

ESPN NBA Data Pipeline

Role: R package developer and data pipeline architect

Challenge

Build a robust R package that can ingest and validate 1,000+ games per season with graceful failure handling.

Process

  • Implemented parallel collection wrappers with retry logic
  • Created manifest checks for missing, postponed, and canceled games
  • Enforced schema maps and type-casting contracts
  • Built idempotent upsert tables and migration hooks

Outcomes

  • Produced four stable parsed core tables
  • Established 13 fixture-based tests without live API dependency
  • Enabled repeatable incremental updates without duplicate rows

NAU Course Catalog Scraper

Role: Scraper and curriculum-analysis pipeline developer

Challenge

Audit AI and ethics course coverage where catalog data is distributed across term pages, prefix lists, and inconsistent page structures.

Process

  • Extracted prefixes from PDF and maintained reusable prefix lists
  • Built Selenium crawl flow with resumability and empty-prefix logging
  • Separated analysis into precision, broad recall, and ethics scripts
  • Exported cleaned CSV outputs and report-ready summaries

Outcomes

  • Scraped 12,944 records across 150+ prefixes and two terms
  • Created transparent precision-vs-recall reporting in R Markdown
  • Delivered inspectable stage-by-stage transformation outputs

NBA Win Probability Models

Role: Statistical modeler and feature engineer

Challenge

Model home-team win probability across 23 seasons while preventing temporal leakage that would inflate metrics.

Process

  • Built lagged rolling features across 3/5/10-game windows
  • Engineered possession metrics and matchup differential features
  • Compared ridge, elastic net, logistic, and random forest models
  • Integrated Elo variants and calibrated holdout thresholds

Outcomes

  • Best ridge model achieved 67% accuracy and 0.684 AUC
  • Maintained chronological train/test integrity
  • Documented reproducible model comparison workflow

Deep Learning Projects

Role: Deep learning practitioner and experiment tracker

Challenge

Deliver four progressive deep-learning projects while preserving reproducibility and making results auditable.

Process

  • Implemented ANN baseline workflows and regression tests
  • Built HPO model-factory patterns with W&B experiment tracking
  • Developed CNN interpretation through filter visualization
  • Implemented RNN sequence pipelines with custom tokenizer

Outcomes

  • Completed ANN, HPO, CNN, and RNN progression
  • Established reproducible training utilities and shared seeds
  • Produced tracked experiment outputs for comparison and reporting

NBA Analytics Platform

Role: Full-stack developer and security hardening engineer

Challenge

Ship a production-ready freemium analytics platform with secure auth, billing, and AI-style Q&A without exposing raw SQL or sensitive internals.

Process

  • Built Fastify API modules for auth, routes, billing, and Q&A
  • Implemented regex-intent Q&A templates to avoid user SQL execution
  • Added Stripe checkout and webhook subscription management
  • Hardened with JWT policy, Zod validation, rate limits, and Redis caching

Outcomes

  • Reached 69 passing tests across 25 files
  • Achieved A+ security posture in project audits
  • Delivered reliable freemium controls and optimized dashboard UX

NAU Capstone: Sports Expected Points Analysis

Role: Lead analyst and modeling researcher

Challenge

Design a multi-sport expected points framework that explains shot quality and value while remaining interpretable to both technical and non-technical audiences.

Process

  • Built xFG and xG workflows with POE and residual diagnostics
  • Ran GAM tensor-surface models with PDP interpretation
  • Clustered player shot profiles via GMM and PCA
  • Combined salary collection with POE per million value rankings

Outcomes

  • Produced interpretable cross-sport shot-quality metrics
  • Built reusable scripts for calibration, clustering, and value analysis
  • Delivered an interactive Streamlit capstone dashboard

NBA Dockerized Scrape Pipeline

Role: Data engineer and ingestion pipeline architect

Challenge

Build a Dockerized ingestion system that can rerun safely at scale while preserving auditability and failure tracing.

Process

  • Parallelized HTTP fetches with concurrent.futures
  • Mapped variable ESPN JSON schemas into relational structures
  • Implemented chunked ON CONFLICT upserts via SQLAlchemy
  • Automated operations through Makefile commands and Docker compose

Outcomes

  • Established idempotent PostgreSQL ingestion behavior
  • Created failure-stage observability through ingest-failure tracking
  • Enabled repeatable local and containerized execution workflows

Stuxnet Cyberwarfare Analysis

Role: Cybersecurity researcher and technical writer

Challenge

Translate highly technical malware behavior into a rigorous, evidence-based analysis that remains readable and academically structured.

Process

  • Mapped propagation and exploit chain details across four zero-days
  • Analyzed certificate abuse and PLC hardware-targeting logic
  • Documented attack-vector mechanics and sensor spoofing behavior
  • Structured findings in publication-style R Markdown and LaTeX

Outcomes

  • Delivered a comprehensive Stuxnet technical report
  • Connected exploit mechanics to real-world infrastructure impact
  • Showcased cybersecurity depth aligned with minor focus