NBA Shot Data Engineering Package
Role: Data engineer and modeling workflow builder
Challenge
Create a repeatable multi-source shot-data pipeline that supports stable downstream modeling and archetype analysis without schema drift.
Process
- Ingested and merged 1.1M+ shots across five seasons from multiple sources
- Applied schema contracts and validation before feature generation
- Implemented rerunnable SQLite load/upsert behavior for repeatable refreshes
- Engineered xFG, residual, and SDI features for GAM and clustering workflows
Outcomes
- xFG model reached 62.7% holdout accuracy
- Generated stable feature layers for shot-quality and value analysis
- Delivered interpretable archetype visuals and residual diagnostics
AI Multitool Assistant
Role: Full-stack developer and AI integration lead
Challenge
Unify chat, PDF Q&A, and real-time external tools into one secure workflow with consistent authentication and data isolation.
Process
- Split system into React/Vite frontend and Django REST backend
- Implemented JWT access and refresh token lifecycle
- Integrated Gemini via LlamaIndex ReAct with tool orchestration
- Built PDF index persistence and lazy query loading
Outcomes
- Single user workflow for chat, docs, and tools
- Composable API surface for fast feature iteration
- Reliable user-scoped data separation across notes and history
ESPN NBA Data Pipeline
Role: R package developer and data pipeline architect
Challenge
Build a robust R package that can ingest and validate 1,000+ games per season with graceful failure handling.
Process
- Implemented parallel collection wrappers with retry logic
- Created manifest checks for missing, postponed, and canceled games
- Enforced schema maps and type-casting contracts
- Built idempotent upsert tables and migration hooks
Outcomes
- Produced four stable parsed core tables
- Established 13 fixture-based tests without live API dependency
- Enabled repeatable incremental updates without duplicate rows
NAU Course Catalog Scraper
Role: Scraper and curriculum-analysis pipeline developer
Challenge
Audit AI and ethics course coverage where catalog data is distributed across term pages, prefix lists, and inconsistent page structures.
Process
- Extracted prefixes from PDF and maintained reusable prefix lists
- Built Selenium crawl flow with resumability and empty-prefix logging
- Separated analysis into precision, broad recall, and ethics scripts
- Exported cleaned CSV outputs and report-ready summaries
Outcomes
- Scraped 12,944 records across 150+ prefixes and two terms
- Created transparent precision-vs-recall reporting in R Markdown
- Delivered inspectable stage-by-stage transformation outputs
NBA Win Probability Models
Role: Statistical modeler and feature engineer
Challenge
Model home-team win probability across 23 seasons while preventing temporal leakage that would inflate metrics.
Process
- Built lagged rolling features across 3/5/10-game windows
- Engineered possession metrics and matchup differential features
- Compared ridge, elastic net, logistic, and random forest models
- Integrated Elo variants and calibrated holdout thresholds
Outcomes
- Best ridge model achieved 67% accuracy and 0.684 AUC
- Maintained chronological train/test integrity
- Documented reproducible model comparison workflow
Deep Learning Projects
Role: Deep learning practitioner and experiment tracker
Challenge
Deliver four progressive deep-learning projects while preserving reproducibility and making results auditable.
Process
- Implemented ANN baseline workflows and regression tests
- Built HPO model-factory patterns with W&B experiment tracking
- Developed CNN interpretation through filter visualization
- Implemented RNN sequence pipelines with custom tokenizer
Outcomes
- Completed ANN, HPO, CNN, and RNN progression
- Established reproducible training utilities and shared seeds
- Produced tracked experiment outputs for comparison and reporting
NBA Analytics Platform
Role: Full-stack developer and security hardening engineer
Challenge
Ship a production-ready freemium analytics platform with secure auth, billing, and AI-style Q&A without exposing raw SQL or sensitive internals.
Process
- Built Fastify API modules for auth, routes, billing, and Q&A
- Implemented regex-intent Q&A templates to avoid user SQL execution
- Added Stripe checkout and webhook subscription management
- Hardened with JWT policy, Zod validation, rate limits, and Redis caching
Outcomes
- Reached 69 passing tests across 25 files
- Achieved A+ security posture in project audits
- Delivered reliable freemium controls and optimized dashboard UX
NAU Capstone: Sports Expected Points Analysis
Role: Lead analyst and modeling researcher
Challenge
Design a multi-sport expected points framework that explains shot quality and value while remaining interpretable to both technical and non-technical audiences.
Process
- Built xFG and xG workflows with POE and residual diagnostics
- Ran GAM tensor-surface models with PDP interpretation
- Clustered player shot profiles via GMM and PCA
- Combined salary collection with POE per million value rankings
Outcomes
- Produced interpretable cross-sport shot-quality metrics
- Built reusable scripts for calibration, clustering, and value analysis
- Delivered an interactive Streamlit capstone dashboard
NBA Dockerized Scrape Pipeline
Role: Data engineer and ingestion pipeline architect
Challenge
Build a Dockerized ingestion system that can rerun safely at scale while preserving auditability and failure tracing.
Process
- Parallelized HTTP fetches with concurrent.futures
- Mapped variable ESPN JSON schemas into relational structures
- Implemented chunked ON CONFLICT upserts via SQLAlchemy
- Automated operations through Makefile commands and Docker compose
Outcomes
- Established idempotent PostgreSQL ingestion behavior
- Created failure-stage observability through ingest-failure tracking
- Enabled repeatable local and containerized execution workflows
Stuxnet Cyberwarfare Analysis
Role: Cybersecurity researcher and technical writer
Challenge
Translate highly technical malware behavior into a rigorous, evidence-based analysis that remains readable and academically structured.
Process
- Mapped propagation and exploit chain details across four zero-days
- Analyzed certificate abuse and PLC hardware-targeting logic
- Documented attack-vector mechanics and sensor spoofing behavior
- Structured findings in publication-style R Markdown and LaTeX
Outcomes
- Delivered a comprehensive Stuxnet technical report
- Connected exploit mechanics to real-world infrastructure impact
- Showcased cybersecurity depth aligned with minor focus