Builds Across AI, Data Engineering & Applied Analytics

Ten showcased projects spanning full-stack AI systems, data engineering, statistical modeling, and cybersecurity analysis.

NBA Shot Data Engineering Package

spatialSportsR

Multi-source NBA shot data engineering package with cleaning, schema validation, and rerunnable SQLite upserts that power GAM modeling and shot archetype analysis across five NBA seasons.

62.7% xFG Holdout
1.1M+ Training Shots
~136K Evaluation Shots
467 Players
5 Seasons

Key Features

  • ESPN + NBA Stats ingestion with parallel collection and exponential backoff
  • Schema validation and rerunnable SQLite upsert behavior
  • xFG logistic regression + GAM tensor surface modeling
  • Shot Difficulty Index feature engineering
  • GMM player archetypes with BIC selection and PCA projection
R Python scikit-learn PyGAM SQLite Streamlit

AI Multitool Assistant

ai-multitool

Full-stack AI web application with chat, PDF Q&A, and real-time tools. Two-service architecture with JWT authentication and Gemini 2.5-Flash via a LlamaIndex ReAct agent.

5 API Tools
7+ REST Endpoints
2-service Architecture

Key Features

  • Gemini 2.5-Flash + LlamaIndex ReAct orchestration
  • RAG pipeline for PDF parsing, indexing, and querying
  • Five real-time tools: stocks, crypto, weather, news, market
  • JWT access and refresh token lifecycle
  • User-scoped data isolation for notes, chat, and PDF indexes
Django REST React 18 Vite JWT Gemini LlamaIndex

ESPN NBA Data Pipeline

nba-data

R package for end-to-end ESPN NBA data collection, parsing, validation, and storage. Handles 1,000+ games per season with parallel collection and exponential backoff retry logic.

1,000+ Games/Season
13 Test Files
4 Core Tables
5 Retry Attempts

Key Features

  • Phase-based pipeline: collection, inventory, parsing, quality
  • Parallel collection via future::multisession
  • Schema enforcement with explicit type casting
  • SQLite upserts with schema versioning via nbadata_meta
  • Fixture-based tests with optional live-gated integration tests
R httr2 DBI RSQLite future testthat

NAU Course Catalog Scraper

nau-scraper

Automated web scraping and AI curriculum analysis pipeline. Scraped 12,944 course records across Fall 2025 and Spring 2026 with a three-tier classification system.

12,944 Courses
150+ Prefixes
2 Terms
3 Analysis Tiers

Key Features

  • PDF prefix extraction with pdfplumber
  • Selenium browser automation with resumable crawl states
  • Three analysis scripts: high-precision, broad recall, and ethics
  • Fuzzy string matching for typo-tolerant detection
  • CSV exports and R Markdown reporting for precision/recall tradeoffs
Python Selenium pdfplumber pandas thefuzz R Markdown

NBA Win Probability Models

nba-modeling

Leakage-aware predictive modeling for NBA home-team win probability using rolling window features across 2002-2025 seasons.

67% Best Accuracy
0.684 Best AUC
59.2% Baseline
2002-2025 Seasons

Key Features

  • Rolling 3/5/10-game windows with lag-1 leakage prevention
  • Advanced stats and matchup differential feature engineering
  • Ridge, elastic net, logistic, and random forest model comparison
  • Elo features with season-reset and all-time variants
  • Chronological train/test split preserving temporal order
R glmnet pROC randomForest slider hoopR

Deep Learning Projects

deep-learning

Four progressive deep learning projects covering ANN baselines, hyperparameter optimization, CNN interpretation, and RNN sequence modeling.

4 Projects
PyTorch Framework
W&B Tracking

Key Features

  • ANN project with custom training loops
  • Grid/random hyperparameter optimization with W&B
  • CNN filter visualization workflow
  • RNN modeling with a custom tokenizer
  • Reproducible shared infrastructure across project modules
PyTorch TorchVision scikit-learn Weights & Biases pytest

NBA Analytics Platform

nba-app

Freemium NBA analytics platform with dashboards, AI-style Q&A over NBA data, Stripe billing, and production-grade security hardening.

Monorepo Architecture
Fastify API Backend
Next.js 14 Frontend
PostgreSQL + Redis Data Layer

Key Features

  • Fastify + TypeScript backend with intent-driven Q&A
  • Next.js 14 App Router frontend with feature gating
  • Stripe checkout and webhook subscription flow
  • Security stack: JWT, Zod safeParse, hashing, rate limiting
  • Redis caching with Lua scripts and performance optimizations
TypeScript Fastify Next.js Prisma PostgreSQL Redis

NAU Capstone: Sports Expected Points Analysis

nau-capstone

Research capstone analyzing expected points/goals across NBA and NHL with shot-quality modeling, calibration diagnostics, and player value analysis.

467 Players Profiled
3 Data Sources
NBA + NHL Sports

Key Features

  • xFG logistic regression and POE computation pipeline
  • GAM tensor surface modeling with partial dependence diagnostics
  • GMM archetypes with BIC selection and PCA projection
  • Shot Difficulty Index and residual analysis
  • POE per million salary value rankings
Python scikit-learn PyGAM Streamlit plotly matplotlib

NBA Dockerized Scrape Pipeline

scrape-pipeline

Production-grade Python/Docker pipeline for high-volume NBA data ingestion into PostgreSQL with idempotent upsert behavior.

PostgreSQL 16 Database
Makefile Automation
Dockerized Architecture

Key Features

  • concurrent.futures parallel HTTP fetching
  • Chunked SQLAlchemy ON CONFLICT DO UPDATE upserts
  • Complex ESPN JSON normalization to relational tables
  • Failure-stage tracking through nba_ingest_failures
  • Makefile automation for ingest/reset/up/down
Python Docker PostgreSQL SQLAlchemy pandas Makefile

Stuxnet Cyberwarfare Analysis

stuxnet-analysis

Technical cybersecurity analysis of Stuxnet covering propagation, zero-days, PLC targeting, sensor spoofing, and critical infrastructure impact.

4 Zero-Days
Operation Olympic Games Attribution
Rmd + PDF Output

Key Features

  • Analysis of four simultaneous zero-day exploits
  • Code-signing abuse via stolen certificates
  • Siemens S7-315 PLC hardware targeting and fingerprinting
  • Man-in-the-middle PLC manipulation and sensor spoofing
  • Impact comparison against WannaCry and NotPetya
R Markdown LaTeX ICS/SCADA security Exploit analysis