projects

Selected systems, hardware, robotics, and AI infrastructure projects.

Projects

Baymax

TreeHacks 2024 Robotics Computer Vision

1st Place Grand Prize ($10,000) at TreeHacks 2024. AI-powered robotic arm for elderly assistance using computer-vision perception and custom inverse kinematics.

Sentinel

TreeHacks 2026 LLM Agents FPGA

Best Hardware Prize at TreeHacks 2026. Agentic firmware engineering system that reads datasheets, generates bare-metal firmware, and validates execution on physical hardware with FPGA-in-the-loop.

Bare-Metal Distributed LLM Inference

ARM Raspberry Pi LLM Inference

Bare-metal Raspberry Pi runtime with GPIO, UART, bootloading, multicore dispatch, caches, and virtual memory, running a 15M-parameter language model directly on hardware with 132x tok/s speedup through low-level optimizations.

RRAM Research

IEEE SISPAD RRAM Modeling

First-author research on statistical modeling and optimization of resistive RAM devices, presented at IEEE SISPAD 2023 in Kobe, Japan.

Lifetime-Aware Heterogeneous Memory Fabrics for AI Accelerators

PyTorch ONNX ZigZag MIQP

Built a PyTorch/ONNX-to-ZigZag workload characterization and heterogeneous memory-network synthesis pipeline that extracts tensor-level traffic behavior and maps differentiated weight/activation flows across SRAM and emerging-memory fabrics to optimize accelerator EDP.

SIMD Matrix-Multiplication Accelerator

VLSI SystemVerilog Accelerators

Architected and synthesized a multi-precision SIMD accelerator for INT8/INT16/INT32 matrix multiplication, verified against a functional simulator and optimized for PPA through datapath reuse, retiming, clock gating, and PE-array scaling.

KVStream

HLS LLM Inference FPGA

Won $10,000 at Anthropic x Etched x Cognition x Mercor hackathon. Long-context decode attention datapath with streaming KV-cache access, online softmax state, Skip-Softmax value-path gating, RTL cosimulation, Vivado synthesis, and block-parallel modeling, showing how decode-side attention can be accelerated with hardware built around memory streaming and reduction instead of dense GEMM.

Automated Laboratory Instrumentation Network

Embedded Systems STM32 PID Control

Built a distributed embedded control system for automating physical laboratory instrumentation: a host computer orchestrated Ethernet-connected thermal and rotational devices, with STM32-based feedback loops regulating hot-plate temperature and centrifuge speed in real time.

Embodied AI Fitness System for Aging Adults

TypeScript Raspberry Pi Pose Estimation

Built a venture-backed AI fitness product for Japan’s aging population, turning any television into a personalized exercise coach through edge vision, pose-based rep tracking, and an accessible remote-control-first experience.

Bare-Metal Raspberry Pi Operating System (CS 140E)

C ARMv6 Raspberry Pi Embedded Systems

Implemented a from-scratch operating system on ARM-based Raspberry Pi hardware, including bootloading, GPIO/UART drivers, interrupt handling, preemptive threads, virtual memory, page tables, wireless device networking, and a read-only FAT32 filesystem.

Bare-Metal Runtime and Hardware Instrumentation Stack (CS 240LX)

C ARMv6 Raspberry Pi JIT DMA I2C

Built a bare-metal systems stack for physical-computing systems, combining dynamic code generation, custom memory instrumentation, PMU profiling, and DMA/I²C-driven sensor-actuator control on Raspberry Pi hardware.

AI-Powered FPGA Music Synthesizer

Verilog FPGA CNN VGA

Built a real-time FPGA music system with ROM-based song sequencing, fixed-point wavetable audio synthesis, and VGA waveform rendering; implemented a pipelined MLP in Verilog to classify the emotional character of synthesized songs.

Adaptive Test-Time Compute for Preference-Tuned LLMs

PyTorch RL Test-Time Compute DPO

Developed adaptive inference strategies for a preference-tuned Qwen-2.5 LLM, implementing best-of-N, beam search, and chain-of-thought evaluation pipelines while improving SFT quality through corrected conversation preprocessing and context-length tuning; the resulting heuristic selector outperformed every fixed decoding strategy in reward score.

Bare-Metal Pi 3B Golf Trackman

ARM Raspberry Pi Radar

Golf TrackMan-style bare-metal Raspberry Pi 3B system with DWC2 USB host support, hub traversal, UART bridge control, radar configuration, and binary frame parsing, streaming TI mmWave detections directly from hardware for golf ball tracking.

AIMI Chest X-Ray Project

Medical Imaging Computer Vision

Computer-vision pipeline for detecting and boxing central venous catheters, endotracheal tubes, and chest tubes in chest X-ray data.

Parallel Alpha

Systems Parallelism Research

Leveraging advanced statistical methods, temporal knowledge graphs, parallelized analytics, MCP, and swarms of agents to extract robust financial relationships from messy, real-world data and deliver curated visualizations and insights

RoutER, A Multi-Agent Medical Consultant

Multi-Agent Twilio

Multilingual AI triage network that routed patients to nearby hospitals based on wait times, hosted a Twilio phone agent, and displayed real-time patient info in an ER dashboard.

Self-Driving Car Prototype @ UCSD COSMOS

Autonomy Robotics Computer Vision

Built a self-driving car prototype at UCSD COSMOS, combining perception, control, and embedded robotics for autonomous navigation.

iOS Apps

Swift SwiftUI Firebase

Built iOS apps across AR capture, nightlife booking, and climate tracking, using SwiftUI, Firebase, ARKit, TestFlight, and production mobile workflows.

LED Matrix Pong with Joystick Control

Arduino PCB

Designed and soldered an 8x8 LED matrix PCB with joystick input and pMOS driver circuitry, using time-division multiplexing for responsive Pong gameplay.