 Command

Pranesh Nikhar's personal site. Vim-style keybinds for navigation; theme + font pickers below.

Theme
 Font Body Code
Reader
Keybinds
Navigation
j / ↓ Next item k / ↑ Previous item g First item in region G Last item in region zz Center focused item h / l Move left/right region ] / [ Next/previous heading } / { Next/previous block d / u Half-page down/up
Layout
<zh> / <zl> Toggle left/right sidebar <zr> Toggle reader view <zj> / <zk> Focus main/navbar <S-h/j/k/l> Focus left/main/navbar/right ⌃H / ⌃L Focus left/right sidebar ⌃J / ⌃K Focus main/navbar ⇧C / ⇧E Collapse / expand all sections
Dialogs
⌃P / : Command palette ⌃X Theme picker / Search ? Show keybinds Esc / ⌃C Close dialog
History
n Next document b Previous document ⌃O History back ⌃I History forward
 Search
about: Pranesh Nikhar about/more: πŸͺͺ More docs/test: Docs Test ideas: πŸ’‘ Ideas more: βž• More now: Now posts: πŸ“¬ Posts projects: πŸ“š Projects webtui: Style posts/agentic-eda: πŸ“Š AgenticEDA β€” Automated Exploratory Data Analysis with LangGraph posts/cap-theorem-outage-story: 🌐 CAP Theorem with a Real Outage Story posts/codepilot: ✈️ CodePilot β€” From Requirements to Deployable FastAPI Backend posts/common-auth-mistakes: πŸ” Common Auth Mistakes Developers Make posts/compiled-vs-jit-vs-interpreted: ⚑ Why Is X Language Fast or Slow? β€” Compiled vs JIT vs Interpreted posts/cs-degree-gaps: πŸŽ“ Things CS Degrees Don't Teach You posts/cve-2025-breach-analysis: πŸ›‘οΈ CVE-2025 Breach Analysis β€” Midnight Blizzard and the 16 Billion Credential Leak posts/fixloop: πŸ”„ FixLoop β€” AI Agent Loop for Self-Correcting Code posts/functional-vs-oop: ⚑ Functional vs OOP β€” Same Problem, Both Ways posts/getman: 🦾 Getman β€” Declarative API Tester for CLI & TUI posts/how-compilers-optimize: βš™οΈ How Compilers Actually Optimize Your Code posts/http3-quic: ⚑ HTTP/3 and QUIC β€” Why They Matter posts/leetcode-vs-engineering: 🧩 LeetCode vs Real Engineering Skills posts/llm-from-scratch: 🧠 LLM from Scratch β€” GPT-Style Transformer in PyTorch posts/lsm-trees-bloom-filters: 🌳 LSM Trees & Bloom Filters β€” Production Deep Dive posts/mcp-workflow-builder: πŸ”§ MCP Workflow Builder β€” Visual DAG for MCP Tools posts/persistent-memory: 🧠 Persistent Memory β€” Long-Term Memory for AI Agents via MCP posts/playcli: 🎬 PlayCLI β€” Terminal Video Player posts/postgres-mvcc: πŸ—„οΈ How PostgreSQL MVCC Works β€” Multi-Version Concurrency Control Deep Dive posts/raft-consensus: β›΅ Raft Consensus Algorithm Explained posts/rust-borrow-checker: πŸ¦€ Rust Borrow Checker β€” Catches Real Bugs posts/titan: πŸ€– Titan β€” Terminal AI Coding Agent posts/what-happens-url: 🌐 What Happens Between Typing a URL and Seeing the Page posts/what-happens-when-you-run-a-program: βš™οΈ What Actually Happens When You Run a Program posts/zero-knowledge-proofs: πŸ” Zero-Knowledge Proofs Explained Simply webtui/components/accordion: Accordion webtui/components/badge: Badge webtui/components/button: Button webtui/components/checkbox: Checkbox webtui/components/dialog: Dialog webtui/components/input: Input webtui/components/popover: Popover webtui/components/pre: Pre webtui/components/progress: Progress webtui/components/radio: Radio webtui/components/range: Range webtui/components/separator: Separator webtui/components/spinner: Spinner webtui/components/switch: Switch webtui/components/table: Table webtui/components/textarea: Textarea webtui/components/tooltip: Popover webtui/components/typography: Typography webtui/components/view: View webtui/contributing/contributing: Contributing webtui/contributing/contributing: ## Local Development webtui/contributing/contributing: ## Issues webtui/contributing/contributing: ## Pull Requests webtui/contributing/style-guide: Style Guide webtui/contributing/style-guide: ## CSS Units webtui/contributing/style-guide: ## Selectors webtui/contributing/style-guide: ## Documentation webtui/installation/astro: Astro webtui/installation/astro: ## Scoping webtui/installation/astro: ### Frontmatter Imports webtui/installation/astro: ### β€Ήstyleβ€Ί tag webtui/installation/astro: ### Full Library Import webtui/installation/nextjs: Next.js webtui/installation/vite: Vite webtui/plugins/plugin-dev: Developing Plugins webtui/plugins/plugin-dev: ### Style Layers webtui/plugins/plugin-nf: Nerd Font Plugin webtui/plugins/theme-catppuccin: Catppuccin Theme webtui/plugins/theme-custom: Custom Theme webtui/plugins/theme-everforest: Everforest Theme webtui/plugins/theme-gruvbox: Gruvbox Theme webtui/plugins/theme-nord: Nord Theme webtui/plugins/theme-vitesse: Vitesse Theme webtui/start/ascii-boxes: ASCII Boxes webtui/start/changelog: Changelog webtui/start/installation: Installation webtui/start/installation: ## Installation webtui/start/installation: ## Using CSS webtui/start/installation: ## Using ESM webtui/start/installation: ## Using a CDN webtui/start/installation: ## Full Library Import webtui/start/installation: ### CSS webtui/start/installation: ### ESM webtui/start/installation: ### CDN webtui/start/intro: Introduction webtui/start/intro: ## Features webtui/start/plugins: Plugins webtui/start/plugins: ## Official Plugins webtui/start/plugins: ### Themes webtui/start/plugins: ## Community Plugins webtui/start/theming: Theming webtui/start/theming: ## CSS Variables webtui/start/theming: ### Font Styles webtui/start/theming: ### Colors webtui/start/theming: ### Light & Dark webtui/start/theming: ## Theme Plugins webtui/start/theming: ### Using Multiple Theme Accents webtui/start/tuis-vs-guis: TUIs vs GUIs webtui/start/tuis-vs-guis: ## Monospace Fonts webtui/start/tuis-vs-guis: ## Character Cells
 Theme Current: Light j/k or ↑/↓ + Enter

🧠 LLM from Scratch β€” GPT-Style Transformer in PyTorch

A complete GPT-style decoder-only transformer built from scratch in PyTorch β€” no transformers library, no HF. BPE tokenizer, training loop, FastAPI server, and React chat frontend.

🎯 What It Does

LLM from Scratch is a complete GPT-style decoder-only transformer language model (~5.2M parameters) built entirely in PyTorch without using the HuggingFace transformers library. It includes a BPE tokenizer trained from scratch, a training loop with modern optimizer settings, a FastAPI REST API for serving, and a React chat frontend.

$ python train.py
Step 1000 | loss 3.42 | lr 1.2e-4 | 12.5 it/s
Step 2000 | loss 2.89 | lr 8.5e-5 | 12.5 it/s
...
Generated: "shall I compare thee to a summer's day..."

🧱 Tech Stack

ComponentTechnology
BackendPython 3.10+, PyTorch 2+, NumPy
APIFastAPI + uvicorn
FrontendReact 19, Vite 8, Tailwind CSS 4, Axios
TokenizerCustom BPE (byte-level, 5000 merges)

This is a zero-dependency transformer implementation β€” no transformers, no SentencePiece, no tiktoken.


πŸ—οΈ Architecture

BACKEND/
β”œβ”€β”€ tokenizer.py         # BPE tokenizer (byte-level, configurable merges)
β”œβ”€β”€ attention.py         # Multi-head causal self-attention (optional flash attn)
β”œβ”€β”€ transformer_block.py # Pre-norm: LN β†’ Attn β†’ residual β†’ LN β†’ FFN β†’ residual
β”œβ”€β”€ model.py             # GPT: embeddings β†’ stacked blocks β†’ final LN β†’ LM head
β”œβ”€β”€ dataset.py           # Sliding-window dataset (predict shifted-by-1)
β”œβ”€β”€ train.py             # AdamW, cosine LR, gradient accumulation, clipping
β”œβ”€β”€ generate.py          # CLI text generation
β”œβ”€β”€ server.py            # FastAPI endpoints: /generate, /health
└── config.py            # Hyperparameters + device detection

FRONTEND/
β”œβ”€β”€ App.jsx              # Main React app
β”œβ”€β”€ Chat.jsx             # Chat interface
└── api.js               # Axios API client

🧬 Model Architecture

ParameterValue
Embedding dim192
Layers6
Attention heads6 (head dim 32)
FFN hidden dim768
Total params~5.2M
Vocabulary5000 BPE tokens
Context length256 tokens

The architecture follows the GPT-2 pattern:

  • Token embeddings + learned positional embeddings
  • Pre-norm transformer blocks (LayerNorm before attention and FFN)
  • Causal multi-head self-attention with optional Flash Attention
  • Feed-forward with GELU activation
  • Final LayerNorm followed by linear LM head
  • Weight tying between embedding and LM head

πŸ“ BPE Tokenizer

The BPE tokenizer is implemented from scratch with:

  • Byte-level encoding (handles any UTF-8 input)
  • Configurable number of merge operations (default: 5000)
  • Trainable on any text corpus
  • Serialization via pickle for reuse after training

This avoids any dependency on HuggingFace tokenizers or tiktoken, while still getting reasonable tokenization quality for the Shakespeare domain.


πŸ‹οΈ Training Setup

SettingValue
OptimizerAdamW
Learning rateCosine schedule with linear warmup
Weight decayApplied to weights only (not biases/norms)
Gradient accumulation4 steps
Gradient clipping1.0
Batch size64 sequences
DeviceAuto-detect (CUDA / MPS / CPU)

The dataset (dataset.py) implements sliding-window chunking: each training example is a contiguous 256-token window, and the target is the same window shifted by one token (next-token prediction).


🌐 Serving

The FastAPI server exposes two endpoints:

GET  /health    # Health check
POST /generate  # Generate text: {prompt, max_tokens, temperature, top_k, top_p}

Generation parameters include temperature, top-k sampling, and top-p (nucleus) sampling β€” all implemented manually in model.py.

The React frontend provides a ChatGPT-like chat interface styled with Tailwind CSS 4.


πŸš€ Quick Start

# Train
cd BACKEND
python train.py  # Trains on Shakespeare, saves to checkpoints/

# Serve
python server.py
# β†’ API running at http://localhost:8000

# Frontend (separate terminal)
cd FRONTEND
npm install
npm run dev
# β†’ UI at http://localhost:5173

# Or generate from CLI
python generate.py --prompt "To be or not to be" --temperature 0.8

πŸ’‘ Why It’s Interesting

This project is a complete LLM implementation from the ground up β€” no black boxes, no library magic. Every component is hand-coded: the transformer blocks, the attention mechanism, the BPE tokenizer, the training loop with modern optimizer settings, and the sampling strategies. It’s the ideal reference for anyone who wants to understand how GPT-style models actually work under the hood, from tokenization to generation.

 praneshnikhar.site / posts / llm-from-scratch Β· Top 1:1