Glossary
Key terms and concepts in transformer-based LLMs
About This Project
Langsplain is an interactive educational tool designed to help you understand how modern Large Language Models work under the hood.
What You'll Learn
- Architecture: tokenization, embeddings, attention, FFN/MOE, and output projection
- Training: data preparation, loss optimization, backpropagation, and post-training alignment
- Inference: prefill, KV cache, sampling, decode loops, and stop conditions
Interactive Features
- Guided Tour: Walkthrough across Architecture, Training, and Inference
- Section-Specific Diagrams: Click any component to open detailed explanations
- Attention Demo: Visualize how tokens attend to each other
- MOE Demo: See how routing works in expert models
- Gradient Demo: Step through optimization on a 2D loss surface
- Loss Demo: Watch cross-entropy and perplexity change during training
- Sampling Demo: Explore temperature, top-k, and top-p effects
- KV Cache Demo: Compare cached vs uncached generation cost
- Generation Demo: Step through prefill + autoregressive decode
Further Learning
- Attention Is All You Need - The original transformer paper
- The Illustrated Transformer - Visual guide by Jay Alammar
- Switch Transformers - MOE at scale
- Neural Networks: Zero to Hero - Andrej Karpathy's course
Technical Notes
This visualization uses simplified, toy-sized models for demonstration purposes. Real LLMs have much larger dimensions (e.g., 4096-8192 vs our 64) and more layers (32-96 vs our 3). The attention patterns shown are computed on actual (tiny) weights but won't match production model behavior.
Credits
Built with vanilla JavaScript, D3.js for visualizations, and Anime.js for animations. No framework dependencies - just clean, educational code.