Trending:
AI & Machine Learning

Karpathy's 3.5-hour LLM tutorial distills ChatGPT training into 63 key questions

Former OpenAI researcher Andrej Karpathy's February 2025 tutorial breaks down LLM architecture, from tokenization through RLHF, now condensed into 63 Q&As covering pre-training, post-training, and practical deployment considerations. Worth watching if you're evaluating LLMs for enterprise applications.

Andrej Karpathy's "Deep Dive into LLMs like ChatGPT" has become essential viewing for technical leaders evaluating language models. The 3.5-hour tutorial, released February 2025, covers the full training pipeline: pre-training on web-scraped data, supervised fine-tuning, and reinforcement learning from human feedback.

The condensed 63 Q&As pull out practical insights. Key technical decisions matter: GPT-4 uses 100,277 tokens in its vocabulary. Too few tokens create unwieldy sequence lengths. Too many bloat the model. Common Crawl data requires heavy filtering for malware, PII, and language relevance before training. The Byte Pair Encoding algorithm compresses frequent byte sequences into single tokens, balancing sequence length against vocabulary size.

Karpathy, now at Eureka Labs after stints at OpenAI and Tesla AI, emphasizes LLM limitations alongside capabilities. Models predict tokens statistically, not through reasoning. They lack inherent memory between sessions. Hallucinations persist despite RLHF refinement. Security risks include jailbreaks and prompt injection.

For enterprise architects, the tutorial demystifies model selection criteria. Understanding tokenization helps evaluate context windows. Knowing RLHF's role clarifies why models still produce incorrect outputs confidently. The follow-up "How I use LLMs" (February 27, 2 hours, 2.15M views) covers practical applications: custom GPTs, code generation via Cursor, audio and video processing.

The original tutorial spawned multiple summary formats as viewers found the content valuable but verbose. The Q&A condensation approach—watch once for overview, rewatch slowly while documenting concepts—proved effective for technical learning.

His channel (1.23M subscribers, 17 videos) has built credibility since his 2023 introductory video. The 2025 content reflects rapid evolution in production LLM deployment. Worth noting: Karpathy flags scaling laws driving LLM improvement but doesn't quantify market projections. His focus stays on technical fundamentals that outlast hype cycles.