Demystifying GPU Terminology: A Compiler Engineer’s Field Guide

SIMD, warps, occupancy, coalescing, shared memory, spills, and matrix units—mapped to real compiler decisions. Table of Contents TL;DR 1. Execution Model, Decoded SIMT vs SIMD (why is it confusing?) Warps/Wavefronts CTA (Cooperative Thread Array) / Workgroup Occupancy (It’s not a religion) 2. Memory Hierarchy (where performance is won and lost) Coalesced Access (the golden rule) Shared Memory (on-chip scratchpad) Spills (the invisible tax) 3. Math Units: Matrix Engines, Precision, and Shapes 4. Scheduling and Latency Hiding Warp Scheduling Divergence and Predication 5. Vendor Term Crosswalk 6. Checklists you will actually use 7. Quick Reference Cheat Sheet GPUs aren’t mysterious - just picky. Most performance cliffs are not about the math; they’re about how warps step, how memory is fetched, and how often the registers spill. This post decodes the jargon; and to be candid, it is me “spilling” my notes, trying to explain myself. ...

November 9, 2025 · 5 min

Booting Up: A Verbose Debug Build of Life and Compilers

Compiler passes, ML systems, and life — debug logs from the path between code and silicon. “If life had a compiler, I’d probably still be tuning the optimization flags.” Welcome to Tiled Thoughts — my verbose debug build. I’m Samarth, a compiler engineer at Qualcomm. My work revolves around building efficient ML systems, contributing to open-source compiler infrastructures like LLVM and MLIR, and exploring the intersection of programming languages and machine learning. This blog is where I log the things that don’t quite fit into a Git commit message — reflections, experiments, and observations tiled across compilers, ML systems, open-source, and education. ...

November 4, 2025 · 1 min