TLDR: The Q, K, V Matrices
Date: 2025-11-26 Source: https://arpitbhayani.me/blogs/qkv-matrices
Overview
At the core of the attention mechanism in LLMs are three matrices: Query, Key, and Value. These matrices are how transformers actually pay attention to different parts of the input. In this write-up, we will go through the construction of these matrices from the ground up.
Key Points
- At the core of the attention mechanism in LLMs are three matrices: Query, Key, and Value.
- Why Q, K, V Matrices Matter: When we read a sentence like “The cat sat on the mat because it was comfortable,” our brain automatically knows that “it” refers to “the mat” and not “the cat.” This is attention in action.
- The Intuition: Think of the attention mechanism like a database lookup system.
- Attention Pipeline: Before we dive deeper, here is the whole flow of self-attention in one clean sequence: I have discussed this entire flow in one of my previous blog posts - How LLM Inference Works - give it a read.