Context Window
The context window refers to the maximum amount of input text (prompt) and output text (completion) that a large language model (LLM) can process or 'remember' during a single interaction. This capacity is measured in tokens, where a token is roughly equivalent to a word or a sub-word unit.
The size of the context window directly dictates the complexity and scope of tasks an LLM can handle. A larger window allows the model to maintain coherence over much longer conversations, analyze extensive documents, and reference earlier parts of a complex input without losing track of the overall thread.
Transformer-based models, which power most modern LLMs, process information sequentially within this defined token limit. When the input exceeds the context window, the model must truncate the oldest information, effectively causing it to 'forget' the beginning of the conversation or document.