The life cycle of an AI model
A Large Language Model (LLM) undergoes two main phases:- The training phase
- The model is trained on large data sets
- The usage phase.
- The model can be used to generate an answer
- The model can not learn anymore

Training an LLM
During training, the model processes vast amounts of text data using a technique called “next token prediction.” The model learns statistical relationships between words and concepts by repeatedly predicting what word should come next in a sequence.
Using an LLM
During the usage phase (also known as inference), the model generates responses by sampling from the probability distributions it learned during training. When you ask aboutArtificial Intelligence
, the model assigns much higher probability to related terms like machine learning
than unrelated ones like banana cake
.

Hi
from the user lets the model probably answer with a greeting. It answers with Hello
.
Then, it generates the next most likely word based on Hi
and Hello
. This process is repeated until the model decides the request was sufficiently answered.
The generation process works token by token:
- User sends:
Hi
- Model predicts high probability for greeting tokens like
Hello
- Model then predicts the next token based on
Hi Hello
- This continues until the model generates an end-of-sequence token
Influencing the output of a response
Since deployed models cannot learn after being deployed, how do they remember previous messages or incorporate new information? The answer lies in the context window.