Training phase
In AI work, the training happens in three main stages.
The first is Pre-training. This stage takes the raw data—the truly massive stuff (think petabytes or exabytes) that is basically the complete internet data. This data is fed through a large neural network, and we get a base model. This model is definitely intelligent—it knows language structure and general facts—but it’s not yet optimized for our specific problems.
Next, this base model is Fine-Tuned. For this, we use known input data paired with the target output we want. The model makes a prediction, and the loss (the error) is calculated. We then minimize this loss, and the weights of the network are adjusted accordingly. This gives us a fine-tuned model that’s ready for specific, targeted tasks.
Finally, the model goes through an Alignment phase, often using Reinforcement Learning from Human Feedback (RLHF). As part of RLHF, the model generates several different outputs for a single prompt. Then, a human labeler or a Reward Model (RM)—which acts like our internal “grader program”—ranks or scores these outputs based on how helpful and safe they are. This feedback is used to further train the model, making it perfectly aligned with the desired target behavior.