Tunable Parameters

    0

    Tunable parameters in large language models (LLMs) are the learned weights that the model adjusts during training to minimize error and improve performance on language tasks.

    • Also called trainable parameters.
    • Include weights and biases in layers such as:
      • Attention layers (e.g., query, key, value, and output projection matrices)
      • Feedforward (MLP) layers
      • Embedding layers
    • Typically number in the billions in modern LLMs (e.g., GPT-3 has 175 billion).

    Tunable parameters store the knowledge the model gains from training data. During training:

    1. The model makes a prediction.

    2. A loss function measures the error.

    3. Backpropagation adjusts the tunable parameters to reduce that error.