← back

XGBoost Objective Function Calculation

#347 · Machine Learning · Medium

⊣ Solve on deep-ml.com

Problem

Implement the XGBoost objective function calculation. Given predictions and true labels, compute the gradient and hessian for a specified loss function (squared error or logistic), then calculate the optimal leaf weight and the gain for a potential split.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import numpy as np
from typing import Dict

def sigmoid(x: np.ndarray) -> np.ndarray:
    return 1.0 / (1.0 + np.exp(-np.clip(x, -500, 500)))

def compute_gradients(
    y_true: np.ndarray,
    y_pred: np.ndarray,
    objective: str = "squared_error"
) -> Dict[str, np.ndarray]:
    if objective == "squared_error":
        grad = y_pred - y_true
        hess = np.ones_like(y_true)
    elif objective == "logistic":
        p = sigmoid(y_pred)
        grad = p - y_true
        hess = p * (1 - p)
    else:
        raise ValueError(f"Unknown objective: {objective}")

    return {"gradient": grad, "hessian": hess}

def compute_leaf_weight(
    gradient: np.ndarray,
    hessian: np.ndarray,
    reg_lambda: float = 1.0
) -> float:
    return -float(np.sum(gradient) / (np.sum(hessian) + reg_lambda))

def compute_split_gain(
    gradient_left: np.ndarray,
    hessian_left: np.ndarray,
    gradient_right: np.ndarray,
    hessian_right: np.ndarray,
    reg_lambda: float = 1.0,
    gamma: float = 0.0
) -> float:
    def score(g, h):
        return (np.sum(g) ** 2) / (np.sum(h) + reg_lambda)

    gain = 0.5 * (
        score(gradient_left, hessian_left) +
        score(gradient_right, hessian_right) -
        score(
            np.concatenate([gradient_left, gradient_right]),
            np.concatenate([hessian_left, hessian_right])
        )
    ) - gamma

    return float(gain)

Explanation

  1. Gradients: For squared error, gradient = pred - true, hessian = 1. For logistic loss, gradient = sigmoid(pred) - true, hessian = sigmoid(pred) * (1 - sigmoid(pred)).
  2. Leaf weight: The optimal leaf value is -sum(gradient) / (sum(hessian) + lambda), where lambda is L2 regularization.
  3. Split gain: Measures the improvement from splitting. Gain = 0.5 * (score_left + score_right - score_parent) - gamma, where score = sum(g)^2 / (sum(h) + lambda). Gamma is the minimum gain threshold.
  4. A positive gain means the split is beneficial.

Complexity

  • Time: O(n) where n is the number of samples in the node
  • Space: O(n) for gradient and hessian arrays