← back

Partial Derivatives of Multivariable Functions

#215 · Calculus · Medium

⊣ Solve on deep-ml.com

Problem

Compute Partial Derivatives of multivariable functions. Given a function f(x1, x2, ..., xn), compute the partial derivative with respect to each variable, both numerically and return the gradient vector.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import numpy as np

def partial_derivative(f, x: list, var_index: int,
                       epsilon: float = 1e-7) -> float:
    """
    Compute partial derivative df/dx_i at point x.
    f: function taking a list/array of inputs
    x: point at which to evaluate
    var_index: index of the variable to differentiate with respect to
    """
    x = list(x)
    x_plus = x.copy()
    x_minus = x.copy()
    x_plus[var_index] += epsilon
    x_minus[var_index] -= epsilon
    return (f(x_plus) - f(x_minus)) / (2 * epsilon)

def gradient(f, x: list, epsilon: float = 1e-7) -> list:
    """Compute the gradient vector [df/dx_0, df/dx_1, ..., df/dx_n]."""
    return [partial_derivative(f, x, i, epsilon) for i in range(len(x))]

def hessian(f, x: list, epsilon: float = 1e-5) -> list:
    """Compute the Hessian matrix of second partial derivatives."""
    n = len(x)
    H = [[0.0] * n for _ in range(n)]
    for i in range(n):
        for j in range(n):
            x_pp = list(x)
            x_pm = list(x)
            x_mp = list(x)
            x_mm = list(x)
            x_pp[i] += epsilon
            x_pp[j] += epsilon
            x_pm[i] += epsilon
            x_pm[j] -= epsilon
            x_mp[i] -= epsilon
            x_mp[j] += epsilon
            x_mm[i] -= epsilon
            x_mm[j] -= epsilon
            H[i][j] = (f(x_pp) - f(x_pm) - f(x_mp) + f(x_mm)) / (
                4 * epsilon ** 2)
    return H

def directional_derivative(f, x: list, direction: list,
                            epsilon: float = 1e-7) -> float:
    """Compute the directional derivative along a unit vector."""
    d = np.array(direction, dtype=float)
    d = d / np.linalg.norm(d)
    grad = np.array(gradient(f, x, epsilon))
    return float(np.dot(grad, d))

Explanation

  1. Partial derivative: Fix all variables except one and differentiate with respect to that variable using central differences.
  2. Gradient: The vector of all partial derivatives. Points in the direction of steepest ascent.
  3. Hessian: Matrix of second partial derivatives, computed using the finite-difference formula for mixed partials: d^2f/(dx_i dx_j) = (f(x+ei+ej) - f(x+ei-ej) - f(x-ei+ej) + f(x-ei-ej)) / (4 * eps^2).
  4. Directional derivative: The dot product of the gradient with a unit direction vector, giving the rate of change of f in that direction.

Complexity

  • Time: O(n) for the gradient (2n function evaluations), O(n^2) for the Hessian (4n^2 evaluations)
  • Space: O(n) for the gradient, O(n^2) for the Hessian