← back

Feature Scaling Implementation

#16 · Machine Learning · Easy

⊣ Solve on deep-ml.com

Problem

Implement feature scaling (standardization) for a dataset. Transform each feature so it has zero mean and unit variance.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def feature_scaling(data: list[list[float]]) -> tuple[list[list[float]], list[float], list[float]]:
    n = len(data)
    num_features = len(data[0])
    means = []
    stds = []
    for j in range(num_features):
        col = [data[i][j] for i in range(n)]
        mean = sum(col) / n
        variance = sum((x - mean) ** 2 for x in col) / n
        std = variance ** 0.5
        means.append(mean)
        stds.append(std)
    scaled = []
    for i in range(n):
        row = []
        for j in range(num_features):
            if stds[j] != 0:
                row.append((data[i][j] - means[j]) / stds[j])
            else:
                row.append(0.0)
        scaled.append(row)
    return (scaled, means, stds)

Explanation

  1. For each feature (column), calculate the mean and standard deviation.
  2. Transform each value using z = (x - mean) / std.
  3. If the standard deviation is zero (constant feature), set the scaled value to 0.
  4. Return the scaled data along with the means and standard deviations for later use.

Complexity

  • Time: O(n * f) where n is samples and f is features
  • Space: O(n * f) for the scaled data