← back

Implement K-Fold Cross-Validation

#18 · Machine Learning · Medium

⊣ Solve on deep-ml.com

Problem

Implement K-Fold Cross-Validation from scratch. Split a dataset into K folds, and for each fold, use it as the validation set while the remaining folds form the training set.

Solution

1
2
3
4
5
6
7
8
9
10
def cross_validation_split(data: list, k: int) -> list:
    fold_size = len(data) // k
    folds = []
    for i in range(k):
        start = i * fold_size
        end = start + fold_size
        validation = data[start:end]
        training = data[:start] + data[end:]
        folds.append([training, validation])
    return folds

Explanation

  1. Determine the fold size by dividing the dataset length by K.
  2. For each fold index i, the validation set is the slice from i * fold_size to (i+1) * fold_size.
  3. The training set is everything before and after that slice, concatenated.
  4. Return a list of [training, validation] pairs.

Complexity

  • Time: O(k * n) where n is the dataset size (due to list slicing)
  • Space: O(k * n) for all the folds