Implement the Mish activation function: mish(x) = x * tanh(softplus(x)) where softplus(x) = ln(1 + e^x).
Apply the Mish formula element-wise. Handle numerical stability for large values.
import math
def mish(x: float) -> float:
if x > 20.0:
return x # tanh(softplus(x)) -> 1 for large x
softplus = math.log(1.0 + math.exp(x))
return x * math.tanh(softplus)
def mish_list(values: list[float]) -> list[float]:
return [round(mish(v), 6) for v in values]softplus(x) = ln(1 + e^x).tanh to the softplus result.x: mish(x) = x * tanh(softplus(x)).