Automatic differentiation for financial options

19 minute read

Published:

Hi, all!. Today I’ll gonna talk about automatic differentiation (AD) applied to computing financial options greeks. I’ll use jax library to compute the greeks of an european option by means of its reverse mode automatic differentiation functionality.

At first, I’ll define the analytical formulas for an european call option price under Black-Scholes model. After that, I’ll briefly explain how reverse automatic differentiation works given a computational graph of functions and arithmetic operations. In addition, I’ll translate that formulas to Python code and I’ll compute the greeks using jax grad method. Finally, I’ll compare by means of some plots the AD greeks with the analytical ones.

image Reverse mode automatic differentiation computational graph. Source: Wikipedia

European options: a little of theory

We’ll use the Black-Scholes model to compute the price and analytical greeks of an European option. This model has some unrealistic assumptions about how the markets and variables behave, but to the end of this post is fine. The model starts with a partial differential equation (PDE) for the price of an option, which is defined as:

image

which comes toghether with the following boundary conditions for call options:

image

After making a variables trasnsformation, it can be shown that the solution to the equation defined above is:

image

The call option can be thought as the difference of two binary options: the first term is an asset-or-nothing call and the second is a cash-or-nothing call.

$\Delta$ is defined as:

image

Applying again the derivative operator w.r.t $S_t$, we obtain $\Gamma$:

image

$\nu$ can be is written as:

image

Moreover, diffenrentiating w.r.t T we get $\theta$:

image

Finally, $\rho$ is defined as:

image

Reverse automatic differentiation

Derivatives play a crucial role in different areas such as mathematics, deep learning and quantitative finance. In machine learning they are used together with the backpropagation algorithm to train neural networks. Moreover, in quantitative finance are used to hedge complex financial options by means of its sensibilities: the greeks.

AD is referred to techniques that allow to a computer perform the derivative of a function $f : R^n \rightarrow R^m$. These methods rely in the fact that any operation and mathematical function can be decomposed into simpler ones. Into the main advantages of AD we can find that it works for any function whose components support AD, and that there is no numerical error introduced by the technique, like in numerical methods.

In AD, the chain rule takes a crucial role: by applying it repeateadly to the operations and functions, the computer can compute any derivative contained in the computational graph. There are mainly two flavors of automatic differentiation: reverse and forward. In reverse mode the chain rule is applied from outside to inside of the function composition, meanwhile in the forward mode is applied from inside to outside. Let’s view this in an example. Suppose the function $f$ is defined as the following composition:

image

where we can rename:

image

If we compute the derivative of $f$ by applying the reverse mode, we would do:

image

which is efficient in terms of memory and number of operations when $m \gg n$. On the contrary, forward mode automatic differentiation works better when the number of inputs is lower than the number of outputs. That is the reason why reverse mode autodiff is suitable for computing greeks of financial derivatives.

European call option: a Python implementation

At first, we import the libraries we’ll need: Jax and Matplotlib.

import jax as j
import jax.numpy as jnp
import jax.scipy as jsp
import matplotlib.pyplot as plt
from matplotlib import cm

Transforming the above defined formunlas into Python code, leaves us:

def d(S0, K, r, q, σ, T):
    return (jnp.log(S0 / K) + (r - q + σ ** 2 / 2) * T) / (σ * jnp.sqrt(T))

def d1(S0, K, r, q, σ, T):
    return d(S0, K, r, q, σ, T) - σ * jnp.sqrt(T)

def call(S0, K, r, q, σ, T):
    x = d(S0, K, r, q, σ, T)
    x1 = d1(S0, K, r, q, σ, T)
    
    N = jsp.stats.norm.cdf
    
    return S0 * jnp.exp(- q * T) * N(x) - K * jnp.exp(- r * T) * N(x1)

where we used the NumPy-like API of Jax library for the mathematical and statistical functions. Using this API lets us to apply automatic differentiation by only using the grad function.

Before applying grad function over the above defined call, we define the analytical greeks ( i.e $\Delta$, $\Gamma$, $\nu$, $\theta$, $\rho$) of the an European call option as:

def Δ_analytical(S0, K, r, q, σ, T):
    x = d(S0, K, r, q, σ, T)
    
    N = jsp.stats.norm.cdf
    
    return N(x)
def Γ_analytical(S0, K, r, q, σ, T):
    x = d(S0, K, r, q, σ, T)
    
    N = jsp.stats.norm.pdf
    
    return N(x) / (S0 * σ * jnp.sqrt(T))
def ν_analytical(S0, K, r, q, σ, T):
    x = d(S0, K, r, q, σ, T)
    
    N = jsp.stats.norm.pdf
    
    return N(x) * S0 * jnp.sqrt(T)
def θ_analytical(S0, K, r, q, σ, T):
    x = d(S0, K, r, q, σ, T)
    x1 = d1(S0, K, r, q, σ, T)
    
    N = jsp.stats.norm.cdf
    Np = jsp.stats.norm.pdf
    
    return - (Np(x) * S0 * σ) / (2 * jnp.sqrt(T)) - r * K * jnp.exp(- r * T) * N(x1)
def ρ_analytical(S0, K, r, q, σ, T):
    x1 = d1(S0, K, r, q, σ, T)
    
    N = jsp.stats.norm.cdf
    
    return K * T * N(x1) * jnp.exp(- r * T)

To achieve fast calculations we vectorize and compile via jit d1, d2 and call functions:

d = j.jit(jnp.vectorize(d))
d1 = j.jit(jnp.vectorize(d1))
call = j.jit(jnp.vectorize(call))

The same is done for the greeks, accordingly:

Δ_analytical = j.jit(jnp.vectorize(Δ_analytical))
Γ_analytical = j.jit(jnp.vectorize(Γ_analytical))
ν_analytical = j.jit(jnp.vectorize(ν_analytical))
ρ_analytical = j.jit(jnp.vectorize(ρ_analytical))
θ_analytical = j.jit(jnp.vectorize(θ_analytical))
Δ = j.jit(jnp.vectorize(j.grad(call)))
Γ = j.jit(jnp.vectorize(j.grad(Δ)))
ν = j.jit(jnp.vectorize(j.grad(call, argnums=4)))
ρ = j.jit(jnp.vectorize(j.grad(call, argnums=2)))
θ = j.jit(jnp.vectorize(j.grad(call, argnums=5)))

A concrete case

We set the input parameters of Black-Scholes model as:

S0 = 100.
K = 110.
r = 0.03
q = 0.0
sigma = 0.2
T = 2.

Pricing with the above defined parameters yields:

print(call(S0, K, r, q, sigma, T))
9.739836

Fortunately, the analytical greeks yield the same result as the AD ones:

print(Δ(S0, K, r, q, sigma, T), Δ_analytical(S0, K, r, q, sigma, T))
print(Γ(S0, K, r, q, sigma, T), Γ_analytical(S0, K, r, q, sigma, T))
print(ν(S0, K, r, q, sigma, T), ν_analytical(S0, K, r, q, sigma, T))
print(ρ(S0, K, r, q, sigma, T), ρ_analytical(S0, K, r, q, sigma, T))
print(θ(S0, K, r, q, sigma, T), θ_analytical(S0, K, r, q, sigma, T))
0.5066145 0.5066146
0.014102802 0.014102802
56.4112 56.411205
81.843216 81.84325
4.048208 -4.0482087

We define the plot function we’ll use to plot the greeks and option price:

import matplotlib as mpl

def plot_3d(x, y, zs, x_labels, y_labels, z_labels, titles, size):
    plt.style.use('seaborn')
    
    fig, ax = plt.subplots(1, 2,  figsize=(13, 10), subplot_kw={"projection": "3d"})
    
    for i in range(2):
        ax[i].set_title(titles[i], fontsize=size)
        
        ax[i].set_xlabel(x_labels[i], fontsize=size)
        ax[i].set_ylabel(y_labels[i], fontsize=size)
        ax[i].set_zlabel(z_labels[i], fontsize=size)

        ax[i].tick_params(axis='x', labelsize=size)
        ax[i].tick_params(axis='y', labelsize=size)
        ax[i].tick_params(axis='z', labelsize=size)
       
        ax[i].xaxis.labelpad=20
        ax[i].yaxis.labelpad=20
        ax[i].zaxis.labelpad=20

        s1 = ax[i].plot_surface(x, y, zs[i], cmap=cm.coolwarm, linewidth=0, antialiased=True)
    
    plt.show()

At first, we plot the the european option price as a function of $S_0$ and T:

s0 = jnp.linspace(60., 120., 100)
Ts = jnp.linspace(1., 5., 100)
X, Y = jnp.meshgrid(s0, Ts)

fig, ax = plt.subplots(figsize=(13, 10), subplot_kw={"projection": "3d"})

price = call(X, K, r, q, sigma, Y) 

ax.set_xlabel(r"$S_0$", fontsize=15)
ax.set_ylabel("T", fontsize=15)
ax.set_zlabel("V", fontsize=15)

ax.tick_params(axis='x', labelsize=15)
ax.tick_params(axis='y', labelsize=15)
ax.tick_params(axis='z', labelsize=15)

ax.plot_surface(X, Y, price, cmap=cm.coolwarm, linewidth=0, antialiased=True)

ax.xaxis.labelpad=30
ax.yaxis.labelpad=30
ax.zaxis.labelpad=30

plt.tight_layout()
plt.show()

image

In order to compare the results, we plot the analytical greeks vs the AD ones. For $\Delta$, we have:

zs = [Δ(X, K, r, q, sigma, Y), Δ_analytical(X, K, r, q, sigma, Y)]

plot_3d(X, Y, zs, [r"$S_0$", r"$S_0$"] , ["T", "T"], [r"$\Delta$", r"$\Delta$"], [r"AD $\Delta$", r"Analytical $\Delta$"], 15)

image

Meanwhile, the derivative of $\Delta$ is:

zs = [Γ(X, K, r, q, sigma, Y), Γ_analytical(X, K, r, q, sigma, Y)]

plot_3d(X, Y, zs, [r"$S_0$", r"$S_0$"] , ["T", "T"], [r"$\Gamma$", r"$\Gamma$"], [r"AD $\Gamma$", r"Analytical $\Gamma$"], 15)

image

Finally, $\nu$, $\rho$ and $\theta$ are given by:

zs = [ν(X, K, r, q, sigma, Y), ν_analytical(X, K, r, q, sigma, Y)]

plot_3d(X, Y, zs, [r"$S_0$", r"$S_0$"] , ["T", "T"], [r"$\nu$", r"$\nu$"], [r"AD $\nu$", r"Analytical $\nu$"], 15)

image

zs = [ρ(X, K, r, q, sigma, Y), ρ_analytical(X, K, r, q, sigma, Y)]

plot_3d(X, Y, zs, [r"$S_0$", r"$S_0$"] , ["T", "T"], [r"$\rho$", r"$\rho$"], [r"AD $\rho$", r"Analytical $\rho$"], 15)

image

zs = [-θ(X, K, r, q, sigma, Y), θ_analytical(X, K, r, q, sigma, Y)]

plot_3d(X, Y, zs, [r"$S_0$", r"$S_0$"] , ["T", "T"], [r"$\theta$", r"$\theta$"], [r"AD $\theta$", r"Analytical $\theta$"], 15)

image

Performance

Using the timeit magic command for both AD and analytical greeks gives:

%%timeit
Δ(S0, K, r, q, sigma, T)
Γ(S0, K, r, q, sigma, T) 
ν(S0, K, r, q, sigma, T) 
ρ(S0, K, r, q, sigma, T) 
θ(S0, K, r, q, sigma, T) 
51.8 µs ± 903 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit
Δ_analytical(S0, K, r, q, sigma, T)
Γ_analytical(S0, K, r, q, sigma, T)
ν_analytical(S0, K, r, q, sigma, T)
ρ_analytical(S0, K, r, q, sigma, T)
θ_analytical(S0, K, r, q, sigma, T)
50.6 µs ± 708 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Conclusions

We defined the Black-Scholes model for pricing European Options and the analytical greeks associated with it. To that end we used Jax library’s numpy-like API, wich helped in defining AD greeks. Both groups of greeks (i.e the analytical and the AD ones) yielded the same results. With respect to the performance, it was nearly the same for both types of computations.