# Neural networks: Deriving the sigmoid derivative via chain and quotient rules

Deriving the derivative of the sigmoid function for neural networks

Hause Lin true
10-01-2019

Get source code for this RMarkdown script here.

## Consider being a patron and supporting my work?

Donate and become a patron: If you find value in what I do and have learned something from my site, please consider becoming a patron. It takes me many hours to research, learn, and put together tutorials. Your support really matters.

## Sigmoid function (aka logistic or inverse logit function)

The sigmoid function $$\sigma(x)=\frac{1}{1+e^{-x}}$$ is frequently used in neural networks because its derivative is very simple and computationally fast to calculate, making it great for backpropagation.

Let’s denote the sigmoid function as the following:

$\sigma(x)=\frac{1}{1+e^{-x}}$

Another way to express the sigmoid function:

$\sigma(x)=\frac{e^{x}}{e^{x}+1}$ You can easily derive the second equation from the first equation:

$\frac{1}{1+e^{-x}}= \frac{1}{1+e^{-x}} \frac{e^{x}}{e^{x}} =\frac{e^{x}}{e^{x}+1}$

Since $$\frac{e^x}{e^x} = 1$$, so in essence, we’re just multiplying $$\frac{1}{1+e^{-x}}$$ by 1.

## Sigmoid derivative

The derivative of the sigmoid function $$\sigma(x)$$ is the sigmoid function $$\sigma(x)$$ multiplied by $$1 - \sigma(x)$$.

$\sigma(x)=\frac{1}{1+e^{-x}}$

$\sigma'(x)=\frac{d}{dx}\sigma(x)=\sigma(x)(1-\sigma(x))$

Before we begin, here’s a reminder of how to find the derivatives of exponential functions.

$\frac{d}{dx}e^x = e^x$ $\frac{d}{dx}e^{-3x^2 + 2x} = (-6x + 2)e^{-3x^2 + 2x}$

## Sigmoid derivative via chain rule

Chain rule: $$\frac{d}{dx} \left[ f(g(x)) \right] = f'\left[g(x) \right] * g'(x)$$.

Example: Find the derivative of $$f(x) = (x^2 + 1)^3$$:

\begin{aligned} f'(x) &= 3(x^2 + 1)^{3-1} * 2x^{2-1}\\ &= 3(x^2 + 1)^2(2x) \\ &= 6x(x^2 + 1)^2 \end{aligned}

Line 2 of the sigmoid derivation below uses this rule.

\begin{aligned} \frac{d}{dx} \sigma(x) &= \frac{d}{dx} \left[ \frac{1}{1+e^{-x}} \right] =\frac{d}{dx}(1+e^{-x})^{-1} \\ &=-1*(1+e^{-x})^{-2}(-e^{-x}) \\ &=\frac{-e^{-x}}{-(1+e^{-x})^{2}} \\ &=\frac{e^{-x}}{(1+e^{-x})^{2}} \\ &=\frac{1}{1+e^{-x}} \frac{e^{-x}}{1+e^{-x}} \\ &=\frac{1}{1+e^{-x}} \frac{e^{-x} + (1 - 1)}{1+e^{-x}} \\ &=\frac{1}{1+e^{-x}} \frac{(1 + e^{-x}) - 1}{1+e^{-x}} \\ &=\frac{1}{1+e^{-x}} \left[ \frac{(1 + e^{-x})}{1+e^{-x}} - \frac{1}{1+e^{-x}} \right] \\ &=\frac{1}{1+e^{-x}} \left[ 1 - \frac{1}{1+e^{-x}} \right] \\ &=\sigma(x) (1-\sigma(x)) \\ \end{aligned}

## Sigmoid derivative via quotient rule

Quotient rule: If $$f(x) = \frac{g(x)}{h(x)}$$, then $$f'(x) = \frac{g'(x)h(x) - h'(x)g(x)}{(h(x))^2}$$.

Example: Find the derivative of $$f(x) = \frac{3x}{1 + x}$$:

\begin{aligned} f'(x) &= \frac{(\frac{d}{dx}(3x))*(1+x) - (\frac{d}{dx}(1+x)) * (3x)} {(1+x)^2} \\ &= \frac{3(1 + x) - 1(3x)}{(1+x)^2} \\ &= \frac{3 + 3x - 3x}{(1+x)^2} \\ &= \frac{3}{(1+x)^2} \end{aligned}

Line 2 of the sigmoid derivation below uses this rule.

\begin{aligned} \frac{d}{dx} \sigma(x) &= \frac{d}{dx} \left[ \frac{1}{1+e^{-x}} \right] \\ &=\frac{(0)(1 + e^{-x}) - (-e^{-x})(1)}{(1 + e^{-x})^2} \\ &=\frac{e^{-x}}{(1 + e^{-x})^2} \\ &=\frac{1}{1+e^{-x}} \frac{e^{-x}}{1+e^{-x}} \\ &=\frac{1}{1+e^{-x}} \frac{e^{-x} + (1 - 1)}{1+e^{-x}} \\ &=\frac{1}{1+e^{-x}} \frac{(1 + e^{-x}) - 1}{1+e^{-x}} \\ &=\frac{1}{1+e^{-x}} \left[ \frac{(1 + e^{-x})}{1+e^{-x}} - \frac{1}{1+e^{-x}} \right] \\ &=\frac{1}{1+e^{-x}} \left[ 1 - \frac{1}{1+e^{-x}} \right] \\ &=\sigma(x) (1-\sigma(x)) \\ \end{aligned}

## Support my work

### Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

### Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hauselin/rtutorialsite, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

### Citation

Lin (2019, Oct. 1). Data science: Neural networks: Deriving the sigmoid derivative via chain and quotient rules. Retrieved from https://hausetutorials.netlify.com/posts/2019-12-01-neural-networks-deriving-the-sigmoid-derivative/
@misc{lin2019neural,
}