The chain rule

Recall

We already know the derivative of simple functions. For example, we have seen that for the sine function we have

f(x)=sin(x)f(x)=cos(x)f(x)=\sin(x) \rightarrow f'(x)=\cos(x)

More difficult is the case where the argument of the sine function is another function, e.g.

f(x)=sin(x2)f(x)=\sin(x^2)

or

f(x)=sin(ln(x))f(x)=\sin(\ln(x))

Both are examples of a chain of functions (see:drefcomposition): if you want to determine the value of f(0.5)f(0.5) you need to apply two functions, first one, then the other: first you have to feed 0.50.5 into the first function (called the inner function) to calculate ln(0.5)=0.693\ln(0.5)=-0.693 and then you have to feed this result into the second function (called the outer function) to calculate sin(0.693)=0.638\sin(-0.693)=-0.638. In short:

0.5ln(0.5)sin(ln(0.5))0.5 \rightarrow \ln(0.5) \rightarrow \sin(\ln(0.5))

More generally, if we denote the inner function by ii, and the outer function by oo, ff can be written as a composition of oo and ii:

f(x)=o(i(x))f(x)=o(i(x))

Thus we have the chain

xi(x)o(i(x))x \rightarrow i(x) \rightarrow o(i(x))

In order to determine the derivative of such a chain of functions, we must first always identify the inner and outer functions. Let's practise that first.

Exercise 1
  1. Determine the inner and outer function of the following functions:

    1. f(x)=cos(x2)f(x)=\cos(x^2)
    2. f(x)=e3xf(x)=e^{3x}
    3. f(x)=sin(x)f(x)=\sin(-x)
    4. f(x)=log3(x)f(x)=\log_{3}(\sqrt{x})
    5. f(x)=1+x2f(x)=\sqrt{1+x^2}
    6. f(x)=2x22x+1f(x)=\frac{2}{x^2-2x+1}
    7. f(x)=(12x)10f(x)=(1-2x)^{10}
    8. f(x)=(sin(x))2f(x)=\left(\sin(x)\right)^2
  2. Determine two different inner and outer functions for the function

    f(x)=sin(1x2)f(x)=\sin(1-x^2)
Solution
  1. We have
    1. o(x)=cos(x),i(x)=x2o(x)=\cos(x), i(x)=x^2
    2. o(x)=ex,i(x)=3xo(x)=e^x, i(x)=3x
    3. o(x)=sin(x),i(x)=xo(x)=\sin(x), i(x)=-x
    4. o(x)=log3(x),i(x)=xo(x)=\log_3(x), i(x)=\sqrt{x}
    5. o(x)=x,i(x)=1+x2o(x)=\sqrt{x}, i(x)=1+x^2
    6. o(x)=2x,i(x)=x22x+1o(x)=\frac{2}{x}, i(x)=x^2-2x+1
    7. o(x)=x10,i(x)=12xo(x)=x^{10}, i(x)=1-2x
    8. o(x)=x2,i(x)=sin(x)o(x)=x^2, i(x)=\sin(x)
  2. Solution 1: o(x)=sin(1x),i(x)=x2o(x)=\sin(1-x), i(x)=x^2. Solution 2: o(x)=sin(x),i(x)=1x2o(x)=\sin(x), i(x)=1-x^2

Note that sometimes there is more than one possibility to assign inner and outer functions. In general it will be clear from the context which one you should take.

So how can we find the derivative of a chain of functions? Let's return to our example

f(x)=sino(lni(x))f(x)=\underbrace{\sin}_{o}(\underbrace{\ln}_{i}(x))

Our first instinct is to set

f(x)=cos(ln(x))f'(x)=\cos(\ln(x))

as we already know that

sin(x)cos(x)\sin(x) \overset{\prime}{\rightarrow} \cos(x)

This is not totally wrong, but also not totally right (uncollapse to see that it can't be correct). Let's call this the naive approach.

Show

To get the correct derivative, we have to multiply the naive approache with a correction factor, and this correction factor is always the derivative of the inner function, in this case i(x)=1/xi'(x)=1/x. Thus, the correct derivative is

f(x)=cos(ln(x))naive1xcorrectionf'(x)=\underbrace{cos(\ln(x))}_{naive}\cdot \underbrace{\frac{1}{x}}_{correction}

Let's summarise this and also generalise

Theorem 1

Assume we have a function ff that can be thought of as a chain (or composition) of two functions oo and ii, that is f(x)=o(i(x))f(x)=o(i(x)). The we can find the derivative of ff as follows:

Equation 1
f(x)=o(i(x))f(x)=o(i(x))naivei(x)correctionf(x)=o(i(x)) \overset{\prime}{\rightarrow} f'(x)=\underbrace{o'(i(x))}_{naive}\cdot \underbrace{i'(x)}_{correction}

Chain rule to find the derivative of a composition of functions.

This is the so called chain rule. In words:

  1. Form the "naive" derivative of oo(i(x))o \rightarrow o^\prime(i(x))

  2. Apply the correction factor i(x)i^\prime(x)

Proof

We have f(x)=o(i(x))f(x)=o(i(x)). We know that

i(x+h)i(x)+i(x)hi(x+h)\approx i(x)+i'(x)\cdot h

The differential quotient of ff is therefore

f(x)f(x+h)f(x)h=o(i(x+h))o(i(x))h=o(i(x)+i(x)h)o(i(x))h\begin{array}{lll} f'(x) &\approx & \frac{f(x+h)-f(x)}{h}\\ &=& \frac{o(i(x+h))-o(i(x))}{h}\\ &=& \frac{o\left(i(x)+i'(x)\cdot h\right)-o(i(x))}{h} \end{array}

Now, we also know that

o(y+s)o(y)+o(y)so(y+s)\approx o(y)+o'(y)\cdot s

With y=i(x)y=i(x) and s=i(x)hs=i'(x)\cdot h we therefore get

o(i(x)y+i(x)hs)o(i(x)y)+o(i(x)y)i(x)hso(\underbrace{i(x)}_{y}+\underbrace{i'(x)\cdot h}_{s})\approx o(\underbrace{i(x)}_{y})+o'(\underbrace{i(x)}_{y})\cdot \underbrace{i'(x)\cdot h}_{s}

Inserting this into the difference quotient above, we get

f(x)f(x+h)f(x)h=o(i(x+h))o(i(x))h=o(i(x)+i(x)h)o(i(x))ho(i(x))+o(i(x))i(x)ho(i(x))h=o(i(x))i(x)hh=o(i(x))i(x)\begin{array}{lll} f'(x) &\approx & \frac{f(x+h)-f(x)}{h}\\ &=& \frac{o(i(x+h))-o(i(x))}{h}\\ &=& \frac{o\left(i(x)+i'(x)\cdot h\right)-o(i(x))}{h}\\ &\approx& \frac{o(i(x))+o'(i(x))\cdot i'(x)\cdot h-o(i(x))}{h}\\ &=& \frac{o'(i(x))\cdot i'(x)\cdot h}{h}\\ &=& o'(i(x))\cdot i'(x)\\ \end{array}

And this is of course also true for h0h\rightarrow 0.

Example 1

Determine f(1)f'(1):

  1. f(x)=coso(x2i(x))f(x)=\underbrace{\cos}_{o}(\underbrace{x^2}_{i(x)})
    1. o(x)=cos(x)o(x)=sin(x)o(x)=\cos(x) \rightarrow o'(x)=-\sin(x)
    2. i(x)=x2i(x)=2xi(x)=x^2 \rightarrow i'(x)=2x
    3. f(x)=sin(x2)2xf'(x)=-\sin(x^2)\cdot 2x
    4. f(1)=sin(1)2=1.683f'(1)=-\sin(1)\cdot 2 = -1.683
  2. f(x)=exf(x)=e^{-x}
    1. o(x)=exo(x)=exo(x)=e^x \rightarrow o'(x)=e^x
    2. i(x)=xi(x)=1i(x)=-x \rightarrow i'(x)=-1
    3. f(x)=ex(1)=exf'(x)=e^{-x}\cdot (-1)=-e^{-x}
    4. f(1)=e1=1/ef'(1)=-e^{-1}=-1/e
  3. f(x)=21+xf(x)=\frac{2}{1+\sqrt{x}}
    1. o(x)=2x=2x1o(x)=2x2o(x)=\frac{2}{x}=2x^{-1} \rightarrow o'(x)=-2x^{-2}
    2. i(x)=1+x=1+x1/2i(x)=12x1/2i(x)=1+\sqrt{x}=1+x^{1/2} \rightarrow i'(x)=\frac{1}{2}x^{-1/2}
    3. f(x)=2(1+x)212x1/2=1x1(1+x)2f'(x)=-2(1+\sqrt{x})^{-2}\cdot \frac{1}{2}x^{-1/2} =-\frac{1}{\sqrt{x}}\cdot \frac{1}{\left(1+\sqrt{x}\right)^2}
    4. f(1)=111(1+1)2=14f'(1)=-\frac{1}{\sqrt{1}}\cdot \frac{1}{\left(1+\sqrt{1}\right)^2}=-\frac{1}{4}
Exercise 2
  1. Use the methods indicated to determine the derivative of ff.

    1. f(x)=(x+2)2f(x)=(x+2)^2 (expanding, product rule, chain rule)
    2. f(x)=(x4)2f(x)=(x^4)^2 (power rule, product rule, chain rule)
    3. f(x)=(3x)2f(x)=(3x)^2 (power rule, product rule, chain rule)
  2. Determine f(x)f^\prime(x):

    1. f(x)=e(x2)f(x)=e^{(x^2)}
    2. f(x)=3sin(2x)f(x)=3\sin(2x)
    3. f(x)=22x+1f(x)=\frac{2}{2x+1}
    4. f(x)=ln(x23x+1)f(x)=\ln(x^2-3x+1)
    5. f(x)=11xf(x)=\frac{1}{\sqrt{1-x}}
    6. f(x)=(ln(x))2f(x)=(\ln(x))^2
    7. f(x)=(ln(3x))2f(x)=(\ln(3x))^2
    8. f(x)=sin(2x)cos(3x)f(x)=\sin(2x)\cos(3x)
    9. f(x)=(1+x2)10f(x)=(1+x^2)^{10}
    10. f(x)=(2x1)5(3x+1)6f(x)=(2x-1)^5(3x+1)^6
  3. Consider a function written as a quotient of two other functions uu and ww:

    f(x)=u(x)w(x)f(x)=\frac{u(x)}{w(x)}

    Show with the help of the chain rule, that the following is true:

    f(x)=u(x)w(x)u(x)w(x)w(x)2f'(x)=\frac{u'(x)w(x)-u(x)w'(x)}{w(x)^2}

    For obvious reasons, this is called the quotient rule.

Solution
  1. We have
    1. Expanding: f(x)=(x+2)2=x2+4x+4f(x)=2x+4f(x)=(x+2)^2=x^2+4x+4\rightarrow f^\prime(x)=2x+4 Chain rule: f(x)=2(x+2)11=2x+4f^\prime(x)=2(x+2)^1\cdot 1=2x+4 Product rule: f(x)=(x+2)(x+2)f(x)=1(x+1)+(x+1)1=2(x+2)f(x)=(x+2)(x+2) \rightarrow f'(x)=1\cdot(x+1)+(x+1)\cdot 1= 2(x+2)
    2. Power rule: f(x)=(x4)2=x8f(x)=8x7f(x)=(x^4)^2=x^8 \rightarrow f^\prime(x)=8x^7 Chain rule: f(x)=2(x4)4x3=8x7f^\prime(x)=2(x^4)\cdot 4x^3 =8x^7 Product rule: f(x)=x4x4f(x)=4x3x4+x44x3=8x7f(x)=x^4\cdot x^4 \rightarrow f'(x)=4x^3 \cdot x^4+x^4\cdot 4x^3= 8x^7
    3. Power rule: f(x)=(3x)2=32x2=9x2f(x)=18xf(x)=(3x)^2=3^2 x^2 =9x^2\rightarrow f^\prime(x)=18x Chain rule: f(x)=2(3x)13=18xf^\prime(x)=2(3x)^1 \cdot 3 =18x Product rule: f(x)=3x3xf(x)=33x+3x3=18xf(x)=3x\cdot 3x \rightarrow f'(x)=3\cdot 3x+3x\cdot 3=18x
  2. It is
    1. f(x)=e(x2)f(x)=e(x2)2xf(x)=e^{(x^2)} \rightarrow f^\prime(x)=e^{(x^2)}\cdot 2x
    2. f(x)=3sin(2x)f(x)=3cos(2x)2=6cos(2x)f(x)=3\sin(2x)\rightarrow f^\prime(x)=3\cos(2x)\cdot 2=6\cos(2x)
    3. f(x)=22x+1=2(2x+1)1apply chain rulef(x)=2(1)(2x+1)22=4(2x+1)2f(x)=\frac{2}{2x+1}=2\cdot \underbrace{(2x+1)^{-1}}_{\text{apply chain rule}}\rightarrow f^\prime(x)=2\cdot(-1)(2x+1)^{-2}\cdot 2=-\frac{4}{(2x+1)^2}
    4. f(x)=ln(x23x+1)f(x)=1x23x+1(2x3)=2x3x23x+1f(x)=\ln(x^2-3x+1)\rightarrow f^\prime(x)=\frac{1}{x^2-3x+1}\cdot(2x-3)=\frac{2x-3}{x^2-3x+1}
    5. f(x)=11x=(1x)1/2f(x)=12(1x)3/2(1)=12(1x)3/2f(x)=\frac{1}{\sqrt{1-x}}=(1-x)^{-1/2} \rightarrow f^\prime(x)=-\frac{1}{2}(1-x)^{-3/2}\cdot(-1)=\frac{1}{2}(1-x)^{-3/2}
    6. f(x)=(ln(x))2f(x)=2ln(x)1x=2ln(x)xf(x)=(\ln(x))^2\rightarrow f^\prime(x)=2\ln(x)\cdot \frac{1}{x}=\frac{2\ln(x)}{x} (note that i(x)=ln(x)i(x)=\ln(x))
    7. f(x)=(ln(3x))2f(x)=2ln(3x)13x3=2ln(3x)xf(x)=(\ln(3x))^2 \rightarrow f^\prime(x)=2\ln(3x)\cdot \frac{1}{3x}\cdot 3=\frac{2\ln(3x)}{x} (note that i(x)=ln(3x)i(x)=\ln(3x), so we need the chain rule again to find ii^\prime). Of course you can also use the product rule.
    8. We have to use the product rule and the chain rule: f(x)=sin(2x)u(x)cos(3x)v(x)f(x)=\underbrace{\sin(2x)}_{u(x)}\cdot \underbrace{\cos(3x)}_{v(x)} f(x)=cos(2x)2u(x)cos(3x)v(x)+sin(2x)u(x)sin(3x)3v(x)f'(x)=\underbrace{\cos(2x)\cdot 2}_{u'(x)} \cdot \underbrace{\cos(3x)}_{v(x)}+\underbrace{\sin(2x)}_{u(x)}\cdot \underbrace{-\sin(3x)\cdot 3}_{v'(x)}
    9. f(x)=(1+x2)10f(x)=(1+x^2)^{10}, we have i(x)=1+x2i(x)=1+x^2 and o(x)=x10o(x)=x^{10}, thus f(x)=10(1+x2)92x=20x(1+x2)9f^\prime(x)=10(1+x^2)^9\cdot 2x=20x(1+x^2)^9.
    10. f(x)=(2x1)5(3x+1)6f(x)=(2x-1)^5 (3x+1)^6, we have to use the product rule and the chain rule. f(x)=u(x)v(x)f(x)=u(x)\cdot v(x) with u(x)=(2x+1)5u(x)=(2x+1)^5 and v(x)=(3x+1)6v(x)=(3x+1)^6. Applying the chain rule for u(x)u(x), we get u(x)=5(2x+1)42=10(2x+1)4u^\prime(x)=5(2x+1)^4\cdot 2= 10(2x+1)^4, applying the chain rule for v(x)v(x) we get v(x)=6(3x+1)53=18(3x+1)5v'(x)=6(3x+1)^5\cdot 3=18(3x+1)^5. Thus, we have f(x)=u(x)v(x)+u(x)v(x)=10(2x1)4(3x+1)6+(2x1)518(3x+1)5=10(2x1)4(3x+1)6+18(2x1)5(3x+1)5\begin{array}{lll}f'(x)&=&u'(x)v(x)+u(x)v'(x)\\ &=&10(2x-1)^4\cdot (3x+1)^6+ (2x-1)^5 \cdot 18(3x+1)^5\\ &=&10(2x-1)^4(3x+1)^6+ 18(2x-1)^5(3x+1)^5\\ \end{array}
  3. We have f(x)=u(x)w(x)=u(x)1w(x)=u(x)(w(x))1v(x)f(x)=\frac{u(x)}{w(x)}=u(x)\frac{1}{w(x)} =u(x)\cdot \underbrace{(w(x))^{-1}}_{v(x)} Applying the product rule and the chain rule, we get f(x)=u(x)(w(x))1v(x)+u(x)1(w(x))2w(x)v(x)=u(x)w(x)u(x)w(x)w(x)2=u(x)w(x)w(x)2u(x)w(x)w(x)2=u(x)w(x)u(x)w(x)w(x)2\begin{array}{lll} f'(x) & = & u'(x)\cdot \underbrace{(w(x))^{-1}}_{v(x)} + u(x)\cdot \underbrace{-1\cdot (w(x))^{-2}\cdot w'(x)}_{v'(x)}\\ & = & \frac{u'(x)}{w(x)}-\frac{u(x)w'(x)}{w(x)^2}\\[1ex] & = & \frac{u'(x)w(x)}{w(x)^2}-\frac{u(x)w'(x)}{w(x)^2}\\[1ex] & = & \frac{u'(x)w(x)-u(x)w'(x)}{w(x)^2}\\ \end{array}