Skip to content

Derivatives & Rules

πŸ”’ Derivatives & Differentiation Rules

The derivative measures the instantaneous rate of change of a function. Geometrically, it represents the slope of the tangent line to the graph of the function at a specific point.


🟒 1. The Definition of a Derivative

The derivative of f(x)f(x) at xx is defined as the limit: fβ€²(x)=lim⁑hβ†’0f(x+h)βˆ’f(x)hf'(x) = \lim_{h \to 0} \frac{f(x+h) - f(x)}{h}

If this limit exists, the function is said to be differentiable at xx.


🟑 2. Fundamental Rules of Differentiation

1. Basic Rules

  • Power Rule: ddxxn=nxnβˆ’1\frac{d}{dx}x^n = nx^{n-1}.
  • Constant Rule: ddxc=0\frac{d}{dx}c = 0.
  • Constant Multiple Rule: ddx[cf(x)]=cfβ€²(x)\frac{d}{dx}[c f(x)] = c f'(x).
  • Sum/Difference Rule: (fΒ±g)β€²=fβ€²Β±gβ€²(f \pm g)' = f' \pm g'.

2. Product and Quotient Rules

  • Product Rule: (fg)β€²=fβ€²g+fgβ€²(fg)' = f'g + fg'.
  • Quotient Rule: (fg)β€²=fβ€²gβˆ’fgβ€²g2\left(\frac{f}{g}\right)' = \frac{f'g - fg'}{g^2}.

3. Transcendental Functions

  • ddxex=ex\frac{d}{dx} e^x = e^x.
  • ddxln⁑(x)=1x\frac{d}{dx} \ln(x) = \frac{1}{x}.
  • ddxsin⁑(x)=cos⁑(x)\frac{d}{dx} \sin(x) = \cos(x).
  • ddxcos⁑(x)=βˆ’sin⁑(x)\frac{d}{dx} \cos(x) = -\sin(x).

πŸ”΄ 3. The Chain Rule

The Chain Rule is the formula for calculating the derivative of the composition of two or more functions. ddx[f(g(x))]=fβ€²(g(x))β‹…gβ€²(x)\frac{d}{dx}[f(g(x))] = f'(g(x)) \cdot g'(x)

In Leibniz notation: dydx=dyduβ‹…dudxΒ whereΒ y=f(u)Β andΒ u=g(x)\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx} \text{ where } y = f(u) \text{ and } u = g(x)

Application: Neural Networks (Backpropagation)

A Neural Network is a series of composite functions: Loss(Layern(Layernβˆ’1(… )))Loss(Layer_n(Layer_{n-1}(\dots))).

  • To find how a weight in the first layer affects the final Loss, we apply the Chain Rule repeatedly from the output back to the input.
  • This is the core of β€œBackpropagation.”

🎯 4. Mean Value Theorem (MVT)

If ff is continuous on [a,b][a, b] and differentiable on (a,b)(a, b), then there exists at least one cc in (a,b)(a, b) such that: fβ€²(c)=f(b)βˆ’f(a)bβˆ’af'(c) = \frac{f(b) - f(a)}{b - a} Essentially, there is a point where the instantaneous rate of change equals the average rate of change over the interval.


πŸ’‘ Practical Example: Optimization

To find the minimum or maximum of a function, we look for critical points where fβ€²(x)=0f'(x) = 0 or is undefined.

import numpy as np

def f(x):
    return x**2 + 5*x + 6

def f_prime(x):
    return 2*x + 5

# Solve f'(x) = 0
# 2x + 5 = 0 => x = -2.5
critical_point = -2.5
print(f"Critical point: {critical_point}")
print(f"Minimum value: {f(critical_point)}")

πŸš€ Key Takeaways

  • Derivatives measure change.
  • The Chain Rule allows differentiating nested functions.
  • Optimization depends on finding where the derivative is zero.