Derivatives & Rules
π’ Derivatives & Differentiation Rules
The derivative measures the instantaneous rate of change of a function. Geometrically, it represents the slope of the tangent line to the graph of the function at a specific point.
π’ 1. The Definition of a Derivative
The derivative of at is defined as the limit:
If this limit exists, the function is said to be differentiable at .
π‘ 2. Fundamental Rules of Differentiation
1. Basic Rules
- Power Rule: .
- Constant Rule: .
- Constant Multiple Rule: .
- Sum/Difference Rule: .
2. Product and Quotient Rules
- Product Rule: .
- Quotient Rule: .
3. Transcendental Functions
- .
- .
- .
- .
π΄ 3. The Chain Rule
The Chain Rule is the formula for calculating the derivative of the composition of two or more functions.
In Leibniz notation:
Application: Neural Networks (Backpropagation)
A Neural Network is a series of composite functions: .
- To find how a weight in the first layer affects the final Loss, we apply the Chain Rule repeatedly from the output back to the input.
- This is the core of βBackpropagation.β
π― 4. Mean Value Theorem (MVT)
If is continuous on and differentiable on , then there exists at least one in such that: Essentially, there is a point where the instantaneous rate of change equals the average rate of change over the interval.
π‘ Practical Example: Optimization
To find the minimum or maximum of a function, we look for critical points where or is undefined.
import numpy as np
def f(x):
return x**2 + 5*x + 6
def f_prime(x):
return 2*x + 5
# Solve f'(x) = 0
# 2x + 5 = 0 => x = -2.5
critical_point = -2.5
print(f"Critical point: {critical_point}")
print(f"Minimum value: {f(critical_point)}")π Key Takeaways
- Derivatives measure change.
- The Chain Rule allows differentiating nested functions.
- Optimization depends on finding where the derivative is zero.