Basic notations:
In the following descriptions, $x \in R^n $, $b \in R^n$ and $X \in R^{n \times n}$, $A \in R^{n \times n}$ and $f (x) \in R, f (X) \in R^{}$. To begin with, we first standardize the following basic notations:
Another way to illustrate the basic notations:
Chain rule:
An illustration of chain rule:
As such,
and
Derivative of determinant:
It holds that $A A^{\ast} = | A | I$ where $A^{ \ast }$ is the Adjugate matrix of $A$, hence,
and
If $A$ is a non-singular matrix, the following equations hold:
If A is also symmetric,
It then immediately follows that:
Some fundamental tricks:
-
$A \in S^n$, $A = U \Sigma U^T$ where $U$ is an orthogonal matrix and $A = A^{1/2} A^{1/2}$ where $A^{1/2} = U \Sigma^{1/2} U^T$;
-
If $| x | = 1$, then $(I + \lambda x x^T)^{- 1} = I - \frac{\lambda}{1 + \lambda} x x^T$;
-
$A \in S^{n}$, $\ln |I + A| = \sum_{i = 1}^n \ln (1 + \lambda_i)$ where $\lambda_1, \lambda_2, \ldots, \lambda_n$ are the eigenvalues of $A$ and $\lambda_i > - 1$.