Research /
Equations
{$$ \tag{1} \sigma(z) = { 1 \over 1 + e^{z}} $$} {$$ \tag{2} {\partial \sigma \over \partial z } = \sigma(z) (1  \sigma(z)) $$}
{$$ z^l = w^l a^{l1} + b^l $$} {$$ \tag{3} a^l = \sigma(z^l) $$}
{$$ {\partial a^l \over \partial b^l} = \sigma^\prime(z^l) $$} {$$ {\partial a^l \over \partial w^l} = a^{l1} \sigma^\prime(z^l) $$}
{$$ \tag{4} {v^\prime = v  \eta {\partial C \over \partial v } } $$}
{$$ \tag{5} C(w,b) = \frac{1}{2n} \sum_x \ y(x)  a \^2 $$}
{$$ \tag{6} \delta^l = \nabla C_a \odot \sigma^\prime(z^l) $$} {$$ \tag{7} \nabla C_{a^L} = a_j^L  y_j $$} {$$ \tag{8} \nabla C_{a^l} = (w^{l+1})^T \delta^{l+1} \odot \sigma^\prime(z^l)$$} {$$ \tag{9} {\partial C \over \partial b^l} = \delta^l $$} {$$ \tag{10} {\partial C \over \partial w^l} = a^{l1} \delta^l $$}
{$$ C =  \ y \ln (a) + (1y) \ln(1a) \ $$} Derivation of C'(b) {$$ C(b) =  D( b ) $$} {$$ D(b) = y E(b) + (1y) G(b) $$} {$$ E(b) = \ln ( \sigma(wa+b) ) $$} {$$ G(b) = \ln ( 1  \sigma(wa+b) ) $$} {$$ C'(b) =  D'(b) $$} {$$ D'(b) = y E'(b) + (1y) G'(b) $$} {$$ E'(b) = (1/a)* \sigma'$$} {$$ E'(b) = 1a $$} {$$ G'(b) = 1/(1a) *  \sigma' $$} {$$ G'(b) = a $$} {$$ D'(b) = y(1a) + (1y)(a) ==> yya a+ya ==> ya $$} {$$ C'(b) = ay $$} {$$ C'(w_{jk}^l) = a_k^{l1}(a_j^ly) $$}
{$$ a_j^L = { e^{z_j^L} \over \sum_k e^{z_k^L} } $$}
{$$ C = \ln( a_y^L ) $$} Derivation of {$ {\partial C \over \partial b} $} in Softmax with LogLikelihood {$$ x_k = e^{z_k^L} $$} {$$ x_k^\prime(b_y^L) = x_k $$} {$y$} is the classification/activation we want  it is 1 {$x_y$} and {$x_m$} are denoted as separate for future derivations {$$ f = \sum_j x_j  j \neq y $$} {$$ s = f + x_y $$} {$$ a_y = {x_y \over s } $$} {$$ D_y(z) = a_y^L = { e^z_y \over e^z_y + f} $$} {$$ D_y^\prime(z) = { f e^z \over (f + e^z)^2} $$} {$$ C^\prime(b_y^L) =  { e^z + f \over e^z } * { f e^z \over (e^z + f)^2 } = { f \over e^z + f } $$} {$$ C^\prime(b_y^L) = { xs\over s} $$} {$$ C^\prime(b_y^L) = a_y  1 $$} {$$ C^\prime(b_k^L) = a_k  k \neq y $$} {$$ C^\prime(b_j^L) = a^L_j  y(x) $$} {$$ C^\prime(w_{jk}^L) = a^{L1}_k (a^L_j  y(x)) $$}
Description of variables in equations above
