Скачать книгу
target="_blank" rel="nofollow" href="#fb3_img_img_6b5ccb9c-c3fb-5612-90fb-d20ec48c2ce1.png" alt="v e c left-parenthesis normal phi left-parenthesis x Superscript l Baseline right-parenthesis right-parenthesis equals upper M v e c left-parenthesis x Superscript l Baseline right-parenthesis period"/>
Supervision signal for the previous layer: In the l‐th layer, we need to compute ∂z/∂vec(xl). For that, we want to reshape xl into a matrix , and use these two equivalent forms (modulo reshaping) interchangeably. By the chain rule, ∂z/∂(vec(xl)T) = [∂z/∂(vec(y)T)][∂vec(y)/∂(vec(xl)T)].
In Eq. (3.101), , and vec((∂z/∂Y)FT) is a vector in . At the same time, MT is an indicator matrix in In order to locate one element in vec(xl) or one row in MT, we need an index triplet (il, jl, dl), with 0 ≤ il < Hl, 0 ≤ jl < Wl, and 0 ≤ dl < Dl. Similarly, to locate a column in MT or an element in ∂z/∂Y)FT, we need an index pair p ,q), with 0≤p < Hl + 1Wl + 1and ≤ q< HWDl. Thus, the (il, jl, dl )‐th entry of ∂z/∂(vec(xl)) is the product of two vectors: the row in MT (or the column in M) that is indexed by (il, jl, dl), and vec((∂z/∂Y)FT). Since MT is an indicator matrix, in the row vector indexed by (il, jl, dl), only those entries whose index (p, q) satisfies m(p, q) = (il, jl, dl) have a value 1, and all other entries are 0. Thus, the (il, jl, dl )‐th entry of ∂z/∂(vec(xl)) equals the sum of these corresponding entries in vec((∂z/∂Y)FT). Therefore, we get the following succinct equation: