柠檬少年|卷积神经网络各层反向传播公式推导


柠檬少年|卷积神经网络各层反向传播公式推导

  1. 前向传播
输入层:
柠檬少年|卷积神经网络各层反向传播公式推导隐藏层:第二层
柠檬少年|卷积神经网络各层反向传播公式推导第三层:
柠檬少年|卷积神经网络各层反向传播公式推导输出层:
柠檬少年|卷积神经网络各层反向传播公式推导2.反向传播
柠檬少年|卷积神经网络各层反向传播公式推导
柠檬少年|卷积神经网络各层反向传播公式推导反向传播算法
柠檬少年|卷积神经网络各层反向传播公式推导参考
柠檬少年|卷积神经网络各层反向传播公式推导
柠檬少年|卷积神经网络各层反向传播公式推导
柠檬少年|卷积神经网络各层反向传播公式推导def conv_forward(X, W, b, stride=1, padding=1):cache = W, b, stride, paddingn_filters, d_filter, h_filter, w_filter = W.shapen_x, d_x, h_x, w_x = X.shapeh_out = (h_x - h_filter + 2 * padding) / stride + 1w_out = (w_x - w_filter + 2 * padding) / stride + 1if not h_out.is_integer() or not w_out.is_integer():raise Exception('Invalid output dimension!')h_out, w_out = int(h_out), int(w_out)X_col = im2col_indices(X, h_filter, w_filter, padding=padding, stride=stride)W_col = W.reshape(n_filters, -1)out = W_col @ X_col + bout = out.reshape(n_filters, h_out, w_out, n_x)out = out.transpose(3, 0, 1, 2)cache = (X, W, b, stride, padding, X_col)return out, cachedef conv_backward(dout, cache):X, W, b, stride, padding, X_col = cachen_filter, d_filter, h_filter, w_filter = W.shapedb = np.sum(dout, axis=(0, 2, 3))db = db.reshape(n_filter, -1)dout_reshaped = dout.transpose(1, 2, 3, 0).reshape(n_filter, -1)dW = dout_reshaped @ X_col.TdW = dW.reshape(W.shape)W_reshape = W.reshape(n_filter, -1)dX_col = W_reshape.T @ dout_reshapeddX = col2im_indices(dX_col, X.shape, h_filter, w_filter, padding=padding, stride=stride)return dX, dW, db
柠檬少年|卷积神经网络各层反向传播公式推导