Given an example(xi, yi>/sub>) where xi is the image and where yi is the (integer) label, and using the shorthand for the scores vectors: s = f(xi, W), then:
the SVM loss has the form: \(L_i = \sum\limits_{j != y_i} max(0, s_j-s_{y_i}+1)\),
code format:
L_i_vectorized(x, y, W):
scores = W.dot(x)
margins = np.maximun(0, scores - scores[y] + 1)
margins[y] = 0
loss_i = np.sum(margins)
return loss_i
L = \(\frac{1}{N}*\sum\limits_{i = 1}^{N}{\sum\limits_{i != y_i}{max(0, f(x_i; W)_j - f(x_i; W)_{y_i} + 1)}} + \lambda*R(W)\) \((\lambda)\sum\limits_k{\sum\limits_l{W_{k, l}^2}}\)
Benefits in initializing parameters:
x = [1 1 1 1]
W1 = [1 0 0 0]
W2 = [0.25 0.25 0.25 0.25]
Without the regularization item, the dot result will be the same ones, while actually W2 is better than W1 as common sense. If the regularization item is added into it, the result won't be the same, so we can classify them.
Other Regularization Methods: Elastic net(L1+L2), Max norm regularization, Dropout.
手机扫一扫
移动阅读更方便
你可能感兴趣的文章