机器学习(Machine Learning)心得体会(3)逻辑回归&神经网络Exercise 3:One-vs-all & Neural Networks

阅读原文时间：2021年04月21日阅读：1

之前我们碰到的逻辑回归问题只有两个输出，0或者1，那么如果有很多个输出呢，0，1，2，3等等，在这一节我们学习One-vs-all的思想来解决这个问题。

其实非常简单，输出为两类的情况我们已经可以解决的，对于多类的时候，我们先针对其中一类，把这一类看成1，把除了这一类外的其他类都看成0，这就又变回了两类问题，我们可以得到这一类的分类器了，然后同样的手段来解决其他类，如果是n类的话，最后可以得到n个分类器，做预测的时候再把数据输入这n个分类器，自然可以把相对应的那一类给分离出来了。

作业中，我们的数据通过作图可以得到下图，上面分别是数字0-9，有十个类，我们要做的就是区分出每个数字。

对于这个方法，我们的costFunction以及gradient的基本形式和之前的是一样，不过由于我们这里是多组分类器，所以矩阵操作时维度注意一下。

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Hint: The computation of the cost function and gradients can be
%       efficiently vectorized. For example, consider the computation
%
%           sigmoid(X * theta)
%
%       Each row of the resulting matrix will contain the value of the
%       prediction for that example. You can make use of this to vectorize
%       the cost function and gradient computations. 
%
% Hint: When computing the gradient of the regularized cost function, 
%       there're many possible vectorized solutions, but one solution
%       looks like:
%           grad = (unregularized gradient for logistic regression)
%           temp = theta; 
%           temp(1) = 0;   % because we don't add anything for j = 0  
%           grad = grad + YOUR_CODE_HERE (using the temp variable)
%
h=sigmoid(X*theta);
J=-1/m*(y'*log(h)+(1-y)'*log(1-h))+lambda/(2*m)*(theta(2:end)'*theta(2:end));
theta_temp=theta;
theta_temp(1)=0;
grad=-1/m*X'*(y-h)+lambda/m*theta_temp;

这个是lrCostFunction.m中要编写的代码。

for c=1:num_labels
    % Set Initial theta
    initial_theta = zeros(n + 1, 1);

    % Set options for fminunc
    options = optimset('GradObj', 'on', 'MaxIter', 50);

    % Run fmincg to obtain the optimal theta
    % This function will return theta and the cost 
    [theta] = ...
        fmincg (@(t)(lrCostFunction(t, X, (y == c), lambda)), ...
                initial_theta, options);
    all_theta(c,:)=theta';

end

这个是oneVsAll中要编写的代码，运用了fmincg函数。

通过以上两步我们就已经训练得到最优的Theta了，最后我们呢可以做一次预测。

% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters (one-vs-all).
%               You should set p to a vector of predictions (from 1 to
%               num_labels).
%
% Hint: This code can be done all vectorized using the max function.
%       In particular, the max function can also return the index of the 
%       max element, for more information see 'help max'. If your examples 
%       are in rows, then, you can use max(A, [], 2) to obtain the max 
%       for each row.
%       

A=sigmoid(X*all_theta');
[y,p]=max(A,[],2);

这个是predictOneVsAll中要编写的代码。

以上是多类分类器的学习过程

我们开始神经网络部分。

神经网络就是模拟神经元的传到过程，来训练我们的数据集。神经网络中，输入不是直接就到输出了，而是要通过中间的隐藏层，一层一层运行最后到达最后一层，也就是我们想要的输出。

如上图所示，所以我们要得到每层的Theta。

作业中，我们已经有了一个已知Theta的神经网络，该神经网络有一个隐藏层，有Theta1，Theta2，我们只需进行预测就可以了。

% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned neural network. You should set p to a 
%               vector containing labels between 1 to num_labels.
%
% Hint: The max function might come in useful. In particular, the max
%       function can also return the index of the max element, for more
%       information see 'help max'. If your examples are in rows, then, you
%       can use max(A, [], 2) to obtain the max for each row.
%
X=[ones(m,1) X];
A2=sigmoid(X*Theta1');
A2=[ones(m,1) A2];
A3=sigmoid(A2*Theta2');

[y,p]=max(A3,[],2);

代码如上，p就是我们预测所得结果。

手机扫一扫

移动阅读更方便

你可能感兴趣的文章

DeepFake——学习资料

生成对抗网络(GAN)简单梳理