Digest | 机器学习: 逻辑回归实现, URL短地址算法

>  2021年04月30日信息消化

### 每天学点机器学习

#### Machine Learning EX2

##### Logistic Regression

```matlab
% Load Data
% The first two columns contain the exam scores and the third column contains the label.
data = load('ex2data1.txt');
X = data(:, [1, 2]); 
y = data(:, 3);
```

##### Visualizing the data

在开始实施任何学习算法之前，如果可能的话，将数据可视化总是好的。下面的代码将加载数据，并通过调用函数plotData将其显示在二维图上。现在你将完成plotData中的代码，使其显示一个像图1一样的图，其中轴是两个exam scores，正反两方面的例子用不同的标记表示。

> Before starting to implement any learning algorithm, it is always good to visualize the data if possible. The code below will load the data and display it on a 2-dimensional plot by calling the function plotData. You will now complete the code in plotData so that it displays a figure like Figure 1, where the axes are the two exam scores, and the positive and negative examples are shown with different markers.

```matlab
% Find Indices of Positive and Negative Examples
pos = find(y==1); neg = find(y == 0);
% Plot Examples
plot(X(pos, 1), X(pos, 2), 'k+','LineWidth', 2, 'MarkerSize', 7);
plot(X(neg, 1), X(neg, 2), 'ko', 'MarkerFaceColor', 'y','MarkerSize', 7);

% Plot the data with + indicating (y = 1) examples and o indicating (y = 0) examples.
plotData(X, y);
 
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')

% Specified in plot order
legend('Admitted', 'Not admitted')
```

##### 1.2 Implementation

###### 1.2.1 Warmup exercise: sigmoid funciton

Logistic regression hypothesis:
$$
h_\theta(x) = g(\theta^Tx),
$$
sigmoid function: 
$$
g(z) = \frac{1}{1+e^{-z}}
$$

- Sigmoid.m

```matlab
function g = sigmoid(z)
%SIGMOID Compute sigmoid function
%   g = SIGMOID(z) computes the sigmoid of z.

g = zeros(size(z));
g = 1./(1 + exp(-z));

end
```

Initialize the data

```matlab
%  Setup the data matrix appropriately
[m, n] = size(X);

% Add intercept term to X
X = [ones(m, 1) X];

% Initialize the fitting parameters
initial_theta = zeros(n + 1, 1);
```

Implementation  `costFunction.m`

```matlab
% Implementation 1
% c.f. https://github.com/mGalarnyk/datasciencecoursera/blob/master/Stanford_Machine_Learning/Week3/ex2/costFunction.m
J=(1/m) * sum((-y .* log(sigmoid(X*theta)))- ((1-y) .* log(1-sigmoid(X*theta))));
 
 for j=1:size(theta)
     grad(j)=sum(((sigmoid(X*theta)-y).* X(:,j)))/ m;
end

% Vectorization Implementation 2
% c.f. https://stackoverflow.com/questions/45653712/logistic-regression-cost-function
J = ((-y' * log(sigmoid(X*theta))) - ((1 - y)' * log(1 - sigmoid(X*theta))))/m;
grad = ((sigmoid(X*theta) - y)' * X)/m;
```

##### Compute the gradient

```matlab
% Compute and display the initial cost and gradient
[cost, grad] = costFunction(initial_theta, X, y);
fprintf('Cost at initial theta (zeros): %f\n', cost);
disp('Gradient at initial theta (zeros):'); 
disp(grad);

>> -0.1000  -12.0092  -11.2628

% Compute and display cost and gradient with non-zero theta
test_theta = [-24; 0.2; 0.2];
[cost, grad] = costFunction(test_theta, X, y);
fprintf('\nCost at non-zero test theta: %f\n', cost);
disp('Gradient at non-zero theta:'); disp(grad);

>> 0.0429    2.5662    2.6468
```

##### 1.2.3 Learning parameters using fminunc

在之前的作业中，你通过实施梯度下降找到了一个线性回归模型的最佳参数。你写了一个成本函数并计算了它的梯度，然后相应地采取梯度下降步骤。这一次，你将使用一个名为fminunc的MATLAB内置函数，而不是采取梯度下降的步骤。

> In the previous assignment, you found the optimal parameters of a linear regression model by implementing gradent descent. You wrote a cost function and calculated its gradient, then took a gradient descent step accordingly. This time, instead of taking gradient descent steps, you will use a MATLAB built-in function called fminunc.

```matlab
%  Set options for fminunc
options = optimoptions(@fminunc,'Algorithm','Quasi-Newton','GradObj', 'on', 'MaxIter', 400);

%  Run fminunc to obtain the optimal theta
%  This function will return theta and the cost 
[theta, cost] = fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

% Print theta
fprintf('Cost at theta found by fminunc: %f\n', cost);
disp('theta:');disp(theta);

>> theta:
>>   -25.1613
>>   0.2062
>>   0.2015

% Plot Boundary
plotDecisionBoundary(theta, X, y);
% Add some labels 
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')
% Specified in plot order
legend('Admitted', 'Not admitted')
hold off;
```

##### Evaluating logistic regression

在学习了这些参数之后，你可以用这个模型来预测某个学生是否会被录取。对于一个考试成绩为45分、考试成绩为85分的学生来说，你应该期望看到的录取概率为0.776。评估我们所找到的参数质量的另一个方法是看所学模型在我们的训练集上的预测效果如何。在这一部分，你的任务是完成predict.m中的代码。predict函数将产生 "1 "或 "0 "的预测，给定一个数据集和一个学习的参数向量。

> After learning the parameters, you can use the model to predict whether a particular student will be admitted. For a student with an Exam 1 score of 45 and an Exam 2 score of 85, you should expect to see an admission probability of 0.776. Another way to evaluate the quality of the parameters we have found is to see how well the learned model predicts on our training set. In this part, your task is to complete the code in predict.m. The predict function will produce '1' or '0' predictions given a dataset and a learned parameter vector .

```matlab
%  Predict probability for a student with score 45 on exam 1  and score 85 on exam 2 
prob = sigmoid([1 45 85] * theta);
fprintf('For a student with scores 45 and 85, we predict an admission probability of %f\n\n', prob);
% Compute accuracy on our training set
p = predict(theta, X);
fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);
```

- predict.m

```matlab
function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic 
%regression parameters theta
%   p = PREDICT(theta, X) computes the predictions for X using a 
%   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)
m = size(X, 1); % Number of training examples
% You need to return the following variables correctly
p = zeros(m, 1);
% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters. 
%               You should set p to a vector of 0's and 1's
%
p=((sigmoid(X * theta))>=0.5);
disp(p);
% =========================================================================
end

```

##### 2. Regularized logistic regression

在这部分练习中，你将实施正则逻辑回归，预测来自制造厂的微芯片是否通过质量保证（QA）。在质量保证期间，每个微芯片都要经过各种测试，以确保其功能正确。假设你是工厂的产品经理，你有一些微芯片在两个不同测试中的测试结果。从这两项测试中，你想确定这些微芯片是应该被接受还是被拒绝。为了帮助你做出决定，你有一个过去的微芯片的测试结果的数据集，你可以从中建立一个逻辑回归模型。

> In this part of the exercise, you will implement regularized logistic regression to predict whether microchips from a fabrication plant passes quality assurance (QA). During QA, each microchip goes through various tests to ensure it is functioning correctly. Suppose you are the product manager of the factory and you have the test results for some microchips on two different tests. From these two tests, you would like to determine whether the microchips should be accepted or rejected. To help you make the decision, you have a dataset of test results on past microchips, from which you can build a logistic regression model.

一个更好地拟合数据的方法是，从每个数据点创建更多的特征。在所提供的函数mapFeature.m中，我们将把特征映射到所有的多项式项和六次方以下的项。

>  One way to fit the data better is to create more features from each data point. In the provided function mapFeature.m, we will map the features into all polynomial terms of  and  up to the sixth power.

```matlab
%  The first two columns contains the X values and the third column
%  contains the label (y).
data = load('ex2data2.txt');
X = data(:, [1, 2]); y = data(:, 3);

plotData(X, y);
% Put some labels 
hold on;
% Labels and Legend
xlabel('Microchip Test 1')
ylabel('Microchip Test 2')
% Specified in plot order
legend('y = 1', 'y = 0')
hold off;

% Add Polynomial Features
% Note that mapFeature also adds a column of ones for us, so the intercept term is handled
X = mapFeature(X(:,1), X(:,2));
```

虽然特征映射允许我们建立一个更具表现力的分类器，但它也更容易受到过拟合的影响。在接下来的练习中，你将实现正则化的逻辑回归来拟合数据，并亲自看看正则化如何帮助解决过拟合的问题。

> While the feature mapping allows us to build a more expressive classifier, it also more susceptible to overfitting. In the next parts of the exercise, you will implement regularized logistic regression to fit the data and also see for yourself how regularization can help combat the overfitting problem.

##### 2.3 Cost function and gradient

现在你将执行代码来计算正则化逻辑回归的成本函数和梯度。

> Now you will implement code to compute the cost function and gradient for regularized logistic regression.

```matlab
% Initialize fitting parameters
initial_theta = zeros(size(X, 2), 1);

% Set regularization parameter lambda to 1
lambda = 1;

% Compute and display initial cost and gradient for regularized logistic regression
[cost, grad] = costFunctionReg(initial_theta, X, y, lambda);
fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros) - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.0085\n 0.0188\n 0.0001\n 0.0503\n 0.0115\n');

% Compute and display cost and gradient with all-ones theta and lambda = 10
test_theta = ones(size(X,2),1);
[cost, grad] = costFunctionReg(test_theta, X, y, 10);
fprintf('\nCost at test theta (with lambda = 10): %f\n', cost);
fprintf('Expected cost (approx): 3.16\n');
fprintf('Gradient at test theta - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.3460\n 0.1614\n 0.1948\n 0.2269\n 0.0922\n');
```

- costFunctionReg.m

```matlab
function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta

J=(1/m) * sum((-y .* log(sigmoid(X*theta)))- ((1-y) .* log(1-sigmoid(X*theta))));
J=J+ (lambda/(2*m)) * sum(theta(2:size(theta)).*theta(2:size(theta)));

grad(1)=(sum(((sigmoid(X*theta)-y).* X(:,1)))/ m);

for j=2:size(theta)
    grad(j)=(sum(((sigmoid(X*theta)-y).* X(:,j)))/ m)+(lambda * theta(j)/m);
end
% =============================================================

end

```

### 每天学点算法

#### 用字母数字字符串实现一个URL短地址生成器

[Implement a URL shortener with alphanumeric string](https://bhuwanupadhyay.github.io/posts/implement-a-url-shortener-with-alphanumeric-string/)

This problem was asked by Microsoft. Implement a URL shortener with six-character alphanumeric string.

##### Example

Implement a URL shortener with the following methods:

- `shorten(url)`, which shortens the url into a six-character alphanumeric string, such as `zLg6wl`.
- `restore(short)`, which expands the shortened string into the original url.
- If no such shortened string exists, return `null`.

What if we enter the same URL twice?

```java
class Solution {

public static final int MAX = 6;
    private final Map<URL, URL> urlCache = new HashMap<>();
    private final AlphanumericRandomizer randomizer = new AlphanumericRandomizer();

URL shorten(String url) {
        try {
            URL longUrl = URI.create(url).toURL();
            String shortValue = randomizer.next(MAX);
            URL shortUrl = new URL(longUrl.getProtocol(), longUrl.getHost(), "/" + shortValue);
            this.urlCache.put(shortUrl, longUrl);
            return shortUrl;
        } catch (MalformedURLException e) {
            throw new RuntimeException(e);
        }
    }

URL restore(URL shortValue) {
        return this.urlCache.get(shortValue);
    }

/**
     * Alphanumeric characters are A to Z, a to z and 0 to 9
     */
    static class AlphanumericRandomizer {

private static final String CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
        private final Random r;
        private final String alphaNumeric;

public AlphanumericRandomizer() {
            r = new Random();
            this.alphaNumeric = CHARS + CHARS.toLowerCase() + "0123456789";
        }

public String next(int size) {
            final int length = this.alphaNumeric.length();
            return IntStream.range(0, size)
                    .mapToObj(value -> r.nextInt(length))
                    .map(index -> String.valueOf(alphaNumeric.charAt(index)))
                    .collect(Collectors.joining());
        }

}
}
```

### 其他值得阅读

### 一点收获

- [Cognicull: Knowledge base for mathematics, natural science and engineering](https://cognicull.com/) | [HN](https://news.ycombinator.com/item?id=26981864)
  - **Metacademy** - "Package Manager for Knowledge" - https://metacademy.org/
  - **MathLingua** - language for easily creating a collection of mathematical knowledge, including definitions, theorems, axioms, and conjectures, in a format designed to be easy and fun to read and write. - https://www.mathlingua.org/
  - **Learn X in Y minutes** - https://learnxinyminutes.com/
  - **Learn X by doing Y** - https://aquadzn.github.io/learn-x-by-doing-y/
  - **Awesome Lists** - https://github.com/sindresorhus/awesome
  - **ncatlab** - https://ncatlab.org/