机器学习 - Linear Regression
单变量线性回归程序设计作业 - python
## 吴恩达的机器学习课程第二周的程序设计作业:线性回归
# https://www.coursera.org/learn/machine-learning/programming/8f3qT/linear-regression
读取数据
import pandas as pd
data = pd.read_csv('populations_profits.csv');
data.head()
获取训练数据 X, y
import numpy as np
x = data.as_matrix(['populations']);
y = data.as_matrix(['profits']);
数据可视化
from matplotlib import pyplot as plt
plt.plot(x,y,'rx');
plt.show();
训练 Linear Regression 模型
from sklearn import linear_model
reg = linear_model.LinearRegression();
reg.fit(x,y);
print('coef: ', reg.coef_)
print('intercept: ', reg.intercept_)
coef: [[1.19303364]]
intercept: [-3.89578088]
训练模型可视化
ypredict = reg.predict(x)
plt.plot( x, y,'rx', x, ypredict, '-');
plt.show();
训练模型的误差
from sklearn.metrics import mean_squared_error
mean_squared_error(y,ypredict)/2
4.476971375975179
多变量线性回归程序设计作业 - python
读取 大小-床数-价格 数据
import pandas as pd
data = pd.read_csv('size_bed_price.csv');
data.head()
数据的一些信息显示
data.describe()
data.mean()['size']
2000.6808510638298
获取训练数据 X, y
x = data.as_matrix(['size','bed']);
y = data.as_matrix(['price']);
训练数据的可视化
from matplotlib import pyplot as plt
plt.figure(1);
plt.subplot(121);
plt.plot(x[:,0], y, 'rx');
plt.xlabel('size in feet');
plt.ylabel('price');
plt.subplot(122);
plt.plot(x[:,1], y, 'ro');
plt.xlabel('beds');
plt.ylabel('price');
plt.show();
Feature Scaling
from sklearn import preprocessing
x = preprocessing.scale(x);
plt.plot(x[:,0],y,'rx',x[:,1],y,'ro');
plt.ylabel('price');
plt.legend(['size-price','bed-price']);
plt.show();
大小-床数 关系图形表示
plt.plot(x[:,0],x[:,1],'rx');
plt.xlabel('size');
plt.ylabel('bed');
plt.show();
训练模型
from sklearn import linear_model
reg = linear_model.LinearRegression();
reg.fit(x,y);
print('intercept: ', reg.intercept_);
print('coef: ', reg.coef_);
intercept: [340412.65957447]
coef: [[109447.79646964 -6578.35485416]]
对大小为 1650, 床数为 3 的房子价格的预测
s = (1650 - data.mean()['size'])/data.std()['size'];
b = (3 - data.mean()['bed'])/data.std()['bed'];
x_test = [[ s, b]];
print('xx = ', x_test);
ypredict = reg.predict(x_test);
print('a 1650 sq-ft, 3 br house:', ypredict)
xx = [[-0.4412732005944351, -0.2236751871685913]] a 1650 sq-ft, 3 br house: [[293587.69488157]]