top of page

Day 15: Linear and Logistic regression with Scikit-Learn

Linear regression with Scikit-Learn

import libraries:

from sklearn.linear_model import SGDRegressor
from sklearn.preprocessing import StandardScaler
import numpy as np

data features (example), our data has 4 features as shown below:

X_features = 'size(sqft)', 'bedrooms', 'floors', 'age'

Scale/normalize the training data:

scaler = StandardScaler()
X_norm = scaler.fit_transform(X_train) # X_train is the input data

Create and fit the regression model:

sgdr = SGDRegressor(max_iter=1000)
sgdr.fit(X_norm, y_train)

note: SGDR stands for Stochastic Gradient Descent Regressor, during classes, we learned about batch gradient descent, in which sklearn.linear_model.LinearRegression may be used.


View parameters:

b_norm = sgdr.intercept_
w_norm = sgdr.coef_

# model parameters based on the 4 features:
w_norm : [110.56 -21.27 -32.71 -37.97]
b_norm : [363.16]

Make predictions:

# make a prediction using sgdr.predict()
y_pred_sgd = sgdr.predict(X_norm)

# make a prediction using w, b
y_pred = np.dot(X_norm, w_norm) + b_norm


Logistic regression with Scikit-Learn

You can also train a logistic regression model using scikit-learn:

from sklearn.linear_model import LogisticRegression

lr_model = LogisticRegression()
lr_model.fit(X, y)

sklearn.linear_model.SGDClassifier may be used instead of LogisticRegression for Stochastic Gradient descent


You can see the predictions made by this model by calling the predict function:

y_pred = lr_model.predict(X)

You can calculate the accuracy of this model by calling the .score() function:

print(lr_model.score(X,y))



Recent Posts

See All

Day 39: Tree Ensembles

Using Multiple Decision Trees One of the weaknesses of using a single decision tree is that decision tree can be highly sensitive to small changes in the data. One solution to make the algorithm less

Comments


bottom of page