We have previously looked at gradient descent for linear regression. Today, we will look at gradient descent for logistic regression.

Recall that the purpose of running gradient descent is to find the values of parameters **w** and **b**, and we do that by minimizing the cost function **J**

**Recall:**

**The model function we use in logistic regression:**

**The algorithm to minimize the cost function:**

**so, the gradient descent algorithm for logistic regression:**

##### Gradient descent implementation in Python

The gradient descent algorithm implementation has two components:

The loop implementing the gradient descent algorithm.

**(gradient_descent)**the calculation of the partial derivatives (see the algorithm inside the square brackets of the gradient descent algorithm).

**(compute_gradient_logistic)**the partial derivative for

**w[j]**will be denoted**dj_dw**and**b**would be**dj_db**

To implement the partial derivatives to find parameters **w** and **b:**

initialize variables to accumulate

**dj_dw**and**dj_db**for each example:

calculate the error for that example

**f(x) - y[i]**for each input value

**xj_i**in this example,multiply the error by the input

**xj_i**and add to the corresponding element**dj_dw**add the error to

**dj_db**divide

**dj_db**and**dj_dw**by total number of examples (m)note that

**x[i]**in numpy X[i,:] or X[i] and**xj_i**is X[i,j]

```
def compute_gradient_logistic(X, y, w, b):
m, n = X.shape
dj_dw = np.zeros((n,))
dj_db = 0.
for i in range(m):
f_wb_i = sigmoid(np.dot(X[i], w) + b)
err_i = f_wb_i - y[i]
for j in range(n):
dj_dw[j] = dj_dw[j] + err_i * X[i,j]
dj_db = dj_db + err_i
dj_dw = dj_dw/m
dj_db = dj_db/m
return dj_db, dj_dw
```

**Notations:
Args:**
X (ndarray (m,n)) : Data, m examples with n features
y (ndarray (m,) : target values
w (ndarray (n,) : model parameters
b (scalar) : model parameter
**Returns:**
dj_dw (ndarray (n,)) : the gradient of cost w.r.t the parameters w
dj_db (scalar) : the gradient of cost w.r.t the parameter b

```
def gradient_descent(X, y, w_in, b_in, alpha, num_iters):
# An array to store cost J and w's at each iteration
J_history = []
w = copy.deepcopy(w_in)
b = b_in
for i in range(num_iters):
# calculate the gradient and update the parameters
dj_db, dj_dw = compute_gradient_logistic(X, y, w, b)
# update parameters using w, b, alpha, and gradient
w = w - alpha * dj_dw
b = b - alpha * dj_db
# Save cost J at each iteration
if i < 100000:
J_history.append(compute_cost_logistic(X, y, w, b))
# print cost every at intervals 10 times or as many iterations if < 10
if i%math.ceil(num_iters/10) == 0:
print(f"Iteration {i:4d}: Cost {J_history[-1]}")
return w, b, J_history
```

## Comments