Day 6: Multiple Linear Regression

Jun 23, 2023
3 min read

Previously, we looked at a version of linear regression with only one feature, today, we're going to explore linear regression with more different features

Size in sqft (X1)	Number of Bedrooms (X2)	Number of floors (X3)	Age of home in years (X4)	Price ($) in $1000s
2104	5	1	45	460
1416	3	2	40	232
1534	3	2	30	315
852	2	1	36	178

a few notations:

X1, X2, X3, X4 denote 4 input features
Xj represent the list of features (j = 1...4)
We'll also use n to denote the number of training examples
X_i = features of i-th example

X_2 = [1416, 3, 2, 40] # a row vector

Xj_i = value of feature j in the i-th training example, X3_2 = 2

Now that we have multiple features, we're going to define our model function differently:

We can define W as a list of numbers that list the parameters:

In math, this is called a vector and sometimes to designate that this is a vector, which means a list of numbers, we add the arrow on top of it.

So, we can re-write our model function as follow, note that b doesn't have an arrow on top of it as bias is a constant.

which will be calculated as shown in the first example above.

Vectorization

When implementing a learning algorithm, using vectorization will both make our code shorter and make it run much more efficiently.

With vectorization, we can easily implement functions with many input features, which we can implement with NumPy's dot function:

f = np.dot(w, x) + b

The numpy dot function is a vectorized implementation of the dot product operation between 2 vectors, the reason that vectorization implementation is much faster is because the numpy dot function is able to use parallel hardware in your computer.

Take a look at a comparison image below:

Matrices

Matrices are 2-dimensional arrays. the elements of the matrix are all of the same type

NumPy's basic data structure is an indexable, n-dimensional array containing elements of the same time (dtype). Matrices have a 2-dimensional index [m, n]

Matrix creation

The same functions that created 1-D vectors will create 2-D arrays.

Below, the shape tuple is provided to achieve a 2-D result. Notice how NumPy uses brackets to denote each dimension. Notice further that NumPy, when printing, will print one row per line.

a = np.zeros((1, 5))
print(f"a shape = {a.shape}, a = {a}")

# result:
a shape = (1, 5), a = [[0. 0. 0. 0. 0.]]

a = np.zeros((2, 1))
print(f" a shape = {a.shape}, a = {a}")

# result:
a shape = (2, 1), a = [[0.]
                       [0.]]

a = np.random.random_sample((1, 1))
print(f"a shape = {a.shape}, a = {a}")

# result:
a shape = (1, 1), a = [[0.44236513]]

Indexing

Matrices include a second index. The two indexes describe [row, column]

Access can either return an element or a row or column. See below:

# vector indexing operations on matrices
a = np.arange(6).reshape(-1, 2)
print(f"a shape: {a.shape}, \na = {a}")

# result:
a shape: (3,2)
a = [[0 1]
     [2 3]
     [4 5]]

# access an element
print(f"\na[2,0].shape: {a[2,0].shape}, a[2,0] = {a[2,0]},
        type(a[2,0]) = {type(a[2,0])} \naccessing an element returns a scalar")

# result:
a[2,0].shape: {}, a[2,0] = 4, type(a[2,0]) = <class 'numpy.int64'>
accessing an element returns a scalar

# access a row
print(f"a[2].shape: {a[2].shape}, a[2] = {a[2]}, type(a[2]) = {type(a[2])}")

#result:
a[2].shape: (2,), a[2] = [4,5], type(a[2]) = <class 'numpy.ndarray'>

Slicing

Slicing creates an array of indices using a set of 3 values (start:stop:step). A subset of values is also valid. Its use is best explained by an example:

# vector 2-D slicing operations
a = np.arange(20).reshape(-1, 10)
print(f"a = \n {a}")

# result
a =
[[0  1  2   3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]]

# access 5 consecutive elements (start:stop:step)
print("a[0, 2:7:1] = ", a[0, 2:7:1], ", a[0, 2:7:1].shape =", a[0, 2:7:1].shape, "a 1-D array")

# result:
a[0, 2:7:1] = [2 3 4 5 6], a[0, 2:7:1].shape = (5,) a 1-D array

# access 5 consecutive elements in 2 rows (start:stop:step)
print("a[:, 2:7:1] = ", a[:, 2:7:1], ", a[:, 2:7:1].shape =", a[:, 2:7:1].shape, "a 2-D array")

# result:
a[:, 2:7:1] = [2  3  4  5   6]
             [12 13 14 15 16]], a[:, 2:7:1].shape = (2,5) a 1-D array

# access all elements
print("a[:, :] = \n", a[:,:].shape = ", a[:,:].shape)

# result:
a[:,:] =
[[0  1  2   3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]], a[:,:].shape = (2, 10)

# access all elements in one row
a[1,:] is the same as a[1]

a[1,:] or a[1] = [10 11 12 13 14 15 16 17 18 19], shape (10,) a 1-d array