Before diving into the code, you need to set up your environment:
- Clone the repository where this project is hosted:
git clone https://github.com/HarrySSH/LinearModels.git cd LinearModels - Create a Conda environment by running:
conda env create -f regression_environment.yml
- Activate the new environment:
conda activate regression_env
Welcome to your task of completing the implementation of a LinearRegression class using Python and NumPy. The goal of this exercise is to deepen your understanding of the linear regression algorithm by manually coding the methods used to fit the model to the data, make predictions, and evaluate the model's performance.
Purpose: Initializes the LinearRegression instance. Tasks:
- Initialize
self.coefficientstoNone. This will later hold the coefficients (weights) calculated from the fit method. - Initialize
self.intercepttoNone, which will store the intercept from the regression model.
Purpose: Fits the linear regression model to the provided data. Tasks:
- Add a column of ones to the input feature matrix
Xto account for the intercept in the linear model. - Compute the transpose of matrix
X. - Calculate the product of the transpose of
XandXitself. - Compute the inverse of this product.
- Calculate the product of the transpose of
Xand the target vectory. - Solve for the coefficient vector using the Normal Equation (
XTX_inv * XTy). This vector includes the intercept as its first element.
Purpose: Makes predictions using the linear model. Tasks:
- Add a column of ones to the input feature matrix
Xif it was not included during fitting. - Compute the dot product of the feature matrix
Xand the coefficients (including intercept) to predict the target variable.
Purpose: Calculates the R-squared value to evaluate the model performance. Tasks:
- Use the
predictmethod to obtain predictions for the input feature matrixX. - Calculate the total sum of squares (variation of
yfrom its mean). - Calculate the residual sum of squares (variation of
yfrom the predicted values). - Compute the R-squared value using the formula
1 - (residual sum of squares / total sum of squares).
To complete this exercise, follow these steps:
- Read and understand the purpose of each method and what it is supposed to accomplish.
- Start by implementing the
fitmethod as it will compute the necessary coefficients used by other methods. - Implement the
predictmethod to make use of the coefficients computed infit. - Implement the
Rsquaredmethod to evaluate the model's performance using the predictions. - Test each method as you implement them to ensure correctness.
Once you have implemented the methods, you can test your class using pytest. Run the following command in your terminal:
pytestThis command will execute the test cases defined in the pytest test suite. Ensure that your functions are correctly implemented by passing all the tests.
Further Reading for unit testing. text