Making AI and ML comprehensible
Linear Regression is a way of predicting an output variable (“Y”) based on one or more input variables (“Xs”). For example, a linear regression system might attempt to predict house prices (an output variable) based on input variables such as number of rooms, school district, and proximity to the ocean.
Input variables (Xs) are also called explanatory variables because they attempt to “explain” the reasons for the output variable. In the housing prices example, the input variables “explain” why each house is sold at a certain price.
Linear regression should be used when (1) your target output variable is a numerical value, and (2) you can predict a linear relationship between the input variables and the output variable. A linear relationship means that if the input variable (e.g., number of rooms) goes up, the output variable (e.g., housing price) should go up as well.
A linear regression model works by fitting a line (when there is one input variable) or a plane (when there are two or more input variables). “Fitting” means that the system predicts the line or plane which best explains the output variable.
The image below shows a linear regression model with a line fit between one input variable (X) and one output variable (Y). The x-axis shows the value of a single input variable (such as a neighborhood’s safety rating); the dots show the value of different output variables (such as house prices). The linear regression model determines the fit between these variables by finding a line which minimizes the distance between the line and the data points.