Visualizing the results of a Linear Regression model is crucial for understanding how well the model fits the data and for identifying any potential issues such as underfitting or overfitting. There are several ways to visualize the results, and here are a few common techniques:
-
Scatter Plot with Regression Line: One of the simplest ways to visualize the results is by creating a scatter plot of the actual data points along with the regression line predicted by the model. This gives you a visual representation of how well the model fits the data.
import matplotlib.pyplot as plt
plt.scatter(X, y, label='Actual Data')
plt.plot(X, predictions, color='red', label='Regression Line')
plt.xlabel('Input Feature')
plt.ylabel('Target Variable')
plt.title('Linear Regression Results')
plt.legend()
plt.show()
-
Residual Plot: A residual plot shows the differences between the actual target values and the predicted values. If the residuals are randomly scattered around the horizontal line (y = 0), it indicates that the model's assumptions are met and the residuals have constant variance.
residuals = y - predictions
plt.scatter(X, residuals)
plt.axhline(y=0, color='red', linestyle='--')
plt.xlabel('Input Feature')
plt.ylabel('Residuals')
plt.title('Residual Plot')
plt.show()
-
Histogram of Residuals: A histogram of the residuals can help you check if they are normally distributed. Normal distribution of residuals is a common assumption in linear regression.
plt.hist(residuals, bins=20)
plt.xlabel('Residuals')
plt.ylabel('Frequency')
plt.title('Histogram of Residuals')
plt.show()
-
Predicted vs. Actual Plot: Another way to visualize the model's performance is by plotting the predicted values against the actual values. This can help you see how closely the predictions match the true values.
plt.scatter(y, predictions)
plt.xlabel('Actual Values')
plt.ylabel('Predicted Values')
plt.title('Predicted vs. Actual Values')
plt.show()
-
Q-Q Plot (Quantile-Quantile Plot): A Q-Q plot compares the quantiles of the residuals to the quantiles of a theoretical normal distribution. This can help you assess the normality assumption of the residuals.
import scipy.stats as stats
stats.probplot(residuals, dist="norm", plot=plt)
plt.title("Q-Q Plot")
plt.show()
These visualizations can provide valuable insights into the performance of your Linear Regression model. Keep in mind that the choice of visualization depends on the specific characteristics of your data and the goals of your analysis.