Why R2 need to adjust

Home
Blog
Why R-Square need to adjust?

Why R-Square need to adjust?Dec 05, 2021

Adjusted R-Square : Why R-Square need to adjust?

R-square and adjusted R-square values are the two most used parameters for evaluating a regression model. In this discussion we are going to see the difference between both parameters and when to use which, so let’s start with a brief introduction of the parameters:.

R-square

It is an evaluation metric for a regression model defined as the proportion of the variation in the dependent variable explained by the independent variable. The value of R² determines that out of total change in the dependent variable how much is caused by the independent variable. It is calculated based on the values of total variation in the dependent values and the variation in the predicted values( or sum of residuals).

The value of R² ranges from 0 to 1 or 0 to 100%, and R²of 1 means that the total variation in the dependent variable is fully explained by the independent variable. R²is directly proportional to the correlation between the independent and dependent variable. We can say that higher the correlation, higher the R²value and better the model will perform. Practically the value of R²will become zero when the regression model will start performing like the averaging model. We can visualize the regression model with the below figure.

The equation of R²is given as:

R²=1 – SSR/SST

Where,

SSR = The sum of residual errors(Unexplained variability in fig.)

SST = The total Variation of the points (Total variability in fig.)

Drawback of R²

The problem with the R²model is that whenever we add new independent variables in our model the R²value will increase irrespective of the fact that the added independent variable is improving the model or not. So, by looking at the value of R²we cannot judge whether the added new feature is improving the model or not.

Adjusted- R²

Adjusted R²is a modified version of R²which takes care of the fact that whether addition of new features improving the model or not. The value of adjusted R²will only be increased when there is a positive influence of the added new feature on the dependent variable. If the newly added feature has no effect on the dependent variable, then the adjusted R²will not increase rather it will decrease.

The equation of Adjusted- R²is given as:

AdJR²=1-((SSR∕(n-k))/(SST∕(n-1) ))

The reason behind Adjusted R²giving the better approximation of the model is the factor that it considers the degree of freedom of the model and then calculates the final value of the effect of variables.

Conclusion

So, when it comes to assess the goodness of a regression model after adding some new predictive features, we prefer to use Adjusted R²to get a proper impact analysis of the feature on the model or the dependent feature. For a single independent variable, the R²and the adjusted R²provide almost the same results.