Why R2 need to adjust

Why R-Square need to adjust?Dec 06, 2021

Adjusted R-Square : Why R-Square need to adjust?

R-square and adjusted R-square values are the two most used parameters for evaluating a regression model. In this discussion we are going to see the difference between both parameters and when to use which, so let’s start with a brief introduction of the parameters:.

R-square

It is an evaluation metric for a regression model defined as the proportion of the variation in the dependent variable explained by the independent variable. The value of R2  determines that out of total change in the dependent variable how much is caused by the independent variable. It is calculated based on the values of total variation in the dependent values and the variation in the predicted values( or sum of residuals).

The value of R2 ranges from 0 to 1 or 0 to 100%, and R2 of 1 means that the total variation in the dependent variable is fully explained by the independent variable. R2 is directly proportional to the correlation between the independent and dependent variable. We can say that higher the correlation, higher the R2 value and better the model will perform. Practically the value of R2 will become zero when the regression model will start performing like the averaging model. We can visualize the regression model with the below figure.

The equation of R2 is given as:

R2 =1 – SSR/SST

Where,

SSR = The sum of residual errors(Unexplained variability in fig.)

SST =  The total Variation of the points (Total variability in fig.)

Drawback of R2

The problem with the R2 model is that whenever we add new independent variables in our model the R2 value will increase irrespective of the fact that the added independent variable is improving the model or not. So, by looking at the value of R2 we cannot judge whether the added new feature is improving the model or not.

Adjusted- R2

Adjusted R2 is a modified version of R2 which takes care of the fact that whether addition of new features improving the model or not. The value of adjusted R2 will only be increased when there is a positive influence of the added new feature on the dependent variable. If the newly added feature has no effect on the dependent variable, then the adjusted R2 will not increase rather it will decrease.

The equation of Adjusted- R2 is given as:

  AdJR2=1-((SSR∕(n-k))/(SST∕(n-1) ))

The reason behind Adjusted R2 giving the better approximation of the model is the factor that it considers the degree of freedom of the model and then calculates the final value of the effect of variables.

Conclusion

So, when it comes to assess the goodness of a regression model after adding some new predictive features, we prefer to use Adjusted R2 to get a proper impact analysis of the feature on the model or the dependent feature. For a single independent variable, the R and the adjusted R2 provide almost the same results.

ApplyOnline