This is a short note based on this.

Answer in short: Because different formulas are used to calculate the R-squared of a linear regression, depending on whether it has an intercept or not.

R2 for a linear model that has an intercept:


where y is the variable that the linear model is trying to predict (the response variable), y^ is the predicted value and y- is the mean value of the response variable.

And the R2 for a linear model that is forced through the origin:

CodeCogsEqn (2),

basically the mean value of the response variable is removed from the equation, making the denominator bigger (and the result of the division smaller). The reason why the mean can not be used for this calculation is that it does not make sense any more - forcing the fit through zero kind of means adding an infinite number of (0,0) points into the dataset.

This means that the R-squared values of two different linear models (one with an intercept, one without) can not really be compared, because when the intercept is quite small compared to the residuals (basically the numerator) then the R2 “advantange” that the through-origin regression gets is relatively bigger than the decrease in residuals, when including the intercept.