The following statistical data analysis is motivated by the work of Dr. John Reid There is no significant trend in global average temperature or – in everyday language without formula – Is There a Trend in Global Average Temperature?
His publication shows that due to strong correlation between successive yearly “global temperature measurements” the usual deterministic regression model with an assumed systematic trend, even when overlayed with a cyclic component, do not fulfill the statistically necessary requirements of a “random white noise” residuum. A much more appropriate model is achieved by assuming autoregressive effects, combined with a “red noise” stochastic component in the moving average term, explaining all temperature variations with 2 effects:
- next year’s temperature is based on nearly (slightly less than factor 1) this year’s temperature (“autoregressive”)
- on top of this there are random variations, slow variations are more likely than fast variations (“red noise”)
Dr. Reid discusses at length, why the “autoregressive parameter” is smaller than 1, but very close to 1. In fact this cannot be decided with absolute certainty by the 170 year long HadCRUT4 temperature data series, but requires the longer, ice core based series GISP2. This is important, because a factor 1 would imply a true “random walk”, resulting in physically not feasible unlimited temperature deviations. A value smaller than 1 constrains this technically called centrally biased random walk to physically meaningful values.
This investigation has 2 purposes:
- Does the deterministic model improve, when we do the regression on the log of the real
atmospheric concentrations instead of an artifical model. Will the modelling quality be as good or better than the stochastic ARMA model?
- What happens when we combine the deterministic and the stochastic model into an ARMAX model? This would mean that reality is not an “either-or” situation, but there is a contribution of “natural variability” and a contribution of “
-forcing”.
In order to be able to compare the quality of the different models, comparing the likelihood is not sufficient, because an increasing number of parameters will always increase the likelihood. Therefore the Bayes Information Criterion (BIC) is used in general to compare the quality of the different models.
The Python code for this page can be found here. This article is not yet finalized and represents work in progress.
Dataset Temperature (HadCRUT4)
This is the well accepted mainstream global temperature data set.

Dataset CO2 (ETH Zürich)
This dataset is compiled by the Zürich university as a contribution tot the latest climate model CMIP6, extending the Mauna Loa data (which begin in 1959) to the year 1850. Here the log of the content is displayed, because the
forcing is proportional to the log of the concentration. The relative changes of forcing since 1850 are remarkably small (5%)

Deterministic Regression on constant and CO2¶
This is the well known deterministic regression model
It is the obvious choice in the climate discussion. The complex climate models include other forcings, mainly aerosols, but for the purpose of this discussion it is enough to look at .
analyzeSingleConfiguration(y,t,(0,0,0),[],x)
BIC: -211.7488415026756 Params: [-1.61015536e+01 2.78040603e+00 1.47884419e-02] Summary: Statespace Model Results ============================================================================== Dep. Variable: y No. Observations: 165 Model: SARIMAX Log Likelihood 113.533 Date: Mon, 14 Sep 2020 AIC -221.067 Time: 14:18:34 BIC -211.749 Sample: 0 HQIC -217.284 - 165 Covariance Type: opg ============================================================================== coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------ const -16.1016 0.745 -21.617 0.000 -17.561 -14.642 x1 2.7804 0.130 21.402 0.000 2.526 3.035 sigma2 0.0148 0.002 8.760 0.000 0.011 0.018 =================================================================================== Ljung-Box (Q): 450.09 Jarque-Bera (JB): 0.73 Prob(Q): 0.00 Prob(JB): 0.69 Heteroskedasticity (H): 0.73 Skew: -0.15 Prob(H) (two-sided): 0.24 Kurtosis: 2.90 =================================================================================== Warnings: [1] Covariance matrix calculated using the outer product of gradients (complex-step).

Visually the temperature data (blue) follow the trend given by the log(<CO2 content>) data very well. But the Ljung-Box Test fails strongly: The residual is still highly correlated, which means that there is more systematic information in the data. A more detailled analysis is found in Dr. Reid’s paper.
ARIMA model (1,0,2) without trend¶
This is a strictly stationary ARMA model, 1 lag autoregression, and 2 lags moving average.
analyzeSingleConfiguration(y,t,(1,0,2),[])
BIC: -256.5000701793232 Params: [ 0.9993786 -0.42618278 -0.23084281 0.01060225] Summary: Statespace Model Results ============================================================================== Dep. Variable: y No. Observations: 165 Model: SARIMAX(1, 0, 2) Log Likelihood 138.462 Date: Mon, 14 Sep 2020 AIC -268.924 Time: 14:18:34 BIC -256.500 Sample: 0 HQIC -263.881 - 165 Covariance Type: opg ============================================================================== coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------ ar.L1 0.9994 0.001 675.387 0.000 0.996 1.002 ma.L1 -0.4262 0.068 -6.278 0.000 -0.559 -0.293 ma.L2 -0.2308 0.080 -2.894 0.004 -0.387 -0.074 sigma2 0.0106 0.001 8.055 0.000 0.008 0.013 =================================================================================== Ljung-Box (Q): 38.50 Jarque-Bera (JB): 1.56 Prob(Q): 0.54 Prob(JB): 0.46 Heteroskedasticity (H): 1.04 Skew: 0.06 Prob(H) (two-sided): 0.88 Kurtosis: 2.54 =================================================================================== Warnings: [1] Covariance matrix calculated using the outer product of gradients (complex-step).

The model follows very closely the data, and according to the Ljung-Box test the Null-Hypothesis the residual is independently distributed. The autoregressive parameter is very close to 1. If it was 1, that would mean a unit root associated with a random walk which is physically not meaningful. In his article Dr. Reid discusses this issue at length, and comes to the conclusion that the process is indeed stationary and therefore the autoregressive parameter is slightly smaller than 1.
If the alternative is “deterministic regression with log(CO2)” or “Stochastic model without CO2-forcing” then the Stochastic model ARIMA(1,0,2) is to be preferred based on the significantly smaller BIC value (BIC=-256) compared to the deterministic log(CO2) regression (BIC=-211). In terms of probability this difference is dramatic – the stochastic model is (
) times more probable than the
-based deterministic model.
Best ARMA model: ARIMA(1,0,0) with polynomial drift [1,1,1]¶
The drift term is modelled as ():
analyzeSingleConfiguration(y,t,(1,0,0),[1,1,1])
BIC: -263.02876380347186 Params: [-1.24089579e-01 -1.65247863e-03 2.36624010e-05 5.43354186e-01 1.01856191e-02] Summary: Statespace Model Results ============================================================================== Dep. Variable: y No. Observations: 165 Model: SARIMAX(1, 0, 0) Log Likelihood 144.279 Date: Mon, 14 Sep 2020 AIC -278.558 Time: 14:18:34 BIC -263.029 Sample: 0 HQIC -272.254 - 165 Covariance Type: opg ============================================================================== coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------ intercept -0.1241 0.031 -4.056 0.000 -0.184 -0.064 drift -0.0017 0.001 -2.241 0.025 -0.003 -0.000 trend.2 2.366e-05 5.2e-06 4.552 0.000 1.35e-05 3.39e-05 ar.L1 0.5434 0.063 8.615 0.000 0.420 0.667 sigma2 0.0102 0.001 8.098 0.000 0.008 0.013 =================================================================================== Ljung-Box (Q): 68.57 Jarque-Bera (JB): 0.90 Prob(Q): 0.00 Prob(JB): 0.64 Heteroskedasticity (H): 1.06 Skew: -0.14 Prob(H) (two-sided): 0.83 Kurtosis: 2.76 =================================================================================== Warnings: [1] Covariance matrix calculated using the outer product of gradients (complex-step).

Introducing polynomial drift (thus replacing the MA terms) slightly improves the BIC value (-263), suggesting that there might be a deterministic component in the data.
Combined AR(1) model (ARIMA(1,0,0)) with Regression on constant and CO2¶
analyzeSingleConfiguration(y,t,(1,0,0),[],x)
BIC: -267.866992289424 Params: [-1.61011849e+01 2.78041928e+00 5.55015354e-01 1.01790898e-02] Summary: Statespace Model Results ============================================================================== Dep. Variable: y No. Observations: 165 Model: SARIMAX(1, 0, 0) Log Likelihood 144.145 Date: Mon, 14 Sep 2020 AIC -280.291 Time: 14:18:35 BIC -267.867 Sample: 0 HQIC -275.248 - 165 Covariance Type: opg ============================================================================== coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------ const -16.1012 1.248 -12.901 0.000 -18.547 -13.655 x1 2.7804 0.217 12.801 0.000 2.355 3.206 ar.L1 0.5550 0.062 8.930 0.000 0.433 0.677 sigma2 0.0102 0.001 8.282 0.000 0.008 0.013 =================================================================================== Ljung-Box (Q): 64.22 Jarque-Bera (JB): 0.77 Prob(Q): 0.01 Prob(JB): 0.68 Heteroskedasticity (H): 0.94 Skew: -0.11 Prob(H) (two-sided): 0.81 Kurtosis: 2.74 =================================================================================== Warnings: [1] Covariance matrix calculated using the outer product of gradients (complex-step).

The absolute best fit based on the BIC criterion of -267 is achieved by the combination of an AR(1) process (ARIMA (1,0,0)) and a deterministic regression of log(CO2). It is astonishing that the regression parameter of has hardly changed by mixing with the AR model. This will be discussed in more detail below.
Combined ARMA model (ARIMA(1,0,2)) with Regression on constant and CO2
analyzeSingleConfiguration(y,t,(1,0,2),[],x)
BIC: -265.8040141626335 Params: [-1.60783099e+01 2.77713548e+00 9.19541699e-01 -4.34644409e-01 -2.40173341e-01 9.67862586e-03] Summary: Statespace Model Results ============================================================================== Dep. Variable: y No. Observations: 165 Model: SARIMAX(1, 0, 2) Log Likelihood 148.220 Date: Tue, 15 Sep 2020 AIC -284.440 Time: 15:56:54 BIC -265.804 Sample: 0 HQIC -276.875 - 165 Covariance Type: opg ============================================================================== coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------ const -16.0783 1.966 -8.177 0.000 -19.932 -12.224 x1 2.7771 0.342 8.130 0.000 2.108 3.447 ar.L1 0.9195 0.061 15.176 0.000 0.801 1.038 ma.L1 -0.4346 0.098 -4.428 0.000 -0.627 -0.242 ma.L2 -0.2402 0.090 -2.663 0.008 -0.417 -0.063 sigma2 0.0097 0.001 8.110 0.000 0.007 0.012 =================================================================================== Ljung-Box (Q): 44.23 Jarque-Bera (JB): 0.87 Prob(Q): 0.30 Prob(JB): 0.65 Heteroskedasticity (H): 0.89 Skew: -0.03 Prob(H) (two-sided): 0.66 Kurtosis: 2.65 =================================================================================== Warnings: [1] Covariance matrix calculated using the outer product of gradients (complex-step).

Combining the ARMA model ARIMA(1,0,2) with the log CO2 regression as an external variable further reduces the log likelihood while slightly increasing the BIC value. But we have a contradicting result, preferring ARIMA(1,0,2), when we apply the AIC criterion, which gives the log likelihood more weight than the BIC criterion. With regards to the HQIC criterion, both results are nearly equal.
Facing the fact that the MA-parameters, which characterize all other (weather) forcings of the climate system, must be identital for both longterm and shorterm time series, we conclude from Dr. Reid’s analysis of the GISP2 data set, that it is preferable to use the ARIMA(1,0,2) model for HadCRUT4 data.
The improvement from adding the deterministic CO2 term indicates that the CO2 rise indeed contributes to the rise in temperatures. The strange fact that the estimated weight parameter does hardly change by mixing the deterministic model with the stochastic ARMA model, indicates that the way the software handles exogeneous parameters, may be incorrect
Adapting the ARMAX model for estimating CO2 sensitivity
The ARMAX related Algorithm above of the Python statsmodels package has a serious drawback. The mode of operation is that the deterministic model of the exogeneous variables is executed first, and the ARMA model is modelling only the residuals:
This gives the exogeneous variables an unjustified high weight, ignoring the forcing caused by the ARMA process. This problems has been discussed in the article The ARIMAX model muddle, where this “standard” ARIMAX treatment is called “Regression with ARMA errors”. The proper treatment is noted in this article as:
Unfortunately this “true” ARMAX model is not fully implemented in the standard Python “statsmodels” package. Therefore the required least squares models are built explizitely and they are estimated by the Python package “statsmodels.regression.linear_model.OLS”, following the textbook Time series, chapter 4 Estimation of linear models, the “Hannan-Rissanen method” (4.3.1):
Running new model with ARMA(1,2) and CO2
exogModel=[co2,[1]*len(co2)] testEvaluation(estimateArmax(y,4,1,2,exogModel))

Discussion of the results:
25.9.2020
The Ljung-Box test, based on the (arbitrary) maximum lag of 16, has a Q-Value of 14.8 at a p-value of 0.54, which means that the Null-Hypothesis of an independently distributed residuum cannot be rejected. The total residual variance is 0.0099 and the model quality criterion BIC is -257.
The estimated weight of x1, (representing the log(CO2-content)) is 0.49, which results in a CO2-doubling sensitiviy of 0.34 degree C. This is remarkable, because it means a sensitivity reduction of approximately a factor 7 when comparing to the model that did not take into account the ARMA process.
It needs to be noted, that the estimated standard error of x1 with 0.43 is very large, which means that according to the T-test the probability of a significant deviation of CO2-sensitivity from 0 is only 25%. The same argument applies to the constant. We will need to test what happens when discarding the constant from the model.
All 3 ARMA parameters (1,2) are statistically significant.
The algorithm detected a collinearity in the data model. This needs to be handled. The likely reason is that the constant and the CO2 data are considered to be nearly parallel, because due to the offset of the CO2 data (appr. 5.5) and their small variability they are nearly parallel to the constant. The mitigation is by making the CO2-data zero-mean:
Testing new model with ARMA(1,2) and zero-mean CO2 data and global constant, no drift
#Model: zero mean CO2 and constant exogModel=[co2-np.mean(co2),[1]*len(co2)] testEvaluation(estimateArmax(y,4,1,2,exogModel))

Discussion of the results:
The condition number has strongly improved by making the CO2-data zero-mean. Otherwise most relevant estimates have not changed – which is to be expected. The constant is now very close to 0 and has become even more insignificant.