Will Generative Models be the Redeemer of Macroeconomic Forecasting?
If you are in a hurry, the short answer is no, at least, not by using traditional data.
Some time ago, a student of mine suggested the idea of generating fake macroeconomic time series data using GANs, supplementing it to the real data, and hoping for forecasting improvements. I was originally dismissive of the idea. Surely, the limited length of available time series data is at the heart of macroeconomic forecasting failures, but the solution sounds too good to be true.
Reflecting on this again upon seeing this paper, let me indulge in recycling my old thoughts here. What is the statistical mechanism at work behind it? Do we really need GANs or sophisticated NNs for that matter? This is not a discussion of the aforementioned paper, but some generic thoughts on the usage of GANs in macro forecasting.
Applying them to the well-known curated FRED data sets, GANs supposedly learn a nonparametric multivariate joint density of the data. But with macroeconomic tabular data, we could very well do such a thing with a nonlinear factor model — or simply a linear one with many factors (which can capture some the nonlinearities). Steps:
Estimate the factor model;
Shuffle residuals;
Add them back to get one set of “fake” data;
Repeat.
As should be clear from the above, there are many other models that could be used for that purpose. If nonlinearities are crucial, maybe Autoencoders are what one should use. Maybe factors should be dynamic.
Using the above suggestion as a guiding light in the dark, let’s think about how fake data can help. In fact, we don’t need to get overly creative: it helps on the variance side of things by shrinking the final model towards the restriction that the multivariate data has been generated by a particular density. This density is already implicit within the real data, but the fake data make it smoother. In essence, this is very close in philosophy to data augmentation, another popular regularization technique to avoid NN producing highly discontinuous predictions as a function of X.
Do we have other techniques for creating new samples in a nonparametric way from the original X matrix? Yes, Bagging. But given that GAN-simulated data comes from a fully-articulated model of the joint density, it is presumably smoother and covers values that the original X does not have. In any case, here are three questions that should be answered before embracing generative models as regularizing agents for forecasting models.
Are GANs even a decent model of the joint density of economic variables, especially keeping in mind that this is straight up tabular data?
Do we really need the sort of regularization described above, and if so, in what models? NNs are obvious contenders, trees not quite. It is also not hard to think about models into which this fake data would be redundant.
Even for NNs, how is that regularization better than early stopping, which is akin to l2 shrinkage (and that is known to shrink the model to one that only uses a few principal components
Thus, there are many open questions to answer before jumping on the Generative bandwagon. This is not to say that such models have no use, of course, as this very text has been sharpened using the help of one particularly illustrious generative model.