As empirical economists, how can we get the maximum from machines? We know that they do the work. Facebook offers ads, articles according to your interests, Siri understands voices, and Google translates everything. Doing these ‘intelligent” things is statistical and computational. Tackling empirically the tasks made it all possible. Let’s see how we can use the latest technology!
“Prediction, perhaps because of its model-free nature, is an area where algorithmic developments have run far ahead of their inferential justification.” – Efron & Hastie (2016, p. 209)
The old, traditional approach in econometrics specifies a target, then estimates that it is a functional part of the data. This target is very often a parameter of a statistical model, which can be a finite or infinite set. The researchers then focus on the quality of the estimators of the target, measured through the large sample.
However, in the machine learning sphere, the focus shifts to developing algorithms. The goal here is to make predictions about the variables given or to classify units based on limited information. Example: to classify handwritten digits based on pixel values.
- Validation and Cross-Validation
Here the task of the researcher is to estimate the unknown parameters of this model. The emphasis is placed on doing this estimation step efficiently. Most often operationalised through definitions of large sample efficiency.
- Overfitting, Regularisation, and Tuning Parameters
“Regularisation theory was one of the first signs of the existence of intelligent inference.” – Vapnik (2013, p. 9) writes. Minimising the sum of squared residuals in the least-squares regression setting, maximising the logarithm of the likelihood function – penalises the complexity of the model.
- Sparsity
Sparsity is stronger than necessary – we can say that in many cases it is sufficient to have approximate sparsity. However, most of the explanatory variables may have very limited explanatory power.
- Computational Issues and Scalability
The machine learning literature is more bothered with computational issues and the capacity to implement estimation methods with large data sets. Those solutions that may not scale well to large data sets are discarded in favour of methods that can be implemented easily in very large data sets.
- Ensemble Methods and Model Averaging
One more crucial feature of the machine learning literature is the use of model averaging and ensemble methods. Many times a single model or algorithm may not perform well. Mixture models may be used to blend different parameter values in a single prediction.
How machine learning can be applied
Machine learning algorithms provide a powerful, flexible way of making quality predictions, but they have a weakness: the absence of strong and mostly unverifiable assumptions. They don’t produce stable estimates of the underlying parameters.
Satellites have been taking images of the earth for decades. The literature provides us with information about economics using satellite data to estimate future harvest sizes. These images do not contain measures of crop yield, but they do provide a large x vector of image-based data, and then these images are matched to yield data. So, the satellite images give us a prediction problem. Machine learning is the essential tool here. This way, we get economically meaningful signals from the data.
On the other hand, the second category of application is in tasks that we approach as estimation issues. When an empirical economist is interested in a parameter, the approach he uses to recover the parameter can contain a prediction component. The first phase is handled as an estimation step. In conclusion, this is a prediction task – only the coefficients in the first stage are means to these values.
Takeaway
Machine learning for the modern economist offers unending ways to use the latest intelligent models. In this article, we talked about the methods and ways in which machine learning can be used by experts. In the long run, they provide a solution for the problems we face in businesses. These tools might increase the outlook of our work. They can deliver new predictions, new methods and allow us to focus on new questions!
Our articles explore the current trends of economics and artificial intelligence. If you like this blog, please don’t forget to share it with a friend!
Resources
https://pubs.aeaweb.org/doi/pdfplus/10.1257/jep.31.2.87
https://www.annualreviews.org/doi/full/10.1146/annurev-economics-080217-053433#_i3
https://towardsdatascience.com/10-machine-learning-methods-that-every-data-scientist-should-know-3cc96e0eeee9