Comparative Quantitative Reasoning Part 2
Why general reasoning is a powerful tool in economics.
Natural Language Is The Most Powerful Language For Mathematics
One of the things that surprised me while working towards a mathematics undergrad(which I did not complete), was the use of English in mathematical proof writing. The logic of using general English, over some other symbolic or formal language, is that mathematics employs a broad variety ideas and principles, and only a general purpose, natural language like English is well suited for the full breadth of concepts and principles that might be required in mathematical proofs.
This is not to say that mathematical proofs never use statements expressed in a formal language such as logic or set theory, instead, plain English(or another natural language), becomes the glue and the universal standard for communicating and developing the ideas of mathematical proof.
I initially was a computer science major, and was a teaching assistant for a 2nd year discrete mathematics class, in the computer science department. After passing the class with excellent grades, especially on the test material, I had the opportunity to grade and tutor that material to other students as a paid position, which really solidified my knowledge of the subject matter. But more relevant to the topic at hand, this background provided an alternative and parallel introduction to to logic and set theory, which is foundational in mathematics and mathematical proofs as well.
One very fascinating part of that CS course, was building an interpreter for a formal language called “datalog”, which is a query language, but focused specifically on inference of facts using rules of logic.
The most well known example of a query language, would probably be SQL, which is very useful because it allows you to work with structured relational data. Datalog as a query language has a very different goal: it uses a fixed point algorithm, repeating a process until it stops changing the results, to infer a set of facts or conclusions from a set of initial statements.
A common example in Datalog, would be the inference of family relationships. If Fred is the son of George, and George is the son of Tim, then Tim is Fred’s grandfather, and Fred is Tim’s grandson. Basically, you can apply simple logical rules in this manner, until you can no longer derive any new additional facts.
This kind of open ended logic might seem very important or powerful, and early researcher in artificial intelligence treated it as such. But this approach of building and applying systems of formal logic, became disfavored over time in AI research compared to statistical machine learning models. In contrast with formal verification systems and languages(of which Datalog is a very simple example), modern machine learning is essentially automated large scale statistical inference, drawing potentially flawed conclusions, but able to build its own models rather than simply computing an extension of a well defined logical system with predefined rules.
The power of statistical machine learning, is that it is able to discover and apply its own set of rules, using data and observation, without a need to formally specify or define the rules beforehand. This is much closer to the human experience of learning, where through experience and feedback we assimilate rules such as grammar, etiquette, or other cultural norms.
This emphasis on statistical inferences over extension of basic logical rules, makes a lot of sense in AI research, both because it is a more intuitive way for humans to interact with machines and algorithms, and also because it is much more flexible and resilient: it can often self correct problematic inputs or design, or even noisy data with errors and mistakes.
In comparison, formal systems that apply logical rules to extend a foundational set of facts require complete precision and accuracy, otherwise they do not work at all.
To summarize, statistical machine learning has the following advantages over formal verification systems:
It is more intuitive to understand how it works and what it is doing
It can directly leverage unformatted natural data sources—although labelled data can be preferable at times, labeling data is still much simpler task than translating into correct formal statements.
It is more resilient and flexible in dealing with errors and bad inputs.
It is easier to adapt and reapply to different kinds of problems or processes
In contrast, while a system of logical deduction is capable of being mathematically flawless, it also requires complete correctness, in building and defining the system, and therefore has much less flexibility and adaptability.
Application of Statistical Machine Learning: Engines vs Open-Ended Models
Even if we decide to use statistical machine learning as our preferred tool, there are at least a couple very different techniques to structure or constrain these system. For this article I am going to use two different terms to describe two possible extremes on how this might work. Statistical ML can either be embedded in an “engine” or simply be an “open ended model”.
The best example of an “engine” would be a chess engine. AlphaZero was an early example of a chess engine that used neural nets, a learning technique, rather than simply algorithmic search and ranking of moves. This means we can use ML techniques like neural nets to develop the decision making processes, but the “engine” is still constrained by a level of formal logic, as to what it can even consider in the first place. Such a chess engine will never consider or process illegal or invalid moves, because the formal rules have been embedded into its design, rather than simply expecting the system to learn and apply the formal rules correctly from its training process.
To summarize, an “engine” has a layer of formal logic to ensure that the outputs are valid, whereas what I am calling an “open ended model”, relies only on the training process itself to attempt to generate valid outputs.
The flaws and weaknesses of an open-ended model is demonstrated humorously by examples like this where an LLM like ChatGPT, is asked to play chess against a more constrained engine:
AI includes both open models and formal engines
It is an unfortunate regression that people have begun to associate “AI”, almost exclusively with open models like generative AI, chatGPT, etc. The problem with relying on the learning process to regulate valid outputs and following formal rules, is that it is much less reliable. Chess is a perfect example how the manner of play can exhibit creativity and unexpected thinking, but the rules of play are strictly defined and not open to creative interpretation.
An engine can in many cases achieve the best of both worlds: it has a the rigor and correctness of a formal logical system, but the flexibility and ease of use that is achieved through statistical models.
By separating the rules of what is valid, from the principles for optimizing decision making, you can have formal correctness with a small set of simple rules, and training flexbility.
Comparative Quantitative Reasoning: We Can Solve Problems Like an AI Engine
The power of comparative quantitative reasoning, is it allows us to use our intelligence to solve problems the same way an AI engine would. We have a small set of rules to check if something is formally valid, but can leverage a broad set of principles for our decision making.
Again, the simplest example of the power of comparative reasoning is the classic game “20 questions”. We use broad categories to rapidly eliminate choices and hone in on the best guesses.
Another example, would be how we play strategy games. We use the formal rules to check if moves are valid, but we use creative “guesswork” or intuition to guide our move preferences.
The synthesis of both formal logic and flexible application of concepts and principles, is what makes this approach so powerful.
For many trained in formal academics, this can be very unsatisfying, or even viewed as lacking rigor. But I think that is a mistake. Highly skilled chess players are able, with lots of training and feedback and skill, able to use essentially open comparative reasoning to find the best move.
Chess and similar strategy games benefit from a clear measurement of performance, whereas in academic disciplines it is much more difficult to assess relative performances. This makes it much more difficult and much less repeatable to develop and confirm mastery. But it is still possible.
The difficult of these disciplines is that there is not good feedback of relative skill and performance, other than peer assessment.
Academics relies on peer review to judge scholarly work. But this is much less reliable than competitive play like what occurs in games like chess. So it is very difficult to ensure that this process of peer review actually is able to identify skilled work and the best performance.
Instead, it is much easier for it to optimize for trivial and superficial assessment, for example, the use of formal mathematical models. The problem with this, as I have explained, is it leads to too much specificity, which means inefficient search or consideration of new ideas.
If what people needed in economics discourse was mathematical specificity, then it would make sense to spend the bulk of our time and effort in pursuit of building explicit mathematical models. And again, this is still a useful exercise, and can develop general skill, the same way exercise like weight lifting develops skills. But it is not the most expedient or direct ways to refine broad competing ideas, which needs to be done at the level of concepts and principles.
One simple example of how to do comparative quantitative reasoning, is in playing strategy games, we only need to do enough reasoning to eliminate a choice, not completely evaluate its expected value.
In a game like poker, assuming we are not counting cards, drawing two aces is better than drawing two kings, even though both are very strong. This is an example where you do not have to calculate exact odds, to assess relative strength.
Perhaps this is a poor example, because poker players would do best to be able to know exact expected values. But for strategy games like magic the gathering or hearthstone, computing exact expected values is much less accessible. So most people would be better off learning to apply comparative reasoning: under reasonable assumptions, does playing card A or card B better improve my odds of winning. The common principle of “win more” cards is critically important in this process. A “win more” card, is a card that would appear very beneficial, but often can only be used if you already have some degree of advantage. Examples might be powerful combinations or tools that require a lot of resources. Even if their impact is large, a card needs to be able to tip you from a losing position to a winning position to be useful.
A lot of economics work seems to fall into this “win more” trap. There are a lot of assumptions made. Just like a magic the gathering card or hearthstone card that requires good conditions to play, highly specific econ models require you to already have a strong formal description and understanding of the economy.
If an economic model only helps you when you already have an accurate and complete description of what is happening in the economy, then it is just another example of this “win more” trap, common in collectible card games. It has no swing potential to help you when you need it, so it becomes a performative flourish, rather than a useful tool.
But this critique is not limited to mainstream economics. It can apply to many heterodox schools whether Georgism, Austrian school, or even MMT and Post-keynesians. It is something we should all consider.
What I define as “mainstream” economics, is that it is adapted for messaging to the general public, the finance profession, and political policy makers. This in fact tends to greatly constrain and greatly simplify what ideas and policies it can look.
Heterodox, in a sense has the luxury of not being constrained by wide acceptance. While we often talk about heterodox as being disadvantaged by its obscurity and lack of general support, this actually gives it a lot of flexibility. Again, this is not even the same thing as academic rigor, or intellectual merit. All three of these are separate issues: public support and acceptance, academic rigor, and intellectual merit.
My argument is that in times of reform or change, academic rigor temporarily becomes a second priority to the other two matters. But it is never irrelevant. Even as non-professionals we like and want more rigor and work. It may not be the most accessible or efficient use of our time, and we must pick our battles. But the goal of “20 questions” is not to stay general and broad in our inquiries forever. As important as that may early on, the eventual goal is specificity and rigor. As we have more resources and time to dedicate toward this goal, and better clarification of foundational concepts we can begin to move in that direction.
