Imagine a world where you can choose to study machine learning, not by taking an extra set of courses during your computer science degree, but by choosing to become a machine learning scientist. Imagine a world where machine learning isn't just a branch of computer science, but also a pillar of physics. Imagine looking at your curriculum, finding geometric deep learning, singular learning theory, categorical deep learning, or maybe a specialized course in neuromanifolds?
I am thinking about this future where AI has stopped being a subfield of engineering and has become a cornerstone of science. Of course no one knows exactly what that science will consist of–maybe a science of learning systems in general? Maybe a ”science of backpropagating learning systems"? Something much broader and overarching than what we currently think machine learning is all about?–but whatever it is, I believe it is coming.
I started thinking about this because I myself fell victim to a specific trap in theory-building. I want to tell you about this trap, starting with a statement: Stop trying to subsume and start proving boundaries. There are currently numerous theories being developed aimed at ”explaining AI”, ”providing a scientific theory of deep learning”, ”proving grokking and double-descent”, and it can be overwhelming what to make out of all of these competing theories. That is, which one is better? Which one is a subsumption of the other? Which one is more general? Which one will win?!
But I realize, while it is important to know about a theory's generalizations, it is more important to know the boundaries of the theory, what it can do, what it can explain, what predictions it can make, and what it definitively cannot do! Moreover, if we truly want to move towards a scientific field of AI, then we shouldn’t focus too much on which one subsume which. Instead, we should focus on truly understanding which one can be used to understand which empirical phenomena.
We didn’t throw away Newton’s theory of mechanics and Galilean’s theory of motion just because Einstein generalized some parts of them. We use them in different contexts, to understand different parts of our world!
So stop trying to subsume and start proving boundaries, so future machine learning scientists will know when to use what theory. Subsumption may satisfy the ego and the desire for elegance (finding the one abstraction that explains everything). But boundaries are useful. Knowing exactly where a theory breaks down and where other theories "take over" is what makes science actionable.