The gap between the rigorous work in categorical deep learning (CDL) and its adoption by "mainstream AI" comes down to the difference between a description and prediction.
The industry is driven by empirical scaling and hardware optimization. Big AI companies rely on highly optimized operations and engineering filled with practical hacks that just "make stuff work". Right now, categorical deep learning (and I assume other theories aiming at making AI as a whole more scientific and explainable) mostly acts as a translation layer–it rigorously describes existing architectures and other invariant components of deep learning, but it hasn't yet yielded a strictly superior architecture that outperforms current baselines or made a novel prediction that changed our understanding about deep learning.
If the gap is to be closed, then we should be able to derive a law, prove it mathematically, while helping both a mathematician trying to understand the behavior of learning systems and an AI engineer at a top tech company trying to build a safer and faster model.
Until our abstractions can directly address the cost of training, the bounds of generalization, optimize resource scaling, or predict emergent capabilities before they happen, the mainstream industry will stick to what empirically works (rightfully so). Building that bridge between rigorous abstraction and practical utility is a necessary next step.