Algorithmic Compression via Pretrained Neural Networks
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis manuscript is very well written, providing many useful insights connecting Artificial Intelligence (mostly regarding LLM-based approaches) with Algorithmic Information Theory. Although far from being trivial, it is written in a way that will be accessible to those having introductory level knowledge of both fields and, therefore, it will be appealing to a broad audience. Moreover, it includes a list of open problems that certainly will motivate the research community in pursuing inquiry in these areas.
I've only a couple of remarks/questions:
- In line 145, the inequality sign is correct? In my understanding, since (7) is a summation of positive terms, then the sum should be larger than any of its terms...
- In line 330, shouldn't be "...did not produce an output after s steps..." instead of "...did not produce an output after L steps..."?
Author Response
We thank the reviewer for the careful review, and are very pleased to hear that the reviewer found our manuscript to provide many useful insights and is accessible yet non-trivial. Additionally we want to thank the reviewer for pointing out two mistakes, which we have corrected in our updated manuscript (changes highlighted in blue):
- Line 145: the reviewer is entirely correct that the inequality is the other way round (meaning a lower, not an upper bound). Apologies for this oversight on our end.
- Line 330: it is correct that the phrase should be: “... did not produce an output after s steps”.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper presents a high-quality, timely, and insightful synthesis of a growing body of work that bridges the rigorous, theoretical frameworks of Algorithmic Information Theory (AIT) and Universal Artificial Intelligence (UAI) with the remarkable empirical success of modern large-scale sequence models. The core argument—that next-token prediction on massive, diverse datasets implicitly performs algorithmic compression and amortized Bayesian inference—is compellingly articulated and well-supported. The manuscript successfully fulfills its aim of providing a theoretical grounding for understanding the emergent capabilities of large language models (LLMs), moving beyond mere statistical pattern matching. The quality of writing, logical structure, and depth of analysis are excellent, making it a valuable contribution to the field.
Author Response
We thank the reviewer for their careful assessment of our work, and are happy to hear that the reviewer finds the manuscript to be a valuable contribution to the field, and to be compellingly articulated, well supported, and of high quality.

