The Darwinian Mind of the Machine: Rethinking LLM Training as Evolution
DOI:
https://doi.org/10.7557/12.8313Keywords:
LLMs, language acquisition, grammar, evolution, poverty of the stimulus, evolutionary competenceAbstract
This essay challenges the prevailing metaphor of “learning” used to describe Large Language Model (LLM) training, proposing instead that these systems represent a form of hyper-accelerated, data-driven evolution. Through analysis of Daniel Dennett’s hierarchy of evolutionary competence and examination of the poverty of the stimulus problem, we argue that LLMs are Darwinian creatures evolved at computational speeds in environments of pure text. This framework explains their linguistic capabilities through convergent evolution rather than learning, resolves paradoxes about their competence without understanding, and for our understanding of the relevance for these models for generative grammar.
References
Attah, Nuhu Osman. 2025. Do language models lack communicative intentions? Synthese 205 187. https://doi.org/10.1007/s11229-025-05022-6.
Baker, Mark C. 2008. The macroparameter in a microparametric world. Linguistic Analysis 34 1–2: 1–46. https://doi.org/10.1075/la.132.16bak.
Battaglia, PeterW., Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinícius Flores Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, C¸ aglar G¨ulc¸ehre, H. Francis Song, Andrew J. Ballard, Justin Gilmer, George E. Dahl, Ashish Vaswani, Kelsey Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matt
Botvinick, Oriol Vinyals, Yujia Li, and Razvan Pascanu. 2018. Relational inductive biases, deep learning, and graph networks. arXiv preprint 1806.01261. https://doi.org/10.48550/arXiv.1806.01261.
Bowdon, Chris. 2025. How many parameters does GPT-5 have? Available at https://www.r-bloggers.com/2025/08/how-many-parameters-does-gpt-5-have/, accessed 2025.
Brown, Tom, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901. Curran Associates.
Browning, Jacob. 2025. Intentionality all-stars redux: Do language models know what they are talking about? In Communicating with AI: Philosophical Perspectives, edited by Herman Cappelen and Rachel Sterken. Oxford University Press, Oxford. Preprint available at https://philarchive.org/rec/BROIAR-4.
Chomsky, Noam. 1959. Review of B. F. Skinner’s Verbal Behavior. Language 35: 26–58.
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. MIT Press, Cambridge, Ma.
Chomsky, Noam. 1981. Lectures on Government and Binding: The Pisa Lectures. Foris, Dordrecht. https://doi.org/10.1515/9783110884166.
Chomsky, Noam. 1986. Knowledge of Language: Its Nature, Origin, and Use. Praeger, New York.
Chomsky, Noam. 1995. The Minimalist Program. MIT Press, Cambridge, Ma. https://doi.org/10.7551/mitpress/9780262527347.001.0001.
Chomsky, Noam. 2001. Derivation by phase. In Ken Hale: A Life in Language, edited by Michael Kenstowicz, no. 36 in Current Studies in Linguistics, pp. 1–52. MIT Press, Cambridge, Ma. https://doi.org/10.7551/mitpress/4056.003.0004.
Chomsky, Noam. 2005. Three factors in language design. Linguistic Inquiry 36 1: 1–22. https://doi.org/10.1162/0024389052993655.
Chomsky, Noam and Robert C. Berwick. 2017. Why Only Us: Language and Evolution. MIT Press, Cambridge, Ma. https://doi.org/10.7551/mitpress/9780262034241.001.0001.
Dennett, Daniel C. 2017. From Bacteria to Bach and Back: The Evolution of Minds. W. W. Norton & Company, New York.
Diester, Ilka, Miklíos Bartos, Jörg Bödecker, Andreas Kortylewski, Christian Leibold, Johannes Letzkus, Mohamed M. Nour, Matthias M. Schönauer, Alexander Straw, Andreas Vlachos, and Thomas Brox. 2024. Internal world models in humans, animals, and AI. Neuron 112 16: 2661–2824. https://doi.org/10.1016/j.neuron.2024.06.019.
Giorgi, Alessandra and Giuseppe Longobardi. 1991. The Syntax of Noun Phrases: Configuration, Parameters and Empty Categories. No. 57 in Cambridge Studies in Linguistics. Cambridge University Press, Cambridge.
Gregory, Richard L. 1963. Distortion of visual space as inappropriate constancy scaling. Nature 199: 678–680. https://doi.org/10.1038/199678a0.
Gregory, Richard L. 1970. The Intelligent Eye. Weidenfeld & Nicolson, London.
Hao, Shibo, Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, Daisy Zhe Wang, and Zhiting Hu. 2023. Reasoning with language model is planning with world model. arXiv preprint 2305.14992. https://doi.org/10.48550/arXiv.2305.14992.
Hu, Jennifer, Kyle Mahowald, Gary Lupyan, Anna Ivanova, and Roger Levy. 2024. Language models align with human judgments on key grammatical constructions. Proceedings of the National Academy of Sciences 121 36. https://doi.org/10.1073/pnas.2400917121.
Intuition Lab. 2024. Mechanistic interpretability: Understanding AI and LLMs. Available at https://intuitionlabs.ai/articles/mechanistic-interpretability-ai-llms, accessed 2025.
Jackendoff, Ray. 1977. X-bar Syntax: A Study of Phrase Structure. Linguistic Inquiry Monographs. MIT Press, Cambridge, Ma.
Jin, Charles and Martin Rinard. 2024. Emergent representations of program semantics in language models trained on programs. arXiv preprint 2305.11169. https://doi.org/10.48550/arXiv.2305.11169.
Kayne, Richard S. 1994. The Antisymmetry of Syntax. No. 25 in Linguistic Inquiry Monographs. MIT Press, Cambridge, Ma.
McCoy, R. Thomas, Robert Frank, and Tal Linzen. 2020. Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks. Transactions of the Association for Computational Linguistics 8: 125–140. https://doi.org/10.1162/tacl_a_00304.
Moro, Andrea. 2016. Impossible Languages. MIT Press, Cambridge, Ma. https://doi.org/10.7551/mitpress/9780262034890.001.0001.
Moro, Andrea. 2025. Linguistics in a battlefield. A short note on syntax and the “Newtonian style of research”. Available at https://ling.auf.net/lingbuzz/008827.
Mulders, Iris and Eddy Ruys. 2024. ChatGPT as an informant. Nota Bene 1 2: 242–260. https://doi.org/10.1075/nb.00015.mul.
Müller, Stefan. 2025. Large language models: The best linguistic theory, a wrong linguistic theory, or no theory at all? Journal of the Linguistic Society of Germany 44 1. https://doi.org/10.18148/zs/2025-2001.
Murphy, Elliot, Evelina Leivada, Vittoria Dentella, Fritz Gunther, and Gary Marcus. 2025. Fundamental principles of linguistic structure are not represented by o3. arXiv preprint 2502.10934. https://doi.org/10.48550/arXiv.2502.10934.
Nalpas, Maud. 2024. LLM sizes. Available at https://web.dev/articles/llm-sizes, accessed 2025.
Ngaihlian, Dorothy. 2025. Machine learning algorithms: Simulating intentionality in artificial intelligence. https://doi.org/10.2139/ssrn.5271061.
Ouyang, Long, JeffreyWu, Xu Jiang, Diogo Almeida, CarrollWainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F. Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, vol. 35, pp. 27730–27744. Curran Associates.
Piantadosi, Steven T. 2024. Modern language models refute Chomsky’s approach to language. In From Fieldwork to Linguistic Theory: A Tribute to Dan Everett, edited by Edward Gibson and Moshe Poliak, no. 15 in Empirically Oriented Theoretical Morphology and Syntax, pp. 353–414. Language Science Press, Berlin. https://doi.org/10.5281/zenodo.12665933.
Piantadosi, Steven T. and Yuan Yang. 2022. Reply to Murphy and Leivada: Program induction can learn language. Proceedings of the National Academy of Sciences 119 23: e2202925119. https://doi.org/10.1073/pnas.2202925119.
Popper, Karl. 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. Routledge, London.
Popper, Karl. 1972. Objective Knowledge: An Evolutionary Approach. Oxford University Press, Oxford.
Qiu, Zhuang, Xufeng Duan, and Zhenguang G. Cai. 2024. Evaluating grammatical well-formedness in large language models: A comparative study with human judgments. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pp. 189–198. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.cmcl-1.16.
Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. Tech. rep., OpenAI.
Skinner, B. F. 1938. The Behavior of Organisms: An Experimental Analysis. Appleton-Century-Crofts, New York.
Skinner, B. F. 1953. Science and Human Behavior. Macmillan, New York.
Stowell, Timothy. 1981. Origins of Phrase Structure. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, Ma.
Tak, Ala N., Amin Banayeeanzade, Anahita Bolourani, Mina Kian, Robin Jia, and Jonathan Gratch. 2025. Mechanistic interpretability of emotion inference in large language models. In Findings of the Association for Computational Linguistics: ACL 2025, pp. 13090–13120. Association for Computational Linguistics, Vienna. https://doi.org/10.18653/v1/2025.findings-acl.679.
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, vol. 30. Curran Associates.
Yildirim, Ilker and L.A. Paul. 2024. From task structures to world models: what do LLMs know? Trends in Cognitive Sciences 28 5: 404–415. https://doi.org/10.1016/j.tics.2024.02.008.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Marc van Oostendorp, Roberta D'Alessandro

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.