3 min read

Notes on artificial understanding

I went to a lecture and took some notes.
Notes on artificial understanding
Picture generated with Gemini 2.5 Pro

Today I had the pleasure of visiting a lecture by Melanie Mitchell, as part of the Amsterdam Lectures in AI and Society. She spoke about the elusive question of understanding in machine cognition, which I'd argue is a necessary condition for true reasoning. Some friends asked if could share my notes, which I figured could be done via this blog, as the topic fits well with the overall themes here.

Notes

Mitchell started out by noting some issues plaguing AI systems, such as brittleness (the breakdown of performance in the face of unusual data) and shortcut learning (leveraging patterns in training data that are irrelevant in real-world contexts). Both failure modes can be explained by these systems not understanding the content of their training data.

Are LLMs different? There's various conjecture about this (such as here and here), but probably the first question should be: how would we evaluate understanding? Mitchell mentions four approaches:

  1. Looking at their behaviour (or what I've here called the utterance level). This is Turing Test-type stuff. Mitchell says it's problematic, because we tend to ascribe mental states to linguistic communication, a point also often made by Elizabeth Bender, as recently discussed).
  2. Testing performance on benchmarks that assess understanding. This is done quite often and LLMs can outperform humans on several benchmarks. However, Mitchell mentioned we don't know whether the benchmarks are valid measures of understanding (also see key point 1), and that performance can be misleading because of data contamination, shortcuts and approximate retrieval.
  3. Probe for emergent world models. In what I suppose is a variation on Mental Models Theory, there is the idea that an LLM would show understanding if a world model could be delineated that is used for reasoning steps. This blog discussed one such attempt before, which Mitchell criticised. For details, see her Substack on the topic.
  4. Adapt experimental methods from cognitive science. This is the approach that Mitchell herself prefers: do rigorous work on LLM performance under experimentally controlled conditions.

She then showed how she checked the robustness of analogical reasoning, comparing humans and machines. In response to the claim that LLMs can do zero-shot analogical reasoning, she and Martha Lewis did controlled experiments with letter-string analogies, while being vigilant about the models taking shortcuts. She showed evidence of the LLMs failing to understand analogies, rather focusing on surface similarities and artefacts in training and testing date (e.g., the question order in an analogous story-task was not balanced, and GPT learned to just pick the first out of two options when giving a response).

Briefly put, LLMs are often better on solving reasoning tasks similar to the ones in training data, which suggest failure of abstract understanding. This is similar to their persistent failure on variations of the Wolf, Cabbage and Goat problem.

Comments

During the Q&A, there was some focus on what would be needed to make artificial systems that do display understanding. Some interesting remarks there:

  • Mitchell, much like Gary Marcus, believes neurosymbolic AI would provide a path forward to machine cognition showing understanding.
  • She also mentioned that the LLM capacity for memory is so large, that there is no incentive to compress. I am not sure if she meant to imply that understanding is a consequence of compression (I guess you could make the point that mental models are a reduced representation).
  • Speaking of Mental Model Theory: that theory hinges on the co-opting of visuo-spatial and motor processing. Interestingly enough, Mitchell mentioned that she would see a spatial reasoning module as an important addition to achieve understanding in machines, e.g. by using neural networks that are similar to hippocampal circuits. I am not sure why her reasoning brings in spatial reasoning.
  • Another thing she mentioned is episodic memory (which I should add has also traditionally been associated with hippocampal functioning), both as a compressed form of memory and in the context of a self being necessary for understanding.
  • This mention of the self stuck with me: I think that a good definition of understanding is ultimately relational and about how an organism can act on information. This presupposes a self-world distinction. I am not sure whether this is why Mitchell mentioned it, but I do myself think embodiment is a prerequisite for understanding. Mitchell did also expand on artificial systems needing to learn as babies do, which I would say is building concepts from the ground up in service of manipulating the world (including other people).
  • She mentioned that complexity science can inform the architecture of artificial cognitive systems, as the complex dynamical systems view of the brain emphasises feedback processes, top-down regulation, modularity and hierarchical structure. A transformer network does not have this architecture by itself.

Modular LLMs

From what I've understood elsewhere, commercial parties are working with hierarchically embedded LLMs, so that tasks are routed to models that have been trained more narrowly to improve performance. This is not making the LLM itself modular, but rather uses it as a step towards massive modularity in artificial systems.

  • Then there was the question why rigour seems to be missing in so much performance studies of machine cognition. Mitchell proposed that computer scientists, unlike psychologists, are not trained in experimental methods. It's a difference of disciplines. But gauging understanding is relevant for the engineering of these systems, because generalisation might depend on it.