The Linking ProblemFintan Mallory (Durham University)
Georg Morgenstiernes hus, room 652
Blindernveien 31
Oslo 0851
Norway
Organisers:
Details
Research on language model interpretation has two different objects of inquiry, models and text. Work on mechanistic interpretability aims to identify how deep neural networks come to represent the world. During training, a network’s internal states can come to track features in the data to which it has been exposed. These features are encoded in the weights and biases of the network’s layers. Finding out what is represented in a network and how it is represented is called the model interpretation problem. In contrast, work on language model metasemantics aims to determine whether the text output by generative language models is meaningful and if so, how this meaning is determined. The Linking Problem attempts to address how the results in these two fields should inform each other if at all. This talk will attempt to clarify one way of responding to this problem.
This guest lecture is part of the "Artificial Intelligence: Human-LLM Communication" and "How do we understand machines that talk to us?" projects.
Registration
No
Who is attending?
1 person is attending:
Will you attend this event?