Program > Keynote speakers

Keynote: Susan Goldin-Meadow

Title: What small data can tell us about the resilience of language


Children learn the languages to which they are exposed.  Understanding how they accomplish this feat requires that we not only know as much as we can about the linguistic input they receive, but also about the architecture of the minds that process this input.  But because linguistic input has such a massive, and immediate, effect on the language children acquire, it is difficult to determine whether children come to language learning with biases and, if so, what those biases are — and getting more and more data about linguistic input won’t solve the problem.  Examining children who are not able to make use of the linguistic input that surrounds them does, however, allow us to discover the biases that children bring to language learning. The properties of language that such children develop are not only central to human language, but also provide hints about the kind of mind that is driven to structure communication in this way.  In this talk, I describe congenitally deaf individuals who cannot learn the spoken language that surrounds them and have not been exposed to sign language by their hearing families.  Individuals in these circumstances use their hands to communicate––they gesture––and those gestures, called homesigns, take on many, but not all, of the forms and functions of languages that have been handed down from generation to generation.  I first describe properties of language that are found in homesign. I next consider properties not found in homesign and I explore conditions that could lead to their development, first, in a naturally occurring situation of language emergence (Nicaraguan Sign Language) and, then, in an experimentally induced situation of language emergence (silent gesturers asked to communicate using only their hands). My goal is to shed light on the type of cognitive system that can not only pick up regularities in the input but, if necessary, also create a communication system that shares many properties with established languages.


Keynote: Afra Alishahi

Title:  Emerging representations of form and meaning in models of grounded language


Humans learn to understand speech from weak and noisy supervision: they manage to extract structure and meaning from speech by simply being exposed to utterances situated and grounded in their daily sensory experience. Emulating this remarkable skill has been the goal of numerous studies; however researchers have often used severely simplified settings where either the language input or the extralinguistic sensory input, or both, are small scale and symbolically represented.

We simulate this process in visually grounded models of language understanding which projects utterances and images to a joint semantic space. We use variations of recurrent neural networks to model the temporal nature of spoken language, and examine how form and meaning-based linguistic knowledge emerge from the input signal.  We carry out an in-depth analysis of the representations used by different components of the trained model and show that encoding of semantic aspects tends to become richer as we go up the hierarchy of layers, whereas encoding of form-related aspects of the language input tends to initially increase and then plateau or decrease.

Keynote: Phil Blunsom

Title: Structure and grounding in language


Computational models of language built upon recent advances in Artificial Intelligence are able to produce remarkably accurate predictive distributions when trained on large text corpora. However, there is significant evidence that such models are not discovering and using the latent syntactic and semantic structure inherent in language. In the first part of this talk I will discuss recent recent work at DeepMind and Oxford University aimed at understanding to what extent current deep learning models are learning structure, and whether models equipped with a preference for memorisation or hierarchical composition are better able to discover lexical and syntactic units. In the second part of this talk I will describe initial work at DeepMind to train agents in simulated 3D worlds to ground simple linguistic expressions.

Keynote: Cynthia Fisher

Title: Words, syntax, and conversation: How children use sentence and discourse structure to learn about words 


Children learn their native languages from notably noisy input. How do they manage this feat? One class of proposals suggests that children do so by learning in a biased system that expects (and therefore detects or builds) straightforward links between different levels of linguistic structure. Syntactic bootstrapping is an example of such a proposal, arguing that word learning (verb learning in particular) is guided by the child's growing syntactic knowledge. The structure-mapping account proposes that syntactic bootstrapping begins with universal biases (1) to map nouns in sentences onto distinct participant roles in a structured conceptual representation and (2) to represent syntactic knowledge in abstract terms. These biases make some aspects of sentence structure inherently meaningful to children (e.g., the number of nouns in the sentence), and permit children to generalize newly acquired syntactic knowledge rapidly to new verbs. In this talk, I will first review evidence for the structure-mapping account. Next, I will  discuss challenges to the account arising from the existence of languages that allow verbs’ arguments to be omitted, such as Korean. I will propose and review evidence that an expectation of discourse continuity allows children to gather linguistic evidence for verbs’ arguments across sentences in a conversation. Taken together, these lines of evidence make clear that simple aspects of sentence structure can guide verb learning from the start of multi-word sentence comprehension, and even when some of the new verb’s arguments are missing due to discourse redundancy.

Keynote: Chen Yu

Title: Statistical Word Learning: Data, Mechanisms and Models


Recent theory and experiments offer a solution as to how human learners may break into word learning, by using cross-situational statistics to find the underlying word-referent mappings. Computational models demonstrate the in-principle plausibility of this statistical learning solution and experimental evidence shows that both adults and infants can aggregate and make statistically appropriate decisions from word-referent co-occurrence data. In this talk, I will first review these empirical and modeling contributions to investigate cognitive processes in statistical word learning. Next, I will present a set of studies using head-mounted cameras and eye trackers to collect and analyze toddlers’ visual input as parents label novel objects during an object-play session. The results show how toddlers and parents coordinate momentary visual attention when exploring novel objects in free-flowing interaction, and how toddlers accumulate co-occurring statistics of seen objects and heard words through free play. I will conclude by suggesting that future research should focus on detailing the statistics in the learning environment and the cognitive processes that make use of those statistics.

Keynote: Michael Frank

Title:Variability and Consistency in Early Language Learning: The Wordbank Project


Every typically developing child learns to talk, but children vary tremendously in how and when they do so. What predicts this variability? And which aspects of early language learning are consistent across the world’s languages and cultures? We use data from tens of thousands of children learning dozens of different languages to create a data-driven picture of universals and variation in early language learning.

Online user: 1