Matthew Purver: Research

For lists of possible undergraduate, postgraduate and PhD research topics, see my teaching page.

Grants / Funding

ARCIDUCA (Annotating Reference and Coreference In Dialogue Using Conversational Agents in games) - EPSRC EP/W001632/1 2022-25, Co-Investigator
Sodestream (Streamlining Social Decision-Making for Enhanced Internet Standards) - EPSRC EP/S033564/1 2020-24, Principal Investigator
RobaCOFI (Robust and Adaptable Comment Filtering) - AI4Media 2022-23, Co-Investigator
EMBEDDIA (Cross-Lingual Embeddings for Less-Represented Languages in European News Media) - EU H2020 825153 2019-22, Local Principal Investigator
SEVERE (SEVErity Detection in REal Care Comments) - Care Quality Commission 2018, Principal Investigator
Institute of Coding - HEFCE 2018-20, Co-Investigator
TEMPO (Twitter Exploration for Mining Patient Opinions) - InnovateUK 2016, Principal Investigator
SLaDe (Sensing Language for Dementia Diagnosis) - QM Intelligent Sensing Fund 2015, Principal Investigator
ConCreTe (Concept Creation Technology) - EU FP7 611733 2013-16, Co-Investigator
CMSI (Cultural Mobility through Social Intelligence) - CreativeWorksLondon/AHRC 2013, Academic Partner
AOTD (Analysing Online Therapy Dialogue) - QM Innovation Fund 2013, Principal Investigator
Reel Reviews: A Social Movie Recommendation App - QM Innovation Fund 2013-14, Principal Investigator
Centre for Digital Music - EPSRC Platform Grant EP/K009559/1 2013-18, Co-Investigator
RISER (Robust Incremental Semantic Resources for Dialogue) - EPSRC EP/J010383/1 2012-13, Principal Investigator
Chatterbox Development of Prototype - TSB SMART 720149 2012-13
PPAT (Predicting Patient Adherence to Treatment from Dialogue Transcripts) - EPSRC EP/J501360/1 2012, Principal Investigator
Prototyping Social Clothing - ImpactQM 2012, Principal Investigator
Chatterbox - QM Proof of Concept Fund 2012
Chatterbox Proof of Market - TSB SMART 700081 2011-12
DynDial (The Dynamics of Conversational Dialogue) - ESRC ES/F0271171 2008-11, Named Researcher

Research Students & Post-docs

Current

Mladen Karan - post-doc RA, SoDeStream project
Yujian Gan - post-doc RA, ARCIDUCA project (previously PhD, text-to-SQL parsing)
Shamila Nasreen - PhD, NLP for dementia diagnosis
George Wright - PhD, creative narrative generation
Pakawat Nakwijit - PhD, social use of misspelling in Thai
Iacopo Ghinassi - PhD, multimodal broadcast segmentation and understanding
Peyman Hosseini - PhD, analysing players' reactions to games (with Square Enix)
Zahraa Al Sahili - PhD, bias in multimodal machine learning
Nikola Ivačič - PhD, text clustering and tracking over time

Ravi Shekhar (→ Essex) - post-doc RA, EMBEDDIA and Sodestream projects
Prashant Khare (→ UK8) - post-doc RA, SoDeStream project
Morteza Rohanian (→ Harvard) - PhD, multimodal NLP and mental health
Jorge del Bosque Trevino (→ Nanu) - PhD, explanations in dialogue systems
Carlos Armendariz - RA, EMBEDDIA project & MPhil, meaning fluidity in dialogue
Tanmoy Mukherjee - PhD, cross-modal representation learning
Mariano Mora McGinity - PhD, language learning and cooperativity
Max Droog-Hayes (→ action.ai, ieso) - PhD, semantic summarisation
Nanda Khaorapapong (→ qConsult) - PhD, wearable computing for social interaction
Christine Farion (→ York) - PhD, ubiquitous computing to combat forgetfulness
Jeni Maleshkova - PhD, virtual interactive environments
Stephen McGregor (→ Goldsmiths, action.ai) - PhD, context-dependent distributional semantics and non-literal language
Sasha Scott (→ European Broadcasting Union) - PhD, public mourning behaviour on social media
Dmitrijs Milajevs (→ NIST) - PhD, distributional semantics for words and sentences
Shauna Concannon (→ Newcastle) - RA, CMSI/TEMPO projects & PhD, argumentation in online discussion
Sascha Griffiths (→ Hamburg) - post-doc RA, creativity and conceptual semantics (ConCreTe project)
Niall Gunter (→ OpenBet) - RA, SLaDe project
Julian Hough (→ Bielefeld) - RA, RISER project & PhD, incremental self-repair processing in dialogue
Christine Howes (→ Gothenburg) - post-doc RA, dialogue mining for clinical consultations (PPAT and AOTD projects)
Henrietta Eyre (→ Black Swan Data) - post-doc RA, creativity and conceptual semantics (ConCreTe project)
Arash Eshghi (→ Heriot-Watt) - post-doc RA, induction for incremental semantic grammars (RISER project)

Research Topics

Incrementality in Dialogue

Dialogue is incremental -- people don't speak (or listen) in complete, stand-alone sentences, but build up meaning bit-by-bit in an interactive process. We can interrupt each other, continue each other's utterances, and engage in an incremental process of feedback and repair. I'm involved in the DynDial project investigating how this happens (through corpus and experimental work) and how we can model it (in grammatical frameworks and dialogue systems). As part of this we've built various parsers and dialogue systems using Dynamic Syntax; see here.

Open-Domain Dialogue Understanding

Understanding dialogue is a genuinely hard problem; we humans are good at it, partly because we have a pretty good idea of what might make sense in a given context. Most computer dialogue systems make use of the same insight: as they work in a restricted domain, they can map from words to sensible meaning representations based on what's sensible or possible in that domain; and as they're involved in the dialogue themselves, they can always ask for clarification if they need it. When the problem is to understand what people are saying as an overhearer, without knowing much of the domain, it's harder; but this is exactly the problem we encounter if we want to build something like an automatic meeting assistant (like the system we developed at Stanford on the CALO project for detecting decisions and action items -- see here). I'm interested in developing robust techniques for detecting high-level topic structure, low-level aspects like addressing, and important conversational structures like decision-making and action item assignment.

Multi-Device Dialogue Systems

I am also working on an in-car spoken dialogue system project, to help people interact with the increasingly complex multiple devices in their car (stereo, phone, navigation & information systems) without having to divert their eyes or hands from the more critical job of driving. This brings a couple of interesting questions into the equation: firstly, how do we know which device is being addressed at any time (especially given the perennial problem of noisy speech recognition); and secondly, how do we even know if the system is being addressed at all, rather than a passenger? Amongst other things, we're approaching this by combining deep & shallow information (e.g. parse structures with topic classifiers) for increased robustness, while working on intelligent clarification and confirmation strategies.

Clarification Requests

While at King's College London I worked on the ROSSINI project, and my PhD thesis investigated clarification questions: what types people use when, how they should be interpreted, how they can be treated or used by a dialogue system, and what they tell us about semantics in general. I'm still working on this area, particularly on suitable semantic representations, in collaboration with Jonathan Ginzburg. As part of my thesis I built a prototype dialogue system, CLARIE, designed to be able to (a) interpret users clarification questions and respond suitably, and (b) ask clarification questions in order to learn new words and phrases. One of the things I'm currently working on (with Raquel Fernández) is extending it to incorporate an element of machine learning: using classifiers to determine optimum methods of fragment resolution. In the mean time, you can try the basic (rule-based) thesis version here, but be warned that the grammar is very limited - it might be worth getting in touch with me first.