Research
I lead a growing team of researchers working on the application of speech and audio technology in healthcare. This is an expanding but challenging field of study with great potential for translational research and real impact for people. Recent advances in mainstream speech recognition and understanding are only partially applicable and portable to the healthcare domain because of the confounding challenges posed by the need for personalised technology tailored to an individual's specific impairment, and the lack of available data with which to train the increasingly sophisticated computational models.
Automatic speech recognition for atypical voices
Sheffield has a long-standard track record in this area. We were the first to introduce more mainstream techniques like deep-learning to improve performance (e.g., Christensen
et al, 2012; Christensen
et al, 2013a; Christensen
et al, 2013b). In the
homeService project) we implemented an online system that we deployed in people's houses long-term. This was the demonstration system for an EPSRC programme grant. The system was one of the first cloud-based systems and we were able to demonstrate state-of-the-art performance for the users. We have subsequently made the
collected database public: (e.g., Green
et al, 2016; Nicolao
et al, 2016; Malavasi
et al, 2016). We extended this work (Google Research Award; (
DeepArt)) by exploring the use of deep-learning to obtain articulatory representations (Xiong
et al, 2018, 2019, 2020). Currently,
, an ESR on the H2020 ITN-ETN TAPAS works on improving continuous dysarthric speech recognition (Yue
et al, 2020a, 2020b). Related, my PhD student
Lubna Alhinti is working on the automatic recognition and detection of linguistic as well as paralinguistic information in speech (Alhinti
et al, 2018, 2020a, 2020b) Funded by EU, Google, MRC (Confidence in Concept) Round 3
Detection of verbal and non-verbal traits in speech and language
As part of a multi-disciplinary team involving neuroscientists, neuropsychologists, a clinical linguist and a general practitioner, I have led the technical work on developing a stratification test for people with memory concerns where a virtual agent asks memory-probing questions and the underlying speech analysis and machine learning looks for signs of neurodegenerative dementia in a person's speech and language (Mirheidari
et al, 2016; 2017a; 2017b; 2018, 2019, 2020). Currently,
Yilin Panan ESR on TAPAS is working with Philips on home-based monitoring for dementia (Pan *et al* 2019, 2020a, 2020b). In 2019
Dr Dan Blackburn and myself received the Rosetrees Trust Interdisciplinary Prize for AI and medicine to port this technology to work for assessing cognitive health for stroke survivors,
COMPASS. In 2019, we took the
CognoSpeak system to
Kenya" to work with a local neurologist (GRCF pump priming funds) to begin work on exploring how to make the underpinning speech analytics more language-agnostic. Recently, I have started working with Megan Thomas, one of the CDT students on a PhD collaboration with Apple on speech and language-based automatic tracking of depression and enxiety. Related, my PhD student Attas is developing a system for automatically analysing psychotherapy-client sessions with the aim of detecting clues for potential rupture. Wider applications of the use of conversational agents in therapy include a motivational system for people with COPS that was co-created and had an initial evaluation of a prototype system (Easton
et al 2020), and the exploration of a similar system for creating an empathy AI agent for online peer-mediated mental health intervention (Easton
et al,
in preparation). Funded by EU, John Hopkins workshop grant, MRC (Confidence in Concept) Round 6, Rosetrees Trust, CLARHC and SHSC RCF NIHR.