Alumni
Morgan Bailey
Morgan was part of the second cohort of Social AI CDT students. Her research was based on social intelligence and how we can use this when building human-AI teams, in collaboration with Qumodo whose main goal is to advance human-AI teams. Qumodo had previously developed software for the Home Office, which drew her to this project as she has a passion for conducting research that has real-world applications to ultimately improve quality of life.
Prior to joining the Social AI CDT, she completed a BSc in Psychology and MSc in Psychological Research Methods at the University of Plymouth. Her undergraduate dissertation looked at Psycholinguistics with a focus on the relationship between phonemic awareness and reading ability in the general population; having been diagnosed with dyslexia at the age of 8, language acquisition has always fascinated her. Her love of Psycholinguistics continued into her MSc but as she wanted to expand her knowledge and challenge herself with different areas of research, she chose to focus her MSc dissertation on the affects of pitch on social cooperation between humans and robots. During her literature review, she discovered how complex and intriguing the relationship between humans and AI is, which motivated her to look for opportunities in the future to complete more research into improving the relationships between humans and AI.
In her opinion, the most appealing aspect of the Social AI CDT is the unique opportunity to work with a group of academics who have such a diverse set of skills and interests; it is inspiring to be working as part of a team who can all help train and support each other.
In 2025, Morgan successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Human or Machine? Exploring How Anthropomorphism and Performance Shape Trust in Human-AI Teams
Visions of the workplace-of-the-future include applications of machine learning and artificial intelligence embedded in nearly every aspect (Brynjolfsson & Mitchell, 2017). This “digital transformation” holds promise to broadly increase effectiveness and efficiency. A challenge to realising this transformation is that the workplace is substantially a human social environment and machines are not intrinsically social. Imbuing machines with social intelligence holds promise to help build human-AI teams and current approaches to teaming one human and one machine appear reasonably straightforward to design. However, if there are more than one human and more than one system that are working together we can see that the complexity of social interactions increases and we need to understand the society of human-AI teams. This research proposes to take a first step in this direction to consider the interaction of triads containing humans and machines.
Our proposed testbed will be concerned with automatic image classification and we choose this since identity and location recognition is a primary work context of our industrial partner Qumodo. Moreover, there are many image classification systems that have recently shown the ability to approach or exceed human performance. There are two scenarios we would like to examine involving human-AI triads and we term them the sharing problem and the consensus problem:
In the sharing problem we examine two humans teamed with the same AI and examine how the human-AI team is influenced by the learning style of the AI, which after initial training can either learn from a single trainer or from multiple trainers. We will examine how trust in the classifier evolves depending upon the presence/absence of another trainer and the accuracy of the other trainer(s). To obtain precise control the “other” trainer(s) could either be actual operators or simulations obtained by parametrically modifying accuracy based on ground truth. Of interest are the questions of when human-AI teams benefit from pooling of human judgment and if pooling can lead to reduced trust.
In the consensus problem we use the scenario of a human manager who must reach a consensus view based on input from a pair of judgments (human-human, human-AI). This consensus will be reached either with or without “explanation” from the two judgments. To make the experiment tractable we will consider the case of a binary decision (e.g. two facial images are of the same person or a different person). Aspects of the design will be taken from a recent paper examining recognition of identity from facial images (Phillips, et al., 2018).
In addition to these experimental studies we also wish to conduct qualitative studies involving surveys or structured interviews in the workplace to ascertain whether the experimental results are consistent or not with people’s attitudes towards the scenarios depicted in the experiments.
As industry moves further towards AI automation, this research will have substantial impact on future practices within the workplace. Even as AI performance increases, in most scenarios a human is still required to be in the loop. There has been very little research into what such a human-AI integration/interaction should look like. Therefore this research is of pressing importance across a myriad of different sectors moving towards automation.
References
[BRY17] Brynjolfsson, E., & Mitchell, T. (2017). What can machine learning do? Workforce implications. Science, 358(6370), 1530-1534.
[PHI18] Phillips, P. J., Yates, A. N., Hu, Y., Hahn, C. A., Noyes, E., Jackson, K., … & Chen, J. C. (2018). Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms. Proceedings of the National Academy of Sciences, 115(24), 6171-6176.
Andrei Birladeanu
Andrei was part of the first cohort of Social AI CDT students, working on a project at the intersection of psychiatry and social signal processing. He completed his undergraduate in Psychology at the University of Aberdeen finishing up with a thesis examining the physical symptoms of social anxiety. His academic interests are broad, but he has been particularly drawn to the fields of theoretical cognitive science, and cognitive neuroscience in both its basic and translational forms. The latter is what has motivated him to pursue research in the field of computational psychiatry, a novel approach aiming to detect and define mental disorders with the help of data-driven techniques. For his PhD research, he used methods from social signal processing to help psychiatrists identify children who display signs of Reactive Attachment Disorder, a severe cluster of psychological and behavioural issues affecting abused and neglected children.
In 2025, Andrei was awarded an MSc in Computing Science and Psychology.
Multimodal Deep Learning for Detection and Analysis of Reactive Attachment Disorder in Abused and Neglected Children
The goal of this project is to develop AI-driven methodologies for detection and analysis of Reactive Attachment Disorder (RAD), a psychiatric disorder affecting abused and neglected children. The main effect of RAD is “failure to seek and accept comfort”, i.e., the shut-down of a set of psychological processes, known as the Attachment System and essential for normal development, that allow children to establish and maintain beneficial relationships with their caregivers [YAR16]. While having serious implications for the child’s future (e.g., RAD is common in children with complex psychiatric disorders and criminal behaviour [MOR17]), RAD is highly amenable to treatment if recognised in infancy [YAR16]. However, the disorder is hard for clinicians to detect because its symptoms are not easily visible to the naked eye.
Encouraging progress in RAD diagnosis has been achieved by manually analysing videos of children involved in therapeutic sessions with their caregivers, but such an approach is too expensive and time consuming to be applied in a standard clinical setting. For this reason, this project proposes the use of AI-driven technologies for the analysis of human behaviour [VIN09]. These have been successfully applied to other attachment related issues [ROF19] and can help not only to automate the observation of the interactions, thus reducing the amount of time needed for possible diagnosis, but also to identify behavioural markers that might escape clinical observation. The emphasis will be on approaches that jointly model multiple behavioural modalities through the use of appropriate deep network architectures [BAL18].
The experimental activities will revolve around an existing corpus of over 300 real-world videos collected in a clinical setting and they will include three main steps:
- Identification of the behavioural cues (the RAD markers) most likely to account for RAD through manual observation of a representative sample of the corpus;
- Development of AI-driven methodologies, mostly based on signal processing and deep networks, for the detection of the RAD markers in the videos of the corpus;
- Development of AI-driven methodologies, mostly based on deep networks, for the automatic identification of children affected by RAD based on presence and intensity of the cues detected at point 2.
The likely outcomes of the system include a scientific analysis of RAD related behaviours as well as AI-driven methodologies capable of supporting the activity of clinicians. In this respect, the project aligns with needs and interests of private and public bodies dealing with child and adolescent mental health (e.g., the UK National Health Service and National Society for the Prevention of Cruelty to Children).
References
[BAL18] Baltrušaitis, T., Ahuja, C. and Morency, L.P. (2018). Multimodal Machine Learning: A Survey and Taxonomy, IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 421-433.
[HUM17] Humphreys, K. L., Nelson, C. A., Fox, N. A., & Zeanah, C. H. (2017). Signs of reactive attachment disorder and disinhibited social engagement disorder at age 12 years: Effects of institutional care history and high-quality foster care. Development and Psychopathology, 29(2), 675-684.
[MOR17] Moran, K., McDonald, J., Jackson, A., Turnbull, S., & Minnis, H. (2017). A study of Attachment Disorders in young offenders attending specialist services. Child Abuse & Neglect, 65, 77-87.
[ROF19] Roffo G, Vo DB, Tayarani M, Rooksby M, Sorrentino A, Di Folco S, Minnis H, Brewster S, Vinciarelli A. (2019). Automating the Administration and Analysis of Psychiatric Tests: The Case of Attachment in School Age Children. Proceedings of the CHI, Paper No.: 595 Pages 1–12.
[VIN09] Vinciarelli, A., Pantic, M. and Bourlard, H. (2009), Social Signal Processing: Survey of an Emerging Domain, Image and Vision Computing Journal, 27(12), 1743-1759.
[YAR16] Yarger, H. A., Hoye, J. R., & Dozier, M. (2016). Trajectories of change in attachment and biobehavioral catch-up among high risk mothers: a randomised clinical trial. Infant Mental Health Journal, 37(5), 525-536.
Jacqueline Borgstedt
Jacqueline was part of the first cohort of Social AI CDT students, interested in how socially intelligent artificial systems can be shaped and adopted in order to improve mental health and well-being of humans. Her doctoral research explored how multimodal interaction between robots and humans can facilitate reduction of stress or anxiety and may foster emotional support. Specifically, she investigated the role of different aspects of touch during robot-human interaction and its potential for improving psychological well-being.
Prior to her PhD studies, she completed an MA in Psychology at the University of Glasgow. During her undergraduate studies, she was particularly interested in the neural circuitry underlying emotion recognition and regulation. As part of her undergraduate dissertation, she thus investigated emotion recognition abilities and sensitivity to affective expressions in epilepsy patients. Further research interests include the evaluation and development of interventions for OCD, anxiety disorders and emotion regulation difficulties.
During her PhD she looked forward to integrating my knowledge of psychological theories within the development and evaluation of socially assistive robots. Furthermore, she hoped to contribute to novel solutions for the application of robots in mental health interventions as well the enhancement of human-robot interaction.
In 2025, Jacqueline successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Multimodal Interaction and Huggable Robot
The aim of the project is to investigate the combination of Human Computer Interaction and social/huggable robots for care, the reduction of stress and anxiety, and emotional support. Existing projects, such as Paro and the Huggable, focus on very simple interactions. The goal of this PhD project will be to create more complex feedback and sensing to enable a richer interaction between the human and the robot.
The plan would be to study two different aspects of touch: thermal feedback and squeeze input/output. These are key aspects of human-human interaction but have not been studied in human-robot settings where robots and humans come into physical contact.
Thermal feedback has strong associations with emotion and social cues [Wil17]. We use terms like ‘warm and loving’ or ‘cold and distant’ in everyday language. By investigating different uses of warm and cool feedback we can facilitate different emotional relationships with a robot. (This could be used alongside more familiar vibration feedback, such as purring). A series of studies will be undertaken looking at how we can use warming/cooling, rate of change and amount of change in temperature to change responses to robots. We will study responses in terms of, for example, valence and arousal.
We will also look at squeeze interaction from the device. Squeezing in real life offers comfort and support. One half of this task will look at squeeze input, with the human squeezing the robot. This can be done with simple pressure sensors on the robot. The second half will investigate the robot squeezing the arm of the human. For this we will need to build some simple hardware. The studies will look at human responses to squeezing, the social acceptability of these more intimate interactions, and emotional responses to them.
The output of this work will be a series of design prototypes and UI guidelines to help robot designers use new interaction modalities in their robots. The impact of this work will be enable robots have a richer and more natural interaction with the humans they touch. This has many practical applications for the acceptability of robots for care and emotional support.
References
[Wil17] Wilson, G., and Brewster, S.: Multi-moji: Combining Thermal, Vibrotactile & Visual Stimuli to Expand the Affective Range of Feedback. In Proceedings of the 35th Conference on Human Factors in Computing Systems – CHI ’17, ACM Press, 2017.
Robin Bretin
Robin was part of the second cohort of Social AI CDT students. From the deep forests of Sologne in France, he grew to become an extremely curious person, sensitive to his environment and beings living in it. This curiosity and sensitivity led him, one way or another, to the National Graduate School of Cognitive Engineering (ENSC) in Bordeaux, France. Cognitics aims to understand and improve the flow of human-machine symbiosis, in terms of performance, substitution, safety, ease and comfort, and augment human through technologies.
He is passionate about our future and the infinite possibilities that are presented to us. Being part of the Social AI CDT programme as a PhD student was the first step to what he hopes will be a great journey toward the integration of new technologies in our society, designed around and for humanity.
In 2024, Robin successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Beyond Boundaries: Unveiling Human-Drone Proxemic Dynamics Using Virtual Reality
Developing autonomous drones designed to assist and interact with people in social contexts, known as ‘social drones,’ is a rapidly evolving field in robotics. Human-drone interaction (HDI) applications range from supporting users in exercise and recreation to providing navigation cues and functioning as aerial interfaces. Understanding how humans perceive and interact with these drones is crucial for enhancing their social capabilities. As social drones are expected to engage closely with humans, similar to everyday human interactions, comprehending the spatial dynamics between these machines and individuals becomes essential. While proxemic behavior has been extensively studied in human-robot interaction, its application to interactions with drones remains unexplored.
Traditionally, researchers have conducted experiments involving direct contact between users and drones. However, such studies are often challenging, costly, and inflexible. For example, conducting real-world experiments to investigate how different drone sizes or flying altitudes impact user behavior can be impractical or hazardous. In contrast, conducting valid experiments in immersive virtual reality (VR) allows researchers to engage a larger and more diverse participant pool while maintaining greater control over environmental variables. For instance, adjusting a drone’s size in VR is as simple as modifying a variable, and VR eliminates the physical constraints of real-world experiments. However, for VR-based studies of human-drone interactions to inform our understanding of HDI in real-world settings, it is crucial to determine the extent to which findings from VR experiments generalize to real-life HDI scenarios.
This project aims to address theoretical gaps in Human-Drone Proxemics by proposing an initial theoretical framework grounded in social psychology and cognitive theories, supported by empirical evidence. Furthermore, this project explores the innovative use of VR as a tool to investigate human-drone proxemics, offering guidelines and practical examples for its optimal application. This approach frees researchers from the limitations of traditional real-world studies and expands the potential research avenues beyond current technological constraints.
To achieve these goals, the project will investigate and compare human proxemic behavior around drones in both VR and real-world settings.
Christopher Chandler
Christopher was part of the second cohort of Social AI CDT students, interested in societies of humans and machines.
As a philosophy undergraduate at the University of Aberdeen, he was interested in modal metaphysics, gravitating to logic and issues in the foundations of mathematics in the later stages of his degree. His dissertation investigated the status of axiomatic set theory as fundamental ontology for mathematics given controversies in the search for large cardinal axioms. However, he started out studying for a joint honours in philosophy and psychology and so gained fair experience in the latter. He subsequently completed a masters in software development at the University of Glasgow with a dissertation on model checking applied to gain insight into user engagement for a mobile game, spending a couple of years thereafter developing analytics for a company in the commercial HVAC and refrigeration space.
He looked forward to working with a diverse team of talented people, challenging discussions and the inevitable maturation of thought that follows. It was his hope to make a novel contribution to the development of practical and reliable autonomous systems capable of integrating into human society.
In 2025, Christopher successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Model checking for trustworthy reasoning in autonomous driving
Trustworthiness is a property of an agent or organisation which engenders trust. From a social perspective, trust is multi-dimensional and widely believed to be a key factor in successful adoption of AVs [1, 2]. One crucial aspect is safety. At present, the state-of-the-art in AV navigation is dominated by deep reinforcement learning (DRL), where an agent’s knowledge of the environment is represented by sensory inputs (states) linked by actions. To ensure fast decision-making at high speed, models are typically trained offline on datasets irrelevant to the immediate context, which can lead to unexpected and risky behaviours for observations out of distribution. Alternatively, DRL models are trained online in simulated environments, which may generate significantly more cases, but ultimately result in a rigid black-box system which cannot be guaranteed safe. While tempting to learn state/action pairs prior to deployment, this ignores vital structural information from the immediate environment. In general, the DRL approach implies that all behaviour is either attraction or avoidance, both of which are closed-loop behaviours.
Closed-loop behaviours are goal-directed reflexes which react to a disturbance (e.g., an obstacle) and terminate when it is eliminated, or when the behaviour fails. The Lifecyle of behaviours which are closed-loop depends on active feedback from the environment and relies on real-time disturbance processing. This is biologically realistic, as even simple biological organisms react to previously unseen environments in a closed-loop fashion in real-time [2, 3]. However, it is non-trivial to achieve this in the context of AVs, where compute resources are limited to minimise energy use and ultimately cost. From a safety perspective, it is therefore desirable to develop a closed-loop system which operates on data relevant to the immediate environment, is computationally lightweight and achieves a low processing latency for decision-making to ensure reliable operation at high speeds. In addition, the method should be transparent, as explanations for AV actions are an important tool for both calibrating user trust during operation and supporting post-hoc investigations by manufacturers and regulators in the event of accidents [5].
Model checking [6] is a widely used technique for automatically verifying reactive systems which is based on a precise mathematical and unambiguous model of possible system behaviour. While there are hard limitations, such as state-space explosion, bespoke implementations using models with small state spaces can be fast and light on computational resources.
In this project we investigate the use of model checking for the development of a transparent, light weight and fast closed-loop system for collision avoidance in AVs which operates on sensed data in real-time. Live model checking will be implemented on a resource constrained robot in situ to investigate the effectiveness of model checking for closed-loop reasoning in AVs with a focus on collision avoidance.
References
[1] J. Moody, N. Bailey, and J. Zhao, “Public perceptions of autonomous vehicle safety: An international comparison,” Saf. Sci., vol. 121, pp. 634–650, Jan. 2020 (link here).
[2] Kaur and G. Rampersad, “Trust in driverless cars: Investigating key factors influencing the adoption of driverless cars,” J. Eng. Technol. Manag., vol. 48, pp. 87–96, Apr. 2018 (link here).
[3] E. S. Spelke and K. D. Kinzler, “Core knowledge,” Dev. Sci., vol. 10, no. 1, pp. 89–96, Jan. 2007 (link here).
[4] Y. LeCun, “A Path Towards Autonomous Machine Intelligence Version 0.9.2, 2022-06-27”.
[5] D. Omeiza, H. Webb, M. Jirotka, and L. Kunze, “Explanations in Autonomous Driving: A Survey,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 10142–10162, Aug. 2022 (link here).
[6] C. Baier and J-P. Katoen, Principles Of Model Checking, vol. 950. Cambridge, Mass: The MIT Press, 2008 (link here).
Radu Chirila
Radu was part of the second cohort of Social AI CDT students. His research focused on designing an artificial smart e-Skin with embedded microactuators, able to convey expressions and emotions biomimetically. To achieve this, the stimuli behind human emotions and how these link to the various kinematic aspects of the human body needed to be understood first.
He came to the Social AI CDT with a diverse technical background. He had worked as a software engineer in the United States of America (focusing on developing low-level neural network nodes) and as an electronic systems consulting intern in the UK. His academic journey started with a Bachelor of Engineering (BEng) degree in Electronics and Electrical Engineering at the University of Glasgow. Joining the Social AI CDT was a natural step in his technical evolution, given the increasing ability of machine intelligence algorithms and the essential role social robotics plays in today’s world.
In 2024, Radu successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Soft eSkin with Embedded Microactuators
Research on tactile skin or e-skin has attracted significant interest recently as it is the key underpinning technology for safe physical interaction between humans and machines such as robots. Thus far, eSkin research has focussed on imitating some of the features of human touch sensing. However, skin is not just designed for feeling the real world, it is also a medium to express feeling through gestures. For example, the skin on the face, which can fold and wrinkle into specific patterns, allows us to express emotions such as varying degrees of happiness, sadness or anger. Yet, this important role of skin has not received any attention so far. Here, for the first time, this project will explore the emotion signal generation capacities of skin by developing programmable soft e-skin patches with embedded micro actuators that will emulate real skin movements. Building on the flexible and soft electronics research in the James Watt School of Engineering and the social robotics research in Institute of Neuroscience & Psychology, this project aims to achieve the following scientific and technological goals:
- Identify suitable actuation methods to generate simple emotive features such as wrinkles on the forehead
- Develop a soft eSkin patch with embedded microactutors
- Use dynamic facial expression models for specific movement patterns in the soft eSkin patch
- Develop an AI approach to program and control the actuators
Andreas Drakopoulos
Andreas was part of the first cohort of Social AI CDT students. His research was concerned with how humans perceive virtual and physical space, the simultaneous modelling of the two, and determining whether they are represented by different areas in the brain.
He came to the centre from a mathematical background: He completed a BSc and MSc in Mathematics at the Universities of Glasgow and Leeds respectively, gravitating towards pure mathematics. His undergraduate dissertation was on Stone duality, a connection between algebra and geometry expressed in the language of category theory; his master’s dissertation focused on the Curry-Howard correspondence, which is the observation that aspects of constructive logic harmonise with aspects of computation (e.g. proofs can be viewed as programs).
He developed his academic skills by studying abstract mathematics, and was excited about the opportunity to use them in an applied setting. He also particularly looked forward to being part of a group with diverse backgrounds and interests, something that drew him to the CDT in the first place.
In 2024, Andreas was awarded a PG Diploma in Computing Science and Psychology with Distinction.
Optimising Interactions with Virtual Environments
Virtual and Mixed Reality systems are socio-technical applications in which users experience different configurations of digital media and computation that give different senses of how a “virtual environment” relates to their local physical environment. In Human-Computer Interaction (HCI), we recently developed computational models capable of representing physical and virtual space, solving the problems of how to recognise virtual spatial regions starting from the detected physical position of the users (Benford et al., 2016). The models are bigraphs [MIL09] derived from the universal computational model introduced by Turing Award Laureate Robin Milner. Bigraphs encapsulate both dynamic and spatial behaviour of agents that interact and move among each other, or within each other. We used the models to investigate cognitive dissonance, namely the inability or difficulty to interact with the virtual environment.
How the brain represents physical versus virtual environments is also an issue very much debated within Psychology and Neuroscience with some researchers arguing that the brain makes little distinction between the two [BOZ12]. Yet more in line with Sevegnani’s work, Harvey and colleagues have shown that different brain areas represent these different environments and that they are further processed in different time scales HAR12; ROS09]. Moreover, special populations struggle more with virtual over real environments [ROS11].
The overarching goal of this PhD project is, therefore, to adapt the computational models developed in HCI and apply them to psychological scenarios, to test whether the environmental processing within the brain is different as proposed. This information will then refine the HCI model and ideally allow a refined application to special populations.
References
[BEN16] Benford, S., Calder, M., Rodden, T., & Sevegnani, M., On lions, impala, and bigraphs: Modelling interactions in physical/virtual spaces. ACM Transactions on Computer-Human Interaction (TOCHI), 23(2), 9, 2016.
[BOZ12] Bozzacchi., C., Giusti, M.A., Pitzalis, S., Spinelli, D., & Di Russo, F., Similar Cerebral Motor Plans for Real and Virtual Actions. PLOS One (7910), e47783, 2012.
[HAR12] Harvey, M. and Rossit, S., Visuospatial neglect in action. Neuropsychologia, 50, 1018-1028, 2012.
[MIL09] Milner, R., The space and motion of communicating agents. Cambridge University Press, 2009.
[ROS11] Rossit, S., Malhotra, P., Muir, K., Reeves, I., Duncan G. and Harvey, M., The role of right temporal lobe structures in off-line action: evidence from lesion-behaviour mapping in stroke patients. Cerebral Cortex, 21 (12), 2751-2761, 2011.
[ROS09] Rossit, S., Malhorta, P., Muir, K., Reeves, I., Duncan, G., Livingstone, K., Jackson H., Hogg, C., Castle P., Learmonth G. and Harvey, M., No neglect- specific deficits in reaching tasks. Cerebral Cortex, 19, 2616-2624, 2009.
Rhiannon Fyfe
Rhiannon was part of the first cohort of Social AI CDT students. Their MA is in English Language and Linguistics from the University of Glasgow. Their area of research was the further development of socially intelligent robots with a hope to improve Human-Robot Interaction, through the use of theory and methods from socially informed linguistics, and through the deployment in a real-world context of MuMMER (a humanoid robot, based on the SoftBank Robotics’ Pepper robot). During their undergraduate, their research interests included looking at the ways in which speech is practically produced and understood, which different social factors have an effect on speech, which different conversational rules are applied in different social situations, what causes breakdowns in communication and how they can be avoided. Their dissertation was titled “Are There New Emerging Basic Colour Terms in British English? A Statistical Analysis”, which was a study into how the semantic space of colour is divided linguistically by speakers of different social backgrounds. The prospect of developing helpful and entertaining robots that could be used to aid child language development, the elderly and the general public drew them to the Social AI CDT.
In 2022, Rhiannon was awarded a PG Diploma in Computing Science and Psychology with Merit.
Evaluating and Enhancing Human-Robot Interaction for Multiple Diverse Users in a Real-World Context
The increasing availability of socially-intelligent robots with functionality for a range of purposes, from guidance in museums [Geh15], to companionship for the elderly [Heb16], has motivated a growing number of studies attempting to evaluate and enhance Human-Robot Interaction (HRI). But, as Honig and Oron-Gilad review of recent work on understanding and resolving failures in HRI observes [Hon18], most research has focussed on technical ways of improving robot reliability. They argue that progress requires a “holistic approach” in which “[t]he technical knowledge of hardware and software must be integrated with cognitive aspects of information processing, psychological knowledge of interaction dynamics, and domain-specific knowledge of the user, the robot, the target application, and the environment” (p.16). Honig and Oron-Gilad point to a particular need to improve the ecological validity of evaluating user communication in HRI, by moving away from experimental, single-person environments, with low-relevance tasks, mainly with younger adult users, to more natural settings, with users of different social profiles and communication strategies, where the outcome of successful HRI matters.
The main contribution of this PhD project is to develop an interdisciplinary approach to evaluating and enhancing communication efficacy of HRI, by combining state-of-the-art social robotics with theory and methods from socially-informed linguistics [Cou14] and conversation analysis [Cli16]. Specifically, the project aims to improve HRI with the newly-developed MultiModal Mall Entertainment Robot (MuMMER). MuMMER is a humanoid robot, based on the SoftBank Robotics’ Pepper robot, which has been designed to interact naturally and autonomously in the communicatively-challenging space of a public shopping centre/mall with unlimited possible users of differing social backgrounds and communication styles [Fos16]. MuMMER’s role is to entertain and engage visitors to the shopping mall, thereby enhancing their overall experience in the mall. This in turn requires ensuring successful HRI which is socially acceptable, helpful and entertaining for multiple, diverse users in a real-world context. As of June 2019, the technical development of the MuMMER system has been nearly completed, and the final robot system will be located for 3 months in a shopping mall in Finland during the autumn of 2019.
The PhD project will evalute HRI with MuMMER in a new context, a large shopping mall in an English-speaking context, in Scotland’s largest, and most socially and ethnically-diverse city, Glasgow. Project objectives are to:
- Design a set of sociolinguistically-informed observational studies of HRI with MuMMER in situ with users from a range of social, ethnic, and language backgrounds, using direct and indirect methods
- Identify the minimal technical modification(dialogue, non-verbal, other) to optimise HRI, and thereby user experience and engagement, also considering indices such as consumer footfall to the mall
- Implement technical alterations, and re-evaluate with new users.
References
[Cli16] Clift, R. (2016). Conversation Analysis. Cambridge: Cambridge University Press.
[Cou14] Coupland, N., Sarangi, S., & Candlin, C. N. (2014). Sociolinguistics and social theory. Routledge.
[Fos16] Foster M.E., Alami, R., Gestranius, O., Lemon, O., Niemela, M., Odobez, J-M., Pandey, A.M. (2016) The MuMMER Project: Engaging Human-Robot Interaction in Real-World Public Spaces. In: Agah A., Cabibihan J., Howard A., Salichs M., He H. (eds) Social Robotics. ICSR 2016. Lecture Notes in Computer Science, vol 9979. Springer, Cham
[Geh15] Gehle R., Pitsch K., Dankert T., Wrede S. (2015). Trouble-based group dynamics in real-world HRI – Reactions on unexpected next moves of a museum guide robot., in 24th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2015 (Kobe), 407–412.
[Heb16] Hebesberger, D., Dondrup, C., Koertner, T., Gisinger, C., Pripfl, J. (2016).Lessonslearned from the deployment of a long-term autonomous robot as companion inphysical therapy for older adults with dementia: A mixed methods study. In: TheEleventh ACM/IEEE International Conference on Human Robot Interaction, 27–34
[Hon18] Honig, S., & Oron-Gilad, T. (2018). Understanding and Resolving Failures in Human-Robot Interaction: Literature Review and Model Development. Frontiers in Psychology, 9, 861.
Thomas Goodge
Thomas was part of the second cohort of Social AI CDT students. His PhD research looked at Human-Car interactions in the context of autonomous vehicles, with a focus on the point of handover of control between the driver and the car.
He started studying Psychology at University of Nottingham, with an interest in visual perception, decision making and interaction with technology. He then worked as a Research Assistant with the Transport Research in Psychology group at Nottingham Trent University. Here, he was involved in various projects looking at hazard perception across different presentation formats and in different vehicle types. Their focus was on understanding the strategies drivers use to decide what a hazard is and the risk associated with it, and then developing training interventions to try and impart these skills to new drivers. During this time, he also studied MSc Information Security with Royal Holloway University of London, which looked at the decisions and attitudes people form about their personal data depending on environment.
He was excited to be joining the Social AI CDT cohort and to be working with a diverse group of academics across Computer Science and Psychology. He was particularly looking forward to developing the work he had been conducting over previous years as well as learning how incorporating artificial agents for drivers to engage with can further assist and improve driver safety.
In 2025, Thomas successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Investigating the Effects of Augmented Reality Cues during Non-Driving Related Activities on the Situational Awareness of Drivers
The aim of this project is to investigate, in the context of social interactions,the interaction between a driver and an autonomous vehicle. Autonomous cars are sophisticated agents that can handle many driving tasks. However, they may have to hand control to the human driver in different circumstances, for example if sensors fail or weather conditions are bad [MCA16, BUR19]. This is potentially difficult for the driver as they may have not been driving the car for a long period and have to quickly take control [POL15]. This is an important issue for car companies as they want to add more automation to vehicles in a safe manner.Key to this problem is whether this interface would benefit from conceptualizing the exchange between human and car as a social interaction.
This project will study how best to handle handovers, from the car indicating to the driver that it is time to take over, the takeover event, and then the return to automated driving. They key factors to investigate are: situational awareness (the driver needs to know what the problem is and what must be done when they take over), responsibility (who’s task is it to drive at which point), the in-car context (what is the driver doing: are they asleep, talking to another passenger), and driver skills (is the driver competent to drive or are they under the influence).
We will conduct a range of experiments in our driving simulator to test different types of handover situations and different types of multimodal interactions involving social cuesto support the 4 factors outlined above.
The output will be experimental results and guidelines that can help automotive designers know how best to communicate and deal with handover situations between car and driver. We currently work with companies such as Jaguar Landrover and Bosch and our results will have direct application in their products.
References
[MCA16] Mcall, R., McGee, F., Meschtscherjakov, A. and Engel, T., Towards A Taxonomy of Autonomous Vehicle Handover Situations, Publication: Proceedings of the International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 193–200, 2016.
[BUR19] Burnett, G., Large, D. R. & Salanitri, D., How will drivers interact with vehicles of the future? Royal Automobile Club Foundation for Motoring Report, 2019.
[POL15] Politis, I, Pollick, F and Brewster S. Language-based multimodal displays for the handover of control in autonomous cars, Publication, Proceedings of the International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 3–10, 2015.
Eleanor Gorton
Ellie was part of the fifth cohort of Social AI CDT students. Her background is primarily in Cognitive Neuroscience and Psychology, for which she received an MSci from the University of Manchester. She has also completed an MSc in Robotics and Autonomous Systems from the University of Sussex. Her previous research topics were quite different, however both involved interdisciplinary concepts and solutions. For her MSci, she investigated links between medically unexplained systems, tactile sensory perception and mental health; combining ideas from both neuroscience and psychology. For her MSc, she used her previous neuroscience knowledge on human sensory systems to improve the touch accuracy in a vision-based tactile robotic sensor.
Her research interests are programming for robotics, mental health, emotions and personality psychology. These coincided well with her PhD project. Her project was focused on developing a robot that has improved social intelligence, with the ultimate aim to use this ability to improve people’s health and wellbeing. The Social AI CDT was the ideal programme on which to conduct this interdisciplinary research, and she was very excited to progress further with her project.
In 2025, Ellie was awarded a PG Diploma in Computing Science and Psychology with Merit.
Developing Robots’ Social Behaviour for Human-Robot Interaction and Personality/Health Assessment
As robots become more prevalent in our daily lives, there is an increasing need for them to be able to interact with humans in a social and intuitive manner. Even though Robot and AI technologies have rapidly developed in recent years, the ability of robots to interact with humans in an intuitive and social way is still quite limited. An important challenge is to determine how to design robots that can perceive the user’s needs, feelings and intentions, and adapt to users over a broad range of cognitive abilities. In particular, how the robot could react to human partners showing different behaviours, and how human-robot interactive behaviours might potentially reveal novel aspects of human personality and health.
In this project, the student will develop robot control and machine learning methods on the UR3e robotic arm testbed (with haptic sensors) to autonomously control a robot to physically interact with human collaborators (clap hands, jointly move an object, etc.) showing different behaviours. The student will use machine learning methods to extract behaviour patterns and then design robot control methods (PID and continuous learning methods) to control the robotic arm to move with specific behaviour patterns. Furthermore, we will study how to use robot and joint human-robot behaviours to elicit and identify aspects of human personality and health.
The project’s impact will be broad, affecting health, neuroscience, social robotics and fundamental robot research. The output of the project can be published in various journals and conferences, such as IEEE International Conference on Robotics and Automation (ICRA), IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), ACM/IEEE International Conference on Human-Robot Interaction, Science Robotics, Journal of Neural Engineering, and Frontiers in psychology, etc.
References
Hale J. and Pollick F. “Sticky Hands”: Learning and generalization for cooperative physical interactions with a humanoid robot, IEEE Transactions on Systems, Man, and Cybernetics—Part C: Applications and Reviews, 35(4), Nov. 2005.
Wiese E., Metta G., Wykowska A., Robots as intentional agents: using neuroscientific methods to make robots appear more social, Frontiers in psychology, Oct. 2017.
Babic J., Hale J. and Oztop E., Human sensorimotor learning for humanoid robot skill synthesis, Adaptive Behavior, 19(4), 2011.
Huang L., Meng Z., Deng Z., Wang C., Li L., et al, Extracting human behavioral biometrics from robot motions, in Proc. 27th Annual International Conference on Mobile Computing and Networking (MobiCom 2021), Oct. 2021.
Casper Hyllested
Casper was part of the second cohort of Social AI CDT students. His research focused primarily on understanding, deconstructing, and measuring the individual differences in people and data. The primary current aim was to implement this understanding to create both generalizable and individualized user models, allowing virtual agents to adapt to the dynamic state and trait profiles that can influence a person’s behaviors and responses.
During his psychology undergraduate at the University of Glasgow, he was particularly interested in the patterns which could be attributed to the plethora of individual differences that underlie most research in psychology. He specialized largely in reliability in research and generalizability theory, and his dissertation was predominantly focused on how responses could vary dynamically over time. Since then he began to explore the versatility of situated measures, having participants respond to a set of situations rather than generalized items on a questionnaire, once more with the same focus of uncovering the variance in data that forms the composition of any individual or group-wide responses.
Exploring both situational and individual differences, not to mention the unlimited pool of potential facets and confounds that amalgamate to generate just a single response or behavior, was a daunting task. Implementing programming and virtual agents can simultaneously allow for easier collection and analysis of the required data. In turn, categorizing underlying personal mechanics may allow virtual agents to better tailor themselves to, and understand, the individuals they encounter. It still required copious amounts of time and expertise from a multitude of different fields, which is why he was particularly looking forward to collaborating in a cohort of people with very different backgrounds in a highly interdisciplinary setting.
In 2024, Casper successfully completed a PhD with Integrated Study in Computing Science and Psychology.
A framework for establishing situated and generalizable models of users in intelligent virtual agents
Aims
Increasing research suggests that intelligent virtual agents are most effective and accepted when they adapt themselves to individual users. One way virtual agents can adapt to different individuals is by developing an effective model of a user’s traits and using it to anticipate dynamically varying states of these traits as situational conditions vary. The primary aims of the current work are to develop: (1) empirical methods for collecting data to build user models, (2) computational procedures for building models from these data, (3) computational procedures for adapting these models to current situations. Although the project’s primary goal is to develop a general framework for building user models, we will also explore preliminary implementations in digital interfaces.
Novel Elements
One standard approach to building a model of a user’s traits—Classical Test Theory—uses a coherent inventory of measurement items to assess a specific trait of interest (e.g., stress, conscientiousness, neuroticism). Typically, these items measure a trait explicitly via a self-report instrument or passively via a digital device. Well-known limitations of this approach include its inability to assess the generalizability of a model across situations and occasions, and its failure incorporate specific situations into model development. In this project, we expand upon the classic approach by incorporating two new perspectives: (1) Generalizability Theory, (2) The Situated Assessment Method. Generalizability Theory will establish a general user model that varies across multiple facets, including individuals, measurement items, situations, and occasions. The Situated Assessment Method replaces standard unsituated assessment items with situations, fundamentally changing the character of assessment.
Approach
We will develop a general framework for collecting empirical data that enables building user models across many potential domains, including stress, personality, social connectedness, wellbeing, mindfulness, eating, daily habits, etc. The data collected—both explicit self-report and passive digital—will assess traits (and states) relevant to a domain across facets for individuals, measurement items, situations, and occasions. These data will be to Generalizability Theory and the Situated Assessment Method to build user models and establish their variance profiles. Of particular interest will be how well user models generalize across facets, the magnitude of individual differences, and clusters of individuals sharing similar models. Situated and unsituated models will both be assessed to establish their relative strengths, weaknesses, and external validity. Once models are built, their ability to predict a user’s states on particular occasions will be assessed, using procedures from Generalizability Theory, the Situated Assessment Method, and autoregression. Prediction error will be assessed to establish optimal model building methods. App prototypes will be developed and explored.
Outputs and Impact
Generally, this work will increase our ability to construct and understand user models that virtual agents can employ. Specifically, we will develop novel methods that: (1) collect data for building user models, (2) assess the generalizability of models; (3) generate state-level inferences in specific situations. Besides being relevant for the development of intelligent social agents, this work will contribute to deeper understanding of classic assessment instruments and to alternative situated measurement approaches across multiple scientific domains. More practicially, the framework, methods, and app prototypes we develop are of potential use to clinicians and individuals interested in understanding both functional and dysfunctional health behaviours.
References
[BLO12] Bloch, R., & Norman, G. (2012). Generalizability theory for the perplexed: A practical introduction and guide: AMEE Guide No. 68. Medical Teacher, 34, 960–992.
[PED20] Pedersen, C.H., & Scheepers, C. (2020). An exploratory meta-analysis of the state-trait anxiety inventory through use of generalizability theory. Manuscript in prepration.
[DUT19] Dutriaux, L., Clark, N., Papies, E. K., Scheepers, C., & Barsalou, L. W. (2019). Using the Situated Assessment Method (SAM2) to assess individual differences in common habits. Manuscript under review.
[LEB16] Lebois, L. A. M., Hertzog, C., Slavich, G. M., Barrett, L. F., & Barsalou, L. W. (2016). Establishing the situated features associated with perceived stress. Acta Psychologica, 169,119–132.
[MAR14] Stacy Marsella and Jonathan Gratch. Computationally Modeling Human Emotion. Communications of the ACM, December, 2014.
[MIL11] Lynn C. Miller, Stacy Marsella, Teresa Dey, Paul Robert Appleby, John L. Christensen, Jennifer Klatt and Stephen J. Read. Socially Optimized Learning in Virtual Environments (SOLVE). The Fourth International Conference on Interactive Digital Storytelling (ICIDS), Vancouver, Canada, Nov. 2011.
Elizabeth Jacobs
Elizabeth was part of the third cohort of Social AI CDT students. Her interest in human behaviour and emotion science was further cultivated by her time at York, when she was enrolled there for her BSc in Psychology. In her final year, she specialised and designed her dissertation project around the Dangerous Decisions theory and first impressions of faces. Furthermore, she also conducted her literature survey on whether the modulation of human emotions is capable via artificial intelligence. Both of these independent research avenues really honed her interest in on the relationship between human behaviour and artificial intelligence, especially in relation to emotions, micro-expressions and body language.
Her PhD project aimed to modulate cognitive models of emotional intelligence via the use of Magnetic Resonance Imaging and EEG to build cognitive models that explain modulation of brain activity in regions associated with emotions. The aim of this project was to build data-driven cognitive models of real-time brain network interaction during emotional modulation via said neurofeedback techniques, which would open up new avenues for the current field of wearable EEG technology.
She was enthused for this opportunity within the Social AI CDT, as the combination of psychology and the computer science behind it really caught her eye. All in all, she hoped that her contribution would bring to light novel perspectives and information on the relationship between how the brain functions and possible translations into artificial intelligence.
In 2025, Elizabeth was awarded an MSc in Computing Science and Psychology with Distinction.
Modulating Cognitive Models of Emotional Intelligence
State-of-the-art artificial intelligent (AI) systems mimic how the brain processes information to develop systems with unprecedented accuracy and performance in accomplishing tasks such as object/face recognition and text/speech translation. However, one key characteristic that defines human success is emotional intelligence. Empathy, the ability to understand others’ people feelings and emotionally reflect upon them, shapes social interaction and it is important in both personal and professional success. Although, some progress has been achieved in developing systems that detect emotions based on facial expressions and physiological data, a way of relating and reflecting upon them is far more challenging. Therefore, understanding how empathy/emotional responses emerge via complex information processing between key brain regions is of paramount importance to develop emotionally-aware AI agents.
In this project, we will exploit real-time functional Magnetic Resonance Imaging (fMRI) neurofeedback techniques to build cognitive models that explain modulation of brain activity in key regions related to empathy and emotions. For example, anterior insula is a brain region located in deep gray matter and it has been consistently implicated in empathy/emotional responses and abnormal emotional processing observed in several disorders such as Autism Spectrum Disorder and misophonia. Neurofeedback has shown promising results in regulating the activity of anterior insula and it could enable therapeutic training techniques (Kanel et al. 2019).
This approach would extract how brain regions interact during neuromodulation and allow cognitive models to emerge in real-time. Subsequently, to allow training in more naturalistic environments we suggest cross-domain learning between fMRI and EEG. The motivation behind this is that whereas fMRI is the gold standard imaging technique for deep gray matter structures it is limited by the lack of portability, comfort in use and low temporal resolution (Deligianni et al. 2014). On the other hand, advances in wearable EEG technology show promising results in the use of the device beyond well-controlled lab experiments. Toward this end advanced machine learning algorithms based on representation learning and domain generalisation would be developed. Domain/Model generalisation in deep learning aims to learn generalised features and extract representations in an ‘unseen’ target domain by eliminating bias observed via multiple source domain data (Volpi et al. 2018) .
Summarising, the overall aims of the project are:
- To build data-driven cognitive models of real-time brain network interaction during emotional modulation via neurofeedback techniques.
- To develop advanced machine learning algorithm to perform cross-domain learning between fMRI and EEG.
- To develop intelligent artificial agents based on portable EEG systems to successfully regulate emotional responses, taking into account cognitive models derived in the fMRI scanner.
Salman Mohammadi
Salman was part of the first cohort of Social AI CDT students. In 2020, Salman was awarded an MSc in Computing Science and Psychology with Distinction.
Enhancing Social Interactions via Physiologically-Informed AI
Over the past few years major developments in machine learning (ML) have enabled important advancements in artificial intelligence (AI). Firstly, the field of deep learning (DL) – which has enabled models to learn complex input-output functions (e.g. pixels in an image mapped onto object categories), has emerged as a major player in this area. DL builds upon neural network theory and design architectures, expanding these in ways that enable more complex function approximations.
The second major advance in ML has combined advances in DL with reinforcement learning (RL) to enable new AI systems for learning state-action policies – in what is often referred to as deep reinforcement learning (DRL) – to enhance human performance in complex tasks. Despite these advancements, however, critical challenges still exist in incorporating AI into a team with human(s).
One of the most important challenges is the need to understand how humans value intermediate decisions (i.e. before they generate a behaviour) through internal models of their confidence, expected reward, risk etc. Critically, such information about human decision-making is not only expressed through overt behaviour, such as speech or action, but more subtlety through physiological changes, small changes in facial expression and posture etc. Socially and emotionally intelligent people are excellent at picking up on this information to infer the current disposition of one another and to guide their decisions and social interactions.
In this project, we propose to develop a physiologically-informed AI platform, utilizing neural and systemic physiological information (e.g. arousal, stress) (Fouragnan et al., 2015; Pisauro et al., 2017; Gherman & Philiastides, 2018) together with affective cues from facial features (Vinciarelli, Pantic & Bourlard, 2009; Baltrušaitis, Robinson & Morency, 2016) to infer latent cognitive and emotional states from humans interacting in a series of social decision-making tasks (e.g. trust game, prisoner’s dilemma etc). Specifically, we will use these latent states to generate rich reinforcement signals to train AI agents (specifically DRL) and allow them to develop a “theory of mind” (Premack & Woodruff 1978; Frith & Frith 2005) in order to make predictions about upcoming human behaviour. The ultimate goal of this project is to deliver advancements towards “closing-the-loop”, whereby the AI agent feeds-back its own predictions to the human players in order to optimise behaviour and social interactions.
Emily O’Hara
Emily was part of the first cohort of Social AI CDT students. Her doctoral research focused on the social perception of speech, paying particular attention to how the usage of fillers affect the percepts of speaker personality. Within the frames of artificial intelligence, the project aimed to improve the functionality and naturalness of artificial voices. Her research interests during her undergraduate degree in English Language and Linguistics included sociolinguistics, natural language processing, and psycholinguistics. Her dissertation was entitled “Masked Degrees of Facilitation: Can They be Found for Phonological Features in Visual Word Recognition?” and was a psycholinguistic study of how the phonological elements of words are stored in the brain and accessed during reading. The opportunity to integrate her knowledge of linguistic methods and theory with computer science was what attracted her to the CDT, and she looked forward to undertaking research that can aid in the creation of more seamless user-AI communication.
In 2024, Emily successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Social Perception of Speech
Short vocalizations like “ehm” and “uhm”, the fillers according to the linguistics terminology, are common in everyday conversations (up to one every 10.9 seconds according to the analysis presented in [Vin15]). For this reason, it is important to understand whether the fillers uttered by a person convey personality impressions, i.e., whether people develop a different opinion about a given individual depending on how she/he utters the fillers. This project will use an existing corpus of 2988 fillers (uttered by 120 persons interacting with one another) to achieve the following scientific and technological goals:
- To establish the vocal parameters that lead to consistent percepts of speaker personality both within and across listeners and the neural areas involved in these attributions from brief fillers.
- To develop an AI approach aimed at predicting the trait people attribute to an individual when they hear her/his fillers.
The first goal will be achieved through behavioural [Mah18] and neuroimaging experiments [Tod08] that pinpoint how and where in the brain stable personality percepts are processed. From there, acoustical analysis and data-driven approaches using cutting-edge acoustical morphing techniques will allow for generation of hypotheses feeding subsequent AI networks [McA14]. This section will allow the development of the skills necessary to design, implement, and analyse behavioural and neural experiments for establishing social percepts from speech and voice.
The final goal will be achieved through the development of an end-to-end automatic approach that can map the speech signal underlying a filler into the traits that listeners attribute to a speaker. This will allow the development of the skills necessary to design and implement deep neural networks capable to model sequences of physical measurements (with an emphasis on speech signals).
The project is relevant to the emerging domain called personality computing [Vin14] and the main application related to this project is the synthesis of “personality colored” speech, i.e., artificial voices that can give the impression of a personality and sound not only more realistic, but also better at performing the task they are developed for [Nas05].
References
[Mah18]. G. Mahrholz, P. Belin and P. McAleer, “Judgements of a speaker’s personality are correlated across differing content and stimulus type”, PLOS ONE, 13(10): e0204991. 2018
[McA14]. P. McAleer, A. Todorov and P. Belin, “How Do You Say ‘Hello’? Personality Impressions from Brief Novel Voices”, PL0S ONE, 9(3): e90779. 2014
[Tod08]. A. Todorov, S. G. Baron and N. N. Oosterhof, “Evaluating face trustworthiness: a model based approach, Social Cognitive Affective Neuroscience, 3(2), pp. 119-127. 2008
[Vin15] A.Vinciarelli, E.Chatziioannou and A.Esposito, “When the Words are not Everything: The Use of Laughter, Fillers, Back-Channel, Silence and Overlapping Speech in Phone Calls“, Frontiers in Information and Communication Technology, 2:4, 2015.
[Vin14] A.Vinciarelli and G.Mohammadi, “A Survey of Personality Computing“, IEEE Transactions on Affective Computing, Vol. 5, no. 3, pp. 273-291, 2014.
[Nas05] C.Nass, S.Brave, “Wired for speech: How voice activates and advances the human-computer relationship”, MIT Press, 2005.
Gordon Rennie
Gordon was part of the second cohort of Social AI CDT students. Before starting at the University of Glasgow he carried out an undergraduate in Psychology and then a MSc in Human Robot Interaction at Heriot-Watt University. Initially he was drawn to psychology because of the sheer number of unknowns in the science and the possibility of discovering new knowledge, while also being aware that it could have a real impact on improving people’s lives. His MSc continued this by taking psychological knowledge and attempting to apply it to real world computing technologies targeted at improving people’s lives. There he began working with Conversational Agents, computer programs which attempt to speak with users using natural language.
His MSc project took one such agent, Alana created by Heriot-Watt’s Interaction Lab, and attempted to enable it to speak with multiple users at once. This was one improvement to the agent that builds on top of the brilliant work done on it previously and gave him insight into how difficult it is to improve such systems.
The PhD he completed, via the Social AI CDT, offered him the chance to continue this vein of research by working on other areas where current conversational agents fail. Specifically understanding conversational occurrences such as ‘uh’, ‘ah’, and laughter. He finds conversational agents a particularly exciting area of research because of the future it promises. Imagine: a computer stops working for some unknowable reason – a common occurrence for even the most technically literate. Imagine also that you could ask it why and how to fix the issue, in plain English, without navigating a myriad of menus. That’s the dream of voice interaction. Of every user being able to interact with computers in the most natural way possible.
In 2024, Gordon successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Deep Learning for Automatic Laughter Detection in Spontaneous Conversations
According to Emmanuel Schegloff, one of the most important linguists of the 20thCentury, conversation is the “primordial site of human sociality”, the setting that has shaped human communicative skills from neural processes to expressive abilities [TUR16]. This project focuses on these latter and, in particular, on the use of nonverbal behavioural cues such as laughter, pauses, fillers and interruptions during dyadic interactions. In particular, the project targets the following main goals:
- To develop approaches for the automatic detection of laughter, pauses, fillers, overlapping speech and back-channel events in speech signals;
- To analyse the interplay between the cues above and social-psychological phenomena such as emotions, agreement/disagreement, negotiation, personality, etc.
The experiments will be performed over two existing corpora. One includes roughly 12 hours of spontaneous conversations involving 120 persons [VIN15] that have been fully annotated in terms of the cues and the phenomena above. The other is the Russian Acted Multimodal Affective Set (RAMAS) − the first multimodal corpus in Russian language, including approximately 7 h of high-quality close-up video recordings of faces, speech, motion-capture data and such physiological signals as electro-dermal activity and photoplethysmogram [PER18].
The main motivation behind the focus on nonvebal behavioural cues is that these tend to be used differently in different cultural contexts, but they can still be detected independently of the language being used. In this respect, an approach based on nonverbal communication promises to be more robust to the application over data collected in different countries and linguistic areas. In addition, while the importance of nonverbal communication is widely recognised in social psychology, the way certain cues interplay with social and psychological phenomena still requires full investigation [VIN19].
From a methodological point of view, the project involves the following main aspects:
- Development of corpus analysis methodologies (observational statistics) for the investigation of the relationships between nonverbal behaviour and social phenomena;
- Development of signal processing methodologies for the conversion of speech signals into measurements suitable for computer processing;
- Development of Artificial Intelligence techniques (mainly based on deep networks) for the inference of information from raw speech signals.
From a scientific point of view, the impact of the project will be mainly in Affective Computing and Social Signal Processing [VIN09] while, from an industrial point of view, the impact will be mainly in the areas of Conversational Interfaces (e.g., Alexa and Siri), multimedia content analysis and, in more general terms, Social AI, the application domain encompassing all attempts of making machines capable to interact with people like people do with one another. For this reason, the project is based on the collaboration between the University of Glasgow and Neurodata Lab, one of the top companies in Social and Emotion AI.
References
[PER18] Perepelkina O., Kazimirova E., Konstantinova M. RAMAS: Russian Multimodal Corpus of Dyadic Interaction for Affective Computing. In: Karpov A., Jokisch O., Potapova R. (eds) Speech and Computer. Lecture Notes in Computer Science, vol 11096. Springer, 2018.
[TUR16] S.Turkle, “Reclaiming conversation: The power of talk in a digital age”, Penguin, 2016.
[VIN19] M.Tayarani, A.Esposito and A.Vinciarelli, “What an `Ehm’ Leaks About You: Mapping Fillers into Personality Traits with Quantum Evolutionary Feature Selection Algorithms“, accepted for publication by IEEE Transactions on Affective Computing, to appear, 2019.
[VIN15] A.Vinciarelli, E.Chatziioannou and A.Esposito, “When the Words are not Everything: The Use of Laughter, Fillers, Back-Channel, Silence and Overlapping Speech in Phone Calls“, Frontiers in Information and Communication Technology, 2:4, 2015.
[VIN09] A.Vinciarelli, M.Pantic, and H.Bourlard, “Social Signal Processing: Survey of an Emerging Domain“, Image and Vision Computing Journal, Vol. 27, no. 12, pp. 1743-1759, 2009.
Mary Roth
Mary was part of the first cohort of Social AI CDT students. She had been a recent Psychology graduate from the University of Strathclyde, Glasgow. To her, conducting research has always been the most interesting part of her degree. She found that people and minds are the most complex and fascinating phenomena one could study, and throughout completing her degree she have been very passionate about learning more about the mechanisms underlying our cognition, emotion, and behaviour.
Grounded in the work on her dissertation, her research interests include the psychology of biases, heuristics, and automatic processing. In this PhD programme she worked on the project “Robust, Efficient, Dynamic Theory of Mind” with Stacy Marsella and Lawrence Barsalou.
Being part of the Social AI CDT programme, she looked forward to contributing to the emerging interdisciplinary junction between psychology and computer science. Coming from a psychological background, she was excited to apply psychological research to the development of more efficient and dynamic models of social situations.
In 2024, Mary successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Robust, Efficient, Dynamic Theory of Mind
Background
The ability to socially function effectively is a critical human skill and providing such skills to artificial agents is a core challenge faced by these technologies. The aim of this work is to improve the social skills of artificial agents, making them more robust, by giving them a skill that is fundamental to effective human social interaction, the ability to possess and use beliefs about the mental processes and states of others, commonly called Theory of Mind (ToM) [Whiten, 1991]. Theory of Mind skills are predictive of social cooperation and collective intelligence, as well as key to cognitive empathy, emotional intelligence, and the use of shared mental models in teamwork. Although people typically develop ToM at an early age, research has shown that even adults with a fully formed capability for ToM are limited in their capacity to employ it (Keysar, Lin, & Barr, 2003; Lin, Keysar, & Epley, 2010).
From a computational perspective, there are sound explanations as to why this may be the case. As critical as they are, forming, maintaining and using models of others in decision making can be computationally intractable. Pynadath & Marsella [2007] presented an approach, called minimal mental models, that sought to reduce these costs by exploiting criteria such as prediction accuracy and utility costs associated with prediction errors as a way to limit model complexity. There is a clear relation of that work to the work in psychology on ad hoc categories formed in order to achieve goals [Barsalou, 1983], as well as ideas on motivated inference [Kunda, 1990].
Approach
This effort seeks to develop more robust artificial agents with ToM using an approach that collects data on human ToM performance, analyses the data and then constructs a computational model based on the analyses. The resulting model will provide artificial agents with a robust, efficient capacity to reason about others.
a) Study the nature of mental model formation and adaptation in people during social interaction– specifically how do one’s own goals, as well as the other’s goals influence and make tractable the model formation and use process.
b) Develop a tractable computational model of this process that takes into the account the artificial agent’s and the human’s goals, as well as models of each other, in an interaction. Tractable of course is fundamental in face-to-face social interactions where agents must respond rapidly.
c) Evaluate the model in artificial agent – human interactions.
We see this work as fundamental to taking embodied social agents beyond their limited, inflexible approaches to interacting socially with us to a significantly more robust capacity. Key to that will be making theory of mind reasoning in artificial agents more tractable via taking into account both the agent’s goals and the human’s goals in the interaction.
References
[Kin90] Kunda, Z. (1990). The case for motivated reasoning. Psychological Bulletin, 108(3), 480-498.
[Bar83] Barsalou, L.W. Memory & Cognition (1983) 11: 211.
[Key03] Keysar, B., Lin, S., & Barr, D. (2003). Limits on theory of mind use in adults. Cognition, 89, 25–41.
[Lin10] Lin, S., Keysar, B., & Epley, N. (2010). Reflexively mindblind: Using theory of mind to interpret behavior requires effortful attention. Journal of Experimental Social Psychology, 46, 551–556.
[Pin07] David V. Pynadath & Stacy C. Marsella (2007). Minimal Mental Models. In: AAAI, pp. 1038-1046.
[Whi91] Whiten, Andrew (ed). Natural Theories of Mind. Oxford: Basil Blackwell, 1991.
Niina Seittenranta
Niina was part of the fourth cohort of Social AI CDT students. In her PhD project, she employed deep learning feature extraction and fMRI to understand how the human brain creates predictions about social interaction. She was fascinated by the brain processes that underlie social interaction, as social situations are complex phenomena with several possible outcomes. Therefore the brain must update predictions efficiently based on incoming sensory information that involves not only a person themselves but also other people, whose behaviour can be challenging to predict. She had a keen interest in brain function and computational research methods, and was excited to pursue these topics during her PhD project here at the University of Glasgow.
Before starting at the University of Glasgow, she completed Bachelor’s and Master’s degrees in Cognitive Science at the University of Helsinki. Her previous studies were interdisciplinary, including neuroscience, psychology, statistics and programming. During her previous degrees and after graduation, she worked as a research and technical assistant in several research projects. The projects have concerned flow experience during visuomotor learning, social interaction and trust, and cognitive and emotional mechanisms of technology-mediated human learning.
After achieving a PG Diploma in Computing Science and Psychology with Merit, Niina moved to the University of Helsinki to complete her current PhD, Neurofunctional and anatomical origins and computational models of dyslexia in 7 to 8 year-old children at dyslexia risk.
Deep Learning feature extraction for social interaction prediction in movies and visual cortex
While watching a movie, a viewer is immersed in the spatiotemporal structure of the movie’s audiovisual and high level conceptual content [Raz19]. The nature of the movies induces a natural waxing and waning of more and less social immersive content. This immersion can be exploited during brain imaging experiments to emulate as closely as possible the every-day human life experience, including brain processes involved in social perception. The human brain is a prediction machine: in addition to receiving sensory information, it actively generates sensory predictions. It implements this by creating internal models about the world which are used to predict upcoming sensory inputs. This basic but powerful concept is used in several studies in Artificial Intelligence (AI) to perform different type of predictions: from video inner-frames for video interpolation [Bao19], to irregularity detection [Sabokrou18], passing through future sound prediction [Oord18]. Despite different studies on AI focusing on how to use visual features to detect and track actors in a movie [Afouras20], it is not clear in the brain how cortical networks for social cognition involve layers in the visual cortex for processing the social interaction cues occurring between actors. Several studies suggest that biological motion recognition (the visual processing of others’ actions) is central to understanding interactions between agents and involves top-down social cognition with bottom up visual processing. We will use cortical layer specific fMRI at Ultra High Field to read brain activity during movie stimulation. Using the latest advances in Deep Learning [Bao19, Afouras20], we will study how the interaction between two people in a movie is processed, trying to analyse predictions that occur between frames. The comparison between the two representation sets, which involves the analysis of the movie video with Deep Learning and its response measured within the brain, will occur doing model comparison with Representational Similarity Analysis (RSA) [Kriegeskorte08]. The work and its natural extensions will help clarify how the early visual cortex is responsible for guiding attention in social scene understanding. The student will spend time in both domains: studying and analysing the state-of-the-art methods in pose estimation and scene understanding in Artificial Intelligence. In brain imaging, they will learn how to perform a brain imaging study with fMRI: from data collection and understanding, to analysis methods. These two fields will provide a solid background in both brain imaging and artificial intelligence, teaching the student the ability to transfer skills and draw conclusions across domains.
References
[Afouras20] Afouras, T., Owens, A., Chung, J. S., & Zisserman, A. (2020). Self-supervised learning of audio-visual objects from video. European Conference on Computer Vision (ECCV 2020).
[Bao19] Bao, W., Lai, W. S., Ma, C., Zhang, X., Gao, Z., & Yang, M. H. (2019). Depth-aware video frame interpolation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3703-3712).
[Kriegeskorte08] Kriegeskorte, N., Mur, M., & Bandettini, P. A. (2008). Representational similarity analysis-connecting the branches of systems neuroscience. Frontiers in systems neuroscience, 2, 4.
[Oord18] Oord, A. V. D., Li, Y., & Vinyals, O. (2018). Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748.
[Raz19] Raz, G., Valente, G., Svanera, M., Benini, S., & Kovács, A. B. (2019). A Robust Neural Fingerprint of Cinematic Shot-Scale. Projections, 13(3), 23-52.
[Sabokrou18] Sabokrou, M., Pourreza, M., Fayyaz, M., Entezari, R., Fathy, M., Gall, J., & Adeli, E. (2018, December). Avid: Adversarial visual irregularity detection. In Asian Conference on Computer Vision (pp. 488-505). Springer, Cham.
Tobias Thejll-Madsen
Tobias was part of the second cohort of Social AI CDT students. His research focused on facial expressions in social signaling and on using this knowledge to autogenerate effective humanlike facial expressions on virtual agents. To do this, how expressions link to underlying emotional states and social judgements needed to be understood and translated to models that the computer can use. He was excited to work with a range of people in both psychology and computer science.
Previously, he had completed an MA in Psychology and an MSc in Human Cognitive Neuropsychology with the University of Edinburgh. There he focused on cognitive psychology, most recently looking at active learning in a social setting, and he was very curious about social inference and cognition in general. However, as many others, he found it hard to just have one interest so in no particular order, he enjoys: moral philosophy/psychology, statistics and methodology, education/pedagogy (prior to his MSc, he worked developing educational resources and research), reinforcement learning, epistemology, philosophy of science, outreach and science communication, cooking, stand-up comedy, roleplaying games, staying hydrated, and basically anything outdoors.
In 2025, Tobias successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Practical and Theoretical Considerations for Working with Emotions in Autonomous Systems
As artificial intelligence systems become increasingly embedded in social and interactive contexts, their ability to recognise and respond to human emotions grows ever more critical. Affective computing addresses this need by integrating insights from psychology, neuroscience, and computer science to recognise, model, and display affect and emotion [1, 2]. It aims both to enhance affect-aware technologies and to contribute to the affective sciences [3].
This PhD project proposes a dual-pronged approach to advance affective computing, addressing both the practical development of emotion-aware technologies and the theoretical advancement of the affective sciences. The first objective is to develop a novel, theory-driven framework for analysing facial expressions of emotion. Moving beyond fixed-category models [4], this work draws on appraisal theory [5] and causal modelling to infer context-sensitive ‘ground truth’ emotional states. Using data from interactive scenarios such as dyadic negotiations, the project will investigate how individual and contextual variability shapes expressive behaviour. It aims to develop approaches that capture individual and group differences, including cultural variations in emotional facial expressions [6].
The second strand of the project focuses on the application and evaluation of large language models (LLMs) in affective tasks, with the potential for such models to serve as back-end components in virtual humans. While systems like GPT-4 have shown promise in generating emotionally appropriate responses, their behaviour remains difficult to interpret and evaluate [7]. This project will develop a systematic framework for assessing LLMs’ emotional inference capabilities. Building on existing work in evaluating computational models of emotion [8], it will also explore hybrid systems that combine structured affective reasoning with the generative capabilities of LLMs.
Overall, the project aims to contribute both theoretically and practically to the development of robust, interpretable, and ethically aligned emotion-aware technologies.
References
[1] Picard, R. W. (2000). Affective computing. The MIT Press.
[2] de Melo, C., Gratch, J., Marsella, S., Pelachaud. C. Social Functions of Machine Emotional Expressions. Proceedings of the IEEE, 2023, 111 (10), pp.1382-1397
[3] D’Mello, S., Kappas, A., & Gratch, J. (2018). The affective computing approach to affect measurement. Emotion Review, 10(2), 173-183.
[4] Barrett, L. F., Adolphs, R., Marsella, S., Martinez, A. M., & Pollak, S. D. (2019). Emotional Expressions Reconsidered: Challenges to Inferring Emotion From Human Facial Movements. Psychological Science in the Public Interest, 20(1), 1–68. https://doi.org/10.1177/1529100619832930
[5] Smith, C. A., & Lazarus, R. S. (1990). Emotion and adaptation. In L. A. Pervin (Ed.), Handbook of personality: Theory and research (pp. 609–637). Guilford Press.
[6] Jack, R. E., Caldara, R., & Schyns, P. G. (2012). Internal representations reveal cultural diversity in expectations of facial expressions of emotion. Journal of Experimental Psychology: General, 141(1), 19–25.
[7] Bubeck, S., Chandrasekaran, V., Eldan, R., et al. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. arXiv preprint arXiv:2303.12712.
[8] Marsella, S. C., Gratch, J., & Petta, P. (2010). Computational models of emotion. In A blueprint for affective computing (pp. 21–46). Oxford University Press.
Maria Vlachou
Maria was part of the second cohort of Social AI CDT students. She came to the Social AI CDT from a diverse background. She holds an MSc in Data Analytics (University of Glasgow) funded by the Data Lab Scotland, a Research Master in (KU Leuven, Belgium), and a BSc in Psychology (Panteion University, Greece). She has worked as an Applied Behavior Analysis Trainer (Greece & Denmark) & as a Research Intern at the Department of Quantitative Psychology (KU Leuven), where she focused on statistical methods for psychology and reproducibility of science. She has worked for the last two years as a Business Intelligence Developer in the Pharmaceutical Industry. As an MSc student in Glasgow, she was exposed to more advanced statistical methods and her Thesis focused on Auto-Encoder models for dimensionality reduction.
She considered her future work on the project Conversational Venue Recommendation” (supervised by Dr. Craig MacDonald and Dr. Philip McAleer) as a natural evolution of all the above. Therefore, she looked forward to working on deep learning methods for building Conversational AI chatbots and discovering new complex methods for recommendations in social networks by incorporating users’ characteristics. Overall, she was excited to be part of an interdisciplinary CDT and have the opportunity to work with people from different research backgrounds.
In 2024, Maria successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Predicting Retrieval Failures in Conversational Recommendation Systems
In recent years, online shopping is becoming increasingly popular. As a result, a variety of Conversational Recommendation Systems (CRS) have been proposed, which assist users in finding items and making decisions. In particular, a CRS agent may be able to elicit further preferences, ask if they prefer one item or another, or ask for clarification in the type of item, such as a fashion attribute or style. This proposal aims to: Examine the dense embedded representations of both text and images to predict the effectiveness of CRS multi-turn rankings, predict conversational failures under more realistic scenarios (i.e., when an item is missing from the catalogue), obtain users’ opinions about alternative satisfactory target items in order to inform predictions about user preferences, and make interventions tailored to the users’ needs that encourage a more pleasant CRS experience.
Sean Westwood
Sean was part of the second cohort of Social AI CDT students. He completed his undergraduate and postgraduate degrees in the School of Psychology here at the University of Glasgow. For his PhD he continued to work under the supervision of Dr Marios Philiastides, who specialises in the neuroscience of decision making and has guided him through his MSc. He also worked under Professor Alessandro Vinciarelli from the School of Computing Science, who specialises in developing computational models involved in human-AI interactions.
His main research interests are reinforcement learning and decision making in humans, as well as the neurological basis for individual differences between people. These interests stem from his background in childcare and sports coaching. For his undergraduate dissertation he studied how gender and masculinity impact risk-taking under stress to investigate why people may act differently in response to physiological arousal. His postgraduate research focused on links between noradrenaline and aspects of reinforcement learning, using pupil dilation as a measure of noradrenergic activity.
He was looking forward to continuing with this line of research, with the aim of building computational models that reflect individual patterns of learning based on pupil data. It was his hope that this will open up exciting possibilities for AI programmes that are able to dynamically respond to individual needs in an educational context.
In 2025, Sean successfully completed a PhD with Integrated Study in Computing Science and Psychology.
Neural characteristics of reward and punishment learning for optimising decision-making
Value-based decisions are often required in everyday life, where we must incorporate situational evidence with past experiences to work out which option will lead to the best outcome. However, the mechanisms that govern how these two factors are weighted are not yet fully understood. Gaining insight into these processes could greatly help towards the optimisation of feedback in gamified learning environments. This project aims to develop a closed-loop biofeedback systemthat leverages unique ways of fusing electroencephalographic (EEG) and pupillometry measurements to investigate the utilityof the noradrenergic arousal system in value judgements and learning.
In recent years,it has become well established that pupil diameter consistently varies with certain decision making variables such as uncertainty, predictions errors and environmental volatility (Larsen & Waters, 2018). The noradrenergic(NA) arousal system in the brainstem is thought to be driving the neural networks involved in controlling these variables. Despite the increasing popularity of pupillometry in decision-making research, there are still many aspects that remain unexplored, such as the role of the NA arousal system in regulating learning rate, which is the rate at which new evidence outweighs past experiences in value-based decisions (Nasar et al., 2012).
Developing a neurobiological framework ofhow NA influences feedback processingand the effect ithas on learning rates can potentially enablethe dynamic manipulation of learning. Recent studies have used real-time EEG analysis to manipulate arousal levels in a challenging perceptual task, showing that it is possible to improve task performance by manipulating feedback (Faller et al., 2019).
Apromising area of application of such real-time EEG analysis is the gamification of learning, particularly in digital learning environments. Gamification in a pedagogical context is the idea of using game features (Landers, 2014)to enable a high level control over stimuli and feedback. This project aimsto dynamically alter learning rates via manipulation of the NA arousal system using known neural correlates associated withlearning and decision making such as attentional conflict and levels of uncertainty (Sara & Bouret, 2012). Specifically, the main aims of the project are:
- To model the relationship between EEG, pupil diameter and dynamic learning rate during reinforcement learning (Fouragnan et al., 2015).
- To model the effect of manipulating arousal, uncertainty and attentional conflict on dynamic learning rate during reinforcement learning.
- To develop a digital learning environment that allows for these principles to be applied in a pedagogical context.
Understanding the potential role of NA arousal system in the way we learn, update beliefs and explore new options could have significantimplications in the realm of education and performance. This project will facilitate the creation of an online learning environment which will provide an opportunity to benchmarkthe utility of neurobiological markers in an educational setting. Success in this endeavour would pave the way for a wide variety of adaptations to learning protocols that could in turn empower alevel of learning optimisation and individualisation as feedback is dynamically and continuously adapted to the needs of the learner.
References
[FAL19] Faller, J., Cummings J., Saproo, S., & Paul Sajda (2019). Regulation of arousal via online neurofeedback improves human performance in a demanding sensory-motor task. Proceedings of the National Academy of Sciences, 116(13), 6482-6490.
[FOU15] Fouragnan, E., Retzler, C., Mullinger, K., & Philiastides, M. G. (2015). Two spatiotemporally distinct value systems shape reward-based learning in the human brain. Nature communications, 6, 8107.
[LAN14] Landers, R. N. (2014). Developing a theory of gamified learning: Linking serious games and gamification of learning. Simulation & gaming,45(6), 752-768.
[LAR18] Larsen, R. S., & Waters, J. (2018). Neuromodulatory correlates of pupil dilation. Frontiers in neural circuits, 12, 21.
[NAS12] Nassar, M. R., Rumsey, K. M., Wilson, R. C., Parikh, K., Heasly, B., & Gold, J. I. (2012). Rational regulation of learning dynamics by pupil-linked arousal systems. Nature neuroscience, 15(7), 1040.
[SAR12] Sara, S. J., & Bouret, S. (2012). Orienting and reorienting: the locus coeruleus mediates cognition through arousal. Neuron, 76(1), 130-141.