The application process is now open, please visit the following page for the instructions on how to apply:
Should you have any enquiry, please contact us at email@example.com (please write “application enquiry” in the subject header).
Brain Based Inclusive Design
Supervisors: Monika Harvey (School of Psychology) and Alessandro Vinciarelli (School of Computing Science).
It is clear to everybody that people differ widely, but the underlying assumption of current technology designs is that all users are equal. The large cost of this, is the exclusion of users that fall far from the average that technology designers use as their ideal abstraction (Holmes, 2019). In some cases, the mismatch is evident (e.g., a mouse typically designed for right-handed people is more difficult to use for left-handers) and attempts have been made to accommodate the differences. In other cases, the differences are more subtle and difficult to observe and no attempt has been made, to the best of our knowledge, as yet to take them into account. This is the case, in particular, for change blindness (Rensink, 2004) and inhibition of return (Posner & Cohen, 1984), two brain phenomena that limit our ability to process stimuli presented too closely in space and time.
The overarching goal of the project is thus to design Human-Computer Interfaces capable of adapting to the limits of every user, in view of a fully inclusive designe capable putting every user at ease, i.e., enabling him/her to interact with technology according to her/his processing speed and not according to the speed imposed by technology designers.
The proposed approach includes four steps:
- Development of the methdologies for the automatic measurement of the phenomena described above through their effect on EEG signals (e.g., changes in P1, N1 components (McDonald et al., 1999) and behavioural performance (e.g., in/decreased accuracy, in/decreased reaction times);
- Identification of the relationship between the phenomena above and observable factors such as age, education level, computer familiarity, etc. of the user;
- Adaptation of the technology design to the factors above,
- Analysis of the improvement of the users’ experience.
The main expected outcome is that technology will become more inclusive and capable of accommodating the individual needs of its users in terms of processing speed and ease of use. This will be particularly beneficial for those groups of users that, for different reasons, tend to be penalised in terms of processing speed, in particular older adults and special populations (e.g., children with developmental issues, stroke survivors, and related cohorts).
The project is of great industrial interest because, ultimately, improving the inclusion of technical design greatly increases user satisfaction, a crucial requirement for every company that aims to commercialise technology.
[HOL19] Holmes, K. (2019). Mismatch, MIT Press.
[MCD99] McDonald,J., Ward,L.M. &.Kiehl,A.H. (1999). An event-elated brain potential study of inhibition of return. PerceptionandPsychophysics, 61, 1411–1423.
[POS84] Posner, M.I. & Cohen, Y. (1984). “Components of visual orienting”. In Bouma, H.; Bouwhuis, D. (eds.). Attention and performance X: Control of language processes. Hillsdale, NJ: Erlbaum. pp. 531–56.
[RES04] Rensink, R.A. (2004). Visual Sensing without Seeing. Psychological Science, 15, 27-32.
Developing and assessing social interactions with virtual agents in a digital interface for reducing personal stress
Supervisors: Larry Barsalou (School of Psychology) and Stacy Marsella (School of Psychology).
Aims: One of the key goals in general for artificial social intelligence is to use the technology in tailored health interventions. In line with that goal, this CDT project will assess whether social interactions with a virtual agent increase engagement with a stress app and the effectiveness of using it as well as explore how the design of the virtual agent impacts those outcomes.
Novel Elements: This work wil exploit a validated model of stress response, the Situated Assessment Method (SAM2), as the basis for a user model that will tailor the intervention and explore how the design of the agent and its behavior, described below, interact with the effectivenss of that model. The SAM2instrument has been developed in our previous work and is based on a state-of-the-art model of the stress response. Besides working with classic stress (more aptly termed “distress”), users would be invited to work with positive stress that they feel excited about and enables them to grow (“eustress”).
Approach: Three variables associated with social interaction will be manipulated between participant groups. First, we will manipulate levels of social interaction: (a) a virtual agent guides users through interactions with the app, (b) a text dialogue guides these interactions instead, or (c) only simple text instructions control app use. Second, we will manipulate whether the app collects a model of the user and uses it during interactions (theory of mind) to tailor the interaction. Manipulating presence versus absence of user models can be implemented across the three levels of social interaction just described. Third, in the version of the app that implements social interaction with a virtual agent, we will manipulate features of the agent, including its physical appearance/behavior, personality, and perceived social connectedness. Of interest across these manipulations is the impact on engagement and effectiveness.
Users would work with the app over at least a two-week period. At the start, we will collect detailed data on the user’s past and current stress experience using the SAM2. For the following two weeks, participants would provide brief reports of their daily stress experience each evening. At the end of the two-week period, participants would perform another SAM2assessment like the one that began the study, providing us with pre- and post-intervention data. Engagement will be assessd by how often and how long participants engage with the app, and how much they enjoy doing so. Effectiveness will be assessed by how much stress changes over the two-week period, and potentially beyond.
Outputs and Impact: The app itself, validated, is an output with potential high impact. Stress impacts many health outcomes and therefore public health costs. SAM2has been studied and evaluated across a wide range of health related domains so effective ways of using it has broad potential app use beyond stress. The creation of the app and its fielding promises a wealth of data for further study. Interesting AI problems associated with building the app include structuring dialogue and interaction between the virtual agent and user, constructing effective user models, and developing strategies to integrate user models, dialogue and virtual agent embodiment in effective health interventions.
[DUT19] Dutriaux, L., Clark, N., Papies, E. K., Scheepers, C., & Barsalou, L. W. (2019). Using the Situated Assessment Method (SAM2) to assess individual differences in common habits. Manuscript under review.
[EPE18] Epel, E. S., Crosswell, A. D., Mayer, S. E., Prather, A. A., Slavich, G. M., Puterman, E., & Mendes, W. B. (2018). More than a feeling: A unified view of stress measurement for population science. Frontiers in Neuroendocrinology, 49, 146–169.
[LEB16] Lebois, L. A. M., Hertzog, C., Slavich, G. M., Barrett, L. F., & Barsalou, L. W. (2016). Establishing the situated features associated with perceived stress. Acta Psychologica, 169,119–132.
[MAR14] Stacy Marsella and Jonathan Gratch. Computationally Modeling Human Emotion. Communications of the ACM, December, 2014.
[MAR04] Marsella, S.C., Pynadath, D.V., and Read, S.J. PsychSim: Agent-based modeling of social interactions and influence. In Proceedings of the International Conference on Cognitive Modeling, pp. 243-248, Pittsburgh, 2004.
[LYN11] Lynn C. Miller, Stacy Marsella, Teresa Dey, Paul Robert Appleby, John L. Christensen, Jennifer Klatt and Stephen J. Read. Socially Optimized Learning in Virtual Environments (SOLVE). The Fourth International Conference on Interactive Digital Storytelling (ICIDS), Vancouver, Canada, Nov. 2011.
Into the thick of it: Situating digital health behaviour interventions
Supervisors: Esther Papies (School of Psychology) and Stacy Marsella (School of Psychology).
Aims and Objectives. This project will examine how to best integrate a digital health intervention into a users’ daily life. Are digital interventions more effective if they are situated, i.e., adapted to specific users and situations where behaviour change should happen? If so, which features of situations should a health (phone) app use to remind a user to perform a healthy behaviour (e.g., time of day, location, mood, activity pattern)? From a Social AI perspective, how do we make inferences about those situations from sensing data and prior models of users’ situated behaviors, how and when does the app socially interact with the user to improve the situated behavior and how do we adapt the user model over time to improve the app’s tailored interaction with a specific user? We will test this in the domain of hydration, with an intervention to increase the consumption of water.
Background and Novelty. Digital interventions are a powerful new tool in the domain of individual health behaviour change. Health apps can reach large numbers of users at relatively low cost, and can be tailored to an individual’s health goals. So far, however, digital health interventions have not exploited a key strength compared to traditional interventions delivered by human health practitioners, namely the ability to situate interventions in critical situations in a users’ daily life. Rather than being presented statically at pre-set times, situating interventions means that they respond and adapt to the key contextual features that affect a users’ health behaviour. Previous work has shown that context features have a powerful influence on health behaviour, for example by triggering habits, impulses, and social norms. Therefore, it is vital for effective behaviour change interventions to take the specific context of a user’s health behaviours into account. The current proposal will test whether situating a mobile health intervention, i.e., designing it to respond adaptively to contextual features, increases its effectiveness compared to unsituated interventions. We will do this in the domain of hydration, because research suggests that many adults may be chronically dehydrated, with implications for cognitive functioning, mood, and physical health (e.g., risk of diabetes, overweight, kidney damage).
Methods. We will build an app to increase water intake and compare a static version of this app with a dynamic version that responds to time, a user’s activity level, location, social context, mood, and other possible features that may be linked to hydration (Paper 1). We will assess whether an app that responds actively to such features leads over time to more engagement and behaviour change than a static app (Paper 2), and which contextual inferences work best to situate an app for effective behaviour change (Paper 3).
Outputs. This project will lead to presentations and papers at both AI and Psychology conferences outlining the principles and results of situating health behaviour interventions, using the tested healthy hydration app.
Impact. Results from this work will have implication for the design of health behaviour interventions across domains, as well as for our understanding of the processes underlying behaviour change. It will explore how sensing and adaptive user modeling can situate both user and AI system in a common contextual frame and whether this facilitates engagement and behavior change.
Alignment with Industrial Interests. This work will be of interest to industry collaborators interested in personalised health behaviour, such as Danone.
[MUN15] Muñoz, C. X., Johnson, E. C., McKenzie, A. L., Guelinckx, I., Graverholt, G., Casa, D. J., … Armstrong, L. E. (2015). Habitual total water intake and dimensions of mood in healthy young women. Appetite, 92, 81–86.
[PAP17] Papies, E. K. (2017). Situating interventions to bridge the intention–behaviour gap: A framework for recruiting nonconscious processes for behaviour change. Social and Personality Psychology Compass, 11(7), n/a-n/a.
[RIE13] Riebl, S. K., & Davy, B. M. (2013). The Hydration Equation: Update on Water Balance and Cognitive Performance. ACSM’s health & fitness journal, 17(6), 21–28.
[WAN17] Wang and S. Marsella, “Assessing personality through objective behavioral sensing,” in Proceedings of the 7th international conference on affective computing and intelligent interaction, 2017.
[LYN11] Lynn C. Miller, Stacy Marsella, Teresa Dey, Paul Robert Appleby, John L. Christensen, Jennifer Klatt and Stephen J. Read. Socially Optimized Learning in Virtual Environments (SOLVE). The Fourth International Conference on Interactive Digital Storytelling (ICIDS), Vancouver, Canada, Nov. 2011.
[PYN07] Pynadath, David V.; Marsella, Stacy C. Minimal mental models. In Proceedings of the 22ndNational Conference on Artificial Intelligence (AAAI), pp. 1038-1044, 2007.
Optimising interactions with virtual environments
Supervisors: Michele Sevegnani (School of Computing Science) and Monika Harvey (School of Psychology).
Virtual and Mixed Reality systems are socio-technical applications in which users experience different configurations of digital media and computation that give different senses of how a “virtual environment” relates to their local physical environment. In Human-Computer Interaction (HCI), we recently developed computational models capable of representing physical and virtual space, solving the problems of how to recognise virtual spatial regions starting from the detected physical position of the users (Benford et al., 2016). The models are bigraphs [MIL09] derived from the universal computational model introduced by Turing Award Laureate Robin Milner. Bigraphs encapsulate both dynamic and spatial behaviour of agents that interact and move among each other, or within each other. We used the models to investigate cognitive dissonance, namely the inability or difficulty to interact with the virtual environment.
How the brain represents physical versus virtual environments is also an issue very much debated within Psychology and Neuroscience with some researchers arguing that the brain makes little distinction between the two [BOZ12]. Yet more in line with Sevegnani’s work, Harvey and colleagues have shown that different brain areas represent these different environments and that they are further processed in different time scales HAR12; ROS09]. Moreover, special populations struggle more with virtual over real environments [ROS11].
The overarching goal of this PhD project is, therefore, to adapt the computational models developed in HCI and apply them to psychological scenarios, to test whether the environmental processing within the brain is different as proposed. This information will then refine the HCI model and ideally allow a refined application to special populations.
[BEN16] Benford, S., Calder, M., Rodden, T., & Sevegnani, M., On lions, impala, and bigraphs: Modelling interactions in physical/virtual spaces. ACM Transactions on Computer-Human Interaction (TOCHI), 23(2), 9, 2016.
[BOZ12] Bozzacchi., C., Giusti, M.A., Pitzalis, S., Spinelli, D., & Di Russo, F., Similar Cerebral Motor Plans for Real and Virtual Actions. PLOS One (7910), e47783, 2012.
[HAR12] Harvey, M. and Rossit, S., Visuospatial neglect in action. Neuropsychologia, 50, 1018-1028, 2012.
[MIL09] Milner, R., The space and motion of communicating agents. Cambridge University Press, 2009.
[ROS11] Rossit, S., Malhotra, P., Muir, K., Reeves, I., Duncan G. and Harvey, M., The role of right temporal lobe structures in off-line action: evidence from lesion-behaviour mapping in stroke patients. Cerebral Cortex, 21 (12), 2751-2761, 2011.
[ROS09] Rossit, S., Malhorta, P., Muir, K., Reeves, I., Duncan, G., Livingstone, K., Jackson H., Hogg, C., Castle P., Learmonth G. and Harvey, M., No neglect- specific deficits in reaching tasks. Cerebral Cortex, 19, 2616-2624, 2009.
Sharing the road: Cyclists and automated vehicles
Supervisors: Steve Brewster (School of Computing Science) and Frank Pollick (School of Psychology).
Automated vehicles must share the road with pedestrians and cyclists, and drive safely around them. Autonomous cars, therefore, must have some form of social intelligence if they are to function correctly around other road users.There has been work looking at how pedestrians may interact with future autonomous vehicles [ROT15] and potential solutions have been proposed (e.g. displays on the outside of cars to indicate that the car has seen the pedestrian). However, there has been little work on automated cars and cyclists.
When there is no driver in the car, social cues such as eye contact, waving, etc., are lost [ROT15]. This changes the social interaction between the car and the cyclist, and may cause accidents if it is no longer clear, for example, who should proceed. Automated cars also behave differently to cars driven by humans, e.g. they may appear more cautious in their driving, which the cyclist may misinterpret. The aim of this project is to study the social cues used by drivers and cyclists, and create multimodal solutions that can enable safe cycling around autonomous vehicles.
The first stage of the work will be observation of the communication between human drivers and cyclists through literature review and fieldwork. The second stage will be to build a bike into our driving simulator [MAT19] so that we can test interactions between cyclists and drivers safely in a simulation.
We will then start to look at how we can facilitate the social interaction between autonomous cars and cyclists. This will potentially involve visual displays on cars or audio feedback from them, to indicate state information to cyclists nearby (eg whether they have been detected, whether the car is letting the cyclist go ahead). We will also investigate interactions and displays for cyclists, for example multimodal displays in cycling helmets [MAT19] to give them information about car state (which could be collected by V2X software on the cyclist’s phone, for example). Or directly communicating with the car by input made on the handlebars or via gestures. These will be experimentally tested in the simulator and, if we have time, in highly controlled real driving scenarios.
The output of this work will be a set new techniques to support the social interaction between autonomous vehicles and cyclists. We currently work with companies such as Jaguar Landrover and Bosch and our results will have direct application in their products.
[ROT15] Rothenbucher, D., Li, J., Sirkin, D. and Ju, W., Ghost driver: a platform for investigating interactions between pedestrians and driverless vehicles, Adjunct Proceedings of the International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 44–49, 2015.
[MAT19] Matviienko, A. Brewster, S., Heuten, W. and Boll, S. Comparing unimodal lane keeping cues for child cyclists (https://doi.org/10.1145/3365610.3365632), Proceedings of the 18th International Conference on Mobile and Ubiquitous MultimediaNovember, pp. 1-11, 2019.
Supervisors: Steve Brewster (School of Computing Science) and Frank Pollick (School of Psychology).
The aim of this project is to investigate, in the context of social interactions,the interaction between a driver and an autonomous vehicle. Autonomous cars are sophisticated agents that can handle many driving tasks. However, they may have to hand control to the human driver in different circumstances, for example if sensors fail or weather conditions are bad [MCA16, BUR19]. This is potentially difficult for the driver as they may have not been driving the car for a long period and have to quickly take control [POL15]. This is an important issue for car companies as they want to add more automation to vehicles in a safe manner.Key to this problem is whether this interface would benefit from conceptualizing the exchange between human and car as a social interaction.
This project will study how best to handle handovers, from the car indicating to the driver that it is time to take over, the takeover event, and then the return to automated driving. They key factors to investigate are: situational awareness (the driver needs to know what the problem is and what must be done when they take over), responsibility (who’s task is it to drive at which point), the in-car context (what is the driver doing: are they asleep, talking to another passenger), and driver skills (is the driver competent to drive or are they under the influence).
We will conduct a range of experiments in our driving simulator to test different types of handover situations and different types of multimodal interactions involving social cuesto support the 4 factors outlined above.
The output will be experimental results and guidelines that can help automotive designers know how best to communicate and deal with handover situations between car and driver. We currently work with companies such as Jaguar Landrover and Bosch and our results will have direct application in their products.
[MCA16] Mcall, R., McGee, F., Meschtscherjakov, A. and Engel, T., Towards A Taxonomy of Autonomous Vehicle Handover Situations, Publication: Proceedings of the International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 193–200, 2016.
[BUR19] Burnett, G., Large, D. R. & Salanitri, D., How will drivers interact with vehicles of the future? Royal Automobile Club Foundation for Motoring Report, 2019.
[POL15] Politis, I, Pollick, F and Brewster S. Language-based multimodal displays for the handover of control in autonomous cars, Publication, Proceedings of the International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 3–10, 2015.
Effective Facial Actions for Artificial Agents.
Supervisors: Rachael Jack (School of Psychology) and Stacy Marsella (School of Psychology).
Face signals play a critical role in social interactions because humans make a wide range of inferences about others from their facial appearance, including emotional, mental and physiological states, culture, ethnicity, age, sex, social class, and personality traits (e.g., see Jack & Schyns, 2017). These judgments in turn impact how people interact with others, oftentimes with significant consequences such as who is hated or loved, hired or fired (e.g., Eberhardt et al., 2006). However, identifying what face features drive these social judgment is challenging because the human face is highly complex, comprising a high number of different facial expressions, textures, complexions, and 3D shapes. Consequently, no formal model of social face signalling currently exists, which in turn has limited the design of artificial agents’ faces to primarily ad hoc approaches that neglect the importance of facial dynamics (e.g., Chen et al., 2019). This project aims to address this knowledge gap by delivering a formal model of face signalling for use in socially interactive artificial agents.
Specifically, this project will a) model the space of 3D dynamic face signals that drive social judgments during social interactions, b) incorporate this model into artificial agents and c) evaluate the model in different human-artificial agent interactions. The result promises to provide a powerful improvement in the design of artificial agents’ face signalling and social interaction capabilities with broad potential for applications in wider society (e.g., social skills training; challenging stereotyping/prejudice).
Modelling the face signals will be derived using methods from human psychophysical perception studies (e.g., see Jack & Schyns, 2017) that extends the work of Dr Jack to include a wider range of social signals used in social interactions (e.g., empathy, agreeableness, skepticism). Face signals that go beyond natural boundaries such as hyper-realistic or super stimuli will also be explored. The resulting model will be incorporated into artificial agents using the public domain SmartBody (Thiebaux et al., 2018) animation platform with possible extension to other platforms. Finally, the model will be evaluated in human-agent interaction using the SmartBody platform with possible combination with other modalities including head and eye movements, hand/arm gestures, transient facial changes such as blushing, pallor, or sweating (e.g., Marsella et al., 2013).
Although there is not a current industrial partner, we expect the work to be very relevant to companies interested in the use of virtual agents for social skills training, such as Medical CyberWorld, and companies working on realistic humanoids robots, such as Furhat and Hanson Robotics. Jack and Marsella have pre-existing relations with these companies.
1. Jack, R. E., & Schyns, P. G. (2017). Toward a social psychophysics of face communication. Annual review of psychology, 68, 269-297.
2. Eberhardt, J. L., Davies, P. G., Purdie-Vaughns, V. J., & Johnson, S. L. (2006). Looking deathworthy: Perceived stereotypicality of Black defendants predicts capital-sentencing outcomes. Psychological science, 17(5), 383-386.
3. Chen, C., Hensel, L. B., Duan, Y., Ince, R., Garrod, O. G., Beskow, J., Jack, R. E. & Schyns, P. G. (2019). Equipping Social Robots with Culturally-Sensitive Facial Expressions of Emotion Using Data-Driven Methods. In: 14th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2019), Lille, France, 14-18 May 2019, (Accepted for Publication).
4. Stacy Marsella, Yuyu Xu, Margaux Lhommet, Andrew Feng, Stefan Scherer, and Ari Shapiro, “Virtual Character Performance From Speech”, in Symposium on Computer Animation , July 2013.
5. Stacy Marsella and Jonathan Gratch, “Computationally modeling human emotion”, Communications of the ACM , vol. 57, Dec. 2014, pp. 56-67
6. Marcus Thiebaux, Andrew Marshall, Stacy Marsella, and Marcelo Kallmann, “SmartBody Behavior Realization for Embodied Conversational Agents”, in Proceedings of Autonomous Agents and Multi-Agent Systems (AAMAS) , 2008.
Multimodal Interaction and Huggable Robot
Supervisors: Stephen Brewster (School of Computing Science) and Frank Pollick (School of Psychology).
The aim of the project is to investigate the combination of Human Computer Interaction and social/huggable robots for care, the reduction of stress and anxiety, and emotional support. Existing projects, such as Paro (www.parorobots.com) and the Huggable (www.media.mit.edu/projects/huggable-a-social-robot-for-pediatric-care), focus on very simple interactions. The goal of this PhD project will be to create more complex feedback and sensing to enable a richer interaction between the human and the robot.
The plan would be to study two different aspects of touch: thermal feedback and squeeze input/output. These are key aspects of human-human interaction but have not been studied in human-robot settings where robots and humans come into physical contact.
Thermal feedback has strong associations with emotion and social cues [Wil17]. We use terms like ‘warm and loving’ or ‘cold and distant’ in everyday language. By investigating different uses of warm and cool feedback we can facilitate different emotional relationships with a robot. (This could be used alongside more familiar vibration feedback, such as purring). A series of studies will be undertaken looking at how we can use warming/cooling, rate of change and amount of change in temperature to change responses to robots. We will study responses in terms of, for example, valence and arousal.
We will also look at squeeze interaction from the device. Squeezing in real life offers comfort and support. One half of this task will look at squeeze input, with the human squeezing the robot. This can be done with simple pressure sensors on the robot. The second half will investigate the robot squeezing the arm of the human. For this we will need to build some simple hardware. The studies will look at human responses to squeezing, the social acceptability of these more intimate interactions, and emotional responses to them.
The output of this work will be a series of design prototypes and UI guidelines to help robot designers use new interaction modalities in their robots. The impact of this work will be enable robots have a richer and more natural interaction with the humans they touch. This has many practical applications for the acceptability of robots for care and emotional support.
[Wil17] Wilson, G., and Brewster, S.: Multi-moji: Combining Thermal, Vibrotactile & Visual Stimuli to Expand the Affective Range of Feedback. In Proceedings of the 35th Conference on Human Factors in Computing Systems – CHI ’17, ACM Press, 2017.
Soft eSkin with Embedded Microactuators
Supervisors: Ravinder Dahiya (School of Engineering) and Philippe Schyns (School of Psychology).
Research on tactile skin or e-skin has attracted significant interest recently as it is the key underpinning technology for safe physical interaction between humans and machines such as robots. Thus far, eSkin research has focussed on imitating some of the features of human touch sensing. However, skin is not just designed for feeling the real world, it is also a medium to express feeling through gestures. For example, the skin on the face, which can fold and wrinkle into specific patterns, allows us to express emotions such as varying degrees of happiness, sadness or anger. Yet, this important role of skin has not received any attention so far. Here, for the first time, this project will explore the emotion signal generation capacities of skin by developing programmable soft e-skin patches with embedded micro actuators that will emulate real skin movements. Building on the flexible and soft electronics research in the James Watt School of Engineering and the social robotics research in Institute of Neuroscience & Psychology, this project aims to achieve the following scientific and technological goals:
- Identify suitable actuation methods to generate simple emotive features such as wrinkles on the forehead
- Develop a soft eSkin patch with embedded microactutors
- Use dynamic facial expression models for specific movement patterns in the soft eSkin patch
- Develop an AI approach to program and control the actuators
Industrial Partners: Project briefly discussed with BMW. They have shown interest, but details could not be discussed as currently we are in the process of filing a patent application.
Conversational Venue Recommendation
Supervisors: Craig McDonald (School of Computing Science) and Phil McAleer (School of Psychology).
Increasingly, location-based social networks, such as Foursquare, Facebook or Yelp are replacing traditional static travel guidebooks. Indeed, personalize venue recommendation is an important task for location-based social networks. This task aims to suggest interesting venues that a user may visit, personalized to their tastes and current context, as might be detected form their current location, recent venue visits and historical venue visits. The recent development of models for venue recommendation have encompassed deep learning techniques, able to make effective personalized recommendations.
Venue recommendation is typically deployed such that the user interacts with a mobile phone application. To the best of our knowledge, voice-based venue recommendation has seen considerably less research, but is a rich area for potential improvement. In particular, a venue recommendation agent may be able to elicit further preferences, ask if they prefer one venue or another, or ask for clarification in the type of venue, or distance to be travelled to the next venue.
This proposal aims to:
- Develop and evaluate models for making venue recommendation using chatbot interfaces, that can be adapted to voice through integration of text-to-speech technology, building upon recent neural network architectures for venue recommendation.
- Integrate additional factors about personality of the user, or other voice-based context signals (stress, urgency, group interactions) that can inform the venue recommendation agent.
Venue recommendation is an information access scenario for citizens within a “smart city” – indeed, smart city sensors can be used to augment venue recommendation with information about which areas of the city are busy.
[Man18] Contextual Attention Recurrent Architecture for Context-aware Venue Recommendation. Jarana Manotumruksa, Craig Macdonald and Iadh Ounis. In Proceedings of SIGIR 2018.
[Man17] A Deep Recurrent Collaborative Filtering Framework for Venue Recommendation. Jarana Manotumruksa, Craig Macdonald and Iadh Ounis. In Proceedings of CIKM 2017.
[Dev15] Experiments with a Venue-Centric Model for Personalised and Time-Aware Venue Suggestion. Romain Deveaud, Dyaa Albakour, Craig Macdonald, Iadh Ounis. In Proceedings of CIKM 2015.
Language Independent Conversation Modelling
Supervisors: Olga Perepelkina (Neurodata Lab) and Alessandro Vinciarelli (School of Computing Science).
According to Emmanuel Schegloff, one of the most important linguists of the 20thCentury, conversation is the “primordial site of human sociality”, the setting that has shaped human communicative skills from neural processes to expressive abilities [TUR16]. This project focuses on these latter and, in particular, on the use of nonverbal behavioural cues such as laughter, pauses, fillers and interruptions during dyadic interactions. In particular, the project targets the following main goals:
- To develop approaches for the automatic detection of laugther, pauses, fillers, overlapping speech and back-channel events in speech signals;
- To analyse the interplay between the cues above and social-psychological phenomena such as emotions, agreement/disagreement, negotiation, personality, etc.
The experiments will be performed over two existing corpora. One includes roughly 12 hours of spontaneous conversations involving 120 persons [VIN15] that have been fully annotated in terms of the cues and the phenomena above. The other is the Russian Acted Multimodal Affective Set (RAMAS) − the first multimodal corpus in Russian language, including approximately 7 h of high-quality close-up video recordings of faces, speech, motion-capture data and such physiological signals as electro-dermal activity and photoplethysmogram [PER18].
The main motivation behind the focus on nonvebal behavioural cues is that these tend to be used differently in different cultural contexts, but they can still be detected independently of the language being used. In this respect, an approach based on nonverbal communication promises to be more robust to the application over data collected in different countries and linguistic areas. In addition, while the importance of nonverbal communication is widely recognised in social psychology, the way certain cues interplay with social and psychological phenomena still requires full investigation [VIN19].
From a methodological point of view, the project involves the following main aspects:
- Development of corpus analysis methodologies (observational staistics) for the investigation of the relationships beween nonverbal behaviour and social phenomena;
- Development of signal processing methodologies for the conversion of speech signals into measurements suitable for computer processing;
- Development of Artificial Intelligence techniques (mainly based on deep networks) for the inference of information from raw speech signals.
From a scientific point of view, the impact of the project will be mainly in Affective Computing and Social Signal Processing [VIN09] while, from an industrial point of view, the impact will be mainly in the areas of Conversational Interfaces (e.g., Alexa and Siri), multimedia content analysis and, in more general terms, Social AI, the application domain encompassing all attempts of making machines capable to interact with people like people do with one another. For this reason, the project is based on the collaboration between the University of Glasgow and Neurodata Lab (http://www.neurodatalab.com), one of the top companies in Social and Emotion AI.
[PER18] Perepelkina O., Kazimirova E., Konstantinova M. RAMAS: Russian Multimodal Corpus of Dyadic Interaction for Affective Computing. In: Karpov A., Jokisch O., Potapova R. (eds) Speech and Computer. Lecture Notes in Computer Science, vol 11096. Springer, 2018.
[TUR16] S.Turkle, “Reclaiming conversation: The power of talk in a digital age”, Penguin, 2016.
[VIN19] M.Tayarani, A.Esposito and A.Vinciarelli, “What an `Ehm’ Leaks About You: Mapping Fillers into Personality Traits with Quantum Evolutionary Feature Selection Algorithms“, accepted for publication by IEEE Transactions on Affective Computing, to appear, 2019.
[VIN15] A.Vinciarelli, E.Chatziioannou and A.Esposito, “When the Words are not Everything: The Use of Laughter, Fillers, Back-Channel, Silence and Overlapping Speech in Phone Calls“, Frontiers in Information and Communication Technology, 2:4, 2015.
[VIN09] A.Vinciarelli, M.Pantic, and H.Bourlard, “Social Signal Processing: Survey of an Emerging Domain“, Image and Vision Computing Journal, Vol. 27, no. 12, pp. 1743-1759, 2009.
Neurobiologically-informed optimization of gamified learning environments
Supervisors: Marios Philiastides (School of Psychology) and Alessandro Vinciarelli (School of Computing Science).
Value-based decisions are often required in everyday life, where we must incorporate situational evidence with past experiences to work out which option will lead to the best outcome. However, the mechanisms that govern how these two factors are weighted are not yet fully understood. Gaining insight into these processes could greatly help towards the optimisation of feedback in gamified learning environments. This project aims to develop a closed-loop biofeedback systemthat leverages unique ways of fusing electroencephalographic (EEG) and pupillometry measurementsto investigate the utilityof the noradrenergic arousal systemin value judgements and learning.
In recent years,it has become well established that pupil diameter consistently varies with certain decision making variables such as uncertainty, predictions errors and environmental volatility (Larsen & Waters, 2018). The noradrenergic(NA) arousal system in the brainstem is thought to be driving the neural networks involved in controlling these variables. Despite the increasing popularity of pupillometry in decision-making research, there are still many aspects that remain unexplored, such as the role of the NA arousal system in regulating learning rate, which is the rate at which new evidence outweighs past experiences in value-based decisions (Nasar et al., 2012).
Developing a neurobiological framework ofhow NA influences feedback processingand the effect ithas on learning rates can potentially enablethe dynamic manipulation of learning. Recent studies have used real-time EEG analysis to manipulate arousal levels in a challenging perceptual task, showing that it is possible to improve task performance by manipulating feedback (Faller et al., 2019).
Apromising area of application of such real-time EEG analysis is the gamification of learning, particularly in digital learning environments. Gamification in a pedagogical context is the idea of using game features (Landers, 2014)to enable a high level control over stimuli and feedback. This project aimsto dynamically alter learning rates via manipulation of the NA arousal system using known neural correlates associated withlearning and decision making such as attentional conflict and levels of uncertainty (Sara & Bouret, 2012). Specifically, the main aims of the project are:
- To model the relationship between EEG, pupil diameter and dynamic learning rate during reinforcement learning (Fouragnan et al., 2015).
- To model the effect of manipulating arousal, uncertainty and attentional conflict on dynamic learning rate during reinforcement learning.
- To develop a digital learning environment that allows for these principles to be applied in a pedagogical context.
Understanding the potential role of NA arousal system in the way we learn, update beliefs and explore new options could have significantimplications in the realm of education and performance. This project will facilitate the creation of an online learning environment which will provide an opportunity to benchmarkthe utility of neurobiological markers in an educational setting. Success in this endeavour would pave the way for a wide variety of adaptations to learning protocols that could in turn empower alevel of learning optimisation and individualisation as feedback is dynamically and continuously adapted to the needs of the learner.
[FAL19] Faller, J., Cummings J., Saproo, S., & Paul Sajda (2019). Regulation of arousal via online neurofeedback improves human performance in a demanding sensory-motor task. Proceedings of the National Academy of Sciences, 116(13), 6482-6490.
[FOU15] Fouragnan, E., Retzler, C., Mullinger, K., & Philiastides, M. G. (2015). Two spatiotemporally distinct value systems shape reward-based learning in the human brain. Nature communications, 6, 8107.
[LAN14] Landers, R. N. (2014). Developing a theory of gamified learning: Linking serious games and gamification of learning. Simulation & gaming,45(6), 752-768.
[LAR18] Larsen, R. S., & Waters, J. (2018). Neuromodulatory correlates of pupil dilation. Frontiers in neural circuits, 12, 21.
[NAS12] Nassar, M. R., Rumsey, K. M., Wilson, R. C., Parikh, K., Heasly, B., & Gold, J. I. (2012). Rational regulation of learning dynamics by pupil-linked arousal systems. Nature neuroscience, 15(7), 1040.
[SAR12] Sara, S. J., & Bouret, S. (2012). Orienting and reorienting: the locus coeruleus mediates cognition through arousal. Neuron, 76(1), 130-141.
Testing social predictive processing in virtual reality
Supervisors: Lars Muckli (School of Psychology) and Alice Miller (School of Computing Science).
Virtual reality (VR) is a powerful entertainment tool allowing highly immersive and richly contextual experiences. At the same time, it can be used to flexibly manipulate the 3D (virtual) environment allowing to tailor behavioural experiments systematically. VR is particularly useful for social interaction research, because the experimenter can manipulate rich and realistic social environments, and have participants behave naturally within them [RB18].
While immersed in VR, a participant builds an inner map of the virtual space and stores multiple expectations about the environment mechanisms i.e., where objects or rooms are and their interaction with them, but also about physical and emotional properties of virtual agents (e.g. theory of mind). Using this innovative and powerful technology, it is possible to manipulate both the virtual space and virtual agents within the virtual world, to test internal participants’ expectations and register their reactions to predictable and unpredictable scenarios.
The phenomenon of “change blindness” demonstrates the surprising difficulty observers have in noticing unpredictable changes to visual scenes[SR05]. When presented with two almost identical images, people can fail to notice small changes (e.g. in object colour) and even large changes (e.g. object disappearance). This process arises because the brain cannot attend to the entire wealth of environmental signals presented to our visual systems at any given moment, and instead use attentional networks to selectively process the most relevant features whilst ignoring others. Testing which environmental attributes drive the detection of changes can give useful insights on how humans use predictive processing in social contexts.
In this PhD the student will run behavioural and brain imaging experiments in which they will use VR to investigate how contextual information drives predictive expectations in relation to changes to the environment and agents within it. They will investigate if change detection is due to visual attention or to a social cognitive mechanism such as empathy. This will involve testing word recognition whilst taking the visuospatial perspective of the agents previously seen in the VR (e.g. [FKS18]). The student will examine if social contextual information originating in higher brain areas modulates the processing of visual information. In brain imaging literature, an effective method to study contextual feedback information is the occlusion paradigm [MPM19]. Cortical layer specific fMRI is possible with 7T brain imaging; the student will test how top-down signals during social cognition activate specific layers of cortex. This data would contribute to redefining current theories explaining the predictive nature of the human brain.
The student will also develop quantitative models in order to assess developed theories. In recent work [PMT19], model checking was proposed as a simple technology to test and develop brain models. Model checking [CHVB18] involves building a simple, finite state model, and defining temporal properties which specify behaviour of interest. These properties can then be automatically checked using exhaustive search. Model checking can replace the need to perform thousands of simulations to measure the effect of an intervention, or of a modification to the model.
[MPM19] Morgan, A. T., Petro, L. S., & Muckli, L. (2019). Scene representations conveyed by cortical feedback to early visual cortex can be described by line drawings. Journal of Neuroscience, 39(47), 9410-9423.
[SR05] Simons, D. J., & Rensink, R. A. (2005). Change blindness: Past, present, and future. Trends in cognitive sciences, 9(1), 16-20.
[RB18] de la Rosa, S., & Breidt, M. (2018). Virtual reality: A new track in psychological research. British Journal of Psychology, 109(3), 427-430.
[FKS18] Freundlieb, M., Kovács, Á. M., & Sebanz, N. (2018). Reading Your Mind While You Are Reading—Evidence for Spontaneous Visuospatial Perspective Taking During a Semantic Categorization Task. Psychological science, 29(4), 614-622.
[PMT19] Porr, B., Miller, A., & Trew, A. (2019). An investigation into serotonergic and environmental interventions against depression in a simulated delayed reward paradigm. Adaptive behaviour, (online version available).
[CHVB-8] Clarke, E. M., Henzinger, T. and Veith, H. & Bloem, R (2018). Handbook of model checking. Springer.
The coordination of gesture and voice in autism as a window into audiovisual perception of emotion
Supervisors: Frank Pollick (School of Psychology) and Stacy Marsella (School of Psychology).
When we speak we typically combine our speech with gesture and these gestures are often referred to as a back channel of face-to-face communication. Notably, a lack of coordination of gesture and speech is thought to be a diagnostic property of autism. While there have been many studies of the production of gesture and speech in autism (de Marchena & Eigsti, 2010), as well as differences in the perception of human movement (Todorova, Hatton & Pollick, 2019), there has been less investigation of what spatiotemporal properties people are sensitive to in the coordination of gesture and speech. Addressing this issue is important for the development of artificial systems that would include both gesture and speech. If we desire these systems to appear natural, as well as effective for special populations such as those with autism, then we need to find the spatiotemporal parameters of speech-gesture coordination that impact the perception of fluency. This is particularly important for robotic systems as it is known that the physical limits of robots can constrain their perception as natural movement (Pollick, Hale & Tzoneva-Hadjigergieva, 2005).
Experiment 1: One window into the perceived coordination of speech and gesture is provided by a fundamental aspect of multisensory perception. Namely that sight and sound do not need to be precisely synchronous to be bound together in a unified percept. The amount of asynchrony allowed is up to 300 ms and has been found to vary between task and observer. We will take a published set of brief (2.5 to 3.5s) audiovisual stimuli depicting emotional exchanges of two point-light display actors (Piwek, Pollick & Petrini, 2015) and parametrically vary the asynchrony to determine how asynchrony impacts emotion perception. For a second part of this experiment we will use the motion capture located in the School of Psychology to record typical individuals telling stories and examine how varying audiovisual asynchrony impacts understanding and enjoyment of watching these stories being told. These experiments will be performed with typically developed adults and adults with autism and their results compared.
Experiment 2: Here, we wish to model the combination of gesture and speech using Bayesian Causal modeling (Körding, Beierholm, Ma, Quartz, Tenenbaum & Shams, 2007), which allows us to model how both the high level semantic and low level physical matches between sight and sound to determine whether the audio and visual signals are likely coming from the same source. Using the data from Experiment 1, along with data from new experimental stimuli where the audio and visual signals are incongruent (e.g. a different speaker telling a different story), we will investigate how the fits of the model reflect different physical and semantic matching of the audiovisual pairings. We will investigate whether the model fits are sensitive to differences between the typically developed adults and the adults with autism.
Experiment 3: From Experiments 1 and 2 we hope to understand the properties that drive the perception of coordinated speech. This will allow us to use these parameters to drive audiovisual speech on a robot platform to investigate what audiovisual parameter combinations are better received by typical and autistic observers. This final study will be done in collaboration with Autism Foundation Finland.
We hope for these experimental and theoretical analyses to inform our understanding of how best to design coordinated gesture and speech on robots and how autism might influence preferences for different designs.
[DEM10] de Marchena, A., & Eigsti, I. M. (2010). Conversational gestures in autism spectrum disorders: Asynchrony but not decreased frequency. Autism research, 3(6), 311-322.
[KOR07] Körding, K. P., Beierholm, U., Ma, W. J., Quartz, S., Tenenbaum, J. B., & Shams, L. (2007). Causal inference in multisensory perception. PLoS one, 2(9), e943.
[PIW15] Piwek, L., Pollick, F., & Petrini, K. (2015). Audiovisual integration of emotional signals from others’ social interactions. Frontiers in psychology, 6, 611.
[POL05] Pollick, F. E., Hale, J. G., & Tzoneva-Hadjigeorgieva, M. (2005). Perception of humanoid movement. International Journal of Humanoid Robotics, 2(03), 277-300.
[TOD19] Todorova, G. K., Hatton, R. E. M., & Pollick, F. E. (2019). Biological motion perception in autism spectrum disorder: a meta-analysis. Molecular autism, 10(1), 49.
Social Intelligence towards Human-AI Teambuilding
Supervisors: Frank Pollick (School of Psychology) and Reuben Moreton (Qumodo).
Visions of the workplace-of-the-future include applications of machine learning and artificial intelligence embedded in nearly every aspect (Brynjolfsson & Mitchell, 2017). This “digital transformation” holds promise to broadly increase effectiveness and efficiency. A challenge to realising this transformation is that the workplace is substantially a human social environment and machines are not intrinsically social. Imbuing machines with social intelligence holds promise to help build human-AI teams and current approaches to teaming one human and one machine appear reasonably staightforward to design. However, if there are more than one human and more than one system that are working together we can see that the complexity of social interactions increases and we need to understand the society of human-AI teams. This research proposes to take a first step in this direction to consider the interaction of triads containing humans and machines.
Our proposed testbed will be concerned with automatic image classification and we choose this since identity and location recognition is a primary work context of our industrial partner Qumodo. Moreover, there are many image classification systems that have recently shown the ability to approach or exceed human performance. There are two scenarios we would like to examine involving human-AI triads and we term them the sharing problem and the consensus problem:
In the sharingproblem we examine two humans teamed with the same AI and examine how the human-AI team is influenced by the learning style of the AI, which after initial training can either learn from a single trainer or from multiple trainers. We will examine how trust in the classifier evolves depending upon the presence/absence of another trainer and the accuracy of the other trainer(s). To obtain precise control the “other” trainer(s) could either be actual operators or simulations obtained by parametrically modifying accuracy based on ground truth. Of interest are the questions of when human-AI teams benefit from pooling of human judgment and if pooling can lead to reduced trust.
In the consensusproblem we use the scenario of a human manager who must reach a consensus view based on input from a pair of judgments (human-human, human-AI). This consensus will be reached either with or without “explanation” from the two judgments. To make the experiment tractable we will consider the case of a binary decision (e.g. two facial images are of the same person or a different person). Aspects of the design will be taken from a recent paper examing recognition of identity from facial images (Phillips, et al., 2018).
In addition to these experimental studies we also wish to conduct qualitative studies involving surveys or structured interviews in the workplace to ascertain whether the experimental results are consistent or not with people’s attitudes towards the scenarios depicted in the experiments.
As industry moves further towards AI automation, this research will have substantial impact on future practices within the workplace. Even as AI performance increases, in most scenarios a human is still required to be in the loop. There has been very little research into what such a human-AI integration/interaction should look like. Therefore this research is of pressing importance across a myriad of different sectors moving towards automation.
[BRY17] Brynjolfsson, E., & Mitchell, T. (2017). What can machine learning do? Workforce implications. Science, 358(6370), 1530-1534.
[PHI18] Phillips, P. J., Yates, A. N., Hu, Y., Hahn, C. A., Noyes, E., Jackson, K., … & Chen, J. C. (2018). Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms. Proceedings of the National Academy of Sciences, 115(24), 6171-6176.
Game-based techniques for the investigation of trust for Autonomous robots
Supervisors: Alice Miller (School of Computing Science) and Frank Pollick (School of Psychology).
Trustworthiness is a property of an agent or organisation that engenders trust in others. Humans rely on trust in their day-to-day social interactions, be they in the context of personal relationships, commercial negotiation, or organisational consultation (with healthcare providers or employers for example). Social success therefore relies on the evaluation of the trustworthiness of others, and our own ability to present ourselves as trustworthy. If autonomous agents are to be used in a social environment, it is vital that we understand the concept of trustworthiness in this context [DEV18].
Some formal models of trust for autonomous systems have been proposed (e.g. [BAS16]), but these models are geared specifically towards autonomous vehicles. Any proposed model must be evaluated by testing. In many cases this would involve deploying complex hardware in sufficiently realistic scenarios in which trust would be a consideration. However, it is also possible to investigate trust in other scenarios. For example, it has been shown that different interfaces to an automatic image classifier change the calibration of human trust towards the classifier [ING20]. Relevant to social processing, in [GAL19] trust was examined via the use of videos. Here, the responses of human participants to videos involving an autonomous robot in a range of scenarios were used to investigate different aspects of trust.
Another way to generate user data to test formal models is using mobile games. In a recent paper [KAV19], a model of the way that users play games was used to investigate a concept known as game balance. A software tool known as a probabilistic model checker [KWI17] was used to predict user behaviour under the assumptions of the model. Subsequently the game has been released to generate user data in order to evaluate the credibility of the model used.
In this PhD project you will use a similar technique to evaluate trust for autonomous systems. The crucial aspects are the formal models of trust and the question of how to design a suitable game so that the way users respond to different scenarios reflect how much they trust (autonomous robot or animated) characters in the game. You will:
- Develop and evaluate models of trust for autonomous robots
- Devise a mobile game for which players will respond according to their trust in autonomous robot or animated characters
- Use an automatic technique such as model checking or simulation to determine player behaviour under the assumptions of your trust models
- Analyse how well player behaviour matches that predicted using model checking
[DEV18] Trustworthiness of autonomous systems– K. Devitt, Foundations of Trusted Autonomy, Studies in Systems, Decision and Control, 2018
[BAS16] Trust dynamics in human autonomous vehicle interaction: a review of trust models – C. Basu et al. AAAI 2016.
[ING20] Calibrating trust towards an autonomous image classifier: a comparison of four interfaces – M. Ingram et al., submitted.
[GAL19] Trusting Robocop: Gender-Based Effects on Trust of an Autonomous Robot – D. Gallimore et al. Frontiers in Psychology 2019
[KAV19] Balancing turn-based games with chained strategy generation – W. Kavanagh, A. Miller et. al. IEEE Transactions on Games 2019.
[KWI17] Probabilistic model checking: advances and applications – M. Kwiatkowska et al. Formal System Verification 2017.
Cross-cultural detection of and adaptation to different user types for a public-space robot
Supervisors: Monika Harvey (School of Psychology), Mary Ellen Foster (School of Computing Science) and Olga Perepelkina (Neurodata Lab).
It is well known that people from different demographic groups – be it age, gender, socio-economic status, culture to name a few – have different preferred interaction styles. However, when a robot is placed in a public space, it often has a single, standard interaction style that it uses in all situations acorss the different populations engaging with it. If a robot were able to detect the type of person it was interacting with and adapt its behaviour accordingly on the fly, this would support longer, higher-quality interactions which in turn would increase its utility and acceptance.
The overarching goal of this PhD project is to create such a robot and our collaboration with Neurodata Lab in Russia will allow us to investigate cultural as well as other more common demographic markers. We will further make use of the audiovisual sensing software developed by Neurodata Lab to be implemented in the robot.
As a result, the proposed project will consist of several distint phases. Firstly, a simple robot system will be build and deployed in various locations across Scotland and Russia, and the audiovisual data of all people interacting with it will be recorded. As a second step, this data will be processed and classified with the aim of identifying characteristic behaviours of different user types. In a further step, the robot behaviour will be modified so that it is able to adapt to the different users, and, in a final step, the the modified robot will be evaluated in the original deployment locations.
The results of the project will be of great relevance to our industrial partner, allowing them to further develop and market their audiovisual sensing software. The student will greatly benefit from the industrial as well as the cross-cultural work experience. More generally the results will be of significant interest in areas including social robotics, affective computing, and intelligent user interfaces.
[FOS16] Foster, M. E., Alami, R., Gestranius, O., Lemon, O., Niemelä, M., Odobez, J.-M., & Pandey, A. K. (2016). The MuMMER Project: Engaging Human-Robot Interaction in Real-World Public Spaces. In Social Robotics (pp. 753–763).
[LEA18] Learmonth, G., Maerker, G., McBride, N., Pellinen, P. & Harvey, M. (2018). Right-lateralised lane keeping in young and older British drivers. PLoS One, 13(9),
[MAE19] Maerker, G., Learmonth, G., Thut. G. & Harvey, M. (2019). Intra- and inter-task reliability of spatial attention measures in healthy older adults. PLoS One, 14(2), 1-21.
[PER19] Perepelkina, O., & Vinciarelli, A. (2019). Social and Emotion AI: The Potential for Industry Impact. 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW). Presented at the 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW).
Developing a digital avatar that invites user engagement
Supervisors: Philippe Schyns (School of Psychology) and Mary Ellen Foster (School of Computing Science).
Digital avatars can engage with humans to interact socially. However, before they do so they typically are in a resting, default state. The question that arises is how we should design such digital avatars in a resting state so that they have a realistic appearance that promotes engagement with a human. We will combine methods from human psychophysics, computer graphics, machine vision and social robotics to design a digital avatar (presented in VR or on a computer screen) that looks to a human participant like a sentient being (e.g. with realistic appearance and spontaneous dynamic movements of the face and the eyes), who can then engage with humans before starting an interaction (i.e. tracks their presence, engage with realistic eye contact and so forth). Building on the strength of digital design avatars in the Institute of Neuroscience and Psychology and the social robotics research on the School of Computing Science, this project will attempt to achieve the following scientific and technological goals:
- Identify the default face movements (including eye movements) that produce a realistic sentient appearance.
- Implement those movements on a digital avatar which can be displayed on a computer screen or in VR.
- Use tracking software to detect human beings in the environment, follow their movements and engage with realistic eye contact.
- Develop models to link human behaviour with avatar movements to encourage engagement.
- Evaluate the performance of the implemented models through deployment in labs and in public spaces.
Improving engagement with mobile health apps by understanding (mis)alignment between design elements and personal characteristics
Supervisors: Aleksandar Matic (Telefonica Alpha) and Esther Papies (School of Psychology)
Mobile health apps have brought a growing enthusiasm related to delivering behavioural and health interventions at low-cost and in a scalable fashion. Unfortunately, the potential impact of mobile health applications has been seriously limited by typically a low user engagement and high drop-out rates. A number of studies unpacked potential reasons for the high drop-out rates including the fit for user’s problems, ease of use, privacy concerns, and trustworthiness [TOR18]. Though the best practices for developing engaging apps have been established, there is a consensus that further engagement improvements require personalisation at an individual level. Yet, the factors that influence engagement at the personal level are very complex and the practice has rarely witnessed individually personalised mobile health apps.
Psychological literature provides numerous clues on how the user interaction can be designed in a more engaging way based on personal characteristics. For instance, it is recommended to highlight rewards and social factors for extraverts, safety and certainty for neurotic individuals, achievements and structure for conscientious people [HIR12], or to use external vs internal factors to motivate individuals with high vs low locus of control [CC14]. Developing and testing personalised mobile health apps based on each personal characteristic would require a long process, a lot of A/B trials and significant efforts and costs. Perhaps, this explains why personalisation has been limited in practice and why most of the mobile health apps have been designed in one-size-fits-all manner. Instead of designing and testing each personalised element, this work will apply a different approach – namely a retrospective exploration of a) personal characteristics of individuals who have already used mobile health apps, b) corresponding service design elements, and c) the outcome (drop-out or engagement) and the links between a) and b) that drive the outcome.
Aims and objectives-
This project will deepen understanding of how to personalise mobile health apps to user personal characteristics aiming to improve the engagement and ultimately intervention effectiveness. The main objectives will be the following:
- Identify specific links between personal characteristics and service design elements that predict engagement and/or drop-outs
- Explore if the engagement with mobile health apps can be improved by avoiding misaligned (reinforcing aligned) design elements with personal characteristics that pre-dominantly drive drop-outs (engagement)
- Deliver a set of takeaways for designing socially intelligent interfaces aware of personal characteristics
Extensive literature research will be first conducted to characterise design elements, and to the links to personal characteristics that can influence engagement. This will result in a set of hypotheses on the relationship between different personal characteristics and the engagement mechanisms. Sequentially, one or more studies will be conducted to capture personal traits of the users who have already used a selected set of relevant mobile health apps. By applying standard statistical methods as well as machine learning (to unpack more complex interplay between personal characteristics and design elements), this data will be used to identify engagement/drop-out predictors. The learnings will be used to design and test personalisation in a real-world scenario.
Alignment with industrial interests
This work will be of a direct interest to Telefonica Alpha that is creating mobile phone based wellbeing services as well as digital therapeutics.
[TOR18] Torous, John, Jennifer Nicholas, Mark E. Larsen, Joseph Firth, and Helen Christensen. “Clinical review of user engagement with mental health smartphone apps: evidence, theory and improvements.” Evidence-based mental health21, no. 3 (2018): 116-119.
[HIR12] Hirsh, J. B., Kang, S. K., & Bodenhausen, G. V. (2012). Personalized persuasion: Tailoring persuasive appeals to recipients’ personality traits. Psychological science, 23(6), 578-581.
[CC14] Cobb-Clark, D. A., Kassenboehmer, S. C., & Schurer, S. (2014). Healthy habits: The connection between diet, exercise, and locus of control. Journal of Economic Behavior & Organization, 98, 1-28.
Social Interatation viaTouch Interactive Volummetric 3D Virtual Agents
Supervisors: Ravinder Dahiya (School of Engineering) and Philippe Schyns (School of Psychology)
Vision and touch based interactions are fundamental modes of interaction between humans and between humans and the real world. Several portable devices use these modes to display gestures that communicate social messags such as emotions. Recently, non-volumetric 3D displays have attracted considerable interest because they give users a 3D visual experience – for exmaple, 3D movies provide viewers with a perceptual sensation of depth via a pair of glasses. Using a newly developed haptics-based holographic 3D volumetric display, this project will develop these new forms of social interactions with virtual agents. Unlike various VR tools that require headsets (which can lead to motion sickness), here the interaction with 3D virtual objects will be less restricted, closer to its natural form, and, critically, give the user the illusion that the virtual agent is physically present. The experiments will involve interactions with holographically diplayed virtual human faces and bodies engaging in various social gestures. To this end, the simulated 2D images showing these various gestures will be displayed mid-air in 3D. For enriched interaction and enhanced realism, this project will also involve hand gesture recognition and controlling haptic feedback (i.e. air patterns) to simulate the surface of several classes of virtual objects. This fundamental study is transformative for sectors where physical interaction with virtual objects is critical, including medical, mental health, sports, education, heritage, security, and entertainment.