An Adaptive Engagement Support Robot Based on Behavior Update Model and Engagement Estimation
Shinobu Hasegawa
Japan Advanced Institute of Science and Technology (JAIST), Japan
This research aims to develop an adaptive engagement support robot based on a behavior update model and engagement estimation. With the development of robotics, there is a growing expectation that robots can help people learn. This includes learning partner robots that accompany learners and functional robots that help with learning. Almost such robots followed predetermined rules and interacted with the learners based on their emotions or behavior. However, the effective robot's interactions would be different for the different learners.
Thus, our main idea is to propose an interaction network model for learning partner robots to follow individual differences and realize adaptive interactions for facilitating engagement in the learning process. This research consists of the following three steps.
1. Engagement estimation: We adopted V. Huynh's approach, a sub-challenge of the 7th Emotion Recognition in the Wild Challenge (EmotiW 2019). His method involves three basic steps: feature extraction, regression, and model combination. First, the facial features are extracted by a pre-trained model from the input video divided into segments. Then, the engagement intensity is predicted as multiple regression tasks with different LSTM models to capture temporal information. Finally, these models are combined to achieve better performance. In this model, the intensity of engagement is divided into four levels: highly-engaged, engaged, barely-engaged, and disengaged.
2. Interaction network: As a learning partner robot in this research, we employed the communication robot Sota, which has a camera, microphone, speaker, and network functions to interact with the learner with words and actions. To generate Sota's interaction with learners, we build an interaction network of three layers representing Sota's action, words, and speech rate. It is a 3*4 fully connected network where the initial weights are set manually in advance. The weights are translated into a probabilistic form, which determines the content of the robot's interaction.
3. Adaptation model: We propose an adaptation model that compares learners' engagement intensities before and after the interaction to update the interaction network weights so that the robot can change the interaction content through the learner's reaction. It might enable the robot to give the most personalized interaction to different learners.
To verify the effectiveness of the proposed method, we conducted a small experiment at a within subject design with the following three primary objectives.
A) To evaluate the accuracy of the engagement intensity detection model, the subjects were asked to refer to their facial video and score their engagement intensity at 0, 5, 10, 15, 20, 25, and 30 minutes. The data judged by the model were then compared with the data scored by the subject. The results show that out of 280 judgments, the number of correct judgments was 150, with a correct rate of 53.6%.
B) The Sota's interaction was evaluated from the following three perspectives. a) Which condition kept the engagement higher in the with-Sota vs. without-Sota? The results indicated that their engagement intensity was higher in the with-Sota condition. b) Which condition improved the engagement? The results show that the engagement intensity was recovered more often in the with-Sota condition. c) What about the timing and content of Sota interactions? From the questionnaire results, around 40% of Sota's interaction affected the maintenance of learning engagement.
C) To evaluate the effectiveness of the proposed adaptive algorithm was verified by comparing the average of the subject's satisfaction with the content of the first, second, and third periods. From the results, we can conclude that the subject satisfaction increases significantly in the number of interactions at the last third period as the experiment progresses.
These results show a certain effectiveness of the proposed method. In the future, we would like to introduce a more robust model to estimate learner’s engagement and follow individual preferences with different cultures and backgrounds.
This work was conducted in collaboration with Mr. Yao Bowei, who has completed his master's degree in March 2022, and was supported by JSPS KAKENHI Grant Number 20H04294 and Photron limited.