Design and Implementation of Adam: A Humanoid Robotic Head with Social Interaction Capabilities

: Social robots are being conceived with different characteristics and being used in different applications. The growth of social robotics benefits from advances in fabrication, sensing, and actuation technologies, as well as signal processing and artificial intelligence. This paper presents a design and implementation of the humanoid robotic platform Adam, consisting of a motorized human-like head with precise movements of the eyes, jaw, and neck, together with capabilities of face tracking and vocal conversation using ChatGPT. Adam relies on 3D-printed parts together with a microphone, a camera

From a hardware point of view, additive manufacturing has facilitated the design, implementation, and modification of robotic platforms due to the flexibility and customizability that it enables [30].In addition, the advances in actuator manufacturing have allowed us to perform precise actions in social robots like displaying biologically inspired motions using motors of different types and muscle-like actuators [31,32].
In this paper, the design and implementation of the Adam platform are detailed.This platform consists of a humanoid robotic head that can perform human head-like motions and display expressions due to its 7 degrees of freedom.This platform is also equipped with sound acquisition, processing, and generation capabilities, used in conjunction with ChatGPT to engage in conversations.This 3D-printed platform has been designed to be used in social interaction contexts, like in education and visitor reception.It is easily reproducible and modifiable, allowing for improvements in design or the addition of capabilities.The design is planned for initial mounting on a human-like upper body, with the aim of subsequently adding other body parts to develop a multipurpose modular humanoid robot.
This paper is organized as follows.Section 2 shows recent work that has been conducted in the domain of human-humanoid robot interaction.Section 3 shows the mechanical, aspects of the robot design and implementation.Section 4 shows the electronics and interaction control of the robot.Section 5 concludes the paper and shows work that will follow.

Related Work
Humanoid robot designs have previously been proposed with different aims.While some were general-purpose open-source designs made for 3D printing, others were made for specific purposes and their designs were not made publicly available for reproduction.Table 1 shows the characteristics of some of the robots mentioned below.

Head DoF Vision Sound Acquisition 3D-Printed Open-Source Design
InMoov [33] 5 [34] Cameras in the eye locations [35] Customizable, example: external microphone [34] Yes Yes Roboquin [36]  Among the open-source designs, "InMoov" is a 3D-printed robot that is easily replicable [33].It is a life-size robot with a humanoid design and a large number of degrees of freedom, allowing it to have human-like motions.It has been used, for example, in research on reception and direction-giving [34], spatial perception and object identification [39], and real-time human motion imitation [40].Another humanoid robot, "imNEU", was based on InMoov and had features like a differential drive mobile platform [41].
"Roboquin" was presented in [36] as a humanoid robot with human-like body proportions and degrees of freedom, allowing it to emulate human movements.This robot was capable of performing nonverbal communication through gestures.It did not have facial components; it expressed states like sadness, anger, and fear by controlling the direction of the head and the posture of the arms and hands.The last aspect was addressed in [37] where the "Berrick" robotic head platform was presented.It was made based on the InMoov head and equipped to perform facial detection and gazing tasks.
Other robotic head platforms worth mentioning include Alan and Alena [38], which are highly customizable and designed to utilize technologies related to sound and imageprocessing tasks.Another notable platform is Furhat [42,43], equipped with a 3D display that allows it to show facial movements and expressions.It can both receive and emit sound, enabling it to engage in conversations.Kismet is a robot head with a cartoonish face that can engage in social interactions [44].Also, Fritz was shown in [45] as a humanoid robot with degrees of freedom in the face and the arms that allowed it to interact with people using modalities like speech, facial expressions, and gestures.Additionally, "Han" is a humanoid robot capable of recognizing and interacting with people.It uses cameras and voice recognition technology.It can perform complex facial expressions and has about 40 motors to control its artificial facial muscles.The face is covered in a soft, flesh-like rubber allowing it to move in a human-like way [46][47][48].
From the above, it can be seen that increasing the number of degrees of freedom, as well as equipping robotic platforms with verbal communication abilities based on strong language models can enhance their interactions with humans and, thus, their acceptability.Adam's design aims to address these considerations in both hardware and software aspects.Manufactured with 3D printing and using commercial control and actuation devices, with software components having ChatGPT at the center, the implementation presented in this paper features human-likeness with a high degree of robustness, flexibility, and user-friendliness.

Mechanical Design and Implementation
The humanoid robotic platform Adam is designed to mimic human-like head motion and it has seven degrees of freedom.The outer head consists of only three 3D-printed shells: the jaw, the frontal face, and the back head.These shells are obtained by splitting a complete humanoid head model without any simplification.The platform has been designed with this low number of shells as it improves its convenience in terms of manufacturability and assembly while ensuring the required degrees of freedom and allowing the platform to have human-like movements.This aspect is important as it helps in ensuring the acceptability of the platform when interacting with human users.The robot head is equipped with a camera for face tracking and utilizes a total of nine servo motors for eye, jaw, and neck movements.The hardware setup involves constructing the robot head structure, connecting the servo motors to a controller board, and interfacing the controller board with a Raspberry Pi.The software setup depends on both the controller board and the Raspberry Pi and includes integrating speech recognition and synthesis modules, as well as establishing a connection to ChatGPT, implementing face tracking algorithms, and controlling the servomotors [49].
Figures 1 and 2 show Adam's structural design, with its different layers, as well as its dimensions.The internal support structure consists of two identical and interconnected 3D-printed plates and all other parts are connected to this structure.The different hardware and software aspects of this platform will be presented in the following parts.

Mechanism
Adam incorporates mechanisms that enable precise and lifelike movements.The mechanisms use a total of nine servo motors, as will be detailed.These servomotors are precisely controlled by a controller board in coordination with Raspberry Pi.The mechanisms also rely on 3D-printed parts designed to ensure optimal fit and functionality.These parts provide the necessary structural integrity and flexibility required for the animatronic movements of the robot head.To create lifelike movements, Adam has a total of seven degrees of freedom.The eyes, with their expressiveness, possess two degrees of freedom each, allowing them to pan and tilt, providing a wide range of gaze directions and conveying a sense of focus and attention.The jaw, with its single degree of freedom, can articulate in a natural opening and closing motion, enabling the head to simulate speech through subtle mouth movements.The neck, with its two degrees of freedom, allows for fluid rotation and tilting, granting the head the ability to turn and nod realistically, enhancing its interaction with the environment.Through the combination of these precise and coordinated movements, along with the capacity to engage in vocal conversations, the animatronic robot head Adam achieves an uncanny level of realism, captivating audiences with its interactivity and engagement, its lifelike expressions, and seamless integration of motion.

Eye Movements
As shown in Figure 3, Adam's eye movements are controlled by four MG90S (available: https://www.towerpro.com.tw/product/mg90s-3/,accessed on 10 March 2024) servo motors, with two dedicated to each eye.One servo motor controls the vertical movement of the eye, allowing it to be oriented up and down, while the other servo motor controls the horizontal movement, enabling the eye to turn left and right.These two degrees of freedom for each eye are independently controlled and provide them with the ability to simulate natural eye movement.This intricate mechanism allows the robot head to establish eye contact, convey emotions through subtle eye gestures, and focus its attention on individuals.The dynamic and realistic movements of the eyes contribute significantly to the humanoid robotic head Adam's ability to engage and captivate its audience, making it an impressive and lifelike creation.The eye mechanisms were designed using 3D modeling software, and the corresponding parts were 3D-printed to ensure precise alignment and smooth movements.The 3D-printed components provide the necessary range of motion and eye stability to accurately track and engage with the user.

Jaw Movements
Adam's jaw, as shown in Figure 4, is equipped with two HK15298 (available: https://hobbyking.com/en_us/hobbykingtm-hk15298-high-voltage-coreless-digital-servo-mgbb-15kg-0-11sec-66g.html,accessed on 10 March 2024) servo motors responsible for its up and down movements in one degree of freedom.These servo motors control the opening and closing of the jaw, allowing the robot head to simulate speech and produce realistic movements in synchronization with the audio output while delivering oral responses while interacting with humans.The jaw mechanism relies on 3D-printed components, ensuring its flexibility and durability.When it is answering questions, the articulating jaw brings an added layer of realism to the animatronic robot head, making it engaging and relatable to observers.

Neck Movements
Adam's neck comprises three HS-805BB (available: https://hitecrcd.com/products/servos/analog/giant-analog/hs-805bb/product, accessed on 10 March 2024) servo motors that enable two degrees of freedom, as shown in Figure 5.One servo motor controls the rotation of the neck in the horizontal plane, allowing it to turn right and left.The other two servo motors are responsible for the vertical movement of the neck, enabling it to tilt up and down.The neck mechanism relies on 3D-printed components that provide the necessary strength and precision to ensure smooth and realistic neck movements that mimic human neck movements.The neck has two degrees of freedom and is designed to have a pitch interval of 60 • and a yaw interval of 100 • as depicted in Figure 6.

Kinematics and Stress Analysis
As shown above, three independent mechanisms control the head movement: the eye mechanism, the jaw mechanism, and the neck mechanism, all of which are adaptations of a four-bar linkage.The head weighs 1.05 kg and is solely supported by the neck and, hence, is the most crucial structural component.This section provides the kinematic and structural analysis of the neck mechanism.
The pitching motion is driven by two synchronized servo motors actuating a set of parallel four-bar linkages with identical input, but moving in the opposite direction.
Figure 7 shows a simplified kinematic diagram of the linkages; O4B is the crank, which is actuated by the servo motor, AB is the coupler link, and O2A is the output link, which is a part of the internal frame of the head.The rotation of the output link results in the pitching motion of the head.The folded-in configuration of the crank-coupler is selected as it delivers a good overall transmission angle and keeps the neck compact.A kinematic motion study is performed using the SolidWorks motion analysis package by simulating the crank rotation for one complete cycle of operation.The transmission angle is the angle between the coupler and the rocker and it is an important parameter indicating the efficiency of force transfer between the links.The motion analysis results show that the transmission angle varies from 47 • to 95 • , as shown in Figure 8, ensuring an efficient and smooth motion.
The head's center of mass is offset from the neck pivot by a radial distance of 70 mm and can create unbalanced forces of varying magnitudes depending on the angle of the head.This off-balance mass also results in inertial forces, especially when there is a sudden change in direction as in a nodding motion.A dynamic force analysis is, hence, found suitable for capturing the internal forces in the system.The nodding motion is simulated in SolidWorks by applying two virtual motors as the servomotor with a pitching angle of 60 • to match the range of motion of the actual robot.These motors are then set to oscillate at 1 Hz around the horizontal plane to simulate the nodding motion.The results show a peak motor torque of 325 N-mm (see Figure 9).This occurs when the head changes direction from a downward to an upward motion.At this position, the center of gravity of the head is furthest from the neck pivot, which creates a high unbalanced force.The structural analysis presented in the following section is performed in this critical position with the highest internal forces.The two notches observed in the chart around 0.2 s and 0.8 s correspond to the position where the head is perfectly upright and the center of mass is vertically above the neck pivot.At this position, the head is perfectly balanced and, hence, very little motor torque is required to support the head.Stress analysis performed on the important load-bearing members of the neck is depicted in Figure 10.The dynamic forces applied to each of the members are obtained directly from the SolidWorks motion study.These are applied as remote loads in the simulation and are shown in the figure using pink lines; the region of maximum stress is highlighted and the corresponding values are displayed.All the analyzed critical components show stresses below the yield strength and the corresponding values are recorded in Table 2.

Interaction and Control
In terms of electronics, Adam involves integrating various components to facilitate communication, control, and power distribution.Control is performed with a Raspberry Pi 4, which interfaces with an EZ-B robot controller board (available: https://www.ez-robot.com/store/p24/EZB-smart-robot-controller.html,accessed on 10 March 2024), EZ-B robotics camera (available: https://www.ez-robot.com/store/p64/robotics-camera.html,accessed on 10 March 2024), a Turtle Beach USB microphone (available: https://uk.turtlebeach.com/pages/stream-mic, accessed on 10 March 2024), and a loudspeaker.These components work concurrently to ensure proper interaction capabilities for the robot.The robot controller board controls actuators in conjunction with Raspberry Pi.The camera module captures real-time video input for face tracking and recognition.The USB microphone captures user speech input, which is converted to text for prompting ChatGPT.Adam's image-processing component utilizes the EZ-B robotics camera module and employs algorithms for detecting and tracking faces through the controller board.

Visual Tracking
The algorithm used for the visual tracking consists of a face detection module that provides the face location in the image.This location is then used to control the servomotors in order to keep the face in the center of the image.For instance, if the person is detected to be in the upper left side of the picture, the neck servomotors are controlled to rotate the head up and to the left.A visual scene captured by Adam with face detection is illustrated in Figure 11.

Conversational System
Adam's conversational ability was centered around ChatGPT, and achieved in several stages, as illustrated in Figure 12, as follows: Prompt and response: the sequence of uttered words is sent as a prompt to ChatGPT through the OpenAI Python library (available: https://pypi.org/project/pyttsx3/,accessed on 10 March 2024) that returns the response as a sequence of words.The prompt also contains the previous parts of the conversation, to be taken into account in the returned answer.Also, the answer is set to be not too long for the conversation to remain lively from both sides.• Speech synthesis and playing: the sequence of words returned from ChatGPT is synthesized into a sound signal with a Python text-to-speech conversion library (available: https://platform.openai.com/docs/libraries,accessed on 10 March 2024) and played as a sound signal through a loudspeaker connected to the used Raspberry Pi.In conjunction with the loudspeaker sound emission, the jaw is activated in order to repeatedly move up and down.The jaw returns to its initial position and stops when the speech utterance is completed.
Compared with existing sound-based conversational systems (see [50,51] for example), Adam features a human-like animated embodiment that enriches the interaction with humans and makes it more accepted.Additionally, the current software structure of the conversational system makes it easily configurable and modifiable and allows it to benefit from emerging libraries and toolboxes that may improve its performance.

Conversational System Evaluation
Adam's conversational abilities were tested in different environments and with ten people of both genders with different age groups asking Adam questions.In each case, the interacting person was in front of Adam, looking at it while Adam's head was oriented toward the person.While that was not necessary, a screen displayed information signaling, such as the times when the robot acquired sounds from the user.As stated above, Adam's conversational interaction relies on libraries requiring an internet connection, especially the OpenAI Python library.The speed of the connection and the reactiveness of the related servers can affect the time that the robot takes to answer a user's utterance.However, the robot's answers were fast enough not to affect the interactions negatively and the different persons did not report any issues related to this aspect.The testing involved asking each person five different questions, with the robot accurately answering between four and five of these per person.This demonstrates the efficiency of the speech recognition, ChatGPT connection, and speech synthesis modules, as well as their effective operation as a cohesive unit.The results of this test are reported in Figure 13.

Conclusions and Future Work
While a detailed statistical study on its acceptability and usefulness in specific roles is planned for future research, this paper presents the successful development and implementation of the humanoid robotic head, Adam.
Adam incorporates a mechanism with servo motors and a servo controller board to enable precise and lifelike movements of the eyes, jaw, and neck.The system includes a camera for face tracking and integrates with the ChatGPT server to generate responses.The performance of the system has been evaluated through user testing sessions, including the accuracy of responses and the ability to maintain eye contact with users.Adam has also been showcased at several events and was positively received by most people who interacted with it.
Adam can be used as an interactive research platform in human-robot interaction for areas like education and visitor reception.It incorporates 3D modeling and additive manufacturing, along with actuation and computing technologies, into an embodiment as a configurable humanoid platform.This ensures its structural integrity and flexibility with smooth and realistic movements.Compared with existing robotic platforms (characteristics are summarized in Table 1), Adam features a relatively high number of degrees of freedom.It has both vision and sound acquisition capabilities along with sound emission.Additionally, it boasts realistic expressiveness and a lifelike design, which are easy to manufacture and assemble, with a production cost not exceeding USD 650.Beyond its ability to track persons visually and engage in conversations, Adam's programmability and multi-freedom extend its potential for more tasks and features.
Moving forward, future work for this project involves expanding the capabilities of Adam.The next phase will focus on developing a body to accompany the robot head, enabling a more complete humanoid appearance.Additionally, plans include incorporating skin-like materials to enhance the face, making it more human-like in terms of appearance and texture.This will improve Adam's facial movements and positively impact their communication process with humans.Additionally, Adam's interaction capabilities can be improved to take into account more than one person at once.In the presence of several persons, the robot needs to define its current interlocutor at each time.This can be done by detecting the speaking person through lip motion or estimating the direction of sound emission as a first step, followed by orienting the head toward this person.To this end, more signal processing hardware and software equipment can be added, and the Adam platform offers the flexibility to allow it.

Figure 2 .
Figure 2. Dimensions of the Adam robotics head.All the measures are in millimeters.

Figure 7 .
Figure 7. (a) Side view of the neck pitching mechanism with the kinematic chain highlighted in orange; (b) kinematic diagram of the pitching mechanism showing the transmission angle γ.

Figure 10 .
Figure 10.FEA von Mises stresses in the important neck components.

Figure 11 .
Figure 11.Detection of the face of a person interacting with Adam.Note that the face is blurred in the current illustration.Demonstration video of the ADAM operation (https://youtu.be/6w9tZgyRsAs?si=9OjM9k1w_Xy-wXd_accessed on 10 March 2024).

Figure 12 .
Figure 12.Consecutive steps in the conversational block of Adam.• Sound acquisition: the utterance of the user interacting with Adam is recorded through a microphone connected to the Raspberry Pi controlling it.• Speech recognition: the recorded speech signal is exploited by a Python speech recognition library (available: https://pypi.org/project/SpeechRecognition/,accessed on 10 March 2024) that returns the sequence of uttered words.•Prompt and response: the sequence of uttered words is sent as a prompt to ChatGPT through the OpenAI Python library (available: https://pypi.org/project/pyttsx3/,accessed on 10 March 2024) that returns the response as a sequence of words.The prompt also contains the previous parts of the conversation, to be taken into account in the returned answer.Also, the answer is set to be not too long for the conversation to remain lively from both sides.• Speech synthesis and playing: the sequence of words returned from ChatGPT is synthesized into a sound signal with a Python text-to-speech conversion library (available:

Figure 13 .
Figure 13.Number of accurate answers provided by Adam to the 5 questions asked by each interlocutor among 10.

Table 1 .
Some characteristics of humanoid robots surveyed in Section 2.

Table 2 .
Finite element analysis results for the critical components of Adam.