Daniel M. Lofaro, Robert Ellenberg, Paul Oh
Transcription
Daniel M. Lofaro, Robert Ellenberg, Paul Oh
Interactive Games With Humanoids: Playing With Jaemi Hubo Daniel M. Lofaro, Robert Ellenberg, Paul Oh Abstract— An interactive participant in a physical world game requires the players to stay within the bounds of the rules and for the game moderators to be able to recognize when the players are outside of the rules. A robot being an interactive participant in a physical world game requires that the robot can understand the rules of the game, sense the world, get the state of the game, and give proper world feedback for the given game. This document gives two examples of how Jaemi Hubo, an adult size humanoid robot, addressed these challenges when acting as an interactive participant in two physical world games. I. INTRODUCTION In late May 2009 Jaemi Hubo, an adult size humanoid robot (see Fig. 1), spent three days at the Please Touch Museum (PTM), a children’s hands on museum in Philadelphia PA. During Jaemi’s visit multiple interactive demonstration and hands on games were played with children between the ages of three and seven. The demonstrations consisted of a quick explanation of how we walk as humans and how robots walk; in this case a simple version of Zero Moment Point (ZMP) was explained in easily understand terms[1]. A game of Simon Says was then played with children where Jaemi Hubo was Simon. The children played this game on three speeds; slow, normal, and fast. Jaemi Hubo was well received by the children, parents, and press (both in print and broadcast). See Figure 2 for a picture of children playing Simon Says with Jaemi Hubo at the PTM. In late May 2010 Jaemi Hubo was scheduled to make it’s second appearance at the Philadelphia Please Touch Museum for a more interactive demonstration. This demonstration would have included the addition of the game Red Light Green Light. The over arching goal of this effort is to make Jaemi Hubo an interactive participant in physical world games. Any robot being an interactive participant in a physical world game requires that the robot can understand the rules of the game, sense the world, get the state of the game, and give proper world feedback for the given game. The challenges for implementing such a system included representing the games’ rules in a robot platform - adhering Project supported by the Drexel Autonomous Systems Lab (DASL) Support for this work was provided by a National Science Foundation Partnerships for International Research and Education grant (#0730206) D. Lofaro is a Ph.D. Candidate with the department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA 19104, USA. [email protected] R. Ellenberg is a Ph.D. Candidate with the department of Mechanical Engineering & Mechanics, Drexel University, Philadelphia, PA 19104, USA. [email protected] P. Oh is with the Department of Mechanical Engineering & Mechanics, Drexel University, Philadelphia, PA 19104, USA. [email protected] Fig. 1. (LEFT) Jaemi Hubo, 130cm tall 41 degree of freedom humanoid robot, full body view and camera location. (TOP RIGHT) Jaemi Hubo front head view and camera location. (BOTTOM RIGHT) Inside Jaemi Hubo’s head and camera location. Fig. 2. Daniel M. Lofaro (Top Right) and the robot Jaemi Hubo (Center) Playing Simon Says with children at the PTM. (Screen shot from ABC News on May 28th, 2009 to the robot’s kinematic limitations - adhering to the robot’s computational capability - keeping the players in a structured environment without the use of non-game specific rules. This document gives a brief overview and time line of commercial robots and their sensing capabilities, and a detailed description of how Simon Says and Red Light Green Light was implemented on Jaemi Hubo the adult size humanoid robot. II. RELATED WORK Robots are intuitive physical links between the human world (physical) and the computer world (virtual). Traditional Human Interface Devices (HID), such as computer mice, keyboards, etc. take information from the world, primarily user input. However, traditional HID do not respond in the physical world, they respond in the virtual world. Robots are the physical embodiment of a computer. This physical embodiment allows the computer world to interact with the physical world or Human Robot Interaction (HRI). Over the past three decades the commercial world has released multiple robots specifically designed to play games with humans in the physical world. In 1984 Tomy’s Omnibot1 was introduced. This was an important step because it was one of the first commercially available robots that included simple voice recognition, see Fig. 3. Fig. 4. R.O.B. (Robotic Operating Buddy) released in 1985 as an accessory for the popular Nintendo Entertainment System (NES). When coupled with the NES and a CRT television set R.O.B. was able to play physical world games with the user such as Gyromite and Stack-Up. The picture above is R.O.B. with the Gyromite attachment[5]. as a robot pet for children and adults but was soon adopted by the robot community for research purposes including the study of walking gates and multi-agent systems primarily in reference to playing robot soccer known as RoboCup)[6], [7]. Fig. 3. Omnibot MKII - 1984 (left) and Omnibot OOM “Hearoid” - 1985 (right). Two examples of Tomy’s line of robot platforms. These robots had obstacle avoidance, ability to be teli-operated and voice controlled (in some models)[2]. In the mid 1980’s R.O.B. (Robotic Operating Buddy) was released for the Nintendo Entertainment System (NES), see Fig. 4. When coupled with the NES and a cathode ray tube (CRT) television R.O.B. was able to play hands on games with the user. These games included Gyromite[3]2 , a game where you place a spinning top on the correct color block that represented dynamite and Stack-Up[4]3 , a memory game to be played with the included blocks and tops. In the late 1990’s two of the most popular robotic toys were introduced, the Furby (1998) and the Aibo (1999), see Fig. 5. The Furby was targeted to children. It has pressure and an infrared (IR) sensor that allowed the Furby to detect when it is being handled by a user or if there is another Furby in the room. The Aibo is an autonomous dog with many degrees of freedom and a variety of sensors including touch, auditory, and video. It has the ability to “learn and mature” as it grows older. The Aibo was initially advertised 1 Omnibot: http://en.wikipedia.org/wiki/Omnibot http://en.wikipedia.org/wiki/Gyromite 3 Stack-Up: http://en.wikipedia.org/wiki/Stack-Up Fig. 5. Furby - 1996 (left) and Aibo - 1999 (right). The Furby had pressure sensors and an infrared (IR) sensor that allowed the Furby to detect when it was being petted or handled by a user or if there is another Furby in the area. The Aibo was initially advertised as a robot pet for children and adults but was soon adopted by the robot community for research purposes. Starting in the early 2000s Dr. Mark Tilden, inventer of BEAM (Biology, Electronics, Aesthetics, and Mechanics) Robotics, started releasing his line of human interacting robots under the flag of the company WowWee[8]4 . These robots include B.I.O. Bugs (2001), Constructobots (2002), G.I Joe Hoverstrike (2003), RoboSapien (2004), Robosapien v2 (2005), Roboraptor (2005), Robopet (2005), Roboreptile (2006), RS Media (2006, co-developed with Daven Sufer and Maxwell Bogue), Roboquad (2007), Roboboa (2007), and the humanform Femisapien (2008) and Joebot (2009). The primarily focus of these robots were to make inexpensive, durable, and interactive robots for children of all ages. The 2 Gyromite: 4 WowWee: http://www.wowwee.com/ most advanced of these is the Joebot, see Fig. 6[8]. Joebot’s abilities include movement detection, object tracking, and processing simple verbal commands[9]. Fig. 6. Robosapien - 2004 (left) and Joebot - 2009 (right). Two examples of Dr. Mark Tilden’s line of robot platforms. The primarily focus of these robots were to make inexpensive, durable, and interactive robots for children of all ages[9]. These examples of commercial robots with human interaction abilities shows how the user can use simple feedback mechanisms, such as touch sensors, or simple cameras, to give a wide variety of interaction ability. We show two of our own examples of how you can use limited feedback abilities to have a robot be an interactive participant in a physical world game. III. METHODOLOGY & RESULTS A. Simon “Jaemi” Says Simon Says is a children’s game that is played with multiple players and one leader or a “Simon”. The Simon will give verbal movement instructions to the players such as“stand on one foot” or “place your right hand on your head.” While giving the verbal instructions Simon will perform a motion. The performed motion may be the same as the verbal motion or it may be different. The objective for the players is to perform the verbal instructions stated by Simon whether or not it corresponds with the movement Simon performed. If the player does not do the correct movement the player is “out” and has to stop playing until the next round. Simon may increase the speed of the commands as the game progresses. The game is over when there is one player remaining. When implementing Simon Says Jaemi Hubo would act as Simon, and thus the name of the game is now Jaemi Says. Humans act as the players. There are three variables to the Jaemi Says game, visual gesture command, verbal gesture command, and speed. The visual gesture commands are the physical motion that the robot performs, see Table I and Fig. 7 for the list and pictures of the gestures used. These motions were designed to emulate the movements from the classic children’s game Simon Says. The verbal gesture commands are the same as the visual gesture commands however instead of preforming the given task gesture the gesture name is spoken. The speed is a combination of two Fig. 7. Jaemi Hubo performing for of the gestures for Simon “Jaemi” Says. (TOP LEFT) Simon says raise your right hand. (TOP RIGHT) Simon says raise your left hand. (BOTTOM LEFT) Simon says touch your head. (BOTTOM RIGHT) Simon says put your hands on your hips. TABLE I V ISUAL AND V ERBAL G ESTURE C OMMANDS Raise Right Hand Raise Right Arm Right Arm Circle Hands on Hips - Raise Left Hand Raise Left Arm Left Arm Circle Rub Stomach Right Arm “Choo Choo” Touch Head Touch Nose Clap Hands Flap Arms - variables: how fast the gesture is preformed (visual gesture command) and at what rate the gestures are commanded (verbal gesture command). When playing with children the game has two set scripts of visual gesture commands consisting of 18 and 13 random gesture commands. This length and gesture order can be changed at the robot operator’s discretion. The corresponding voice commands are given to the players through the speakers in Jaemi Hubo’s head, see Fig. 1. The verbal gesture commands and the visual gesture command have between 50% and 75% correct corrospondance. The game starts with the speed set to slow. This allowed the children to get accustomed to the robots commands. After one game set to slow speed the system is played at normal speed for three games. Finally the game is played at fast speed, also referred to as “robot speed.” A major limitation of the system was that Jaemi Hubo was unable to autonomously obtain pose information on each player. Instead Jaemi relied on its human partners to provide the pose information for each of the players. This limitation was later addressed in other games Jaemi participated in. The game was played with ten separate groups of young children and their parents. Most of the children were between three and seven years of age. This event was reported on by television stations including CBS and ABC, see Fig. 2, as well as the radio station KYW News Radio. The event was well received by the children, parents, and press. B. Red Light Green Light Red Light Green Light is a children’s game played with one traffic director and multiple players. The goal of the game is for the players to get from their starting position (far away from the traffic director) to the end position (close to the traffic director) without getting caught. The traffic director will be at one end of a linear playing field and the players will start at the opposite end. The traffic director has two commands that the players must follow, “Red Light” and “Green Light.” Red Light means that the players are not permitted to move. Green Light means the players are permitted to move. The traffic director must have their back facing the players when they say Green Light. The traffic director can turn around after they say Red Light. At this point all of the players should not be moving. If the traffic director catches any of the players moving the player is “out” and has to stop playing until the next round. The game is over when there is one player remaining. 1) Field Setup: In this game of Red Light Green Light Jaemi Hubo is the traffic director. Due to the resolution and field of view of Jaemi’s camera, and the computational power of Jaemi’s computers only two players are able to play at the same time. One player is located to the front right of Hubo and one to the front left (Fig. 8). The two players are referred as SA and SB for player A and player B respectively. Key points about the setup of the Red Light Green Light field: • SA and SB must start in the field of view of the robot, in this case the horizontal and vertical field of view is 59.4o and 31.5o respectively. The reference point is Jaemi’s camera located in Jaemi’s head, Fig. 1. • SA and SB must move in a straight line toward the robot • SA must be on the robot’s left side and SB must be on the robot’s right side. Given these restrictions, we had to design a way to keep the children within the bounds of the game but without stating strict rules for them to follow. This issue was addressed by setting up a series of circles, referred to as “lily pads.” Now the players have to jump from one lily pad to the other instead of simply running towards the goal. This ensures that the players stay in the horizontal and vertical field of view of the robot as well as ensures the players are moving in a Fig. 8. H: Jaemi Hubo, A: Player “A”, B: Player “B.” Top view representation of the Red Light Green Light playing area. The angle 59.4o is the horizontal field of view of the camera being used for the game. The arrows denote the direction of travel each player must go. The start and finish lines denote where the starting point for each user starts at and where the user will end at. straight line on the correct side of the robot. Fig. 9 shows the side view of the “lily pad” setup for Red Light Green Light. Fig. 9. H: Jaemi Hubo, A: Player “A”, B: Player “B.” Side view representation of the Red Light Green Light playing area with the “lily pad” setup. The angle 31.5o is the vertical field of view of the camera being used for the game. The arrows denote the direction of travel each player must go. Each player must hop from one pad to the other until they reach the las pad. Who ever reaches the last pad first without being caught moving by the robot wins. 2) World Interaction: In order for Jaemi Hubo to play Red Light Green Light interactively with the players Hubo must be able to detect the movement and signal which players were moving after “Red Light” is called. The Lucas-Kanade Optical Flow (L-K Optic Flow) algorithm was as the motion detection method[10]. The L-K Optic Flow algorithm was used instead of the Horn-Schunck Optical Flow (H-S Optic Flow) method because the only information needed is the motion state in a given area[11]. We do not need the magnitude of the movement at each point. The world feedback used is pointing. The robot will point its arm to the side of the scene that is moving, see Fig. 10. The decision tree for the combination of the motion detection and the world feedback can be found in Fig. 11. If (x, y) = 1 : if OpticF low(I(x, y)) > R 0 : else (2) If (x, y) represents the movement state at (x, y). If If (x, y) = 1 then the point (x, y) has moved since the last recorded frame, if If (x, y) = 0 then the point (x, y) has not moved since the last recorded frame. In addition R is the cut point value for the magnitude of the movement. This value is determined by the a combination of the user defined values: • The time between each video frame • Number of frames are being compared Fig. 12 shows the above system implemented in OpenCV. The left panel shows the video frame and the right panel shows If . Fig. 10. Jaemi Hubo (robot in the left of each scene) pointing to the side of it’s viewable scene that is moving. Depicted in the picture the subject’s (human in the right of each scene) arms act as Player A (Left Arm) and Player B (Right Arm). Top: In this scene Player B is moving and Player A is not. Middle: In this scene Player A is moving and Player B is not. Bottom: In this Scene both Player A and Player B is moving. A video of this system can be found on YouTube at: http://www.youtube.com/watch?v=es 4qYe55sw Fig. 12. In each scene above (Top and Botton) the left panel shows the video frame and the right panel shows If (1 = blue, 0 = black). Top: Player moving on the LEFT side of the viewable scene triggering a left side move event. Bottom: Player moving on the RIGHT side of the viewable scene triggering a right side move event. (1) If is then split in half on the x axis (left-right) creating a left image and a right image referenced as If L and If R respectively. If L and If R are then sumed and averaged to form Mf L and Mf R respectively. If L Mf L = (3) width(If L ) · length(If L ) If R Mf R = (4) width(If R ) · length(If R ) Where I is the intensity image (gray scale), If is an array of the same size as image I except If only contains binary values, and OpticF low() is the L-K Optic Flow function. The given side is considered moving if Mf R and/or Mf L is greater then a movement cut point value defined as Mc . For our demonstrations Mc was a user defined value that is set on-line. The steps to determining the values of the decision tree in Fig. 11 are defined in Table II and can be found below: If = OpticF low(I) Fig. 11. Decision tree for motion detection and movement reaction. Values for the decision tree can be found in Table II TABLE II the implementation of Red Light Green Light. In this game Jaemi acted as the traffic directer. Using a vision system Jaemi was able to autonomously moderate the game with up to two players. In the future system the support for multiple players (more then two) is highly desirable. In addition supporting a wider variety of world feedback mechanisms, such as pose detection and speech recognition, will be implemented. D ECISION T REE S TATE D EFINITIONS Variable Name ma mb mr ml mr = ml = mb = ma = V. ACKNOWLEDGMENTS T RU E F ALSE : if Mf R > Mc : else (5) Support for this work was provided by a National Science Foundation - Partnerships for International Research and Education grant (#0730206). T RU E F ALSE : if Mf L > Mc : else (6) R EFERENCES : if (mr &ml ) == T RU E : else (7) : if (mr ml ) == T RU E : else (8) T RU E F ALSE Block name in Fig. 11 Movement Detected Right and Left Sides Moving Right Side Moving Left Side Moving T RU E F ALSE The above system knows the rules of the game, is able to get the state of the game through visual methods, and is able to give the proper world feedback through the use of the robots two arms. The system is fully functional in a lab setting, however we were unable to test it with a variety of players in 2010, like we did with Simon “Jaemi” Says in 2009 at the PTM, due to extenuating circumstances which prevented us from visiting the PTM in 2010.5,6 IV. CONCLUSION & FUTURE WORK The implementation of Simon Says with Jaemi Hubo, where Jaemi acted as Simon, showed that the game can be played with children, creating a fun and exciting environment. The primary limitation of the system was that it had no world feed back. The players could do any movement they wanted and Jaemi would not know the difference. Jaemi Hubo relied on its human partners to watch the players. The world interaction abilities were increased with 5 Jaemi’s 6 Jaemi’s Big Trip: http://www.youtube.com/watch?v=DF8zAM4FLB4 Big Fix: http://www.youtube.com/watch?v=A67nY2ifDyY [1] M. Vukobratovic and D. Juricic, “Contribution to the synthesis of biped gait,” in IEEE Trans. Biomed. Eng., 1969, pp. vol. 16, no. 1, pp. 1 – 6. [2] Wikipedia. (October 2010) Omnibot by tomy. [Online]. Available: http://en.wikipedia.org/wiki/Omnibot [3] ——. (October 2010) Nintendo game - gyromite. [Online]. Available: http://en.wikipedia.org/wiki/Gyromite [4] ——. (October 2010) Nintendo game - stack-up. [Online]. Available: http://en.wikipedia.org/wiki/Stack-Up [5] ——. (October 2010) R.o.b. - robotic operating buddy. [Online]. Available: http://en.wikipedia.org/wiki/R.O.B. [6] G. Hornby, S. Takamura, J. Yokono, O. Hanagata, T. Yamamoto, and M. Fujita, “Evolving robust gaits with aibo,” in Robotics and Automation, 2000. Proceedings. ICRA ’00. IEEE International Conference on, vol. 3, 2000, pp. 3040 –3045 vol.3. [7] H. Kitano, “Robocup rescue: a grand challenge for multi-agent systems,” in MultiAgent Systems, 2000. Proceedings. Fourth International Conference on, 2000, pp. 5 –12. [8] WowWee. (October 2010) Wowwee - astonishing imagination inc. [Online]. Available: http://www.wowwee.com [9] D. M. Lofaro, “Conversation with dr. mark tilden at the 2010 telluride workshop held by the institute of neuromorphic engineering: http://www.ine-web.org/index.php,” July 2010. [10] B. Lucas and T. Kanade, “An interactive image registration technique with application to stereo vision,” Proceedings of the DARPA image understanding workshop, pp. 121–130, 1981. [11] B. Horn and B. Schunck, “Determining optical flow,” Artificial Intelligence, vol. 17, pp. 185–204, 1981.