Header logo is am


2017


Interactive Perception: Leveraging Action in Perception and Perception in Action
Interactive Perception: Leveraging Action in Perception and Perception in Action

Bohg, J., Hausman, K., Sankaran, B., Brock, O., Kragic, D., Schaal, S., Sukhatme, G.

IEEE Transactions on Robotics, 33, pages: 1273-1291, December 2017 (article)

Abstract
Recent approaches in robotics follow the insight that perception is facilitated by interactivity with the environment. These approaches are subsumed under the term of Interactive Perception (IP). We argue that IP provides the following benefits: (i) any type of forceful interaction with the environment creates a new type of informative sensory signal that would otherwise not be present and (ii) any prior knowledge about the nature of the interaction supports the interpretation of the signal. This is facilitated by knowledge of the regularity in the combined space of sensory information and action parameters. The goal of this survey is to postulate this as a principle and collect evidence in support by analyzing and categorizing existing work in this area. We also provide an overview of the most important applications of Interactive Perception. We close this survey by discussing the remaining open questions. Thereby, we hope to define a field and inspire future work.

arXiv DOI Project Page [BibTex]

2017

arXiv DOI Project Page [BibTex]


Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning
Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning

Li, W., Bohg, J., Fritz, M.

arXiv, November 2017 (article) Submitted

Abstract
Understanding physical phenomena is a key component of human intelligence and enables physical interaction with previously unseen environments. In this paper, we study how an artificial agent can autonomously acquire this intuition through interaction with the environment. We created a synthetic block stacking environment with physics simulation in which the agent can learn a policy end-to-end through trial and error. Thereby, we bypass to explicitly model physical knowledge within the policy. We are specifically interested in tasks that require the agent to reach a given goal state that may be different for every new trial. To this end, we propose a deep reinforcement learning framework that learns policies which are parametrized by a goal. We validated the model on a toy example navigating in a grid world with different target positions and in a block stacking task with different target structures of the final tower. In contrast to prior work, our policies show better generalization across different goals.

arXiv [BibTex]


no image
Event-based State Estimation: An Emulation-based Approach

Trimpe, S.

IET Control Theory & Applications, 11(11):1684-1693, July 2017 (article)

Abstract
An event-based state estimation approach for reducing communication in a networked control system is proposed. Multiple distributed sensor agents observe a dynamic process and sporadically transmit their measurements to estimator agents over a shared bus network. Local event-triggering protocols ensure that data is transmitted only when necessary to meet a desired estimation accuracy. The event-based design is shown to emulate the performance of a centralised state observer design up to guaranteed bounds, but with reduced communication. The stability results for state estimation are extended to the distributed control system that results when the local estimates are used for feedback control. Results from numerical simulations and hardware experiments illustrate the effectiveness of the proposed approach in reducing network communication.

arXiv Supplementary material PDF DOI Project Page [BibTex]


Probabilistic Articulated Real-Time Tracking for Robot Manipulation
Probabilistic Articulated Real-Time Tracking for Robot Manipulation

(Best Paper of RA-L 2017, Finalist of Best Robotic Vision Paper Award of ICRA 2017)

Garcia Cifuentes, C., Issac, J., Wüthrich, M., Schaal, S., Bohg, J.

IEEE Robotics and Automation Letters (RA-L), 2(2):577-584, April 2017 (article)

Abstract
We propose a probabilistic filtering method which fuses joint measurements with depth images to yield a precise, real-time estimate of the end-effector pose in the camera frame. This avoids the need for frame transformations when using it in combination with visual object tracking methods. Precision is achieved by modeling and correcting biases in the joint measurements as well as inaccuracies in the robot model, such as poor extrinsic camera calibration. We make our method computationally efficient through a principled combination of Kalman filtering of the joint measurements and asynchronous depth-image updates based on the Coordinate Particle Filter. We quantitatively evaluate our approach on a dataset recorded from a real robotic platform, annotated with ground truth from a motion capture system. We show that our approach is robust and accurate even under challenging conditions such as fast motion, significant and long-term occlusions, and time-varying biases. We release the dataset along with open-source code of our approach to allow for quantitative comparison with alternative approaches.

arXiv video code and dataset video PDF DOI Project Page [BibTex]


no image
Anticipatory Action Selection for Human-Robot Table Tennis

Wang, Z., Boularias, A., Mülling, K., Schölkopf, B., Peters, J.

Artificial Intelligence, 247, pages: 399-414, 2017, Special Issue on AI and Robotics (article)

Abstract
Abstract Anticipation can enhance the capability of a robot in its interaction with humans, where the robot predicts the humans' intention for selecting its own action. We present a novel framework of anticipatory action selection for human-robot interaction, which is capable to handle nonlinear and stochastic human behaviors such as table tennis strokes and allows the robot to choose the optimal action based on prediction of the human partner's intention with uncertainty. The presented framework is generic and can be used in many human-robot interaction scenarios, for example, in navigation and human-robot co-manipulation. In this article, we conduct a case study on human-robot table tennis. Due to the limited amount of time for executing hitting movements, a robot usually needs to initiate its hitting movement before the opponent hits the ball, which requires the robot to be anticipatory based on visual observation of the opponent's movement. Previous work on Intention-Driven Dynamics Models (IDDM) allowed the robot to predict the intended target of the opponent. In this article, we address the problem of action selection and optimal timing for initiating a chosen action by formulating the anticipatory action selection as a Partially Observable Markov Decision Process (POMDP), where the transition and observation are modeled by the \{IDDM\} framework. We present two approaches to anticipatory action selection based on the \{POMDP\} formulation, i.e., a model-free policy learning method based on Least-Squares Policy Iteration (LSPI) that employs the \{IDDM\} for belief updates, and a model-based Monte-Carlo Planning (MCP) method, which benefits from the transition and observation model by the IDDM. Experimental results using real data in a simulated environment show the importance of anticipatory action selection, and that \{POMDPs\} are suitable to formulate the anticipatory action selection problem by taking into account the uncertainties in prediction. We also show that existing algorithms for POMDPs, such as \{LSPI\} and MCP, can be applied to substantially improve the robot's performance in its interaction with humans.

DOI Project Page [BibTex]

DOI Project Page [BibTex]

2014


no image
Pole Balancing with Apollo

Holger Kaden

Eberhard Karls Universität Tübingen, December 2014 (mastersthesis)

[BibTex]

2014

[BibTex]


no image
Wenn es was zu sagen gibt

(Klaus Tschira Award 2014 in Computer Science)

Trimpe, S.

Bild der Wissenschaft, pages: 20-23, November 2014, (popular science article in German) (article)

PDF Project Page [BibTex]

PDF Project Page [BibTex]


no image
Robotics and Neuroscience

Floreano, Dario, Ijspeert, Auke Jan, Schaal, S.

Current Biology, 24(18):R910-R920, sep 2014 (article)

[BibTex]

[BibTex]


no image
Learning Coupling Terms for Obstacle Avoidance

Rai, A.

École polytechnique fédérale de Lausanne, August 2014 (mastersthesis)

Project Page [BibTex]

Project Page [BibTex]


no image
Object Tracking in Depth Images Using Sigma Point Kalman Filters

Issac, J.

Karlsruhe Institute of Technology, July 2014 (mastersthesis)

Project Page [BibTex]

Project Page [BibTex]


Nonmyopic View Planning for Active Object Classification and Pose Estimation
Nonmyopic View Planning for Active Object Classification and Pose Estimation

Atanasov, N., Sankaran, B., Le Ny, J., Pappas, G., Daniilidis, K.

IEEE Transactions on Robotics, May 2014, clmc (article)

Abstract
One of the central problems in computer vision is the detection of semantically important objects and the estimation of their pose. Most of the work in object detection has been based on single image processing and its performance is limited by occlusions and ambiguity in appearance and geometry. This paper proposes an active approach to object detection by controlling the point of view of a mobile depth camera. When an initial static detection phase identifies an object of interest, several hypotheses are made about its class and orientation. The sensor then plans a sequence of viewpoints, which balances the amount of energy used to move with the chance of identifying the correct hypothesis. We formulate an active M-ary hypothesis testing problem, which includes sensor mobility, and solve it using a point-based approximate POMDP algorithm. The validity of our approach is verified through simulation and real-world experiments with the PR2 robot. The results suggest a significant improvement over static object detection

Web pdf link (url) [BibTex]

Web pdf link (url) [BibTex]


Data-Driven Grasp Synthesis - A Survey
Data-Driven Grasp Synthesis - A Survey

Bohg, J., Morales, A., Asfour, T., Kragic, D.

IEEE Transactions on Robotics, 30, pages: 289 - 309, IEEE, April 2014 (article)

Abstract
We review the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps. We divide the approaches into three groups based on whether they synthesize grasps for known, familiar or unknown objects. This structure allows us to identify common object representations and perceptual processes that facilitate the employed data-driven grasp synthesis technique. In the case of known objects, we concentrate on the approaches that are based on object recognition and pose estimation. In the case of familiar objects, the techniques use some form of a similarity matching to a set of previously encountered objects. Finally for the approaches dealing with unknown objects, the core part is the extraction of specific features that are indicative of good grasps. Our survey provides an overview of the different methodologies and discusses open problems in the area of robot grasping. We also draw a parallel to the classical approaches that rely on analytic formulations.

PDF link (url) DOI Project Page [BibTex]

PDF link (url) DOI Project Page [BibTex]


no image
Learning objective functions for autonomous motion generation

Kalakrishnan, M.

University of Southern California, University of Southern California, Los Angeles, CA, 2014 (phdthesis)

Project Page Project Page [BibTex]

Project Page Project Page [BibTex]


no image
A Limiting Property of the Matrix Exponential

Trimpe, S., D’Andrea, R.

IEEE Transactions on Automatic Control, 59(4):1105-1110, 2014 (article)

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Event-Based State Estimation With Variance-Based Triggering

Trimpe, S., D’Andrea, R.

IEEE Transactions on Automatic Control, 59(12):3266-3281, 2014 (article)

PDF Supplementary material DOI Project Page [BibTex]

PDF Supplementary material DOI Project Page [BibTex]


no image
Perspective: Intelligent Systems: Bits and Bots

Spatz, J. P., Schaal, S.

Nature, (509), 2014, clmc (article)

Abstract
What is intelligence, and can we create it? Animals can perceive, reason, react and learn, but they are just one example of an intelligent system. Intelligent systems could be robots as large as humans, helping with search-and- rescue operations in dangerous places, or smart devices as tiny as a cell, delivering drugs to a target within the body. Even computing systems can be intelligent, by perceiving the world, crawling the web and processing â??big dataâ?? to extract and learn from complex information.Understanding not only how intelligence can be reproduced, but also how to build systems that put these ideas into practice, will be a challenge. Small intelligent systems will require new materials and fabrication methods, as well as com- pact information processors and power sources. And for nano-sized systems, the rules change altogether. The laws of physics operate very differently at tiny scales: for a nanorobot, swimming through water is like struggling through treacle.Researchers at the Max Planck Institute for Intelligent Systems have begun to solve these problems by developing new computational methods, experiment- ing with unique robotic systems and fabricating tiny, artificial propellers, like bacterial flagella, to propel nanocreations through their environment.

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Data-driven autonomous manipulation

Pastor, P.

University of Southern California, University of Southern California, Los Angeles, CA, 2014 (phdthesis)

Project Page Project Page [BibTex]

Project Page Project Page [BibTex]


no image
An autonomous manipulation system based on force control and optimization

Righetti, L., Kalakrishnan, M., Pastor, P., Binney, J., Kelly, J., Voorhies, R. C., Sukhatme, G. S., Schaal, S.

Autonomous Robots, 36(1-2):11-30, January 2014 (article)

Abstract
In this paper we present an architecture for autonomous manipulation. Our approach is based on the belief that contact interactions during manipulation should be exploited to improve dexterity and that optimizing motion plans is useful to create more robust and repeatable manipulation behaviors. We therefore propose an architecture where state of the art force/torque control and optimization-based motion planning are the core components of the system. We give a detailed description of the modules that constitute the complete system and discuss the challenges inherent to creating such a system. We present experimental results for several grasping and manipulation tasks to demonstrate the performance and robustness of our approach.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning of grasp selection based on shape-templates

Herzog, A., Pastor, P., Kalakrishnan, M., Righetti, L., Bohg, J., Asfour, T., Schaal, S.

Autonomous Robots, 36(1-2):51-65, January 2014 (article)

Abstract
The ability to grasp unknown objects still remains an unsolved problem in the robotics community. One of the challenges is to choose an appropriate grasp configuration, i.e., the 6D pose of the hand relative to the object and its finger configuration. In this paper, we introduce an algorithm that is based on the assumption that similarly shaped objects can be grasped in a similar way. It is able to synthesize good grasp poses for unknown objects by finding the best matching object shape templates associated with previously demonstrated grasps. The grasp selection algorithm is able to improve over time by using the information of previous grasp attempts to adapt the ranking of the templates to new situations. We tested our approach on two different platforms, the Willow Garage PR2 and the Barrett WAM robot, which have very different hand kinematics. Furthermore, we compared our algorithm with other grasp planners and demonstrated its superior performance. The results presented in this paper show that the algorithm is able to find good grasp configurations for a large set of unknown objects from a relatively small set of demonstrations, and does improve its performance over time.

link (url) DOI [BibTex]

2010


Learning Grasping Points with Shape Context
Learning Grasping Points with Shape Context

Bohg, J., Kragic, D.

Robotics and Autonomous Systems, 58(4):362-377, North-Holland Publishing Co., Amsterdam, The Netherlands, The Netherlands, April 2010 (article)

Abstract
This paper presents work on vision based robotic grasping. The proposed method adopts a learning framework where prototypical grasping points are learnt from several examples and then used on novel objects. For representation purposes, we apply the concept of shape context and for learning we use a supervised learning approach in which the classifier is trained with labelled synthetic images. We evaluate and compare the performance of linear and non-linear classifiers. Our results show that a combination of a descriptor based on shape context with a non-linear classification algorithm leads to a stable detection of grasping points for a variety of objects.

pdf link (url) DOI [BibTex]

2010

pdf link (url) DOI [BibTex]


no image
Policy learning algorithmis for motor learning (Algorithmen zum automatischen Erlernen von Motorfähigkigkeiten)

Peters, J., Kober, J., Schaal, S.

Automatisierungstechnik, 58(12):688-694, 2010, clmc (article)

Abstract
Robot learning methods which allow au- tonomous robots to adapt to novel situations have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to ful- fill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics. If possible, scaling was usually only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general ap- proach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human- like performance. For doing so, we study two major components for such an approach, i. e., firstly, we study policy learning algo- rithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structu- res for task representation and execution.

link (url) [BibTex]


no image
A Bayesian approach to nonlinear parameter identification for rigid-body dynamics

Ting, J., DSouza, A., Schaal, S.

Neural Networks, 2010, clmc (article)

Abstract
For complex robots such as humanoids, model-based control is highly beneficial for accurate tracking while keeping negative feedback gains low for compliance. However, in such multi degree-of-freedom lightweight systems, conventional identification of rigid body dynamics models using CAD data and actuator models is inaccurate due to unknown nonlinear robot dynamic effects. An alternative method is data-driven parameter estimation, but significant noise in measured and inferred variables affects it adversely. Moreover, standard estimation procedures may give physically inconsistent results due to unmodeled nonlinearities or insufficiently rich data. This paper addresses these problems, proposing a Bayesian system identification technique for linear or piecewise linear systems. Inspired by Factor Analysis regression, we develop a computationally efficient variational Bayesian regression algorithm that is robust to ill-conditioned data, automatically detects relevant features, and identifies input and output noise. We evaluate our approach on rigid body parameter estimation for various robotic systems, achieving an error of up to three times lower than other state-of-the-art machine learning methods.

link (url) [BibTex]


no image
A first optimal control solution for a complex, nonlinear, tendon driven neuromuscular finger model

Theodorou, E. A., Todorov, E., Valero-Cuevas, F.

Proceedings of the ASME 2010 Summer Bioengineering Conference August 30-September 2, 2010, Naples, Florida, USA, 2010, clmc (article)

Abstract
In this work we present the first constrained stochastic op- timal feedback controller applied to a fully nonlinear, tendon driven index finger model. Our model also takes into account an extensor mechanism, and muscle force-length and force-velocity properties. We show this feedback controller is robust to noise and perturbations to the dynamics, while successfully handling the nonlinearities and high dimensionality of the system. By ex- tending prior methods, we are able to approximate physiological realism by ensuring positivity of neural commands and tendon tensions at all timesthus can, for the first time, use the optimal control framework to predict biologically plausible tendon tensions for a nonlinear neuromuscular finger model. METHODS 1 Muscle Model The rigid-body triple pendulum finger model with slightly viscous joints is actuated by Hill-type muscle models. Joint torques are generated by the seven muscles of the index fin-

PDF [BibTex]

PDF [BibTex]


no image
Efficient learning and feature detection in high dimensional regression

Ting, J., D’Souza, A., Vijayakumar, S., Schaal, S.

Neural Computation, 22, pages: 831-886, 2010, clmc (article)

Abstract
We present a novel algorithm for efficient learning and feature selection in high- dimensional regression problems. We arrive at this model through a modification of the standard regression model, enabling us to derive a probabilistic version of the well-known statistical regression technique of backfitting. Using the Expectation- Maximization algorithm, along with variational approximation methods to overcome intractability, we extend our algorithm to include automatic relevance detection of the input features. This Variational Bayesian Least Squares (VBLS) approach retains its simplicity as a linear model, but offers a novel statistically robust â??black- boxâ? approach to generalized linear regression with high-dimensional inputs. It can be easily extended to nonlinear regression and classification problems. In particular, we derive the framework of sparse Bayesian learning, e.g., the Relevance Vector Machine, with VBLS at its core, offering significant computational and robustness advantages for this class of methods. We evaluate our algorithm on synthetic and neurophysiological data sets, as well as on standard regression and classification benchmark data sets, comparing it with other competitive statistical approaches and demonstrating its suitability as a drop-in replacement for other generalized linear regression techniques.

link (url) [BibTex]

link (url) [BibTex]


no image
Stochastic Differential Dynamic Programming

Theodorou, E., Tassa, Y., Todorov, E.

In the proceedings of American Control Conference (ACC 2010) , 2010, clmc (article)

Abstract
We present a generalization of the classic Differential Dynamic Programming algorithm. We assume the existence of state- and control-dependent process noise, and proceed to derive the second-order expansion of the cost-to-go. Despite having quartic and cubic terms in the initial expression, we show that these vanish, leaving us with the same quadratic structure as standard DDP.

PDF [BibTex]

PDF [BibTex]


no image
Learning control in robotics – trajectory-based opitimal control techniques

Schaal, S., Atkeson, C. G.

Robotics and Automation Magazine, 17(2):20-29, 2010, clmc (article)

Abstract
In a not too distant future, robots will be a natural part of daily life in human society, providing assistance in many areas ranging from clinical applications, education and care giving, to normal household environments [1]. It is hard to imagine that all possible tasks can be preprogrammed in such robots. Robots need to be able to learn, either by themselves or with the help of human supervision. Additionally, wear and tear on robots in daily use needs to be automatically compensated for, which requires a form of continuous self-calibration, another form of learning. Finally, robots need to react to stochastic and dynamic environments, i.e., they need to learn how to optimally adapt to uncertainty and unforeseen changes. Robot learning is going to be a key ingredient for the future of autonomous robots. While robot learning covers a rather large field, from learning to perceive, to plan, to make decisions, etc., we will focus this review on topics of learning control, in particular, as it is concerned with learning control in simulated or actual physical robots. In general, learning control refers to the process of acquiring a control strategy for a particular control system and a particular task by trial and error. Learning control is usually distinguished from adaptive control [2] in that the learning system can have rather general optimization objectivesâ??not just, e.g., minimal tracking errorâ??and is permitted to fail during the process of learning, while adaptive control emphasizes fast convergence without failure. Thus, learning control resembles the way that humans and animals acquire new movement strategies, while adaptive control is a special case of learning control that fulfills stringent performance constraints, e.g., as needed in life-critical systems like airplanes. Learning control has been an active topic of research for at least three decades. However, given the lack of working robots that actually use learning components, more work needs to be done before robot learning will make it beyond the laboratory environment. This article will survey some ongoing and past activities in robot learning to assess where the field stands and where it is going. We will largely focus on nonwheeled robots and less on topics of state estimation, as typically explored in wheeled robots [3]â??6], and we emphasize learning in continuous state-action spaces rather than discrete state-action spaces [7], [8]. We will illustrate the different topics of robot learning with examples from our own research with anthropomorphic and humanoid robots.

link (url) [BibTex]

link (url) [BibTex]


no image
Learning, planning, and control for quadruped locomotion over challenging terrain

Kalakrishnan, M., Buchli, J., Pastor, P., Mistry, M., Schaal, S.

International Journal of Robotics Research, 30(2):236-258, 2010, clmc (article)

Abstract
We present a control architecture for fast quadruped locomotion over rough terrain. We approach the problem by decomposing it into many sub-systems, in which we apply state-of-the-art learning, planning, optimization, and control techniques to achieve robust, fast locomotion. Unique features of our control strategy include: (1) a system that learns optimal foothold choices from expert demonstration using terrain templates, (2) a body trajectory optimizer based on the Zero- Moment Point (ZMP) stability criterion, and (3) a floating-base inverse dynamics controller that, in conjunction with force control, allows for robust, compliant locomotion over unperceived obstacles. We evaluate the performance of our controller by testing it on the LittleDog quadruped robot, over a wide variety of rough terrains of varying difficulty levels. The terrain that the robot was tested on includes rocks, logs, steps, barriers, and gaps, with obstacle sizes up to the leg length of the robot. We demonstrate the generalization ability of this controller by presenting results from testing performed by an independent external test team on terrain that has never been shown to us.

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]

2008


no image
Learning to control in operational space

Peters, J., Schaal, S.

International Journal of Robotics Research, 27, pages: 197-212, 2008, clmc (article)

Abstract
One of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in com- plex robots, e.g., humanoid robots. In this paper, we suggest a learning approach for opertional space control as a direct inverse model learning problem. A first important insight for this paper is that a physically cor- rect solution to the inverse problem with redundant degrees-of-freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem. The cost function as- sociated with this optimal control problem allows us to formulate a learn- ing algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corre- sponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees of freedom robot arm are used to illustrate the suggested approach. The applica- tion to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA-10 medical robotics arm.

link (url) DOI [BibTex]

2008

link (url) DOI [BibTex]


no image
Adaptation to a sub-optimal desired trajectory

M. Mistry, E. A. G. L. T. Y. S. S. M. K.

Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (article)

PDF [BibTex]

PDF [BibTex]


no image
Operational space control: A theoretical and emprical comparison

Nakanishi, J., Cory, R., Mistry, M., Peters, J., Schaal, S.

International Journal of Robotics Research, 27(6):737-757, 2008, clmc (article)

Abstract
Dexterous manipulation with a highly redundant movement system is one of the hallmarks of hu- man motor skills. From numerous behavioral studies, there is strong evidence that humans employ compliant task space control, i.e., they focus control only on task variables while keeping redundant degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances and simultaneously safe for the operator and the environment. The theory of operational space con- trol in robotics aims to achieve similar performance properties. However, despite various compelling theoretical lines of research, advanced operational space control is hardly found in actual robotics imple- mentations, in particular new kinds of robots like humanoids and service robots, which would strongly profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches to operational space control, this paper focuses on a theoretical and empirical evaluation of different methods that have been suggested in the literature, but also some new variants of operational space controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate all controllers in a common notational framework, including quaternion-based orientation control, and discuss some of their theoretical properties. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks. As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which ensures physical consistency, as this issue was crucial for our successful robot implementations. Our extensive empirical results demonstrate that one of the simplified acceleration-based approaches can be advantageous in terms of task performance, ease of parameter tuning, and general robustness and compliance in face of inevitable modeling errors.

link (url) [BibTex]

link (url) [BibTex]


no image
A library for locally weighted projection regression

Klanke, S., Vijayakumar, S., Schaal, S.

Journal of Machine Learning Research, 9, pages: 623-626, 2008, clmc (article)

Abstract
In this paper we introduce an improved implementation of locally weighted projection regression (LWPR), a supervised learning algorithm that is capable of handling high-dimensional input data. As the key features, our code supports multi-threading, is available for multiple platforms, and provides wrappers for several programming languages.

link (url) [BibTex]

link (url) [BibTex]


no image
Optimization strategies in human reinforcement learning

Hoffmann, H., Theodorou, E., Schaal, S.

Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (article)

PDF [BibTex]

PDF [BibTex]

2007


no image
Machine Learning of Motor Skills for Robotics

Peters, J.

University of Southern California, Los Angeles, CA, USA, University of Southern California, Los Angeles, CA, USA, 2007, clmc (phdthesis)

Abstract
Autonomous robots that can assist humans in situations of daily life have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. A first step towards this goal is to create robots that can accomplish a multitude of different tasks, triggered by environmental context or higher level instruction. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning and human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this thesis, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting. As a theoretical foundation, we first study a general framework to generate control laws for real robots with a particular focus on skills represented as dynamical systems in differential constraint form. We present a point-wise optimal control framework resulting from a generalization of Gauss' principle and show how various well-known robot control laws can be derived by modifying the metric of the employed cost function. The framework has been successfully applied to task space tracking control for holonomic systems for several different metrics on the anthropomorphic SARCOS Master Arm. In order to overcome the limiting requirement of accurate robot models, we first employ learning methods to find learning controllers for task space control. However, when learning to execute a redundant control problem, we face the general problem of the non-convexity of the solution space which can force the robot to steer into physically impossible configurations if supervised learning methods are employed without further consideration. This problem can be resolved using two major insights, i.e., the learning problem can be treated as locally convex and the cost function of the analytical framework can be used to ensure global consistency. Thus, we derive an immediate reinforcement learning algorithm from the expectation-maximization point of view which leads to a reward-weighted regression technique. This method can be used both for operational space control as well as general immediate reward reinforcement learning problems. We demonstrate the feasibility of the resulting framework on the problem of redundant end-effector tracking for both a simulated 3 degrees of freedom robot arm as well as for a simulated anthropomorphic SARCOS Master Arm. While learning to execute tasks in task space is an essential component to a general framework to motor skill learning, learning the actual task is of even higher importance, particularly as this issue is more frequently beyond the abilities of analytical approaches than execution. We focus on the learning of elemental tasks which can serve as the "building blocks of movement generation", called motor primitives. Motor primitives are parameterized task representations based on splines or nonlinear differential equations with desired attractor properties. While imitation learning of parameterized motor primitives is a relatively well-understood problem, the self-improvement by interaction of the system with the environment remains a challenging problem, tackled in the fourth chapter of this thesis. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. In conclusion, in this thesis, we have contributed a general framework for analytically computing robot control laws which can be used for deriving various previous control approaches and serves as foundation as well as inspiration for our learning algorithms. We have introduced two classes of novel reinforcement learning methods, i.e., the Natural Actor-Critic and the Reward-Weighted Regression algorithm. These algorithms have been used in order to replace the analytical components of the theoretical framework by learned representations. Evaluations have been performed on both simulated and real robot arms.

[BibTex]

2007

[BibTex]


no image
The new robotics - towards human-centered machines

Schaal, S.

HFSP Journal Frontiers of Interdisciplinary Research in the Life Sciences, 1(2):115-126, 2007, clmc (article)

Abstract
Research in robotics has moved away from its primary focus on industrial applications. The New Robotics is a vision that has been developed in past years by our own university and many other national and international research instiutions and addresses how increasingly more human-like robots can live among us and take over tasks where our current society has shortcomings. Elder care, physical therapy, child education, search and rescue, and general assistance in daily life situations are some of the examples that will benefit from the New Robotics in the near future. With these goals in mind, research for the New Robotics has to embrace a broad interdisciplinary approach, ranging from traditional mathematical issues of robotics to novel issues in psychology, neuroscience, and ethics. This paper outlines some of the important research problems that will need to be resolved to make the New Robotics a reality.

link (url) [BibTex]

link (url) [BibTex]

1996


no image
A Kendama learning robot based on bi-directional theory

Miyamoto, H., Schaal, S., Gandolfo, F., Koike, Y., Osu, R., Nakano, E., Wada, Y., Kawato, M.

Neural Networks, 9(8):1281-1302, 1996, clmc (article)

Abstract
A general theory of movement-pattern perception based on bi-directional theory for sensory-motor integration can be used for motion capture and learning by watching in robotics. We demonstrate our methods using the game of Kendama, executed by the SARCOS Dextrous Slave Arm, which has a very similar kinematic structure to the human arm. Three ingredients have to be integrated for the successful execution of this task. The ingredients are (1) to extract via-points from a human movement trajectory using a forward-inverse relaxation model, (2) to treat via-points as a control variable while reconstructing the desired trajectory from all the via-points, and (3) to modify the via-points for successful execution. In order to test the validity of the via-point representation, we utilized a numerical model of the SARCOS arm, and examined the behavior of the system under several conditions.

link (url) [BibTex]

1996

link (url) [BibTex]


no image
One-handed juggling: A dynamical approach to a rhythmic movement task

Schaal, S., Sternad, D., Atkeson, C. G.

Journal of Motor Behavior, 28(2):165-183, 1996, clmc (article)

Abstract
The skill of rhythmic juggling a ball on a racket is investigated from the viewpoint of nonlinear dynamics. The difference equations that model the dynamical system are analyzed by means of local and non-local stability analyses. These analyses yield that the task dynamics offer an economical juggling pattern which is stable even for open-loop actuator motion. For this pattern, two types of pre dictions are extracted: (i) Stable periodic bouncing is sufficiently characterized by a negative acceleration of the racket at the moment of impact with the ball; (ii) A nonlinear scaling relation maps different juggling trajectories onto one topologically equivalent dynamical system. The relevance of these results for the human control of action was evaluated in an experiment where subjects performed a comparable task of juggling a ball on a paddle. Task manipulations involved different juggling heights and gravity conditions of the ball. The predictions were confirmed: (i) For stable rhythmic performance the paddle's acceleration at impact is negative and fluctuations of the impact acceleration follow predictions from global stability analysis; (ii) For each subject, the realizations of juggling for the different experimental conditions are related by the scaling relation. These results allow the conclusion that for the given task, humans reliably exploit the stable solutions inherent to the dynamics of the task and do not overrule these dynamics by other control mechanisms. The dynamical scaling serves as an efficient principle to generate different movement realizations from only a few parameter changes and is discussed as a dynamical formalization of the principle of motor equivalence.

link (url) [BibTex]

link (url) [BibTex]