Header logo is am


2017


Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets
Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets

Hausman, K., Chebotar, Y., Schaal, S., Sukhatme, G., Lim, J.

In Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (inproceedings)

pdf video [BibTex]

2017

pdf video [BibTex]


On the Design of {LQR} Kernels for Efficient Controller Learning
On the Design of LQR Kernels for Efficient Controller Learning

Marco, A., Hennig, P., Schaal, S., Trimpe, S.

Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC), pages: 5193-5200, IEEE, IEEE Conference on Decision and Control, December 2017 (conference)

Abstract
Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]


Optimizing Long-term Predictions for Model-based Policy Search
Optimizing Long-term Predictions for Model-based Policy Search

Doerr, A., Daniel, C., Nguyen-Tuong, D., Marco, A., Schaal, S., Toussaint, M., Trimpe, S.

Proceedings of 1st Annual Conference on Robot Learning (CoRL), 78, pages: 227-238, (Editors: Sergey Levine and Vincent Vanhoucke and Ken Goldberg), 1st Annual Conference on Robot Learning, November 2017 (conference)

Abstract
We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.

PDF Project Page [BibTex]

PDF Project Page [BibTex]


no image
Learning optimal gait parameters and impedance profiles for legged locomotion

Heijmink, E., Radulescu, A., Ponton, B., Barasuol, V., Caldwell, D., Semini, C.

Proceedings International Conference on Humanoid Robots, IEEE, 2017 IEEE-RAS 17th International Conference on Humanoid Robots, November 2017 (conference)

Abstract
The successful execution of complex modern robotic tasks often relies on the correct tuning of a large number of parameters. In this paper we present a methodology for improving the performance of a trotting gait by learning the gait parameters, impedance profile and the gains of the control architecture. We show results on a set of terrains, for various speeds using a realistic simulation of a hydraulically actuated system. Our method achieves a reduction in the gait's mechanical energy consumption during locomotion of up to 26%. The simulation results are validated in experimental trials on the hardware system.

paper [BibTex]

paper [BibTex]


no image
A New Data Source for Inverse Dynamics Learning

Kappler, D., Meier, F., Ratliff, N., Schaal, S.

In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Piscataway, NJ, USA, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017 (inproceedings)

[BibTex]

[BibTex]


no image
Bayesian Regression for Artifact Correction in Electroencephalography

Fiebig, K., Jayaram, V., Hesse, T., Blank, A., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 131-136, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

DOI [BibTex]

DOI [BibTex]


no image
Investigating Music Imagery as a Cognitive Paradigm for Low-Cost Brain-Computer Interfaces

Grossberger, L., Hohmann, M. R., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 160-164, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

DOI [BibTex]

DOI [BibTex]


On the relevance of grasp metrics for predicting grasp success
On the relevance of grasp metrics for predicting grasp success

Rubert, C., Kappler, D., Morales, A., Schaal, S., Bohg, J.

In Proceedings of the IEEE/RSJ International Conference of Intelligent Robots and Systems, September 2017 (inproceedings) Accepted

Abstract
We aim to reliably predict whether a grasp on a known object is successful before it is executed in the real world. There is an entire suite of grasp metrics that has already been developed which rely on precisely known contact points between object and hand. However, it remains unclear whether and how they may be combined into a general purpose grasp stability predictor. In this paper, we analyze these questions by leveraging a large scale database of simulated grasps on a wide variety of objects. For each grasp, we compute the value of seven metrics. Each grasp is annotated by human subjects with ground truth stability labels. Given this data set, we train several classification methods to find out whether there is some underlying, non-trivial structure in the data that is difficult to model manually but can be learned. Quantitative and qualitative results show the complexity of the prediction problem. We found that a good prediction performance critically depends on using a combination of metrics as input features. Furthermore, non-parametric and non-linear classifiers best capture the structure in the data.

Project Page [BibTex]

Project Page [BibTex]


no image
Local Bayesian Optimization of Motor Skills

Akrour, R., Sorokin, D., Peters, J., Neumann, G.

Proceedings of the 34th International Conference on Machine Learning, 70, pages: 41-50, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning

Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S.

Proceedings of the 34th International Conference on Machine Learning, 70, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

pdf video [BibTex]

pdf video [BibTex]


Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers
Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

Doerr, A., Nguyen-Tuong, D., Marco, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 5295-5301, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

PDF arXiv DOI Project Page [BibTex]

PDF arXiv DOI Project Page [BibTex]


Learning Feedback Terms for Reactive Planning and Control
Learning Feedback Terms for Reactive Planning and Control

Rai, A., Sutanto, G., Schaal, S., Meier, F.

Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (conference)

pdf video [BibTex]

pdf video [BibTex]


Virtual vs. {R}eal: Trading Off Simulations and Physical Experiments in Reinforcement Learning with {B}ayesian Optimization
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Marco, A., Berkenkamp, F., Hennig, P., Schoellig, A. P., Krause, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 1557-1563, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]


Path Integral Guided Policy Search
Path Integral Guided Policy Search

Chebotar, Y., Kalakrishnan, M., Yahya, A., Li, A., Schaal, S., Levine, S.

Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (conference)

pdf video [BibTex]

pdf video [BibTex]

2004


no image
Learning Movement Primitives

Schaal, S., Peters, J., Nakanishi, J., Ijspeert, A.

In 11th International Symposium on Robotics Research (ISRR2003), pages: 561-572, (Editors: Dario, P. and Chatila, R.), Springer, ISRR, 2004, clmc (inproceedings)

Abstract
This paper discusses a comprehensive framework for modular motor control based on a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic control policies. Model-based control theory is used to convert the outputs of these policies into motor commands. By means of coupling terms, on-line modifications can be incorporated into the time evolution of the differential equations, thus providing a rather flexible and reactive framework for motor planning and execution. The linear parameterization of DMPs lends itself naturally to supervised learning from demonstration. Moreover, the temporal, scale, and translation invariance of the differential equations with respect to these parameters provides a useful means for movement recognition. A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria. We demonstrate the different ingredients of the DMP approach in various examples, involving skill learning from demonstration on the humanoid robot DB, and learning biped walking from demonstration in simulation, including self-improvement of the movement patterns towards energy efficiency through resonance tuning.

link (url) DOI [BibTex]

2004

link (url) DOI [BibTex]


no image
Learning Composite Adaptive Control for a Class of Nonlinear Systems

Nakanishi, J., Farrell, J. A., Schaal, S.

In IEEE International Conference on Robotics and Automation, pages: 2647-2652, New Orleans, LA, USA, April 2004, 2004, clmc (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
A framework for learning biped locomotion with dynamic movement primitives

Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal, S., Kawato, M.

In IEEE-RAS/RSJ International Conference on Humanoid Robots (Humanoids 2004), IEEE, Los Angeles, CA: Nov.10-12, Santa Monica, CA, 2004, clmc (inproceedings)

Abstract
This article summarizes our framework for learning biped locomotion using dynamical movement primitives based on nonlinear oscillators. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a central pattern generator (CPG) of a biped robot, an approach we have previously proposed for learning and encoding complex human movements. Demonstrated trajectories are learned through movement primitives by locally weighted regression, and the frequency of the learned trajectories is adjusted automatically by a frequency adaptation algorithm based on phase resetting and entrainment of coupled oscillators. Numerical simulations and experimental implementation on a physical robot demonstrate the effectiveness of the proposed locomotion controller. Furthermore, we demonstrate that phase resetting contributes to robustness against external perturbations and environmental changes by numerical simulations and experiments.

link (url) [BibTex]

link (url) [BibTex]


no image
Learning Motor Primitives with Reinforcement Learning

Peters, J., Schaal, S.

In Proceedings of the 11th Joint Symposium on Neural Computation, http://resolver.caltech.edu/CaltechJSNC:2004.poster020, 2004, clmc (inproceedings)

Abstract
One of the major challenges in action generation for robotics and in the understanding of human motor control is to learn the "building blocks of move- ment generation," or more precisely, motor primitives. Recently, Ijspeert et al. [1, 2] suggested a novel framework how to use nonlinear dynamical systems as motor primitives. While a lot of progress has been made in teaching these mo- tor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this poster, we evaluate different reinforcement learning approaches can be used in order to improve the performance of motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and line out how these lead to a novel algorithm which is based on natural policy gradients [3]. We compare this algorithm to previous reinforcement learning algorithms in the context of dynamic motor primitive learning, and show that it outperforms these by at least an order of magnitude. We demonstrate the efficiency of the resulting reinforcement learning method for creating complex behaviors for automous robotics. The studied behaviors will include both discrete, finite tasks such as baseball swings, as well as complex rhythmic patterns as they occur in biped locomotion

[BibTex]

[BibTex]

1994


no image
Robot learning by nonparametric regression

Schaal, S., Atkeson, C. G.

In Proceedings of the International Conference on Intelligent Robots and Systems (IROS’94), pages: 478-485, Munich Germany, 1994, clmc (inproceedings)

Abstract
We present an approach to robot learning grounded on a nonparametric regression technique, locally weighted regression. The model of the task to be performed is represented by infinitely many local linear models, i.e., the (hyper-) tangent planes at every query point. Such a model, however, is only generated when a query is performed and is not retained. This is in contrast to other methods using a finite set of linear models to accomplish a piecewise linear model. Architectural parameters of our approach, such as distance metrics, are also a function of the current query point instead of being global. Statistical tests are presented for when a local model is good enough such that it can be reliably used to build a local controller. These statistical measures also direct the exploration of the robot. We explicitly deal with the case where prediction accuracy requirements exist during exploration: By gradually shifting a center of exploration and controlling the speed of the shift with local prediction accuracy, a goal-directed exploration of state space takes place along the fringes of the current data support until the task goal is achieved. We illustrate this approach by describing how it has been used to enable a robot to learn a challenging juggling task: Within 40 to 100 trials the robot accomplished the task goal starting out with no initial experiences.

[BibTex]

1994

[BibTex]


no image
Assessing the quality of learned local models

Schaal, S., Atkeson, C. G.

In Advances in Neural Information Processing Systems 6, pages: 160-167, (Editors: Cowan, J.;Tesauro, G.;Alspector, J.), Morgan Kaufmann, San Mateo, CA, 1994, clmc (inproceedings)

Abstract
An approach is presented to learning high dimensional functions in the case where the learning algorithm can affect the generation of new data. A local modeling algorithm, locally weighted regression, is used to represent the learned function. Architectural parameters of the approach, such as distance metrics, are also localized and become a function of the query point instead of being global. Statistical tests are given for when a local model is good enough and sampling should be moved to a new area. Our methods explicitly deal with the case where prediction accuracy requirements exist during exploration: By gradually shifting a "center of exploration" and controlling the speed of the shift with local prediction accuracy, a goal-directed exploration of state space takes place along the fringes of the current data support until the task goal is achieved. We illustrate this approach with simulation results and results from a real robot learning a complex juggling task.

link (url) [BibTex]

link (url) [BibTex]


no image
Memory-based robot learning

Schaal, S., Atkeson, C. G.

In IEEE International Conference on Robotics and Automation, 3, pages: 2928-2933, San Diego, CA, 1994, clmc (inproceedings)

Abstract
We present a memory-based local modeling approach to robot learning using a nonparametric regression technique, locally weighted regression. The model of the task to be performed is represented by infinitely many local linear models, the (hyper-) tangent planes at every query point. This is in contrast to other methods using a finite set of linear models to accomplish a piece-wise linear model. Architectural parameters of our approach, such as distance metrics, are a function of the current query point instead of being global. Statistical tests are presented for when a local model is good enough such that it can be reliably used to build a local controller. These statistical measures also direct the exploration of the robot. We explicitly deal with the case where prediction accuracy requirements exist during exploration: By gradually shifting a center of exploration and controlling the speed of the shift with local prediction accuracy, a goal-directed exploration of state space takes place along the fringes of the current data support until the task goal is achieved. We illustrate this approach by describing how it has been used to enable a robot to learn a challenging juggling task: within 40 to 100 trials the robot accomplished the task goal starting out with no initial experiences.

[BibTex]

[BibTex]


no image
Nonparametric regression for learning

Schaal, S.

In Conference on Adaptive Behavior and Learning, Center of Interdisciplinary Research (ZIF) Bielefeld Germany, also technical report TR-H-098 of the ATR Human Information Processing Research Laboratories, 1994, clmc (inproceedings)

Abstract
In recent years, learning theory has been increasingly influenced by the fact that many learning algorithms have at least in part a comprehensive interpretation in terms of well established statistical theories. Furthermore, with little modification, several statistical methods can be directly cast into learning algorithms. One family of such methods stems from nonparametric regression. This paper compares nonparametric learning with the more widely used parametric counterparts and investigates how these two families differ in their properties and their applicability. 

link (url) [BibTex]

link (url) [BibTex]