Header logo is am


2019


no image
On the Transfer of Inductive Bias from Simulation to the Real World: a New Disentanglement Dataset

Gondal, M. W., Wuthrich, M., Miladinovic, D., Locatello, F., Breidt, M., Volchkov, V., Akpo, J., Bachem, O., Schölkopf, B., Bauer, S.

Advances in Neural Information Processing Systems 32, pages: 15714-15725, (Editors: H. Wallach and H. Larochelle and A. Beygelzimer and F. d’Alché-Buc and E. Fox and R. Garnett), Curran Associates, Inc., 33rd Annual Conference on Neural Information Processing Systems, December 2019 (conference)

link (url) [BibTex]

2019

link (url) [BibTex]


Accurate Vision-based Manipulation through Contact Reasoning
Accurate Vision-based Manipulation through Contact Reasoning

Kloss, A., Bauza, M., Wu, J., Tenenbaum, J. B., Rodriguez, A., Bohg, J.

In International Conference on Robotics and Automation, May 2019 (inproceedings) Accepted

Abstract
Planning contact interactions is one of the core challenges of many robotic tasks. Optimizing contact locations while taking dynamics into account is computationally costly and in only partially observed environments, executing contact-based tasks often suffers from low accuracy. We present an approach that addresses these two challenges for the problem of vision-based manipulation. First, we propose to disentangle contact from motion optimization. Thereby, we improve planning efficiency by focusing computation on promising contact locations. Second, we use a hybrid approach for perception and state estimation that combines neural networks with a physically meaningful state representation. In simulation and real-world experiments on the task of planar pushing, we show that our method is more efficient and achieves a higher manipulation accuracy than previous vision-based approaches.

Video link (url) [BibTex]

Video link (url) [BibTex]


Learning Latent Space Dynamics for Tactile Servoing
Learning Latent Space Dynamics for Tactile Servoing

Sutanto, G., Ratliff, N., Sundaralingam, B., Chebotar, Y., Su, Z., Handa, A., Fox, D.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings) Accepted

pdf video [BibTex]

pdf video [BibTex]


Leveraging Contact Forces for Learning to Grasp
Leveraging Contact Forces for Learning to Grasp

Merzic, H., Bogdanovic, M., Kappler, D., Righetti, L., Bohg, J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings)

Abstract
Grasping objects under uncertainty remains an open problem in robotics research. This uncertainty is often due to noisy or partial observations of the object pose or shape. To enable a robot to react appropriately to unforeseen effects, it is crucial that it continuously takes sensor feedback into account. While visual feedback is important for inferring a grasp pose and reaching for an object, contact feedback offers valuable information during manipulation and grasp acquisition. In this paper, we use model-free deep reinforcement learning to synthesize control policies that exploit contact sensing to generate robust grasping under uncertainty. We demonstrate our approach on a multi-fingered hand that exhibits more complex finger coordination than the commonly used two- fingered grippers. We conduct extensive experiments in order to assess the performance of the learned policies, with and without contact sensing. While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.

video arXiv [BibTex]

video arXiv [BibTex]

2018


Motion-based Object Segmentation based on Dense RGB-D Scene Flow
Motion-based Object Segmentation based on Dense RGB-D Scene Flow

Shao, L., Shah, P., Dwaracherla, V., Bohg, J.

IEEE Robotics and Automation Letters, 3(4):3797-3804, IEEE, IEEE/RSJ International Conference on Intelligent Robots and Systems, October 2018 (conference)

Abstract
Given two consecutive RGB-D images, we propose a model that estimates a dense 3D motion field, also known as scene flow. We take advantage of the fact that in robot manipulation scenarios, scenes often consist of a set of rigidly moving objects. Our model jointly estimates (i) the segmentation of the scene into an unknown but finite number of objects, (ii) the motion trajectories of these objects and (iii) the object scene flow. We employ an hourglass, deep neural network architecture. In the encoding stage, the RGB and depth images undergo spatial compression and correlation. In the decoding stage, the model outputs three images containing a per-pixel estimate of the corresponding object center as well as object translation and rotation. This forms the basis for inferring the object segmentation and final object scene flow. To evaluate our model, we generated a new and challenging, large-scale, synthetic dataset that is specifically targeted at robotic manipulation: It contains a large number of scenes with a very diverse set of simultaneously moving 3D objects and is recorded with a commonly-used RGB-D camera. In quantitative experiments, we show that we significantly outperform state-of-the-art scene flow and motion-segmentation methods. In qualitative experiments, we show how our learned model transfers to challenging real-world scenes, visually generating significantly better results than existing methods.

Project Page arXiv DOI [BibTex]

2018

Project Page arXiv DOI [BibTex]


Probabilistic Recurrent State-Space Models
Probabilistic Recurrent State-Space Models

Doerr, A., Daniel, C., Schiegg, M., Nguyen-Tuong, D., Schaal, S., Toussaint, M., Trimpe, S.

In Proceedings of the International Conference on Machine Learning (ICML), International Conference on Machine Learning (ICML), July 2018 (inproceedings)

Abstract
State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g., LSTMs) proved extremely successful in modeling complex time-series data. Fully probabilistic SSMs, however, unfortunately often prove hard to train, even for smaller problems. To overcome this limitation, we propose a scalable initialization and training algorithm based on doubly stochastic variational inference and Gaussian processes. In the variational approximation we propose in contrast to related approaches to fully capture the latent state temporal correlations to allow for robust training.

arXiv pdf Project Page [BibTex]

arXiv pdf Project Page [BibTex]


Online Learning of a Memory for Learning Rates
Online Learning of a Memory for Learning Rates

(nominated for best paper award)

Meier, F., Kappler, D., Schaal, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018, accepted (inproceedings)

Abstract
The promise of learning to learn for robotics rests on the hope that by extracting some information about the learning process itself we can speed up subsequent similar learning tasks. Here, we introduce a computationally efficient online meta-learning algorithm that builds and optimizes a memory model of the optimal learning rate landscape from previously observed gradient behaviors. While performing task specific optimization, this memory of learning rates predicts how to scale currently observed gradients. After applying the gradient scaling our meta-learner updates its internal memory based on the observed effect its prediction had. Our meta-learner can be combined with any gradient-based optimizer, learns on the fly and can be transferred to new optimization tasks. In our evaluations we show that our meta-learning algorithm speeds up learning of MNIST classification and a variety of learning control tasks, either in batch or online learning settings.

pdf video code [BibTex]

pdf video code [BibTex]


Learning Sensor Feedback Models from Demonstrations via Phase-Modulated Neural Networks
Learning Sensor Feedback Models from Demonstrations via Phase-Modulated Neural Networks

Sutanto, G., Su, Z., Schaal, S., Meier, F.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018 (inproceedings)

pdf video [BibTex]

pdf video [BibTex]


no image
On Time Optimization of Centroidal Momentum Dynamics

Ponton, B., Herzog, A., Del Prete, A., Schaal, S., Righetti, L.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 5776-5782, IEEE, Brisbane, Australia, 2018 (inproceedings)

Abstract
Recently, the centroidal momentum dynamics has received substantial attention to plan dynamically consistent motions for robots with arms and legs in multi-contact scenarios. However, it is also non convex which renders any optimization approach difficult and timing is usually kept fixed in most trajectory optimization techniques to not introduce additional non convexities to the problem. But this can limit the versatility of the algorithms. In our previous work, we proposed a convex relaxation of the problem that allowed to efficiently compute momentum trajectories and contact forces. However, our approach could not minimize a desired angular momentum objective which seriously limited its applicability. Noticing that the non-convexity introduced by the time variables is of similar nature as the centroidal dynamics one, we propose two convex relaxations to the problem based on trust regions and soft constraints. The resulting approaches can compute time-optimized dynamically consistent trajectories sufficiently fast to make the approach realtime capable. The performance of the algorithm is demonstrated in several multi-contact scenarios for a humanoid robot. In particular, we show that the proposed convex relaxation of the original problem finds solutions that are consistent with the original non-convex problem and illustrate how timing optimization allows to find motion plans that would be difficult to plan with fixed timing † †Implementation details and demos can be found in the source code available at https://git-amd.tuebingen.mpg.de/bponton/timeoptimization.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Unsupervised Contact Learning for Humanoid Estimation and Control

Rotella, N., Schaal, S., Righetti, L.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 411-417, IEEE, Brisbane, Australia, 2018 (inproceedings)

Abstract
This work presents a method for contact state estimation using fuzzy clustering to learn contact probability for full, six-dimensional humanoid contacts. The data required for training is solely from proprioceptive sensors - endeffector contact wrench sensors and inertial measurement units (IMUs) - and the method is completely unsupervised. The resulting cluster means are used to efficiently compute the probability of contact in each of the six endeffector degrees of freedom (DoFs) independently. This clustering-based contact probability estimator is validated in a kinematics-based base state estimator in a simulation environment with realistic added sensor noise for locomotion over rough, low-friction terrain on which the robot is subject to foot slip and rotation. The proposed base state estimator which utilizes these six DoF contact probability estimates is shown to perform considerably better than that which determines kinematic contact constraints purely based on measured normal force.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Task-Specific Dynamics to Improve Whole-Body Control

Gams, A., Mason, S., Ude, A., Schaal, S., Righetti, L.

In Hua, IEEE, Beijing, China, November 2018 (inproceedings)

Abstract
In task-based inverse dynamics control, reference accelerations used to follow a desired plan can be broken down into feedforward and feedback trajectories. The feedback term accounts for tracking errors that are caused from inaccurate dynamic models or external disturbances. On underactuated, free-floating robots, such as humanoids, high feedback terms can be used to improve tracking accuracy; however, this can lead to very stiff behavior or poor tracking accuracy due to limited control bandwidth. In this paper, we show how to reduce the required contribution of the feedback controller by incorporating learned task-space reference accelerations. Thus, we i) improve the execution of the given specific task, and ii) offer the means to reduce feedback gains, providing for greater compliance of the system. With a systematic approach we also reduce heuristic tuning of the model parameters and feedback gains, often present in real-world experiments. In contrast to learning task-specific joint-torques, which might produce a similar effect but can lead to poor generalization, our approach directly learns the task-space dynamics of the center of mass of a humanoid robot. Simulated and real-world results on the lower part of the Sarcos Hermes humanoid robot demonstrate the applicability of the approach.

link (url) [BibTex]

link (url) [BibTex]


no image
An MPC Walking Framework With External Contact Forces

Mason, S., Rotella, N., Schaal, S., Righetti, L.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 1785-1790, IEEE, Brisbane, Australia, May 2018 (inproceedings)

Abstract
In this work, we present an extension to a linear Model Predictive Control (MPC) scheme that plans external contact forces for the robot when given multiple contact locations and their corresponding friction cone. To this end, we set up a two-step optimization problem. In the first optimization, we compute the Center of Mass (CoM) trajectory, foot step locations, and introduce slack variables to account for violating the imposed constraints on the Zero Moment Point (ZMP). We then use the slack variables to trigger the second optimization, in which we calculate the optimal external force that compensates for the ZMP tracking error. This optimization considers multiple contacts positions within the environment by formulating the problem as a Mixed Integer Quadratic Program (MIQP) that can be solved at a speed between 100-300 Hz. Once contact is created, the MIQP reduces to a single Quadratic Program (QP) that can be solved in real-time ({\textless}; 1kHz). Simulations show that the presented walking control scheme can withstand disturbances 2-3× larger with the additional force provided by a hand contact.

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2017


Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets
Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets

Hausman, K., Chebotar, Y., Schaal, S., Sukhatme, G., Lim, J.

In Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (inproceedings)

pdf video [BibTex]

2017

pdf video [BibTex]


On the Design of {LQR} Kernels for Efficient Controller Learning
On the Design of LQR Kernels for Efficient Controller Learning

Marco, A., Hennig, P., Schaal, S., Trimpe, S.

Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC), pages: 5193-5200, IEEE, IEEE Conference on Decision and Control, December 2017 (conference)

Abstract
Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]


Optimizing Long-term Predictions for Model-based Policy Search
Optimizing Long-term Predictions for Model-based Policy Search

Doerr, A., Daniel, C., Nguyen-Tuong, D., Marco, A., Schaal, S., Toussaint, M., Trimpe, S.

Proceedings of 1st Annual Conference on Robot Learning (CoRL), 78, pages: 227-238, (Editors: Sergey Levine and Vincent Vanhoucke and Ken Goldberg), 1st Annual Conference on Robot Learning, November 2017 (conference)

Abstract
We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.

PDF Project Page [BibTex]

PDF Project Page [BibTex]


no image
Learning optimal gait parameters and impedance profiles for legged locomotion

Heijmink, E., Radulescu, A., Ponton, B., Barasuol, V., Caldwell, D., Semini, C.

Proceedings International Conference on Humanoid Robots, IEEE, 2017 IEEE-RAS 17th International Conference on Humanoid Robots, November 2017 (conference)

Abstract
The successful execution of complex modern robotic tasks often relies on the correct tuning of a large number of parameters. In this paper we present a methodology for improving the performance of a trotting gait by learning the gait parameters, impedance profile and the gains of the control architecture. We show results on a set of terrains, for various speeds using a realistic simulation of a hydraulically actuated system. Our method achieves a reduction in the gait's mechanical energy consumption during locomotion of up to 26%. The simulation results are validated in experimental trials on the hardware system.

paper [BibTex]

paper [BibTex]


no image
A New Data Source for Inverse Dynamics Learning

Kappler, D., Meier, F., Ratliff, N., Schaal, S.

In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Piscataway, NJ, USA, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017 (inproceedings)

[BibTex]

[BibTex]


no image
Bayesian Regression for Artifact Correction in Electroencephalography

Fiebig, K., Jayaram, V., Hesse, T., Blank, A., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 131-136, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

DOI [BibTex]

DOI [BibTex]


no image
Investigating Music Imagery as a Cognitive Paradigm for Low-Cost Brain-Computer Interfaces

Grossberger, L., Hohmann, M. R., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 160-164, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

DOI [BibTex]

DOI [BibTex]


On the relevance of grasp metrics for predicting grasp success
On the relevance of grasp metrics for predicting grasp success

Rubert, C., Kappler, D., Morales, A., Schaal, S., Bohg, J.

In Proceedings of the IEEE/RSJ International Conference of Intelligent Robots and Systems, September 2017 (inproceedings) Accepted

Abstract
We aim to reliably predict whether a grasp on a known object is successful before it is executed in the real world. There is an entire suite of grasp metrics that has already been developed which rely on precisely known contact points between object and hand. However, it remains unclear whether and how they may be combined into a general purpose grasp stability predictor. In this paper, we analyze these questions by leveraging a large scale database of simulated grasps on a wide variety of objects. For each grasp, we compute the value of seven metrics. Each grasp is annotated by human subjects with ground truth stability labels. Given this data set, we train several classification methods to find out whether there is some underlying, non-trivial structure in the data that is difficult to model manually but can be learned. Quantitative and qualitative results show the complexity of the prediction problem. We found that a good prediction performance critically depends on using a combination of metrics as input features. Furthermore, non-parametric and non-linear classifiers best capture the structure in the data.

Project Page [BibTex]

Project Page [BibTex]


no image
Local Bayesian Optimization of Motor Skills

Akrour, R., Sorokin, D., Peters, J., Neumann, G.

Proceedings of the 34th International Conference on Machine Learning, 70, pages: 41-50, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning

Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S.

Proceedings of the 34th International Conference on Machine Learning, 70, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

pdf video [BibTex]

pdf video [BibTex]


Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers
Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

Doerr, A., Nguyen-Tuong, D., Marco, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 5295-5301, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

PDF arXiv DOI Project Page [BibTex]

PDF arXiv DOI Project Page [BibTex]


Learning Feedback Terms for Reactive Planning and Control
Learning Feedback Terms for Reactive Planning and Control

Rai, A., Sutanto, G., Schaal, S., Meier, F.

Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (conference)

pdf video [BibTex]

pdf video [BibTex]


Virtual vs. {R}eal: Trading Off Simulations and Physical Experiments in Reinforcement Learning with {B}ayesian Optimization
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Marco, A., Berkenkamp, F., Hennig, P., Schoellig, A. P., Krause, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 1557-1563, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]


Path Integral Guided Policy Search
Path Integral Guided Policy Search

Chebotar, Y., Kalakrishnan, M., Yahya, A., Li, A., Schaal, S., Levine, S.

Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (conference)

pdf video [BibTex]

pdf video [BibTex]

2011


Mind the gap - robotic grasping under incomplete observation
Mind the gap - robotic grasping under incomplete observation

Bohg, J., Johnson-Roberson, M., Leon, B., Felip, J., Gratal, X., Bergstrom, N., Kragic, D., Morales, A.

In Robotics and Automation (ICRA), 2011 IEEE International Conference on, pages: 686-693, May 2011 (inproceedings)

Abstract
We consider the problem of grasp and manipulation planning when the state of the world is only partially observable. Specifically, we address the task of picking up unknown objects from a table top. The proposed approach to object shape prediction aims at closing the knowledge gaps in the robot's understanding of the world. A completed state estimate of the environment can then be provided to a simulator in which stable grasps and collision-free movements are planned. The proposed approach is based on the observation that many objects commonly in use in a service robotic scenario possess symmetries. We search for the optimal parameters of these symmetries given visibility constraints. Once found, the point cloud is completed and a surface mesh reconstructed. Quantitative experiments show that the predictions are valid approximations of the real object shape. By demonstrating the approach on two very different robotic platforms its generality is emphasized.

pdf video code data DOI Project Page [BibTex]

2011

pdf video code data DOI Project Page [BibTex]


no image
STOMP: Stochastic trajectory optimization for motion planning

Kalakrishnan, M., Chitta, S., Theodorou, E., Pastor, P., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, May 9-13, 2011, clmc (inproceedings)

Abstract
We present a new approach to motion planning using a stochastic trajectory optimization framework. The approach relies on generating noisy trajectories to explore the space around an initial (possibly infeasible) trajectory, which are then combined to produced an updated trajectory with lower cost. A cost function based on a combination of obstacle and smoothness cost is optimized in each iteration. No gradient information is required for the particular optimization algorithm that we use and so general costs for which derivatives may not be available (e.g. costs corresponding to constraints and motor torques) can be included in the cost function. We demonstrate the approach both in simulation and on a dual-arm mobile manipulation system for unconstrained and constrained tasks. We experimentally show that the stochastic nature of STOMP allows it to overcome local minima that gradient-based optimizers like CHOMP can get stuck in.

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Development of a Low-Pressure Fluidic Servo-Valve for Wearable Haptic Interfaces and Lightweight Robotic Systems"

Folgheraiter, M., Jordan, M., Benitez, L. M. V., Grimminger, F., Schmidt, S., Albiez, J., Kirchner, F.

In Informatics in Control, Automation and Robotics, pages: 239-252, Springer Berlin Heidelberg, Berlin, Heidelberg, 2011 (inproceedings)

Abstract
This document presents a low-pressure servo-valve specifically designed for haptic interfaces and lightweight robotic applications. The device is able to work with hydraulic and pneumatic fluidic sources, operating within a pressure range of (0{\thinspace}−{\thinspace}50 {\textperiodcentered}105Pa). All sensors and electronics were integrated inside the body of the valve, reducing the need for external circuits. Positioning repeatability as well as the capability to fine modulate the hydraulic flow were measured and verified. Furthermore, the static and dynamic behavior of the valve were evaluated for different working conditions, and a non-linear model identified using a recursive Hammerstein-Wiener parameter adaptation algorithm.

DOI [BibTex]

DOI [BibTex]


no image
An Experimental Demonstration of a Distributed and Event-based State Estimation Algorithm

(Best Interactive Paper Award (top out of 450))

Trimpe, S., D’Andrea, R.

In Proceedings of the 18th IFAC World Congress, 2011 (inproceedings)

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Path Integral Control and Bounded Rationality

Braun, D. A., Ortega, P. A., Theodorou, E., Schaal, S.

In IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011, clmc (inproceedings)

Abstract
Path integral methods [7], [15],[1] have recently been shown to be applicable to a very general class of optimal control problems. Here we examine the path integral formalism from a decision-theoretic point of view, since an optimal controller can always be regarded as an instance of a perfectly rational decision-maker that chooses its actions so as to maximize its expected utility [8]. The problem with perfect rationality is, however, that finding optimal actions is often very difficult due to prohibitive computational resource costs that are not taken into account. In contrast, a bounded rational decision-maker has only limited resources and therefore needs to strike some compromise between the desired utility and the required resource costs [14]. In particular, we suggest an information-theoretic measure of resource costs that can be derived axiomatically [11]. As a consequence we obtain a variational principle for choice probabilities that trades off maximizing a given utility criterion and avoiding resource costs that arise due to deviating from initially given default choice probabilities. The resulting bounded rational policies are in general probabilistic. We show that the solutions found by the path integral formalism are such bounded rational policies. Furthermore, we show that the same formalism generalizes to discrete control problems, leading to linearly solvable bounded rational control policies in the case of Markov systems. Importantly, Bellman?s optimality principle is not presupposed by this variational principle, but it can be derived as a limit case. This suggests that the information- theoretic formalization of bounded rationality might serve as a general principle in control design that unifies a number of recently reported approximate optimal control methods both in the continuous and discrete domain.

PDF [BibTex]

PDF [BibTex]


no image
Skill learning and task outcome prediction for manipulation

Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, May 9-13, 2011, clmc (inproceedings)

Abstract
Learning complex motor skills for real world tasks is a hard problem in robotic manipulation that often requires painstaking manual tuning and design by a human expert. In this work, we present a Reinforcement Learning based approach to acquiring new motor skills from demonstration. Our approach allows the robot to learn fine manipulation skills and significantly improve its success rate and skill level starting from a possibly coarse demonstration. Our approach aims to incorporate task domain knowledge, where appropriate, by working in a space consistent with the constraints of a specific task. In addition, we also present an approach to using sensor feedback to learn a predictive model of the task outcome. This allows our system to learn the proprioceptive sensor feedback needed to monitor subsequent executions of the task online and abort execution in the event of predicted failure. We illustrate our approach using two example tasks executed with the PR2 dual-arm robot: a straight and accurate pool stroke and a box flipping task using two chopsticks as tools.

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


no image
An Iterative Path Integral Stochastic Optimal Control Approach for Learning Robotic Tasks

Theodorou, E., Stulp, F., Buchli, J., Schaal, S.

In Proceedings of the 18th World Congress of the International Federation of Automatic Control, 2011, clmc (inproceedings)

Abstract
Recent work on path integral stochastic optimal control theory Theodorou et al. (2010a); Theodorou (2011) has shown promising results in planning and control of nonlinear systems in high dimensional state spaces. The path integral control framework relies on the transformation of the nonlinear Hamilton Jacobi Bellman (HJB) partial differential equation (PDE) into a linear PDE and the approximation of its solution via the use of the Feynman Kac lemma. In this work, we are reviewing the generalized version of path integral stochastic optimal control formalism Theodorou et al. (2010a), used for optimal control and planing of stochastic dynamical systems with state dependent control and diffusion matrices. Moreover we present the iterative path integral control approach, the so called Policy Improvement with Path Integrals or (PI2 ) which is capable of scaling in high dimensional robotic control problems. Furthermore we present a convergence analysis of the proposed algorithm and we apply the proposed framework to a variety of robotic tasks. Finally with the goal to perform locomotion the iterative path integral control is applied for learning nonlinear limit cycle attractors with adjustable land scape.

PDF [BibTex]

PDF [BibTex]


Enhanced visual scene understanding through human-robot dialog
Enhanced visual scene understanding through human-robot dialog

Johnson-Roberson, M., Bohg, J., Skantze, G., Gustafson, J., Carlson, R., Rasolzadeh, B., Kragic, D.

In Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on, pages: 3342-3348, 2011 (inproceedings)

Abstract
We propose a novel human-robot-interaction framework for robust visual scene understanding. Without any a-priori knowledge about the objects, the task of the robot is to correctly enumerate how many of them are in the scene and segment them from the background. Our approach builds on top of state-of-the-art computer vision methods, generating object hypotheses through segmentation. This process is combined with a natural dialog system, thus including a `human in the loop' where, by exploiting the natural conversation of an advanced dialog system, the robot gains knowledge about ambiguous situations. We present an entropy-based system allowing the robot to detect the poorest object hypotheses and query the user for arbitration. Based on the information obtained from the human-robot dialog, the scene segmentation can be re-seeded and thereby improved. We present experimental results on real data that show an improved segmentation performance compared to segmentation without interaction.

pdf video DOI Project Page [BibTex]

pdf video DOI Project Page [BibTex]


Risk and gain battery management for self-docking mobile robots
Risk and gain battery management for self-docking mobile robots

Berenz, V., Suzuki, K.

In Robotics and Biomimetics (ROBIO), 2011 IEEE International Conference on, pages: 1766-1771, 2011 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Reduced Communication State Estimation for Control of an Unstable Networked Control System

Trimpe, S., D’Andrea, R.

In Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference, 2011 (inproceedings)

PDF Supplementary material DOI [BibTex]

PDF Supplementary material DOI [BibTex]


TDM: A software framework for elegant and rapid development of autonomous behaviors for humanoid robots.
TDM: A software framework for elegant and rapid development of autonomous behaviors for humanoid robots.

Berenz, V., Tanaka, F., Suzuki, K., Herink, M.

In Humanoids, pages: 179-186, IEEE, 2011 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


Coaching robot behavior using continuous physiological affective feedback
Coaching robot behavior using continuous physiological affective feedback

Gruebler, A., Berenz, V., Suzuki, K.

In 11th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2011), Bled, Slovenia, October 26-28, 2011, pages: 466-471, 2011 (inproceedings)

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Neuromuscular Stochastic Optimal Control of a Tendon Driven Index Finger

Theodorou, E. A., Todorov, E., Valero-Cuevas, F.

In Proceedings of American Control Conference (ACC), 2011, clmc (inproceedings)

Abstract
With the goal to build robotic hands which can reach the levels of dexterity and robustness of the hand, the question of what are the candidate control principles that can handle the nonlinearities, the high dimensionality and the internal noise of biomechanical structures of the complexity of the hand, is still open. In this work we present the first stochastic optimal feedback controller applied to a full tendon driven simulated robotic index finger. In our model we do take into account the full tendon structure of the index finger which consist of 11 tendons based on the underlying physiology and we consider muscle with the typical force - length and force velocity properties. Our feedback controller show robustness against noise and perturbation of the dynamics while it can also successfully handle the nonlinearities and high dimensionality of the robotic index finger. Furthermore as it is shown in the evaluations, it provides the complete time history of the tendon excursions and the tendon velocities of the index finger for the tasks of tapping with zero and nonzero terminal velocities.

PDF [BibTex]

PDF [BibTex]


no image
Learning Force Control Policies for Compliant Manipulation

Kalakrishnan, M., Righetti, L., Pastor, P., Schaal, S.

In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 4639-4644, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Developing robots capable of fine manipulation skills is of major importance in order to build truly assistive robots. These robots need to be compliant in their actuation and control in order to operate safely in human environments. Manipulation tasks imply complex contact interactions with the external world, and involve reasoning about the forces and torques to be applied. Planning under contact conditions is usually impractical due to computational complexity, and a lack of precise dynamics models of the environment. We present an approach to acquiring manipulation skills on compliant robots through reinforcement learning. The initial position control policy for manipulation is initialized through kinesthetic demonstration. We augment this policy with a force/torque profile to be controlled in combination with the position trajectories. We use the Policy Improvement with Path Integrals (PI2) algorithm to learn these force/torque profiles by optimizing a cost function that measures task success. We demonstrate our approach on the Barrett WAM robot arm equipped with a 6-DOF force/torque sensor on two different manipulation tasks: opening a door with a lever door handle, and picking up a pen off the table. We show that the learnt force control policies allow successful, robust execution of the tasks.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Control of legged robots with optimal distribution of contact forces

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 11th IEEE-RAS International Conference on Humanoid Robots, pages: 318-324, IEEE, Bled, Slovenia, 2011 (inproceedings)

Abstract
The development of agile and safe humanoid robots require controllers that guarantee both high tracking performance and compliance with the environment. More specifically, the control of contact interaction is of crucial importance for robots that will actively interact with their environment. Model-based controllers such as inverse dynamics or operational space control are very appealing as they offer both high tracking performance and compliance. However, while widely used for fully actuated systems such as manipulators, they are not yet standard controllers for legged robots such as humanoids. Indeed such robots are fundamentally different from manipulators as they are underactuated due to their floating-base and subject to switching contact constraints. In this paper we present an inverse dynamics controller for legged robots that use torque redundancy to create an optimal distribution of contact constraints. The resulting controller is able to minimize, given a desired motion, any quadratic cost of the contact constraints at each instant of time. In particular we show how this can be used to minimize tangential forces during locomotion, therefore significantly improving the locomotion of legged robots on difficult terrains. In addition to the theoretical result, we present simulations of a humanoid and a quadruped robot, as well as experiments on a real quadruped robot that demonstrate the advantages of the controller.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Motion Primitive Goals for Robust Manipulation

Stulp, F., Theodorou, E., Kalakrishnan, M., Pastor, P., Righetti, L., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 325-331, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions. Second, in manipulation, the end-point of the movement must be chosen carefully, as it represents a grasp which must be adapted to the pose and shape of the object. Finally, there is uncertainty in the object pose, and even the most carefully planned movement may fail if the object is not at the expected position. To address these challenges we 1) present a simplified, computationally more efficient version of our model-free reinforcement learning algorithm PI2; 2) extend PI2 so that it simultaneously learns shape parameters and goal parameters of motion primitives; 3) use shape and goal learning to acquire motion primitives that are robust to object pose uncertainty. We evaluate these contributions on a manipulation platform consisting of a 7-DOF arm with a 4-DOF hand.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Inverse Dynamics Control of Floating-Base Robots with External Constraints: a Unified View

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 IEEE International Conference on Robotics and Automation, pages: 1085-1090, IEEE, Shanghai, China, 2011 (inproceedings)

Abstract
Inverse dynamics controllers and operational space controllers have proved to be very efficient for compliant control of fully actuated robots such as fixed base manipulators. However legged robots such as humanoids are inherently different as they are underactuated and subject to switching external contact constraints. Recently several methods have been proposed to create inverse dynamics controllers and operational space controllers for these robots. In an attempt to compare these different approaches, we develop a general framework for inverse dynamics control and show that these methods lead to very similar controllers. We are then able to greatly simplify recent whole-body controllers based on operational space approaches using kinematic projections, bringing them closer to efficient practical implementations. We also generalize these controllers such that they can be optimal under an arbitrary quadratic cost in the commands.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Movement segmentation using a primitive library

Meier, F., Theodorou, E., Stulp, F., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011), Sept. 25-30, San Francisco, CA, 2011, clmc (inproceedings)

Abstract
Segmenting complex movements into a sequence of primitives remains a difficult problem with many applications in the robotics and vision communities. In this work, we show how the movement segmentation problem can be reduced to a sequential movement recognition problem. To this end, we reformulate the orig-inal Dynamic Movement Primitive (DMP) formulation as a linear dynamical sys-tem with control inputs. Based on this new formulation, we develop an Expecta-tion-Maximization algorithm to estimate the duration and goal position of a par-tially observed trajectory. With the help of this algorithm and the assumption that a library of movement primitives is present, we present a movement seg-mentation framework. We illustrate the usefulness of the new DMP formulation on the two applications of online movement recognition and movement segmen-tation.

link (url) [BibTex]

link (url) [BibTex]


no image
Online movement adaptation based on previous sensor experiences

Pastor, P., Righetti, L., Kalakrishnan, M., Schaal, S.

In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 365-371, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Personal robots can only become widespread if they are capable of safely operating among humans. In uncertain and highly dynamic environments such as human households, robots need to be able to instantly adapt their behavior to unforseen events. In this paper, we propose a general framework to achieve very contact-reactive motions for robotic grasping and manipulation. Associating stereotypical movements to particular tasks enables our system to use previous sensor experiences as a predictive model for subsequent task executions. We use dynamical systems, named Dynamic Movement Primitives (DMPs), to learn goal-directed behaviors from demonstration. We exploit their dynamic properties by coupling them with the measured and predicted sensor traces. This feedback loop allows for online adaptation of the movement plan. Our system can create a rich set of possible motions that account for external perturbations and perception uncertainty to generate truly robust behaviors. As an example, we present an application to grasping with the WAM robot arm.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Additional DOFs and sensors for bio-inspired locomotion: Towards active spine, ankle joints, and feet for a quadruped robot

Kuehn, D., Grimminger, F., Beinersdorf, F., Bernhard, F., Burchardt, A., Schilling, M., Simnofske, M., Stark, T., Zenzes, M., Kirchner, F.

In 2011 IEEE International Conference on Robotics and Biomimetics, pages: 2780-2786, December 2011 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Learning to grasp under uncertainty

Stulp, F., Theodorou, E., Buchli, J., Schaal, S.

In Robotics and Automation (ICRA), 2011 IEEE International Conference on, Shanghai, China, May 9-13, 2011, clmc (inproceedings)

Abstract
We present an approach that enables robots to learn motion primitives that are robust towards state estimation uncertainties. During reaching and preshaping, the robot learns to use fine manipulation strategies to maneuver the object into a pose at which closing the hand to perform the grasp is more likely to succeed. In contrast, common assumptions in grasp planning and motion planning for reaching are that these tasks can be performed independently, and that the robot has perfect knowledge of the pose of the objects in the environment. We implement our approach using Dynamic Movement Primitives and the probabilistic model-free reinforcement learning algorithm Policy Improvement with Path Integrals (PI2 ). The cost function that PI2 optimizes is a simple boolean that penalizes failed grasps. The key to acquiring robust motion primitives is to sample the actual pose of the object from a distribution that represents the state estimation uncertainty. During learning, the robot will thus optimize the chance of grasping an object from this distribution, rather than at one specific pose. In our empirical evaluation, we demonstrate how the motion primitives become more robust when grasping simple cylindrical objects, as well as more complex, non-convex objects. We also investigate how well the learned motion primitives generalize towards new object positions and other state estimation uncertainty distributions.

link (url) [BibTex]

link (url) [BibTex]

2010


no image
Reinforcement learning of full-body humanoid motor skills

Stulp, F., Buchli, J., Theodorou, E., Schaal, S.

In Humanoid Robots (Humanoids), 2010 10th IEEE-RAS International Conference on, pages: 405-410, December 2010, clmc (inproceedings)

Abstract
Applying reinforcement learning to humanoid robots is challenging because humanoids have a large number of degrees of freedom and state and action spaces are continuous. Thus, most reinforcement learning algorithms would become computationally infeasible and require a prohibitive amount of trials to explore such high-dimensional spaces. In this paper, we present a probabilistic reinforcement learning approach, which is derived from the framework of stochastic optimal control and path integrals. The algorithm, called Policy Improvement with Path Integrals (PI2), has a surprisingly simple form, has no open tuning parameters besides the exploration noise, is model-free, and performs numerically robustly in high dimensional learning problems. We demonstrate how PI2 is able to learn full-body motor skills on a 34-DOF humanoid robot. To demonstrate the generality of our approach, we also apply PI2 in the context of variable impedance control, where both planned trajectories and gain schedules for each joint are optimized simultaneously.

link (url) [BibTex]

2010

link (url) [BibTex]


Enhanced Visual Scene Understanding through Human-Robot Dialog
Enhanced Visual Scene Understanding through Human-Robot Dialog

Johnson-Roberson, M., Bohg, J., Kragic, D., Skantze, G., Gustafson, J., Carlson, R.

In Proceedings of AAAI 2010 Fall Symposium: Dialog with Robots, November 2010 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Scene Representation and Object Grasping Using Active Vision
Scene Representation and Object Grasping Using Active Vision

Gratal, X., Bohg, J., Björkman, M., Kragic, D.

In IROS’10 Workshop on Defining and Solving Realistic Perception Problems in Personal Robotics, October 2010 (inproceedings)

Abstract
Object grasping and manipulation pose major challenges for perception and control and require rich interaction between these two fields. In this paper, we concentrate on the plethora of perceptual problems that have to be solved before a robot can be moved in a controlled way to pick up an object. A vision system is presented that integrates a number of different computational processes, e.g. attention, segmentation, recognition or reconstruction to incrementally build up a representation of the scene suitable for grasping and manipulation of objects. Our vision system is equipped with an active robotic head and a robot arm. This embodiment enables the robot to perform a number of different actions like saccading, fixating, and grasping. By applying these actions, the robot can incrementally build a scene representation and use it for interaction. We demonstrate our system in a scenario for picking up known objects from a table top. We also show the system’s extendibility towards grasping of unknown and familiar objects.

video pdf slides [BibTex]

video pdf slides [BibTex]