Header logo is am


2014


Robot Arm Pose Estimation through Pixel-Wise Part Classification
Robot Arm Pose Estimation through Pixel-Wise Part Classification

Bohg, J., Romero, J., Herzog, A., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA) 2014, pages: 3143-3150, IEEE International Conference on Robotics and Automation (ICRA), June 2014 (inproceedings)

Abstract
We propose to frame the problem of marker-less robot arm pose estimation as a pixel-wise part classification problem. As input, we use a depth image in which each pixel is classified to be either from a particular robot part or the background. The classifier is a random decision forest trained on a large number of synthetically generated and labeled depth images. From all the training samples ending up at a leaf node, a set of offsets is learned that votes for relative joint positions. Pooling these votes over all foreground pixels and subsequent clustering gives us an estimate of the true joint positions. Due to the intrinsic parallelism of pixel-wise classification, this approach can run in super real-time and is more efficient than previous ICP-like methods. We quantitatively evaluate the accuracy of this approach on synthetic data. We also demonstrate that the method produces accurate joint estimates on real data despite being purely trained on synthetic data.

video code pdf DOI Project Page [BibTex]

2014

video code pdf DOI Project Page [BibTex]


no image
A Self-Tuning LQR Approach Demonstrated on an Inverted Pendulum

Trimpe, S., Millane, A., Doessegger, S., D’Andrea, R.

In Proceedings of the 19th IFAC World Congress, Cape Town, South Africa, 2014 (inproceedings)

PDF Supplementary material DOI [BibTex]

PDF Supplementary material DOI [BibTex]


no image
Learning coupling terms for obstacle avoidance

Rai, A., Meier, F., Ijspeert, A., Schaal, S.

In International Conference on Humanoid Robotics, pages: 512-518, IEEE, 2014, clmc (inproceedings)

Abstract
Autonomous manipulation in dynamic environments is important for robots to perform everyday tasks. For this, a manipulator should be capable of interpreting the environment and planning an appropriate movement. At least, two possible approaches exist for this in literature. Usually, a planning system is used to generate a complex movement plan that satisfies all constraints. Alternatively, a simple plan could be chosen and modified with sensory feedback to accommodate additional constraints by equipping the controller with features that remain dormant most of the time, except when specific situations arise. Dynamic Movement Primitives (DMPs) form a robust and versatile starting point for such a controller that can be modified online using a non-linear term, called the coupling term. This can prove to be a fast and reactive way of obstacle avoidance in a human-like fashion. We propose a method to learn this coupling term from human demonstrations starting with simple features and making it more robust to avoid a larger range of obstacles. We test the ability of our coupling term to model different kinds of obstacle avoidance behaviours in humans and use this learnt coupling term to avoid obstacles in a reactive manner. This line of research aims at pushing the boundary of reactive control strategies to more complex scenarios, such that complex and usually computationally more expensive planning methods can be avoided as much as possible.

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Generalization of the tacit learning controller based on periodic tuning functions
Generalization of the tacit learning controller based on periodic tuning functions

Berenz, V., Hayashibe, M., Alnajjar, F., Shimoda, S.

In 5th IEEE RAS/EMBS International Conference on Biomedical Robotics and Biomechatronics, pages: 893-898, 2014 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Incremental Local Gaussian Regression

Meier, F., Hennig, P., Schaal, S.

In Advances in Neural Information Processing Systems 27, pages: 972-980, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014, clmc (inproceedings)

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Efficient Bayesian Local Model Learning for Control

Meier, F., Hennig, P., Schaal, S.

In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, pages: 2244 - 2249, IROS, 2014, clmc (inproceedings)

Abstract
Model-based control is essential for compliant controland force control in many modern complex robots, like humanoidor disaster robots. Due to many unknown and hard tomodel nonlinearities, analytical models of such robots are oftenonly very rough approximations. However, modern optimizationcontrollers frequently depend on reasonably accurate models,and degrade greatly in robustness and performance if modelerrors are too large. For a long time, machine learning hasbeen expected to provide automatic empirical model synthesis,yet so far, research has only generated feasibility studies butno learning algorithms that run reliably on complex robots.In this paper, we combine two promising worlds of regressiontechniques to generate a more powerful regression learningsystem. On the one hand, locally weighted regression techniquesare computationally efficient, but hard to tune due to avariety of data dependent meta-parameters. On the other hand,Bayesian regression has rather automatic and robust methods toset learning parameters, but becomes quickly computationallyinfeasible for big and high-dimensional data sets. By reducingthe complexity of Bayesian regression in the spirit of local modellearning through variational approximations, we arrive at anovel algorithm that is computationally efficient and easy toinitialize for robust learning. Evaluations on several datasetsdemonstrate very good learning performance and the potentialfor a general regression learning tool for robotics.

PDF link (url) DOI [BibTex]

PDF link (url) DOI [BibTex]


no image
Stability Analysis of Distributed Event-Based State Estimation

Trimpe, S.

In Proceedings of the 53rd IEEE Conference on Decision and Control, Los Angeles, CA, 2014 (inproceedings)

Abstract
An approach for distributed and event-based state estimation that was proposed in previous work [1] is analyzed and extended to practical networked systems in this paper. Multiple sensor-actuator-agents observe a dynamic process, sporadically exchange their measurements over a broadcast network according to an event-based protocol, and estimate the process state from the received data. The event-based approach was shown in [1] to mimic a centralized Luenberger observer up to guaranteed bounds, under the assumption of identical estimates on all agents. This assumption, however, is unrealistic (it is violated by a single packet drop or slight numerical inaccuracy) and removed herein. By means of a simulation example, it is shown that non-identical estimates can actually destabilize the overall system. To achieve stability, the event-based communication scheme is supplemented by periodic (but infrequent) exchange of the agentsâ?? estimates and reset to their joint average. When the local estimates are used for feedback control, the stability guarantee for the estimation problem extends to the event-based control system.

PDF Supplementary material DOI Project Page [BibTex]

PDF Supplementary material DOI Project Page [BibTex]


no image
Dual Execution of Optimized Contact Interaction Trajectories

Toussaint, M., Ratliff, N., Bohg, J., Righetti, L., Englert, P., Schaal, S.

In 2014 IEEE/RSJ Conference on Intelligent Robots and Systems, pages: 47-54, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
Efficient manipulation requires contact to reduce uncertainty. The manipulation literature refers to this as funneling: a methodology for increasing reliability and robustness by leveraging haptic feedback and control of environmental interaction. However, there is a fundamental gap between traditional approaches to trajectory optimization and this concept of robustness by funneling: traditional trajectory optimizers do not discover force feedback strategies. From a POMDP perspective, these behaviors could be regarded as explicit observation actions planned to sufficiently reduce uncertainty thereby enabling a task. While we are sympathetic to the full POMDP view, solving full continuous-space POMDPs in high-dimensions is hard. In this paper, we propose an alternative approach in which trajectory optimization objectives are augmented with new terms that reward uncertainty reduction through contacts, explicitly promoting funneling. This augmentation shifts the responsibility of robustness toward the actual execution of the optimized trajectories. Directly tracing trajectories through configuration space would lose all robustness-dual execution achieves robustness by devising force controllers to reproduce the temporal interaction profile encoded in the dual solution of the optimization problem. This work introduces dual execution in depth and analyze its performance through robustness experiments in both simulation and on a real-world robotic platform.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning and Exploration in a Novel Dimensionality-Reduction Task

Ebert, J, Kim, S, Schweighofer, N., Sternad, D, Schaal, S.

In Abstracts of Neural Control of Movement Conference (NCM 2009), Amsterdam, Netherlands, 2014 (inproceedings)

[BibTex]

[BibTex]


no image
Balancing experiments on a torque-controlled humanoid with hierarchical inverse dynamics

Herzog, A., Righetti, L., Grimminger, F., Pastor, P., Schaal, S.

In 2014 IEEE/RSJ Conference on Intelligent Robots and Systems, pages: 981-988, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
Recently several hierarchical inverse dynamics controllers based on cascades of quadratic programs have been proposed for application on torque controlled robots. They have important theoretical benefits but have never been implemented on a torque controlled robot where model inaccuracies and real-time computation requirements can be problematic. In this contribution we present an experimental evaluation of these algorithms in the context of balance control for a humanoid robot. The presented experiments demonstrate the applicability of the approach under real robot conditions (i.e. model uncertainty, estimation errors, etc). We propose a simplification of the optimization problem that allows us to decrease computation time enough to implement it in a fast torque control loop. We implement a momentum-based balance controller which shows robust performance in face of unknown disturbances, even when the robot is standing on only one foot. In a second experiment, a tracking task is evaluated to demonstrate the performance of the controller with more complicated hierarchies. Our results show that hierarchical inverse dynamics controllers can be used for feedback control of humanoid robots and that momentum-based balance control can be efficiently implemented on a real robot.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Full Dynamics LQR Control of a Humanoid Robot: An Experimental Study on Balancing and Squatting

Mason, S., Righetti, L., Schaal, S.

In 2014 IEEE-RAS International Conference on Humanoid Robots, pages: 374-379, IEEE, Madrid, Spain, 2014 (inproceedings)

Abstract
Humanoid robots operating in human environments require whole-body controllers that can offer precise tracking and well-defined disturbance rejection behavior. In this contribution, we propose an experimental evaluation of a linear quadratic regulator (LQR) using a linearization of the full robot dynamics together with the contact constraints. The advantage of the controller is that it explicitly takes into account the coupling between the different joints to create optimal feedback controllers for whole-body control. We also propose a method to explicitly regulate other tasks of interest, such as the regulation of the center of mass of the robot or its angular momentum. In order to evaluate the performance of linear optimal control designs in a real-world scenario (model uncertainty, sensor noise, imperfect state estimation, etc), we test the controllers in a variety of tracking and balancing experiments on a torque controlled humanoid (e.g. balancing, split plane balancing, squatting, pushes while squatting, and balancing on a wheeled platform). The proposed control framework shows a reliable push recovery behavior competitive with more sophisticated balance controllers, rejecting impulses up to 11.7 Ns with peak forces of 650 N, with the added advantage of great computational simplicity. Furthermore, the controller is able to track squatting trajectories up to 1 Hz without relinearization, suggesting that the linearized dynamics is sufficient for significant ranges of motion.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
State Estimation for a Humanoid Robot

Rotella, N., Bloesch, M., Righetti, L., Schaal, S.

In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 952-958, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
This paper introduces a framework for state estimation on a humanoid robot platform using only common proprioceptive sensors and knowledge of leg kinematics. The presented approach extends that detailed in prior work on a point-foot quadruped platform by adding the rotational constraints imposed by the humanoid's flat feet. As in previous work, the proposed Extended Kalman Filter accommodates contact switching and makes no assumptions about gait or terrain, making it applicable on any humanoid platform for use in any task. A nonlinear observability analysis is performed on both the point-foot and flat-foot filters and it is concluded that the addition of rotational constraints significantly simplifies singular cases and improves the observability characteristics of the system. Results on a simulated walking dataset demonstrate the performance gain of the flat-foot filter as well as confirm the results of the presented observability analysis.

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2013


Probabilistic Object Tracking Using a Range Camera
Probabilistic Object Tracking Using a Range Camera

Wüthrich, M., Pastor, P., Kalakrishnan, M., Bohg, J., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 3195-3202, IEEE, November 2013 (inproceedings)

Abstract
We address the problem of tracking the 6-DoF pose of an object while it is being manipulated by a human or a robot. We use a dynamic Bayesian network to perform inference and compute a posterior distribution over the current object pose. Depending on whether a robot or a human manipulates the object, we employ a process model with or without knowledge of control inputs. Observations are obtained from a range camera. As opposed to previous object tracking methods, we explicitly model self-occlusions and occlusions from the environment, e.g, the human or robotic hand. This leads to a strongly non-linear observation model and additional dependencies in the Bayesian network. We employ a Rao-Blackwellised particle filter to compute an estimate of the object pose at every time step. In a set of experiments, we demonstrate the ability of our method to accurately and robustly track the object pose in real-time while it is being manipulated by a human or a robot.

arXiv Video Code Video DOI Project Page [BibTex]

2013

arXiv Video Code Video DOI Project Page [BibTex]


Learning and Optimization with Submodular Functions
Learning and Optimization with Submodular Functions

Sankaran, B., Ghazvininejad, M., He, X., Kale, D., Cohen, L.

ArXiv, May 2013 (techreport)

Abstract
In many naturally occurring optimization problems one needs to ensure that the definition of the optimization problem lends itself to solutions that are tractable to compute. In cases where exact solutions cannot be computed tractably, it is beneficial to have strong guarantees on the tractable approximate solutions. In order operate under these criterion most optimization problems are cast under the umbrella of convexity or submodularity. In this report we will study design and optimization over a common class of functions called submodular functions. Set functions, and specifically submodular set functions, characterize a wide variety of naturally occurring optimization problems, and the property of submodularity of set functions has deep theoretical consequences with wide ranging applications. Informally, the property of submodularity of set functions concerns the intuitive principle of diminishing returns. This property states that adding an element to a smaller set has more value than adding it to a larger set. Common examples of submodular monotone functions are entropies, concave functions of cardinality, and matroid rank functions; non-monotone examples include graph cuts, network flows, and mutual information. In this paper we will review the formal definition of submodularity; the optimization of submodular functions, both maximization and minimization; and finally discuss some applications in relation to learning and reasoning using submodular functions.

arxiv link (url) [BibTex]

arxiv link (url) [BibTex]


Hypothesis Testing Framework for Active Object Detection
Hypothesis Testing Framework for Active Object Detection

Sankaran, B., Atanasov, N., Le Ny, J., Koletschka, T., Pappas, G., Daniilidis, K.

In IEEE International Conference on Robotics and Automation (ICRA), May 2013, clmc (inproceedings)

Abstract
One of the central problems in computer vision is the detection of semantically important objects and the estimation of their pose. Most of the work in object detection has been based on single image processing and its performance is limited by occlusions and ambiguity in appearance and geometry. This paper proposes an active approach to object detection by controlling the point of view of a mobile depth camera. When an initial static detection phase identifies an object of interest, several hypotheses are made about its class and orientation. The sensor then plans a sequence of view-points, which balances the amount of energy used to move with the chance of identifying the correct hypothesis. We formulate an active M-ary hypothesis testing problem, which includes sensor mobility, and solve it using a point-based approximate POMDP algorithm. The validity of our approach is verified through simulation and experiments with real scenes captured by a kinect sensor. The results suggest a significant improvement over static object detection.

pdf [BibTex]

pdf [BibTex]


no image
Action and Goal Related Decision Variables Modulate the Competition Between Multiple Potential Targets

Enachescu, V, Christopoulos, Vassilios N, Schrater, P. R., Schaal, S.

In Abstracts of Neural Control of Movement Conference (NCM 2013), February 2013 (inproceedings)

[BibTex]

[BibTex]


The functional role of automatic body response in shaping voluntary actions based on muscle synergy theory
The functional role of automatic body response in shaping voluntary actions based on muscle synergy theory

Alnajjar, F. S., Berenz, V., Shimoda, S.

In Neural Engineering (NER), 2013 6th International IEEE/EMBS Conference on, pages: 1230-1233, 2013 (inproceedings)

DOI [BibTex]

DOI [BibTex]


Coaching robots with biosignals based on human affective social behaviors
Coaching robots with biosignals based on human affective social behaviors

Suzuki, K., Gruebler, A., Berenz, V.

In ACM/IEEE International Conference on Human-Robot Interaction, HRI 2013, Tokyo, Japan, March 3-6, 2013, pages: 419-420, 2013 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


Fusing visual and tactile sensing for 3-D object reconstruction while grasping
Fusing visual and tactile sensing for 3-D object reconstruction while grasping

Ilonen, J., Bohg, J., Kyrki, V.

In IEEE International Conference on Robotics and Automation (ICRA), pages: 3547-3554, 2013 (inproceedings)

Abstract
In this work, we propose to reconstruct a complete 3-D model of an unknown object by fusion of visual and tactile information while the object is grasped. Assuming the object is symmetric, a first hypothesis of its complete 3-D shape is generated from a single view. This initial model is used to plan a grasp on the object which is then executed with a robotic manipulator equipped with tactile sensors. Given the detected contacts between the fingers and the object, the full object model including the symmetry parameters can be refined. This refined model will then allow the planning of more complex manipulation tasks. The main contribution of this work is an optimal estimation approach for the fusion of visual and tactile data applying the constraint of object symmetry. The fusion is formulated as a state estimation problem and solved with an iterative extended Kalman filter. The approach is validated experimentally using both artificial and real data from two different robotic platforms.

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Learning Objective Functions for Manipulation

Kalakrishnan, M., Pastor, P., Righetti, L., Schaal, S.

In 2013 IEEE International Conference on Robotics and Automation, IEEE, Karlsruhe, Germany, 2013 (inproceedings)

Abstract
We present an approach to learning objective functions for robotic manipulation based on inverse reinforcement learning. Our path integral inverse reinforcement learning algorithm can deal with high-dimensional continuous state-action spaces, and only requires local optimality of demonstrated trajectories. We use L 1 regularization in order to achieve feature selection, and propose an efficient algorithm to minimize the resulting convex objective function. We demonstrate our approach by applying it to two core problems in robotic manipulation. First, we learn a cost function for redundancy resolution in inverse kinematics. Second, we use our method to learn a cost function over trajectories, which is then used in optimization-based motion planning for grasping and manipulation tasks. Experimental results show that our method outperforms previous algorithms in high-dimensional settings.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Task Error Models for Manipulation

Pastor, P., Kalakrishnan, M., Binney, J., Kelly, J., Righetti, L., Sukhatme, G. S., Schaal, S.

In 2013 IEEE Conference on Robotics and Automation, IEEE, Karlsruhe, Germany, 2013 (inproceedings)

Abstract
Precise kinematic forward models are important for robots to successfully perform dexterous grasping and manipulation tasks, especially when visual servoing is rendered infeasible due to occlusions. A lot of research has been conducted to estimate geometric and non-geometric parameters of kinematic chains to minimize reconstruction errors. However, kinematic chains can include non-linearities, e.g. due to cable stretch and motor-side encoders, that result in significantly different errors for different parts of the state space. Previous work either does not consider such non-linearities or proposes to estimate non-geometric parameters of carefully engineered models that are robot specific. We propose a data-driven approach that learns task error models that account for such unmodeled non-linearities. We argue that in the context of grasping and manipulation, it is sufficient to achieve high accuracy in the task relevant state space. We identify this relevant state space using previously executed joint configurations and learn error corrections for those. Therefore, our system is developed to generate subsequent executions that are similar to previous ones. The experiments show that our method successfully captures the non-linearities in the head kinematic chain (due to a counterbalancing spring) and the arm kinematic chains (due to cable stretch) of the considered experimental platform, see Fig. 1. The feasibility of the presented error learning approach has also been evaluated in independent DARPA ARM-S testing contributing to successfully complete 67 out of 72 grasping and manipulation tasks.

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2012


Towards Multi-DOF model mediated teleoperation: Using vision to augment feedback
Towards Multi-DOF model mediated teleoperation: Using vision to augment feedback

Willaert, B., Bohg, J., Van Brussel, H., Niemeyer, G.

In IEEE International Workshop on Haptic Audio Visual Environments and Games (HAVE), pages: 25-31, October 2012 (inproceedings)

Abstract
In this paper, we address some of the challenges that arise as model-mediated teleoperation is applied to systems with multiple degrees of freedom and multiple sensors. Specifically we use a system with position, force, and vision sensors to explore an environment geometry in two degrees of freedom. The inclusion of vision is proposed to alleviate the difficulties of estimating an increasing number of environment properties. Vision can furthermore increase the predictive nature of model-mediated teleoperation, by effectively predicting touch feedback before the slave is even in contact with the environment. We focus on the case of estimating the location and orientation of a local surface patch at the contact point between the slave and the environment. We describe the various information sources with their respective limitations and create a combined model estimator as part of a multi-d.o.f. model-mediated controller. An experiment demonstrates the feasibility and benefits of utilizing vision sensors in teleoperation.

DOI [BibTex]

2012

DOI [BibTex]


Failure Recovery with Shared Autonomy
Failure Recovery with Shared Autonomy

Sankaran, B., Pitzer, B., Osentoski, S.

In International Conference on Intelligent Robots and Systems, October 2012 (inproceedings)

Abstract
Building robots capable of long term autonomy has been a long standing goal of robotics research. Such systems must be capable of performing certain tasks with a high degree of robustness and repeatability. In the context of personal robotics, these tasks could range anywhere from retrieving items from a refrigerator, loading a dishwasher, to setting up a dinner table. Given the complexity of tasks there are a multitude of failure scenarios that the robot can encounter, irrespective of whether the environment is static or dynamic. For a robot to be successful in such situations, it would need to know how to recover from failures or when to ask a human for help. This paper, presents a novel shared autonomy behavioral executive to addresses these issues. We demonstrate how this executive combines generalized logic based recovery and human intervention to achieve continuous failure free operation. We tested the systems over 250 trials of two different use case experiments. Our current algorithm drastically reduced human intervention from 26% to 4% on the first experiment and 46% to 9% on the second experiment. This system provides a new dimension to robot autonomy, where robots can exhibit long term failure free operation with minimal human supervision. We also discuss how the system can be generalized.

link (url) [BibTex]

link (url) [BibTex]


Task-Based Grasp Adaptation on a Humanoid Robot
Task-Based Grasp Adaptation on a Humanoid Robot

Bohg, J., Welke, K., León, B., Do, M., Song, D., Wohlkinger, W., Aldoma, A., Madry, M., Przybylski, M., Asfour, T., Marti, H., Kragic, D., Morales, A., Vincze, M.

In 10th IFAC Symposium on Robot Control, SyRoCo 2012, Dubrovnik, Croatia, September 5-7, 2012., pages: 779-786, September 2012 (inproceedings)

Abstract
In this paper, we present an approach towards autonomous grasping of objects according to their category and a given task. Recent advances in the field of object segmentation and categorization as well as task-based grasp inference have been leveraged by integrating them into one pipeline. This allows us to transfer task-specific grasp experience between objects of the same category. The effectiveness of the approach is demonstrated on the humanoid robot ARMAR-IIIa.

Video pdf DOI [BibTex]

Video pdf DOI [BibTex]


no image
Movement Segmentation and Recognition for Imitation Learning

Meier, F., Theodorou, E., Schaal, S.

In Seventeenth International Conference on Artificial Intelligence and Statistics, La Palma, Canary Islands, Fifteenth International Conference on Artificial Intelligence and Statistics , April 2012 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Event-based State Estimation with Switching Static-gain Observers

Trimpe, S.

In Proceedings of the 3rd IFAC Workshop on Distributed Estimation and Control in Networked Systems, 2012 (inproceedings)

PDF DOI [BibTex]

PDF DOI [BibTex]


Usability benchmarks of the Targets-Drives-Means robotic architecture
Usability benchmarks of the Targets-Drives-Means robotic architecture

Berenz, V., Suzuki, K.

In 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), Osaka, Japan, November 29 - Dec. 1, 2012, pages: 514-519, 2012 (inproceedings)

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Event-based State Estimation with Variance-Based Triggering

Trimpe, S., D’Andrea, R.

In Proceedings of the 51st IEEE Conference on Decision and Control, 2012 (inproceedings)

PDF Supplementary material DOI [BibTex]

PDF Supplementary material DOI [BibTex]


no image
Inverse dynamics with optimal distribution of contact forces for the control of legged robots

Righetti, L., Schaal, S.

In Dynamic Walking 2012, Pensacola, 2012 (inproceedings)

[BibTex]

[BibTex]


no image
Encoding of Periodic and their Transient Motions by a Single Dynamic Movement Primitive

Ernesti, J., Righetti, L., Do, M., Asfour, T., Schaal, S.

In 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pages: 57-64, IEEE, Osaka, Japan, November 2012 (inproceedings)

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
An adaptive sensor foot for a bipedal and quadrupedal robot

Fondahl, K., Kuehn, D., Beinersdorf, F., Bernhard, F., Grimminger, F., Schilling, M., Stark, T., Kirchner, F.

In 2012 4th IEEE RAS EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob), pages: 270-275, June 2012 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Learning Force Control Policies for Compliant Robotic Manipulation

Kalakrishnan, M., Righetti, L., Pastor, P., Schaal, S.

In ICML’12 Proceedings of the 29th International Coference on International Conference on Machine Learning, pages: 49-50, Edinburgh, Scotland, 2012 (inproceedings)

[BibTex]

[BibTex]


no image
Quadratic programming for inverse dynamics with optimal distribution of contact forces

Righetti, L., Schaal, S.

In 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pages: 538-543, IEEE, Osaka, Japan, November 2012 (inproceedings)

Abstract
In this contribution we propose an inverse dynamics controller for a humanoid robot that exploits torque redundancy to minimize any combination of linear and quadratic costs in the contact forces and the commands. In addition the controller satisfies linear equality and inequality constraints in the contact forces and the commands such as torque limits, unilateral contacts or friction cones limits. The originality of our approach resides in the formulation of the problem as a quadratic program where we only need to solve for the control commands and where the contact forces are optimized implicitly. Furthermore, we do not need a structured representation of the dynamics of the robot (i.e. an explicit computation of the inertia matrix). It is in contrast with existing methods based on quadratic programs. The controller is then robust to uncertainty in the estimation of the dynamics model and the optimization is fast enough to be implemented in high bandwidth torque control loops that are increasingly available on humanoid platforms. We demonstrate properties of our controller with simulations of a human size humanoid robot.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Towards Associative Skill Memories

Pastor, P., Kalakrishnan, M., Righetti, L., Schaal, S.

In 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pages: 309-315, IEEE, Osaka, Japan, November 2012 (inproceedings)

Abstract
Movement primitives as basis of movement planning and control have become a popular topic in recent years. The key idea of movement primitives is that a rather small set of stereotypical movements should suffice to create a large set of complex manipulation skills. An interesting side effect of stereotypical movement is that it also creates stereotypical sensory events, e.g., in terms of kinesthetic variables, haptic variables, or, if processed appropriately, visual variables. Thus, a movement primitive executed towards a particular object in the environment will associate a large number of sensory variables that are typical for this manipulation skill. These association can be used to increase robustness towards perturbations, and they also allow failure detection and switching towards other behaviors. We call such movement primitives augmented with sensory associations Associative Skill Memories (ASM). This paper addresses how ASMs can be acquired by imitation learning and how they can create robust manipulation skill by determining subsequent ASMs online to achieve a particular manipulation goal. Evaluation for grasping and manipulation with a Barrett WAM/Hand illustrate our approach.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Template-based learning of grasp selection

Herzog, A., Pastor, P., Kalakrishnan, M., Righetti, L., Asfour, T., Schaal, S.

In 2012 IEEE International Conference on Robotics and Automation, pages: 2379-2384, IEEE, Saint Paul, USA, 2012 (inproceedings)

Abstract
The ability to grasp unknown objects is an important skill for personal robots, which has been addressed by many present and past research projects, but still remains an open problem. A crucial aspect of grasping is choosing an appropriate grasp configuration, i.e. the 6d pose of the hand relative to the object and its finger configuration. Finding feasible grasp configurations for novel objects, however, is challenging because of the huge variety in shape and size of these objects. Moreover, possible configurations also depend on the specific kinematics of the robotic arm and hand in use. In this paper, we introduce a new grasp selection algorithm able to find object grasp poses based on previously demonstrated grasps. Assuming that objects with similar shapes can be grasped in a similar way, we associate to each demonstrated grasp a grasp template. The template is a local shape descriptor for a possible grasp pose and is constructed using 3d information from depth sensors. For each new object to grasp, the algorithm then finds the best grasp candidate in the library of templates. The grasp selection is also able to improve over time using the information of previous grasp attempts to adapt the ranking of the templates. We tested the algorithm on two different platforms, the Willow Garage PR2 and the Barrett WAM arm which have very different hands. Our results show that the algorithm is able to find good grasp configurations for a large set of objects from a relatively small set of demonstrations, and does indeed improve its performance over time.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Probabilistic depth image registration incorporating nonvisual information

Wüthrich, M., Pastor, P., Righetti, L., Billard, A., Schaal, S.

In 2012 IEEE International Conference on Robotics and Automation, pages: 3637-3644, IEEE, Saint Paul, USA, 2012 (inproceedings)

Abstract
In this paper, we derive a probabilistic registration algorithm for object modeling and tracking. In many robotics applications, such as manipulation tasks, nonvisual information about the movement of the object is available, which we will combine with the visual information. Furthermore we do not only consider observations of the object, but we also take space into account which has been observed to not be part of the object. Furthermore we are computing a posterior distribution over the relative alignment and not a point estimate as typically done in for example Iterative Closest Point (ICP). To our knowledge no existing algorithm meets these three conditions and we thus derive a novel registration algorithm in a Bayesian framework. Experimental results suggest that the proposed methods perform favorably in comparison to PCL [1] implementations of feature mapping and ICP, especially if nonvisual information is available.

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2003


no image
Dynamic movement primitives - A framework for motor control in humans and humanoid robots

Schaal, S.

In The International Symposium on Adaptive Motion of Animals and Machines, Kyoto, Japan, March 4-8, 2003, March 2003, clmc (inproceedings)

Abstract
Sensory-motor integration is one of the key issues in robotics. In this paper, we propose an approach to rhythmic arm movement control that is synchronized with an external signal based on exploiting a simple neural oscillator network. Trajectory generation by the neural oscillator is a biologically inspired method that can allow us to generate a smooth and continuous trajectory. The parameter tuning of the oscillators is used to generate a synchronized movement with wide intervals. We adopted the method for the drumming task as an example task. By using this method, the robot can realize synchronized drumming with wide drumming intervals in real time. The paper also shows the experimental results of drumming by a humanoid robot.

link (url) [BibTex]

2003

link (url) [BibTex]


no image
Bayesian backfitting

D’Souza, A., Vijayakumar, S., Schaal, S.

In Proceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003), Irvine, CA, May 2003, 2003, clmc (inproceedings)

Abstract
We present an algorithm aimed at addressing both computational and analytical intractability of Bayesian regression models which operate in very high-dimensional, usually underconstrained spaces. Several domains of research frequently provide such datasets, including chemometrics [2], and human movement analysis [1]. The literature in nonparametric statistics provides interesting solutions such as Backfitting [3] and Partial Least Squares [4], which are extremely robust and efficient, yet lack a probabilistic interpretation that could place them in the context of current research in statistical learning algorithms that emphasize the estimation of confidence, posterior distributions, and model complexity. In order to achieve numerical robustness and low computational cost, we first derive a novel Bayesian interpretation of Backfitting (BB) as a computationally efficient regression algorithm. BBÕs learning complexity scales linearly with the input dimensionality by decoupling inference among individual input dimensions. We embed BB in an efficient, locally variational model selection mechanism that automatically grows the number of backfitting experts in a mixture-of-experts regression model. We demonstrate the effectiveness of the algorithm in performing principled regularization of model complexity when fitting nonlinear manifolds while avoiding the numerical hazards associated with highly underconstrained problems. We also note that this algorithm appears applicable in various areas of neural computation, e.g., in abstract models of computational neuroscience, or implementations of statistical learning on artificial systems.

link (url) [BibTex]

link (url) [BibTex]


no image
Reinforcement learning for humanoid robotics

Peters, J., Vijayakumar, S., Schaal, S.

In IEEE-RAS International Conference on Humanoid Robots (Humanoids2003), Karlsruhe, Germany, Sept.29-30, 2003, clmc (inproceedings)

Abstract
Reinforcement learning offers one of the most general framework to take traditional robotics towards true autonomy and versatility. However, applying reinforcement learning to high dimensional movement systems like humanoid robots remains an unsolved problem. In this paper, we discuss different approaches of reinforcement learning in terms of their applicability in humanoid robotics. Methods can be coarsely classified into three different categories, i.e., greedy methods, `vanilla' policy gradient methods, and natural gradient methods. We discuss that greedy methods are not likely to scale into the domain humanoid robotics as they are problematic when used with function approximation. `Vanilla' policy gradient methods on the other hand have been successfully applied on real-world robots including at least one humanoid robot. We demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. A derivation of the natural policy gradient is provided, proving that the average policy gradient of Kakade (2002) is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges to the nearest local minimum of the cost function with respect to the Fisher information metric under suitable conditions. The algorithm outperforms non-natural policy gradients by far in a cart-pole balancing evaluation, and for learning nonlinear dynamic motor primitives for humanoid robot control. It offers a promising route for the development of reinforcement learning for truly high dimensionally continuous state-action systems.

link (url) [BibTex]

link (url) [BibTex]


no image
Discovering imitation strategies through categorization of multi-cimensional data

Billard, A., Epars, Y., Schaal, S., Cheng, G.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2003), Las Vegas, NV, Oct. 27-31, 2003, clmc (inproceedings)

Abstract
An essential problem of imitation is that of determining Ówhat to imitateÓ, i.e. to determine which of the many features of the demonstration are relevant to the task and which should be reproduced. The strategy followed by the imitator can be modeled as a hierarchical optimization system, which minimizes the discrepancy between two multidimensional datasets. We consider imitation of a manipulation task. To classify across manipulation strategies, we apply a probabilistic analysis to data in Cartesian and joint spaces. We determine a general metric that optimizes the policy of task reproduction, following strategy determination. The model successfully discovers strategies in six different manipulation tasks and controls task reproduction by a full body humanoid robot. or the complete path followed by the demonstrator. We follow a similar taxonomy and apply it to the learning and reproduction of a manipulation task by a humanoid robot. We take the perspective that the features of the movements to imitate are those that appear most frequently, i.e. the invariants in time. The model builds upon previous work [3], [4] and is composed of a hierarchical time delay neural network that extracts invariant features from a manipulation task performed by a human demonstrator. The system analyzes the Carthesian trajectories of the objects and the joint

link (url) [BibTex]

link (url) [BibTex]


no image
Scaling reinforcement learning paradigms for motor learning

Peters, J., Vijayakumar, S., Schaal, S.

In Proceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003), Irvine, CA, May 2003, 2003, clmc (inproceedings)

Abstract
Reinforcement learning offers a general framework to explain reward related learning in artificial and biological motor control. However, current reinforcement learning methods rarely scale to high dimensional movement systems and mainly operate in discrete, low dimensional domains like game-playing, artificial toy problems, etc. This drawback makes them unsuitable for application to human or bio-mimetic motor control. In this poster, we look at promising approaches that can potentially scale and suggest a novel formulation of the actor-critic algorithm which takes steps towards alleviating the current shortcomings. We argue that methods based on greedy policies are not likely to scale into high-dimensional domains as they are problematic when used with function approximation Ð a must when dealing with continuous domains. We adopt the path of direct policy gradient based policy improvements since they avoid the problems of unstabilizing dynamics encountered in traditional value iteration based updates. While regular policy gradient methods have demonstrated promising results in the domain of humanoid notor control, we demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. Based on this, it is proved that KakadeÕs Ôaverage natural policy gradientÕ is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges with probability one to the nearest local minimum in Riemannian space of the cost function. The algorithm outperforms nonnatural policy gradients by far in a cart-pole balancing evaluation, and offers a promising route for the development of reinforcement learning for truly high-dimensionally continuous state-action systems.

link (url) [BibTex]

link (url) [BibTex]


no image
Design and Control of a Leg for the Running Machine PANTER

Berns, K., Grimminger, F., Hochholdinger, U., Kerscher, T., Albiez, J.

In Proceedings of the ICAR 2003–11th International Conference on Advanced Robotics, pages: 1737-1742, 2003 (inproceedings)

[BibTex]

[BibTex]


no image
Learning attractor landscapes for learning motor primitives

Ijspeert, A., Nakanishi, J., Schaal, S.

In Advances in Neural Information Processing Systems 15, pages: 1547-1554, (Editors: Becker, S.;Thrun, S.;Obermayer, K.), Cambridge, MA: MIT Press, 2003, clmc (inproceedings)

Abstract
If globally high dimensional data has locally only low dimensional distributions, it is advantageous to perform a local dimensionality reduction before further processing the data. In this paper we examine several techniques for local dimensionality reduction in the context of locally weighted linear regression. As possible candidates, we derive local versions of factor analysis regression, principle component regression, principle component regression on joint distributions, and partial least squares regression. After outlining the statistical bases of these methods, we perform Monte Carlo simulations to evaluate their robustness with respect to violations of their statistical assumptions. One surprising outcome is that locally weighted partial least squares regression offers the best average results, thus outperforming even factor analysis, the theoretically most appealing of our candidate techniques.Ê

link (url) [BibTex]

link (url) [BibTex]


no image
PANTER-prototype for a fast-running quadruped robot with pneumatic muscles

Albiez, J., Kerscher, T., Grimminger, F., Hochholdinger, U., Dillmann, R., Berns, K.

In Proceedings of the 6th International Conference on Climbing and Walking Robots, pages: 617-624, 2003 (inproceedings)

[BibTex]

[BibTex]


no image
Learning from demonstration and adaptation of biped locomotion with dynamical movement primitives

Nakanishi, J., Morimoto, J., Endo, G., Schaal, S., Kawato, M.

In Workshop on Robot Learning by Demonstration, IEEE International Conference on Intelligent Robots and Systems (IROS 2003), Las Vegas, NV, Oct. 27-31, 2003, clmc (inproceedings)

Abstract
In this paper, we report on our research for learning biped locomotion from human demonstration. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a CPG of a biped robot, an approach we have previously proposed for learning and encoding complex human movements. Demonstrated trajectories are learned through the movement primitives by locally weighted regression, and the frequency of the learned trajectories is adjusted automatically by a novel frequency adaptation algorithm based on phase resetting and entrainment of oscillators. Numerical simulations demonstrate the effectiveness of the proposed locomotion controller.

link (url) [BibTex]

link (url) [BibTex]


no image
Movement planning and imitation by shaping nonlinear attractors

Schaal, S.

In Proceedings of the 12th Yale Workshop on Adaptive and Learning Systems, Yale University, New Haven, CT, 2003, clmc (inproceedings)

Abstract
Given the continuous stream of movements that biological systems exhibit in their daily activities, an account for such versatility and creativity has to assume that movement sequences consist of segments, executed either in sequence or with partial or complete overlap. Therefore, a fundamental question that has pervaded research in motor control both in artificial and biological systems revolves around identifying movement primitives (a.k.a. units of actions, basis behaviors, motor schemas, etc.). What are the fundamental building blocks that are strung together, adapted to, and created for ever new behaviors? This paper summarizes results that led to the hypothesis of Dynamic Movement Primitives (DMP). DMPs are units of action that are formalized as stable nonlinear attractor systems. They are useful for autonomous robotics as they are highly flexible in creating complex rhythmic (e.g., locomotion) and discrete (e.g., a tennis swing) behaviors that can quickly be adapted to the inevitable perturbations of a dy-namically changing, stochastic environment. Moreover, DMPs provide a formal framework that also lends itself to investigations in computational neuroscience. A recent finding that allows creating DMPs with the help of well-understood statistical learning methods has elevated DMPs from a more heuristic to a principled modeling approach, and, moreover, created a new foundation for imitation learning. Theoretical insights, evaluations on a humanoid robot, and behavioral and brain imaging data will serve to outline the framework of DMPs for a general approach to motor control and imitation in robotics and biology.

link (url) [BibTex]

link (url) [BibTex]

2002


no image
Learning rhythmic movements by demonstration using nonlinear oscillators

Ijspeert, J. A., Nakanishi, J., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2002), pages: 958-963, Piscataway, NJ: IEEE, Lausanne, Sept.30-Oct.4 2002, 2002, clmc (inproceedings)

Abstract
Locally weighted learning (LWL) is a class of statistical learning techniques that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional beliefs that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested in up to 50 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing of a humanoid robot arm, and inverse-dynamics learning for a seven degree-of-freedom robot.

link (url) [BibTex]

2002

link (url) [BibTex]


no image
Reliable stair climbing in the simple hexapod ’RHex’

Moore, E. Z., Campbell, D., Grimminger, F., Buehler, M.

In Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292), 3, pages: 2222-2227 vol.3, May 2002 (inproceedings)

DOI [BibTex]

DOI [BibTex]


no image
Movement imitation with nonlinear dynamical systems in humanoid robots

Ijspeert, J. A., Nakanishi, J., Schaal, S.

In International Conference on Robotics and Automation (ICRA2002), Washinton, May 11-15 2002, 2002, clmc (inproceedings)

Abstract
Locally weighted learning (LWL) is a class of statistical learning techniques that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional beliefs that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested in up to 50 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing of a humanoid robot arm, and inverse-dynamics learning for a seven degree-of-freedom robot.

link (url) [BibTex]

link (url) [BibTex]


no image
A locally weighted learning composite adaptive controller with structure adaptation

Nakanishi, J., Farrell, J. A., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2002), Lausanne, Sept.30-Oct.4 2002, 2002, clmc (inproceedings)

Abstract
This paper introduces a provably stable adaptive learning controller which employs nonlinear function approximation with automatic growth of the learning network according to the nonlinearities and the working domain of the control system. The unknown function in the dynamical system is approximated by piecewise linear models using a nonparametric regression technique. Local models are allocated as necessary and their parameters are optimized on-line. Inspired by composite adaptive control methods, the pro-posed learning adaptive control algorithm uses both the tracking error and the estimation error to up-date the parameters. We provide Lyapunov analyses that demonstrate the stability properties of the learning controller. Numerical simulations illustrate rapid convergence of the tracking error and the automatic structure adaptation capability of the function approximator. This paper introduces a provably stable adaptive learning controller which employs nonlinear function approximation with automatic growth of the learning network according to the nonlinearities and the working domain of the control system. The unknown function in the dynamical system is approximated by piecewise linear models using a nonparametric regression technique. Local models are allocated as necessary and their parameters are optimized on-line. Inspired by composite adaptive control methods, the pro-posed learning adaptive control algorithm uses both the tracking error and the estimation error to up-date the parameters. We provide Lyapunov analyses that demonstrate the stability properties of the learning controller. Numerical simulations illustrate rapid convergence of the tracking error and the automatic structure adaptation capability of the function approximator

link (url) [BibTex]

link (url) [BibTex]