Header logo is am


2011


no image
Learning, planning, and control for quadruped locomotion over challenging terrain

Kalakrishnan, Mrinal, Buchli, Jonas, Pastor, Peter, Mistry, Michael, Schaal, S.

International Journal of Robotics Research, 30(2):236-258, February 2011 (article)

[BibTex]

2011

[BibTex]


no image
Bayesian robot system identification with input and output noise

Ting, J., D’Souza, A., Schaal, S.

Neural Networks, 24(1):99-108, 2011, clmc (article)

Abstract
For complex robots such as humanoids, model-based control is highly beneficial for accurate tracking while keeping negative feedback gains low for compliance. However, in such multi degree-of-freedom lightweight systems, conventional identification of rigid body dynamics models using CAD data and actuator models is inaccurate due to unknown nonlinear robot dynamic effects. An alternative method is data-driven parameter estimation, but significant noise in measured and inferred variables affects it adversely. Moreover, standard estimation procedures may give physically inconsistent results due to unmodeled nonlinearities or insufficiently rich data. This paper addresses these problems, proposing a Bayesian system identification technique for linear or piecewise linear systems. Inspired by Factor Analysis regression, we develop a computationally efficient variational Bayesian regression algorithm that is robust to ill-conditioned data, automatically detects relevant features, and identifies input and output noise. We evaluate our approach on rigid body parameter estimation for various robotic systems, achieving an error of up to three times lower than other state-of-the-art machine learning methods

link (url) [BibTex]

link (url) [BibTex]


no image
Learning variable impedance control

Buchli, J., Stulp, F., Theodorou, E., Schaal, S.

International Journal of Robotics Research, 2011, clmc (article)

Abstract
One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not trivial to derive variable impedance controllers for practical high degree-of-freedom (DOF) robotic tasks. In this contribution, we accomplish such variable impedance control with the reinforcement learning (RL) algorithm PISq ({f P}olicy {f I}mprovement with {f P}ath {f I}ntegrals). PISq is a model-free, sampling based learning method derived from first principles of stochastic optimal control. The PISq algorithm requires no tuning of algorithmic parameters besides the exploration noise. The designer can thus fully focus on cost function design to specify the task. From the viewpoint of robotics, a particular useful property of PISq is that it can scale to problems of many DOFs, so that reinforcement learning on real robotic systems becomes feasible. We sketch the PISq algorithm and its theoretical properties, and how it is applied to gain scheduling for variable impedance control. We evaluate our approach by presenting results on several simulated and real robots. We consider tasks involving accurate tracking through via-points, and manipulation tasks requiring physical contact with the environment. In these tasks, the optimal strategy requires both tuning of a reference trajectory emph{and} the impedance of the end-effector. The results show that we can use path integral based reinforcement learning not only for planning but also to derive variable gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide variety of robotic systems and practical applications.

link (url) [BibTex]

link (url) [BibTex]


no image
Understanding haptics by evolving mechatronic systems

Loeb, G. E., Tsianos, G.A., Fishel, J.A., Wettels, N., Schaal, S.

Progress in Brain Research, 192, pages: 129, 2011 (article)

[BibTex]

[BibTex]


no image
Intelligent Mobility—Autonomous Outdoor Robotics at the DFKI

Joyeux, S., Schwendner, J., Kirchner, F., Babu, A., Grimminger, F., Machowinski, J., Paranhos, P., Gaudig, C.

KI, 25(2):133-139, May 2011 (article)

DOI [BibTex]

DOI [BibTex]

2008


no image
Learning to control in operational space

Peters, J., Schaal, S.

International Journal of Robotics Research, 27, pages: 197-212, 2008, clmc (article)

Abstract
One of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in com- plex robots, e.g., humanoid robots. In this paper, we suggest a learning approach for opertional space control as a direct inverse model learning problem. A first important insight for this paper is that a physically cor- rect solution to the inverse problem with redundant degrees-of-freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem. The cost function as- sociated with this optimal control problem allows us to formulate a learn- ing algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corre- sponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees of freedom robot arm are used to illustrate the suggested approach. The applica- tion to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA-10 medical robotics arm.

link (url) DOI [BibTex]

2008

link (url) DOI [BibTex]


no image
Adaptation to a sub-optimal desired trajectory

M. Mistry, E. A. G. L. T. Y. S. S. M. K.

Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (article)

PDF [BibTex]

PDF [BibTex]


no image
Operational space control: A theoretical and emprical comparison

Nakanishi, J., Cory, R., Mistry, M., Peters, J., Schaal, S.

International Journal of Robotics Research, 27(6):737-757, 2008, clmc (article)

Abstract
Dexterous manipulation with a highly redundant movement system is one of the hallmarks of hu- man motor skills. From numerous behavioral studies, there is strong evidence that humans employ compliant task space control, i.e., they focus control only on task variables while keeping redundant degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances and simultaneously safe for the operator and the environment. The theory of operational space con- trol in robotics aims to achieve similar performance properties. However, despite various compelling theoretical lines of research, advanced operational space control is hardly found in actual robotics imple- mentations, in particular new kinds of robots like humanoids and service robots, which would strongly profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches to operational space control, this paper focuses on a theoretical and empirical evaluation of different methods that have been suggested in the literature, but also some new variants of operational space controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate all controllers in a common notational framework, including quaternion-based orientation control, and discuss some of their theoretical properties. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks. As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which ensures physical consistency, as this issue was crucial for our successful robot implementations. Our extensive empirical results demonstrate that one of the simplified acceleration-based approaches can be advantageous in terms of task performance, ease of parameter tuning, and general robustness and compliance in face of inevitable modeling errors.

link (url) [BibTex]

link (url) [BibTex]


no image
A library for locally weighted projection regression

Klanke, S., Vijayakumar, S., Schaal, S.

Journal of Machine Learning Research, 9, pages: 623-626, 2008, clmc (article)

Abstract
In this paper we introduce an improved implementation of locally weighted projection regression (LWPR), a supervised learning algorithm that is capable of handling high-dimensional input data. As the key features, our code supports multi-threading, is available for multiple platforms, and provides wrappers for several programming languages.

link (url) [BibTex]

link (url) [BibTex]


no image
Adaptive stair-climbing behaviour with a hybrid legged-wheeled robot

Eich, M., Grimminger, F., Kirchner, F.

In Advances In Mobile Robotics, pages: 768-775, World Scientific, August 2008 (incollection)

DOI [BibTex]

DOI [BibTex]


no image
Optimization strategies in human reinforcement learning

Hoffmann, H., Theodorou, E., Schaal, S.

Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (article)

PDF [BibTex]

PDF [BibTex]

2002


no image
Forward models in visuomotor control

Mehta, B., Schaal, S.

J Neurophysiol, 88(2):942-53, August 2002, clmc (article)

Abstract
In recent years, an increasing number of research projects investigated whether the central nervous system employs internal models in motor control. While inverse models in the control loop can be identified more readily in both motor behavior and the firing of single neurons, providing direct evidence for the existence of forward models is more complicated. In this paper, we will discuss such an identification of forward models in the context of the visuomotor control of an unstable dynamic system, the balancing of a pole on a finger. Pole balancing imposes stringent constraints on the biological controller, as it needs to cope with the large delays of visual information processing while keeping the pole at an unstable equilibrium. We hypothesize various model-based and non-model-based control schemes of how visuomotor control can be accomplished in this task, including Smith Predictors, predictors with Kalman filters, tapped-delay line control, and delay-uncompensated control. Behavioral experiments with human participants allow exclusion of most of the hypothesized control schemes. In the end, our data support the existence of a forward model in the sensory preprocessing loop of control. As an important part of our research, we will provide a discussion of when and how forward models can be identified and also the possible pitfalls in the search for forward models in control.

link (url) [BibTex]

2002

link (url) [BibTex]


no image
Learning robot control

Schaal, S.

In The handbook of brain theory and neural networks, 2nd Edition, pages: 983-987, 2, (Editors: Arbib, M. A.), MIT Press, Cambridge, MA, 2002, clmc (inbook)

Abstract
This is a review article on learning control in robots.

link (url) [BibTex]

link (url) [BibTex]


no image
Arm and hand movement control

Schaal, S.

In The handbook of brain theory and neural networks, 2nd Edition, pages: 110-113, 2, (Editors: Arbib, M. A.), MIT Press, Cambridge, MA, 2002, clmc (inbook)

Abstract
This is a review article on computational and biological research on arm and hand control.

link (url) [BibTex]

link (url) [BibTex]


no image
Scalable techniques from nonparameteric statistics for real-time robot learning

Schaal, S., Atkeson, C. G., Vijayakumar, S.

Applied Intelligence, 17(1):49-60, 2002, clmc (article)

Abstract
Locally weighted learning (LWL) is a class of techniques from nonparametric statistics that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional belief that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested on up to 90 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing by a humanoid robot arm, and inverse-dynamics learning for a seven and a 30 degree-of-freedom robot. In all these examples, the application of our statistical neural networks techniques allowed either faster or more accurate acquisition of motor control than classical control engineering.

link (url) [BibTex]

link (url) [BibTex]

2000


no image
A brachiating robot controller

Nakanishi, J., Fukuda, T., Koditschek, D. E.

IEEE Transactions on Robotics and Automation, 16(2):109-123, 2000, clmc (article)

Abstract
We report on our empirical studies of a new controller for a two-link brachiating robot. Motivated by the pendulum-like motion of an apeâ??s brachiation, we encode this task as the output of a â??target dynamical system.â? Numerical simulations indicate that the resulting controller solves a number of brachiation problems that we term the â??ladder,â? â??swing-up,â? and â??ropeâ? problems. Preliminary analysis provides some explanation for this success. The proposed controller is implemented on a physical system in our laboratory. The robot achieves behaviors including â??swing locomotionâ? and â??swing upâ? and is capable of continuous locomotion over several rungs of a ladder. We discuss a number of formal questions whose answers will be required to gain a full understanding of the strengths and weaknesses of this approach.

link (url) [BibTex]

2000

link (url) [BibTex]


no image
Biomimetic gaze stabilization

Shibata, T., Schaal, S.

In Robot learning: an Interdisciplinary approach, pages: 31-52, (Editors: Demiris, J.;Birk, A.), World Scientific, 2000, clmc (inbook)

Abstract
Accurate oculomotor control is one of the essential pre-requisites for successful visuomotor coordination. In this paper, we suggest a biologically inspired control system for learning gaze stabilization with a biomimetic robotic oculomotor system. In a stepwise fashion, we develop a control circuit for the vestibulo-ocular reflex (VOR) and the opto-kinetic response (OKR), and add a nonlinear learning network to allow adaptivity. We discuss the parallels and differences of our system with biological oculomotor control and suggest solutions how to deal with nonlinearities and time delays in the control system. In simulation and actual robot studies, we demonstrate that our system can learn gaze stabilization in real time in only a few seconds with high final accuracy.

link (url) [BibTex]

link (url) [BibTex]


no image
Interaction of rhythmic and discrete pattern generators in single joint movements

Sternad, D., Dean, W. J., Schaal, S.

Human Movement Science, 19(4):627-665, 2000, clmc (article)

Abstract
The study investigates a single-joint movement task that combines a translatory and cyclic component with the objective to investigate the interaction of discrete and rhythmic movement elements. Participants performed an elbow movement in the horizontal plane, oscillating at a prescribed frequency around one target and shifting to a second target upon a trigger signal, without stopping the oscillation. Analyses focused on extracting the mutual influences of the rhythmic and the discrete component of the task. Major findings are: (1) The onset of the discrete movement was confined to a limited phase window in the rhythmic cycle. (2) Its duration was influenced by the period of oscillation. (3) The rhythmic oscillation was "perturbed" by the discrete movement as indicated by phase resetting. On the basis of these results we propose a model for the coordination of discrete and rhythmic actions (K. Matsuoka, Sustained oscillations generated by mutually inhibiting neurons with adaptations, Biological Cybernetics 52 (1985) 367-376; Mechanisms of frequency and pattern control in the neural rhythm generators, Biological Cybernetics 56 (1987) 345-353). For rhythmic movements an oscillatory pattern generator is developed following models of half-center oscillations (D. Bullock, S. Grossberg, The VITE model: a neural command circuit for generating arm and articulated trajectories, in: J.A.S. Kelso, A.J. Mandel, M. F. Shlesinger (Eds.), Dynamic Patterns in Complex Systems. World Scientific. Singapore. 1988. pp. 305-326). For discrete movements a point attractor dynamics is developed close to the VITE model For each joint degree of freedom both pattern generators co-exist but exert mutual inhibition onto each other. The suggested modeling framework provides a unified account for both discrete and rhythmic movements on the basis of neuronal circuitry. Simulation results demonstrated that the effects observed in human performance can be replicated using the two pattern generators with a mutually inhibiting coupling.

link (url) [BibTex]


no image
Dynamics of a bouncing ball in human performance

Sternad, D., Duarte, M., Katsumata, H., Schaal, S.

Physical Review E, 63(011902):1-8, 2000, clmc (article)

Abstract
On the basis of a modified bouncing-ball model, we investigated whether human movements utilize principles of dynamic stability in their performance of a similar movement task. Stability analyses of the model provided predictions about conditions indicative of a dynamically stable period-one regime. In a series of experiments, human subjects bounced a ball rhythmically on a racket and displayed these conditions supporting that they attuned to and exploited the dynamic stability properties of the task.

link (url) [BibTex]

link (url) [BibTex]

1998


no image
Constructive incremental learning from only local information

Schaal, S., Atkeson, C. G.

Neural Computation, 10(8):2047-2084, 1998, clmc (article)

Abstract
We introduce a constructive, incremental learning system for regression problems that models data by means of spatially localized linear models. In contrast to other approaches, the size and shape of the receptive field of each locally linear model as well as the parameters of the locally linear model itself are learned independently, i.e., without the need for competition or any other kind of communication. Independent learning is accomplished by incrementally minimizing a weighted local cross validation error. As a result, we obtain a learning system that can allocate resources as needed while dealing with the bias-variance dilemma in a principled way. The spatial localization of the linear models increases robustness towards negative interference. Our learning system can be interpreted as a nonparametric adaptive bandwidth smoother, as a mixture of experts where the experts are trained in isolation, and as a learning system which profits from combining independent expert knowledge on the same problem. This paper illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields. 

link (url) [BibTex]

1998

link (url) [BibTex]


no image
Local adaptive subspace regression

Vijayakumar, S., Schaal, S.

Neural Processing Letters, 7(3):139-149, 1998, clmc (article)

Abstract
Incremental learning of sensorimotor transformations in high dimensional spaces is one of the basic prerequisites for the success of autonomous robot devices as well as biological movement systems. So far, due to sparsity of data in high dimensional spaces, learning in such settings requires a significant amount of prior knowledge about the learning task, usually provided by a human expert. In this paper we suggest a partial revision of the view. Based on empirical studies, we observed that, despite being globally high dimensional and sparse, data distributions from physical movement systems are locally low dimensional and dense. Under this assumption, we derive a learning algorithm, Locally Adaptive Subspace Regression, that exploits this property by combining a dynamically growing local dimensionality reduction technique  as a preprocessing step with a nonparametric learning technique, locally weighted regression, that also learns the region of validity of the regression. The usefulness of the algorithm and the validity of its assumptions are illustrated for a synthetic data set, and for data of the inverse dynamics of human arm movements and an actual 7 degree-of-freedom anthropomorphic robot arm. 

link (url) [BibTex]

link (url) [BibTex]

1993


no image
Learning passive motor control strategies with genetic algorithms

Schaal, S., Sternad, D.

In 1992 Lectures in complex systems, pages: 913-918, (Editors: Nadel, L.;Stein, D.), Addison-Wesley, Redwood City, CA, 1993, clmc (inbook)

Abstract
This study investigates learning passive motor control strategies. Passive control is understood as control without active error correction; the movement is stabilized by particular properties of the controlling dynamics. We analyze the task of juggling a ball on a racket. An approximation to the optimal solution of the task is derived by means of optimization theory. In order to model the learning process, the problem is coded for a genetic algorithm in representations without sensory or with sensory information. For all representations the genetic algorithm is able to find passive control strategies, but learning speed and the quality of the outcome are significantly different. A comparison with data from human subjects shows that humans seem to apply yet different movement strategies to the ones proposed. For the feedback representation some implications arise for learning from demonstration.

link (url) [BibTex]

1993

link (url) [BibTex]


no image
A genetic algorithm for evolution from an ecological perspective

Sternad, D., Schaal, S.

In 1992 Lectures in Complex Systems, pages: 223-231, (Editors: Nadel, L.;Stein, D.), Addison-Wesley, Redwood City, CA, 1993, clmc (inbook)

Abstract
In the population model presented, an evolutionary dynamic is explored which is based on the operator characteristics of genetic algorithms. An essential modification in the genetic algorithms is the inclusion of a constraint in the mixing of the gene pool. The pairing for the crossover is governed by a selection principle based on a complementarity criterion derived from the theoretical tenet of perception-action (P-A) mutuality of ecological psychology. According to Swenson and Turvey [37] P-A mutuality underlies evolution and is an integral part of its thermodynamics. The present simulation tested the contribution of P-A-cycles in evolutionary dynamics. A numerical experiment compares the population's evolution with and without this intentional component. The effect is measured in the difference of the rate of energy dissipation, as well as in three operationalized aspects of complexity. The results support the predicted increase in the rate of energy dissipation, paralleled by an increase in the average heterogeneity of the population. Furthermore, the spatio-temporal evolution of the system is tested for the characteristic power-law relations of a nonlinear system poised in a critical state. The frequency distribution of consecutive increases in population size shows a significantly different exponent in functional relationship.

[BibTex]

[BibTex]


no image
Design concurrent calculation: A CAD- and data-integrated approach

Schaal, S., Ehrlenspiel, K.

Journal of Engineering Design, 4, pages: 71-85, 1993, clmc (article)

Abstract
Besides functional regards, product design demands increasingly more for further reaching considerations. Quality alone cannot suffice anymore to compete in the market; design for manufacturability, for assembly, for recycling, etc., are well-known keywords. Those can largely be reduced to the necessity of design for costs. This paper focuses on a CAD-based approach to design concurrent calculation. It will discuss how, in the meantime well-established, tools like feature technology, knowledge-based systems, and relational databases can be blended into one coherent concept to achieve an entirely CAD- and data-integrated cost information tool. This system is able to extract data from the CAD-system, combine it with data about the company specific manufacturing environment, and subsequently autonomously evaluate manufacturability aspects and costs of the given CAD-model. Within minutes the designer gets quantitative in-formation about the major cost sources of his/her design. Additionally, some alternative methods for approximating manu-facturing times from empirical data, namely neural networks and local weighted regression, are introduced.

[BibTex]

[BibTex]

1991


no image
Ways to smarter CAD-systems

Ehrlenspiel, K., Schaal, S.

In Proceedings of ICED’91Heurista, pages: 10-16, (Editors: Hubka), Edition, Schriftenreihe WDK 21. Zürich, 1991, clmc (inbook)

[BibTex]

1991

[BibTex]