Header logo is am


2012


Thumb xl screen shot 2015 08 23 at 13.56.29
Towards Multi-DOF model mediated teleoperation: Using vision to augment feedback

Willaert, B., Bohg, J., Van Brussel, H., Niemeyer, G.

In IEEE International Workshop on Haptic Audio Visual Environments and Games (HAVE), pages: 25-31, October 2012 (inproceedings)

Abstract
In this paper, we address some of the challenges that arise as model-mediated teleoperation is applied to systems with multiple degrees of freedom and multiple sensors. Specifically we use a system with position, force, and vision sensors to explore an environment geometry in two degrees of freedom. The inclusion of vision is proposed to alleviate the difficulties of estimating an increasing number of environment properties. Vision can furthermore increase the predictive nature of model-mediated teleoperation, by effectively predicting touch feedback before the slave is even in contact with the environment. We focus on the case of estimating the location and orientation of a local surface patch at the contact point between the slave and the environment. We describe the various information sources with their respective limitations and create a combined model estimator as part of a multi-d.o.f. model-mediated controller. An experiment demonstrates the feasibility and benefits of utilizing vision sensors in teleoperation.

DOI [BibTex]

2012

DOI [BibTex]


Thumb xl sankaran iros 20121
Failure Recovery with Shared Autonomy

Sankaran, B., Pitzer, B., Osentoski, S.

In International Conference on Intelligent Robots and Systems, October 2012 (inproceedings)

Abstract
Building robots capable of long term autonomy has been a long standing goal of robotics research. Such systems must be capable of performing certain tasks with a high degree of robustness and repeatability. In the context of personal robotics, these tasks could range anywhere from retrieving items from a refrigerator, loading a dishwasher, to setting up a dinner table. Given the complexity of tasks there are a multitude of failure scenarios that the robot can encounter, irrespective of whether the environment is static or dynamic. For a robot to be successful in such situations, it would need to know how to recover from failures or when to ask a human for help. This paper, presents a novel shared autonomy behavioral executive to addresses these issues. We demonstrate how this executive combines generalized logic based recovery and human intervention to achieve continuous failure free operation. We tested the systems over 250 trials of two different use case experiments. Our current algorithm drastically reduced human intervention from 26% to 4% on the first experiment and 46% to 9% on the second experiment. This system provides a new dimension to robot autonomy, where robots can exhibit long term failure free operation with minimal human supervision. We also discuss how the system can be generalized.

link (url) [BibTex]

link (url) [BibTex]


Thumb xl bottlehandovergrasp
Task-Based Grasp Adaptation on a Humanoid Robot

Bohg, J., Welke, K., León, B., Do, M., Song, D., Wohlkinger, W., Aldoma, A., Madry, M., Przybylski, M., Asfour, T., Marti, H., Kragic, D., Morales, A., Vincze, M.

In 10th IFAC Symposium on Robot Control, SyRoCo 2012, Dubrovnik, Croatia, September 5-7, 2012., pages: 779-786, September 2012 (inproceedings)

Abstract
In this paper, we present an approach towards autonomous grasping of objects according to their category and a given task. Recent advances in the field of object segmentation and categorization as well as task-based grasp inference have been leveraged by integrating them into one pipeline. This allows us to transfer task-specific grasp experience between objects of the same category. The effectiveness of the approach is demonstrated on the humanoid robot ARMAR-IIIa.

Video pdf DOI [BibTex]

Video pdf DOI [BibTex]


no image
Movement Segmentation and Recognition for Imitation Learning

Meier, F., Theodorou, E., Schaal, S.

In Seventeenth International Conference on Artificial Intelligence and Statistics, La Palma, Canary Islands, Fifteenth International Conference on Artificial Intelligence and Statistics , April 2012 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
From Dynamic Movement Primitives to Associative Skill Memories

Pastor, P., Kalakrishnan, M., Meier, F., Stulp, F., Buchli, J., Theodorou, E., Schaal, S.

Robotics and Autonomous Systems, 2012 (article)

Project Page [BibTex]

Project Page [BibTex]


no image
Inverse dynamics with optimal distribution of contact forces for the control of legged robots

Righetti, L., Schaal, S.

In Dynamic Walking 2012, Pensacola, 2012 (inproceedings)

[BibTex]

[BibTex]


no image
Encoding of Periodic and their Transient Motions by a Single Dynamic Movement Primitive

Ernesti, J., Righetti, L., Do, M., Asfour, T., Schaal, S.

In 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pages: 57-64, IEEE, Osaka, Japan, November 2012 (inproceedings)

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Force Control Policies for Compliant Robotic Manipulation

Kalakrishnan, M., Righetti, L., Pastor, P., Schaal, S.

In ICML’12 Proceedings of the 29th International Coference on International Conference on Machine Learning, pages: 49-50, Edinburgh, Scotland, 2012 (inproceedings)

[BibTex]

[BibTex]


no image
Quadratic programming for inverse dynamics with optimal distribution of contact forces

Righetti, L., Schaal, S.

In 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pages: 538-543, IEEE, Osaka, Japan, November 2012 (inproceedings)

Abstract
In this contribution we propose an inverse dynamics controller for a humanoid robot that exploits torque redundancy to minimize any combination of linear and quadratic costs in the contact forces and the commands. In addition the controller satisfies linear equality and inequality constraints in the contact forces and the commands such as torque limits, unilateral contacts or friction cones limits. The originality of our approach resides in the formulation of the problem as a quadratic program where we only need to solve for the control commands and where the contact forces are optimized implicitly. Furthermore, we do not need a structured representation of the dynamics of the robot (i.e. an explicit computation of the inertia matrix). It is in contrast with existing methods based on quadratic programs. The controller is then robust to uncertainty in the estimation of the dynamics model and the optimization is fast enough to be implemented in high bandwidth torque control loops that are increasingly available on humanoid platforms. We demonstrate properties of our controller with simulations of a human size humanoid robot.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Model-free reinforcement learning of impedance control in stochastic environments

Stulp, Freek, Buchli, Jonas, Ellmer, Alice, Mistry, Michael, Theodorou, Evangelos A., Schaal, S.

Autonomous Mental Development, IEEE Transactions on, 4(4):330-341, 2012 (article)

[BibTex]

[BibTex]


no image
Towards Associative Skill Memories

Pastor, P., Kalakrishnan, M., Righetti, L., Schaal, S.

In 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pages: 309-315, IEEE, Osaka, Japan, November 2012 (inproceedings)

Abstract
Movement primitives as basis of movement planning and control have become a popular topic in recent years. The key idea of movement primitives is that a rather small set of stereotypical movements should suffice to create a large set of complex manipulation skills. An interesting side effect of stereotypical movement is that it also creates stereotypical sensory events, e.g., in terms of kinesthetic variables, haptic variables, or, if processed appropriately, visual variables. Thus, a movement primitive executed towards a particular object in the environment will associate a large number of sensory variables that are typical for this manipulation skill. These association can be used to increase robustness towards perturbations, and they also allow failure detection and switching towards other behaviors. We call such movement primitives augmented with sensory associations Associative Skill Memories (ASM). This paper addresses how ASMs can be acquired by imitation learning and how they can create robust manipulation skill by determining subsequent ASMs online to achieve a particular manipulation goal. Evaluation for grasping and manipulation with a Barrett WAM/Hand illustrate our approach.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Template-based learning of grasp selection

Herzog, A., Pastor, P., Kalakrishnan, M., Righetti, L., Asfour, T., Schaal, S.

In 2012 IEEE International Conference on Robotics and Automation, pages: 2379-2384, IEEE, Saint Paul, USA, 2012 (inproceedings)

Abstract
The ability to grasp unknown objects is an important skill for personal robots, which has been addressed by many present and past research projects, but still remains an open problem. A crucial aspect of grasping is choosing an appropriate grasp configuration, i.e. the 6d pose of the hand relative to the object and its finger configuration. Finding feasible grasp configurations for novel objects, however, is challenging because of the huge variety in shape and size of these objects. Moreover, possible configurations also depend on the specific kinematics of the robotic arm and hand in use. In this paper, we introduce a new grasp selection algorithm able to find object grasp poses based on previously demonstrated grasps. Assuming that objects with similar shapes can be grasped in a similar way, we associate to each demonstrated grasp a grasp template. The template is a local shape descriptor for a possible grasp pose and is constructed using 3d information from depth sensors. For each new object to grasp, the algorithm then finds the best grasp candidate in the library of templates. The grasp selection is also able to improve over time using the information of previous grasp attempts to adapt the ranking of the templates. We tested the algorithm on two different platforms, the Willow Garage PR2 and the Barrett WAM arm which have very different hands. Our results show that the algorithm is able to find good grasp configurations for a large set of objects from a relatively small set of demonstrations, and does indeed improve its performance over time.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Reinforcement Learning with Sequences of Motion Primitives for Robust Manipulation

Stulp, F., Theodorou, E., Schaal, S.

IEEE Transactions on Robotics, 2012 (article)

[BibTex]

[BibTex]


no image
Probabilistic depth image registration incorporating nonvisual information

Wüthrich, M., Pastor, P., Righetti, L., Billard, A., Schaal, S.

In 2012 IEEE International Conference on Robotics and Automation, pages: 3637-3644, IEEE, Saint Paul, USA, 2012 (inproceedings)

Abstract
In this paper, we derive a probabilistic registration algorithm for object modeling and tracking. In many robotics applications, such as manipulation tasks, nonvisual information about the movement of the object is available, which we will combine with the visual information. Furthermore we do not only consider observations of the object, but we also take space into account which has been observed to not be part of the object. Furthermore we are computing a posterior distribution over the relative alignment and not a point estimate as typically done in for example Iterative Closest Point (ICP). To our knowledge no existing algorithm meets these three conditions and we thus derive a novel registration algorithm in a Bayesian framework. Experimental results suggest that the proposed methods perform favorably in comparison to PCL [1] implementations of feature mapping and ICP, especially if nonvisual information is available.

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2005


no image
Composite adaptive control with locally weighted statistical learning

Nakanishi, J., Farrell, J. A., Schaal, S.

Neural Networks, 18(1):71-90, January 2005, clmc (article)

Abstract
This paper introduces a provably stable learning adaptive control framework with statistical learning. The proposed algorithm employs nonlinear function approximation with automatic growth of the learning network according to the nonlinearities and the working domain of the control system. The unknown function in the dynamical system is approximated by piecewise linear models using a nonparametric regression technique. Local models are allocated as necessary and their parameters are optimized on-line. Inspired by composite adaptive control methods, the proposed learning adaptive control algorithm uses both the tracking error and the estimation error to update the parameters. We first discuss statistical learning of nonlinear functions, and motivate our choice of the locally weighted learning framework. Second, we begin with a class of first order SISO systems for theoretical development of our learning adaptive control framework, and present a stability proof including a parameter projection method that is needed to avoid potential singularities during adaptation. Then, we generalize our adaptive controller to higher order SISO systems, and discuss further extension to MIMO problems. Finally, we evaluate our theoretical control framework in numerical simulations to illustrate the effectiveness of the proposed learning adaptive controller for rapid convergence and high accuracy of control.

link (url) [BibTex]

2005

link (url) [BibTex]


no image
Natural Actor-Critic

Peters, J., Vijayakumar, S., Schaal, S.

In Proceedings of the 16th European Conference on Machine Learning, 3720, pages: 280-291, (Editors: Gama, J.;Camacho, R.;Brazdil, P.;Jorge, A.;Torgo, L.), Springer, ECML, 2005, clmc (inproceedings)

Abstract
This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing AmariÕs natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regres- sion. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gradients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and BradtkeÕs Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Em- pirical evaluations illustrate the effectiveness of our techniques in com- parison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Comparative experiments on task space control with redundancy resolution

Nakanishi, J., Cory, R., Mistry, M., Peters, J., Schaal, S.

In Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 3901-3908, Edmonton, Alberta, Canada, Aug. 2-6, IROS, 2005, clmc (inproceedings)

Abstract
Understanding the principles of motor coordination with redundant degrees of freedom still remains a challenging problem, particularly for new research in highly redundant robots like humanoids. Even after more than a decade of research, task space control with redundacy resolution still remains an incompletely understood theoretical topic, and also lacks a larger body of thorough experimental investigation on complex robotic systems. This paper presents our first steps towards the development of a working redundancy resolution algorithm which is robust against modeling errors and unforeseen disturbances arising from contact forces. To gain a better understanding of the pros and cons of different approaches to redundancy resolution, we focus on a comparative empirical evaluation. First, we review several redundancy resolution schemes at the velocity, acceleration and torque levels presented in the literature in a common notational framework and also introduce some new variants of these previous approaches. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm. Surprisingly, one of our simplest algorithms empirically demonstrates the best performance, despite, from a theoretical point, the algorithm does not share the same beauty as some of the other methods. Finally, we discuss practical properties of these control algorithms, particularly in light of inevitable modeling errors of the robot dynamics.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
A model of smooth pursuit based on learning of the target dynamics using only retinal signals

Shibata, T., Tabata, H., Schaal, S., Kawato, M.

Neural Networks, 18, pages: 213-225, 2005, clmc (article)

Abstract
While the predictive nature of the primate smooth pursuit system has been evident through several behavioural and neurophysiological experiments, few models have attempted to explain these results comprehensively. The model we propose in this paper in line with previous models employing optimal control theory; however, we hypothesize two new issues: (1) the medical superior temporal (MST) area in the cerebral cortex implements a recurrent neural network (RNN) in order to predict the current or future target velocity, and (2) a forward model of the target motion is acquired by on-line learning. We use stimulation studies to demonstrate how our new model supports these hypotheses.

link (url) [BibTex]

link (url) [BibTex]


no image
Linear and Nonlinear Estimation models applied to Hemodynamic Model

Theodorou, E.

Technical Report-2005-1, Computational Action and Vision Lab University of Minnesota, 2005, clmc (techreport)

Abstract
The relation between BOLD signal and neural activity is still poorly understood. The Gaussian Linear Model known as GLM is broadly used in many fMRI data analysis for recovering the underlying neural activity. Although GLM has been proved to be a really useful tool for analyzing fMRI data it can not be used for describing the complex biophysical process of neural metabolism. In this technical report we make use of a system of Stochastic Differential Equations that is based on Buxton model [1] for describing the underlying computational principles of hemodynamic process. Based on this SDE we built a Kalman Filter estimator so as to estimate the induced neural signal as well as the blood inflow under physiologic and sensor noise. The performance of Kalman Filter estimator is investigated under different physiologic noise characteristics and measurement frequencies.

PDF [BibTex]

PDF [BibTex]


no image
Predicting EMG Data from M1 Neurons with Variational Bayesian Least Squares

Ting, J., D’Souza, A., Yamamoto, K., Yoshioka, T., Hoffman, D., Kakei, S., Sergio, L., Kalaska, J., Kawato, M., Strick, P., Schaal, S.

In Advances in Neural Information Processing Systems 18 (NIPS 2005), (Editors: Weiss, Y.;Schölkopf, B.;Platt, J.), Cambridge, MA: MIT Press, Vancouver, BC, Dec. 6-11, 2005, clmc (inproceedings)

Abstract
An increasing number of projects in neuroscience requires the statistical analysis of high dimensional data sets, as, for instance, in predicting behavior from neural firing, or in operating artificial devices from brain recordings in brain-machine interfaces. Linear analysis techniques remain prevalent in such cases, but classi-cal linear regression approaches are often numercially too fragile in high dimen-sions. In this paper, we address the question of whether EMG data collected from arm movements of monkeys can be faithfully reconstructed with linear ap-proaches from neural activity in primary motor cortex (M1). To achieve robust data analysis, we develop a full Bayesian approach to linear regression that automatically detects and excludes irrelevant features in the data, and regular-izes against overfitting. In comparison with ordinary least squares, stepwise re-gression, partial least squares, and a brute force combinatorial search for the most predictive input features in the data, we demonstrate that the new Bayesian method offers a superior mixture of characteristics in terms of regularization against overfitting, computational efficiency, and ease of use, demonstrating its potential as a drop-in replacement for other linear regression techniques. As neuroscientific results, our analyses demonstrate that EMG data can be well pre-dicted from M1 neurons, further opening the path for possible real-time inter-faces between brains and machines.

link (url) [BibTex]

link (url) [BibTex]


no image
Rapbid synchronization and accurate phase-locking of rhythmic motor primitives

Pongas, D., Billard, A., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2005), pages: 2911-2916, Edmonton, Alberta, Canada, Aug. 2-6, 2005, clmc (inproceedings)

Abstract
Rhythmic movement is ubiquitous in human and animal behavior, e.g., as in locomotion, dancing, swimming, chewing, scratching, music playing, etc. A particular feature of rhythmic movement in biology is the rapid synchronization and phase locking with other rhythmic events in the environment, for instance music or visual stimuli as in ball juggling. In traditional oscillator theories to rhythmic movement generation, synchronization with another signal is relatively slow, and it is not easy to achieve accurate phase locking with a particular feature of the driving stimulus. Using a recently developed framework of dynamic motor primitives, we demonstrate a novel algorithm for very rapid synchronizaton of a rhythmic movement pattern, which can phase lock any feature of the movement to any particulur event in the driving stimulus. As an example application, we demonstrate how an anthropomorphic robot can use imitation learning to acquire a complex rumming pattern and keep it synchronized with an external rhythm generator that changes its frequency over time.

link (url) [BibTex]

link (url) [BibTex]


no image
Parametric and Non-Parametric approaches for nonlinear tracking of moving objects

Hidaka, Y, Theodorou, E.

Technical Report-2005-1, 2005, clmc (article)

PDF [BibTex]

PDF [BibTex]


no image
A new methodology for robot control design

Peters, J., Mistry, M., Udwadia, F. E., Schaal, S.

In The 5th ASME International Conference on Multibody Systems, Nonlinear Dynamics, and Control (MSNDC 2005), Long Beach, CA, Sept. 24-28, 2005, clmc (inproceedings)

Abstract
Gauss principle of least constraint and its generalizations have provided a useful insights for the development of tracking controllers for mechanical systems (Udwadia,2003). Using this concept, we present a novel methodology for the design of a specific class of robot controllers. With our new framework, we demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic framework, and show experimental verifications on a Sarcos Master Arm robot for some of these controllers. We believe that the suggested approach unifies and simplifies the design of optimal nonlinear control laws for robots obeying rigid body dynamics equations, both with or without external constraints, holonomic or nonholonomic constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics.

link (url) [BibTex]

link (url) [BibTex]


no image
Arm movement experiments with joint space force fields using an exoskeleton robot

Mistry, M., Mohajerian, P., Schaal, S.

In IEEE Ninth International Conference on Rehabilitation Robotics, pages: 408-413, Chicago, Illinois, June 28-July 1, 2005, clmc (inproceedings)

Abstract
A new experimental platform permits us to study a novel variety of issues of human motor control, particularly full 3-D movements involving the major seven degrees-of-freedom (DOF) of the human arm. We incorporate a seven DOF robot exoskeleton, and can minimize weight and inertia through gravity, Coriolis, and inertia compensation, such that subjects' arm movements are largely unaffected by the manipulandum. Torque perturbations can be individually applied to any or all seven joints of the human arm, thus creating novel dynamic environments, or force fields, for subjects to respond and adapt to. Our first study investigates a joint space force field where the shoulder velocity drives a disturbing force in the elbow joint. Results demonstrate that subjects learn to compensate for the force field within about 100 trials, and from the strong presence of aftereffects when removing the field in some randomized catch trials, that an inverse dynamics, or internal model, of the force field is formed by the nervous system. Interestingly, while post-learning hand trajectories return to baseline, joint space trajectories remained changed in response to the field, indicating that besides learning a model of the force field, the nervous system also chose to exploit the space to minimize the effects of the force field on the realization of the endpoint trajectory plan. Further applications for our apparatus include studies in motor system redundancy resolution and inverse kinematics, as well as rehabilitation.

link (url) [BibTex]

link (url) [BibTex]


no image
A unifying framework for the control of robotics systems

Peters, J., Mistry, M., Udwadia, F. E., Cory, R., Nakanishi, J., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2005), pages: 1824-1831, Edmonton, Alberta, Canada, Aug. 2-6, 2005, clmc (inproceedings)

Abstract
Recently, [1] suggested to derive tracking controllers for mechanical systems using a generalization of GaussÕ principle of least constraint. This method al-lows us to reformulate control problems as a special class of optimal control. We take this line of reasoning one step further and demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sar-cos Master Arm robot for some of the the derived controllers.We believe that the suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equa-tions, both with or without external constraints, with over-actuation or under-actuation, as well as open-chain and closed-chain kinematics.

link (url) [BibTex]

link (url) [BibTex]

1999


no image
Is imitation learning the route to humanoid robots?

Schaal, S.

Trends in Cognitive Sciences, 3(6):233-242, 1999, clmc (article)

Abstract
This review will focus on two recent developments in artificial intelligence and neural computation: learning from imitation and the development of humanoid robots. It will be postulated that the study of imitation learning offers a promising route to gain new insights into mechanisms of perceptual motor control that could ultimately lead to the creation of autonomous humanoid robots. This hope is justified because imitation learning channels research efforts towards three important issues: efficient motor learning, the connection between action and perception, and modular motor control in form of movement primitives. In order to make these points, first, a brief review of imitation learning will be given from the view of psychology and neuroscience. In these fields, representations and functional connections between action and perception have been explored that contribute to the understanding of motor acts of other beings. The recent discovery that some areas in the primate brain are active during both movement perception and execution provided a first idea of the possible neural basis of imitation. Secondly, computational approaches to imitation learning will be described, initially from the perspective of traditional AI and robotics, and then with a focus on neural network models and statistical learning research. Parallels and differences between biological and computational approaches to imitation will be highlighted. The review will end with an overview of current projects that actually employ imitation learning for humanoid robots.

link (url) [BibTex]

1999

link (url) [BibTex]


no image
Nonparametric regression for learning nonlinear transformations

Schaal, S.

In Prerational Intelligence in Strategies, High-Level Processes and Collective Behavior, 2, pages: 595-621, (Editors: Ritter, H.;Cruse, H.;Dean, J.), Kluwer Academic Publishers, 1999, clmc (inbook)

Abstract
Information processing in animals and artificial movement systems consists of a series of transformations that map sensory signals to intermediate representations, and finally to motor commands. Given the physical and neuroanatomical differences between individuals and the need for plasticity during development, it is highly likely that such transformations are learned rather than pre-programmed by evolution. Such self-organizing processes, capable of discovering nonlinear dependencies between different groups of signals, are one essential part of prerational intelligence. While neural network algorithms seem to be the natural choice when searching for solutions for learning transformations, this paper will take a more careful look at which types of neural networks are actually suited for the requirements of an autonomous learning system. The approach that we will pursue is guided by recent developments in learning theory that have linked neural network learning to well established statistical theories. In particular, this new statistical understanding has given rise to the development of neural network systems that are directly based on statistical methods. One family of such methods stems from nonparametric regression. This paper will compare nonparametric learning with the more widely used parametric counterparts in a non technical fashion, and investigate how these two families differ in their properties and their applicabilities. We will argue that nonparametric neural networks offer a set of characteristics that make them a very promising candidate for on-line learning in autonomous system.

link (url) [BibTex]

link (url) [BibTex]


no image
Segmentation of endpoint trajectories does not imply segmented control

Sternad, D., Schaal, D.

Experimental Brain Research, 124(1):118-136, 1999, clmc (article)

Abstract
While it is generally assumed that complex movements consist of a sequence of simpler units, the quest to define these units of action, or movement primitives, still remains an open question. In this context, two hypotheses of movement segmentation of endpoint trajectories in 3D human drawing movements are re-examined: (1) the stroke-based segmentation hypothesis based on the results that the proportionality coefficient of the 2/3 power law changes discontinuously with each new â??strokeâ?, and (2) the segmentation hypothesis inferred from the observation of piecewise planar endpoint trajectories of 3D drawing movements. In two experiments human subjects performed a set of elliptical and figure-8 patterns of different sizes and orientations using their whole arm in 3D. The kinematic characteristics of the endpoint trajectories and the seven joint angles of the arm were analyzed. While the endpoint trajectories produced similar segmentation features as reported in the literature, analyses of the joint angles show no obvious segmentation but rather continuous oscillatory patterns. By approximating the joint angle data of human subjects with sinusoidal trajectories, and by implementing this model on a 7-degree-of-freedom anthropomorphic robot arm, it is shown that such a continuous movement strategy can produce exactly the same features as observed by the above segmentation hypotheses. The origin of this apparent segmentation of endpoint trajectories is traced back to the nonlinear transformations of the forward kinematics of human arms. The presented results demonstrate that principles of discrete movement generation may not be reconciled with those of rhythmic movement as easily as has been previously suggested, while the generalization of nonlinear pattern generators to arm movements can offer an interesting alternative to approach the question of units of action.

link (url) [BibTex]

link (url) [BibTex]

1997


no image
Locally weighted learning

Atkeson, C. G., Moore, A. W., Schaal, S.

Artificial Intelligence Review, 11(1-5):11-73, 1997, clmc (article)

Abstract
This paper surveys locally weighted learning, a form of lazy learning and memory-based learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning fit parameters, interference between old and new data, implementing locally weighted learning efficiently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control. Keywords: locally weighted regression, LOESS, LWR, lazy learning, memory-based learning, least commitment learning, distance functions, smoothing parameters, weighting functions, global tuning, local tuning, interference.

link (url) [BibTex]

1997

link (url) [BibTex]


no image
Locally weighted learning for control

Atkeson, C. G., Moore, A. W., Schaal, S.

Artificial Intelligence Review, 11(1-5):75-113, 1997, clmc (article)

Abstract
Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We explain various forms that control tasks can take, and how this affects the choice of learning paradigm. The discussion section explores the interesting impact that explicitly remembering all previous experiences has on the problem of learning to control. Keywords: locally weighted regression, LOESS, LWR, lazy learning, memory-based learning, least commitment learning, forward models, inverse models, linear quadratic regulation (LQR), shifting setpoint algorithm, dynamic programming.

link (url) [BibTex]

link (url) [BibTex]


no image
Learning from demonstration

Schaal, S.

In Advances in Neural Information Processing Systems 9, pages: 1040-1046, (Editors: Mozer, M. C.;Jordan, M.;Petsche, T.), MIT Press, Cambridge, MA, 1997, clmc (inproceedings)

Abstract
By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstrations of other humans. For learning control, this paper investigates how learning from demonstration can be applied in the context of reinforcement learning. We consider priming the Q-function, the value function, the policy, and the model of the task dynamics as possible areas where demonstrations can speed up learning. In general nonlinear learning problems, only model-based reinforcement learning shows significant speed-up after a demonstration, while in the special case of linear quadratic regulator (LQR) problems, all methods profit from the demonstration. In an implementation of pole balancing on a complex anthropomorphic robot arm, we demonstrate that, when facing the complexities of real signal processing, model-based reinforcement learning offers the most robustness for LQR problems. Using the suggested methods, the robot learns pole balancing in just a single trial after a 30 second long demonstration of the human instructor. 

link (url) [BibTex]

link (url) [BibTex]


no image
Robot learning from demonstration

Atkeson, C. G., Schaal, S.

In Machine Learning: Proceedings of the Fourteenth International Conference (ICML ’97), pages: 12-20, (Editors: Fisher Jr., D. H.), Morgan Kaufmann, Nashville, TN, July 8-12, 1997, 1997, clmc (inproceedings)

Abstract
The goal of robot learning from demonstration is to have a robot learn from watching a demonstration of the task to be performed. In our approach to learning from demonstration the robot learns a reward function from the demonstration and a task model from repeated attempts to perform the task. A policy is computed based on the learned reward function and task model. Lessons learned from an implementation on an anthropomorphic robot arm using a pendulum swing up task include 1) simply mimicking demonstrated motions is not adequate to perform this task, 2) a task planner can use a learned model and reward function to compute an appropriate policy, 3) this model-based planning process supports rapid learning, 4) both parametric and nonparametric models can be learned and used, and 5) incorporating a task level direct learning component, which is non-model-based, in addition to the model-based planner, is useful in compensating for structural modeling errors and slow model learning. 

link (url) [BibTex]

link (url) [BibTex]


no image
Local dimensionality reduction for locally weighted learning

Vijayakumar, S., Schaal, S.

In International Conference on Computational Intelligence in Robotics and Automation, pages: 220-225, Monteray, CA, July10-11, 1997, 1997, clmc (inproceedings)

Abstract
Incremental learning of sensorimotor transformations in high dimensional spaces is one of the basic prerequisites for the success of autonomous robot devices as well as biological movement systems. So far, due to sparsity of data in high dimensional spaces, learning in such settings requires a significant amount of prior knowledge about the learning task, usually provided by a human expert. In this paper we suggest a partial revision of the view. Based on empirical studies, it can been observed that, despite being globally high dimensional and sparse, data distributions from physical movement systems are locally low dimensional and dense. Under this assumption, we derive a learning algorithm, Locally Adaptive Subspace Regression, that exploits this property by combining a local dimensionality reduction as a preprocessing step with a nonparametric learning technique, locally weighted regression. The usefulness of the algorithm and the validity of its assumptions are illustrated for a synthetic data set and data of the inverse dynamics of an actual 7 degree-of-freedom anthropomorphic robot arm.

link (url) [BibTex]

link (url) [BibTex]


no image
Learning tasks from a single demonstration

Atkeson, C. G., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA97), 2, pages: 1706-1712, Piscataway, NJ: IEEE, Albuquerque, NM, 20-25 April, 1997, clmc (inproceedings)

Abstract
Learning a complex dynamic robot manoeuvre from a single human demonstration is difficult. This paper explores an approach to learning from demonstration based on learning an optimization criterion from the demonstration and a task model from repeated attempts to perform the task, and using the learned criterion and model to compute an appropriate robot movement. A preliminary version of the approach has been implemented on an anthropomorphic robot arm using a pendulum swing up task as an example

link (url) [BibTex]

link (url) [BibTex]

1992


no image
Ins CAD integrierte Kostenkalkulation (CAD-Integrated Cost Calculation)

Ehrlenspiel, K., Schaal, S.

Konstruktion 44, 12, pages: 407-414, 1992, clmc (article)

[BibTex]

1992

[BibTex]


no image
Integrierte Wissensverarbeitung mit CAD am Beispiel der konstruktionsbegleitenden Kalkulation (Ways to smarter CAD Systems)

Schaal, S.

Hanser 1992. (Konstruktionstechnik München Band 8). Zugl. München: TU Diss., München, 1992, clmc (book)

[BibTex]

[BibTex]


no image
Informationssysteme mit CAD (Information systems within CAD)

Schaal, S.

In CAD/CAM Grundlagen, pages: 199-204, (Editors: Milberg, J.), Springer, Buchreihe CIM-TT. Berlin, 1992, clmc (inbook)

[BibTex]

[BibTex]


no image
What should be learned?

Schaal, S., Atkeson, C. G., Botros, S.

In Proceedings of Seventh Yale Workshop on Adaptive and Learning Systems, pages: 199-204, New Haven, CT, May 20-22, 1992, clmc (inproceedings)

[BibTex]

[BibTex]