Header logo is am


2018


A Value-Driven Eldercare Robot: Virtual and Physical Instantiations of a Case-Supported Principle-Based Behavior Paradigm
A Value-Driven Eldercare Robot: Virtual and Physical Instantiations of a Case-Supported Principle-Based Behavior Paradigm

Anderson, M., Anderson, S., Berenz, V.

Proceedings of the IEEE, pages: 1,15, October 2018 (article)

Abstract
In this paper, a case-supported principle-based behavior paradigm is proposed to help ensure ethical behavior of autonomous machines. We argue that ethically significant behavior of autonomous systems should be guided by explicit ethical principles determined through a consensus of ethicists. Such a consensus is likely to emerge in many areas in which autonomous systems are apt to be deployed and for the actions they are liable to undertake. We believe that this is the case since we are more likely to agree on how machines ought to treat us than on how human beings ought to treat one another. Given such a consensus, particular cases of ethical dilemmas where ethicists agree on the ethically relevant features and the right course of action can be used to help discover principles that balance these features when they are in conflict. Such principles not only help ensure ethical behavior of complex and dynamic systems but also can serve as a basis for justification of this behavior. The requirements, methods, implementation, and evaluation components of the paradigm are detailed as well as its instantiation in both a simulated and real robot functioning in the domain of eldercare.

link (url) DOI [BibTex]


Playful: Reactive Programming for Orchestrating Robotic Behavior
Playful: Reactive Programming for Orchestrating Robotic Behavior

Berenz, V., Schaal, S.

IEEE Robotics Automation Magazine, 25(3):49-60, September 2018 (article) In press

Abstract
For many service robots, reactivity to changes in their surroundings is a must. However, developing software suitable for dynamic environments is difficult. Existing robotic middleware allows engineers to design behavior graphs by organizing communication between components. But because these graphs are structurally inflexible, they hardly support the development of complex reactive behavior. To address this limitation, we propose Playful, a software platform that applies reactive programming to the specification of robotic behavior.

playful website playful_IEEE_RAM link (url) DOI [BibTex]


ClusterNet: Instance Segmentation in RGB-D Images
ClusterNet: Instance Segmentation in RGB-D Images

Shao, L., Tian, Y., Bohg, J.

arXiv, September 2018, Submitted to ICRA'19 (article) Submitted

Abstract
We propose a method for instance-level segmentation that uses RGB-D data as input and provides detailed information about the location, geometry and number of {\em individual\/} objects in the scene. This level of understanding is fundamental for autonomous robots. It enables safe and robust decision-making under the large uncertainty of the real-world. In our model, we propose to use the first and second order moments of the object occupancy function to represent an object instance. We train an hourglass Deep Neural Network (DNN) where each pixel in the output votes for the 3D position of the corresponding object center and for the object's size and pose. The final instance segmentation is achieved through clustering in the space of moments. The object-centric training loss is defined on the output of the clustering. Our method outperforms the state-of-the-art instance segmentation method on our synthesized dataset. We show that our method generalizes well on real-world data achieving visually better segmentation results.

link (url) [BibTex]

link (url) [BibTex]


Real-time Perception meets Reactive Motion Generation
Real-time Perception meets Reactive Motion Generation

(Best Systems Paper Finalists - Amazon Robotics Best Paper Awards in Manipulation)

Kappler, D., Meier, F., Issac, J., Mainprice, J., Garcia Cifuentes, C., Wüthrich, M., Berenz, V., Schaal, S., Ratliff, N., Bohg, J.

IEEE Robotics and Automation Letters, 3(3):1864-1871, July 2018 (article)

Abstract
We address the challenging problem of robotic grasping and manipulation in the presence of uncertainty. This uncertainty is due to noisy sensing, inaccurate models and hard-to-predict environment dynamics. Our approach emphasizes the importance of continuous, real-time perception and its tight integration with reactive motion generation methods. We present a fully integrated system where real-time object and robot tracking as well as ambient world modeling provides the necessary input to feedback controllers and continuous motion optimizers. Specifically, they provide attractive and repulsive potentials based on which the controllers and motion optimizer can online compute movement policies at different time intervals. We extensively evaluate the proposed system on a real robotic platform in four scenarios that exhibit either challenging workspace geometry or a dynamic environment. We compare the proposed integrated system with a more traditional sense-plan-act approach that is still widely used. In 333 experiments, we show the robustness and accuracy of the proposed system.

arxiv video video link (url) DOI Project Page [BibTex]


no image
Distributed Event-Based State Estimation for Networked Systems: An LMI Approach

Muehlebach, M., Trimpe, S.

IEEE Transactions on Automatic Control, 63(1):269-276, January 2018 (article)

arXiv (extended version) DOI Project Page [BibTex]

arXiv (extended version) DOI Project Page [BibTex]


no image
Memristor-enhanced humanoid robot control system–Part I: theory behind the novel memcomputing paradigm

Ascoli, A., Baumann, D., Tetzlaff, R., Chua, L. O., Hild, M.

International Journal of Circuit Theory and Applications, 46(1):155-183, 2018 (article)

DOI [BibTex]

DOI [BibTex]


no image
Memristor-enhanced humanoid robot control system–Part II: circuit theoretic model and performance analysis

Baumann, D., Ascoli, A., Tetzlaff, R., Chua, L. O., Hild, M.

International Journal of Circuit Theory and Applications, 46(1):184-220, 2018 (article)

DOI [BibTex]

DOI [BibTex]

2017


Interactive Perception: Leveraging Action in Perception and Perception in Action
Interactive Perception: Leveraging Action in Perception and Perception in Action

Bohg, J., Hausman, K., Sankaran, B., Brock, O., Kragic, D., Schaal, S., Sukhatme, G.

IEEE Transactions on Robotics, 33, pages: 1273-1291, December 2017 (article)

Abstract
Recent approaches in robotics follow the insight that perception is facilitated by interactivity with the environment. These approaches are subsumed under the term of Interactive Perception (IP). We argue that IP provides the following benefits: (i) any type of forceful interaction with the environment creates a new type of informative sensory signal that would otherwise not be present and (ii) any prior knowledge about the nature of the interaction supports the interpretation of the signal. This is facilitated by knowledge of the regularity in the combined space of sensory information and action parameters. The goal of this survey is to postulate this as a principle and collect evidence in support by analyzing and categorizing existing work in this area. We also provide an overview of the most important applications of Interactive Perception. We close this survey by discussing the remaining open questions. Thereby, we hope to define a field and inspire future work.

arXiv DOI Project Page [BibTex]

2017

arXiv DOI Project Page [BibTex]


Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning
Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning

Li, W., Bohg, J., Fritz, M.

arXiv, November 2017 (article) Submitted

Abstract
Understanding physical phenomena is a key component of human intelligence and enables physical interaction with previously unseen environments. In this paper, we study how an artificial agent can autonomously acquire this intuition through interaction with the environment. We created a synthetic block stacking environment with physics simulation in which the agent can learn a policy end-to-end through trial and error. Thereby, we bypass to explicitly model physical knowledge within the policy. We are specifically interested in tasks that require the agent to reach a given goal state that may be different for every new trial. To this end, we propose a deep reinforcement learning framework that learns policies which are parametrized by a goal. We validated the model on a toy example navigating in a grid world with different target positions and in a block stacking task with different target structures of the final tower. In contrast to prior work, our policies show better generalization across different goals.

arXiv [BibTex]


no image
Event-based State Estimation: An Emulation-based Approach

Trimpe, S.

IET Control Theory & Applications, 11(11):1684-1693, July 2017 (article)

Abstract
An event-based state estimation approach for reducing communication in a networked control system is proposed. Multiple distributed sensor agents observe a dynamic process and sporadically transmit their measurements to estimator agents over a shared bus network. Local event-triggering protocols ensure that data is transmitted only when necessary to meet a desired estimation accuracy. The event-based design is shown to emulate the performance of a centralised state observer design up to guaranteed bounds, but with reduced communication. The stability results for state estimation are extended to the distributed control system that results when the local estimates are used for feedback control. Results from numerical simulations and hardware experiments illustrate the effectiveness of the proposed approach in reducing network communication.

arXiv Supplementary material PDF DOI Project Page [BibTex]


Probabilistic Articulated Real-Time Tracking for Robot Manipulation
Probabilistic Articulated Real-Time Tracking for Robot Manipulation

(Best Paper of RA-L 2017, Finalist of Best Robotic Vision Paper Award of ICRA 2017)

Garcia Cifuentes, C., Issac, J., Wüthrich, M., Schaal, S., Bohg, J.

IEEE Robotics and Automation Letters (RA-L), 2(2):577-584, April 2017 (article)

Abstract
We propose a probabilistic filtering method which fuses joint measurements with depth images to yield a precise, real-time estimate of the end-effector pose in the camera frame. This avoids the need for frame transformations when using it in combination with visual object tracking methods. Precision is achieved by modeling and correcting biases in the joint measurements as well as inaccuracies in the robot model, such as poor extrinsic camera calibration. We make our method computationally efficient through a principled combination of Kalman filtering of the joint measurements and asynchronous depth-image updates based on the Coordinate Particle Filter. We quantitatively evaluate our approach on a dataset recorded from a real robotic platform, annotated with ground truth from a motion capture system. We show that our approach is robust and accurate even under challenging conditions such as fast motion, significant and long-term occlusions, and time-varying biases. We release the dataset along with open-source code of our approach to allow for quantitative comparison with alternative approaches.

arXiv video code and dataset video PDF DOI Project Page [BibTex]


no image
Anticipatory Action Selection for Human-Robot Table Tennis

Wang, Z., Boularias, A., Mülling, K., Schölkopf, B., Peters, J.

Artificial Intelligence, 247, pages: 399-414, 2017, Special Issue on AI and Robotics (article)

Abstract
Abstract Anticipation can enhance the capability of a robot in its interaction with humans, where the robot predicts the humans' intention for selecting its own action. We present a novel framework of anticipatory action selection for human-robot interaction, which is capable to handle nonlinear and stochastic human behaviors such as table tennis strokes and allows the robot to choose the optimal action based on prediction of the human partner's intention with uncertainty. The presented framework is generic and can be used in many human-robot interaction scenarios, for example, in navigation and human-robot co-manipulation. In this article, we conduct a case study on human-robot table tennis. Due to the limited amount of time for executing hitting movements, a robot usually needs to initiate its hitting movement before the opponent hits the ball, which requires the robot to be anticipatory based on visual observation of the opponent's movement. Previous work on Intention-Driven Dynamics Models (IDDM) allowed the robot to predict the intended target of the opponent. In this article, we address the problem of action selection and optimal timing for initiating a chosen action by formulating the anticipatory action selection as a Partially Observable Markov Decision Process (POMDP), where the transition and observation are modeled by the \{IDDM\} framework. We present two approaches to anticipatory action selection based on the \{POMDP\} formulation, i.e., a model-free policy learning method based on Least-Squares Policy Iteration (LSPI) that employs the \{IDDM\} for belief updates, and a model-based Monte-Carlo Planning (MCP) method, which benefits from the transition and observation model by the IDDM. Experimental results using real data in a simulated environment show the importance of anticipatory action selection, and that \{POMDPs\} are suitable to formulate the anticipatory action selection problem by taking into account the uncertainties in prediction. We also show that existing algorithms for POMDPs, such as \{LSPI\} and MCP, can be applied to substantially improve the robot's performance in its interaction with humans.

DOI Project Page [BibTex]

DOI Project Page [BibTex]

2008


no image
Learning to control in operational space

Peters, J., Schaal, S.

International Journal of Robotics Research, 27, pages: 197-212, 2008, clmc (article)

Abstract
One of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in com- plex robots, e.g., humanoid robots. In this paper, we suggest a learning approach for opertional space control as a direct inverse model learning problem. A first important insight for this paper is that a physically cor- rect solution to the inverse problem with redundant degrees-of-freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem. The cost function as- sociated with this optimal control problem allows us to formulate a learn- ing algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corre- sponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees of freedom robot arm are used to illustrate the suggested approach. The applica- tion to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA-10 medical robotics arm.

link (url) DOI [BibTex]

2008

link (url) DOI [BibTex]


no image
Adaptation to a sub-optimal desired trajectory

M. Mistry, E. A. G. L. T. Y. S. S. M. K.

Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (article)

PDF [BibTex]

PDF [BibTex]


no image
Operational space control: A theoretical and emprical comparison

Nakanishi, J., Cory, R., Mistry, M., Peters, J., Schaal, S.

International Journal of Robotics Research, 27(6):737-757, 2008, clmc (article)

Abstract
Dexterous manipulation with a highly redundant movement system is one of the hallmarks of hu- man motor skills. From numerous behavioral studies, there is strong evidence that humans employ compliant task space control, i.e., they focus control only on task variables while keeping redundant degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances and simultaneously safe for the operator and the environment. The theory of operational space con- trol in robotics aims to achieve similar performance properties. However, despite various compelling theoretical lines of research, advanced operational space control is hardly found in actual robotics imple- mentations, in particular new kinds of robots like humanoids and service robots, which would strongly profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches to operational space control, this paper focuses on a theoretical and empirical evaluation of different methods that have been suggested in the literature, but also some new variants of operational space controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate all controllers in a common notational framework, including quaternion-based orientation control, and discuss some of their theoretical properties. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks. As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which ensures physical consistency, as this issue was crucial for our successful robot implementations. Our extensive empirical results demonstrate that one of the simplified acceleration-based approaches can be advantageous in terms of task performance, ease of parameter tuning, and general robustness and compliance in face of inevitable modeling errors.

link (url) [BibTex]

link (url) [BibTex]


no image
A library for locally weighted projection regression

Klanke, S., Vijayakumar, S., Schaal, S.

Journal of Machine Learning Research, 9, pages: 623-626, 2008, clmc (article)

Abstract
In this paper we introduce an improved implementation of locally weighted projection regression (LWPR), a supervised learning algorithm that is capable of handling high-dimensional input data. As the key features, our code supports multi-threading, is available for multiple platforms, and provides wrappers for several programming languages.

link (url) [BibTex]

link (url) [BibTex]


no image
Optimization strategies in human reinforcement learning

Hoffmann, H., Theodorou, E., Schaal, S.

Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (article)

PDF [BibTex]

PDF [BibTex]

2005


no image
Composite adaptive control with locally weighted statistical learning

Nakanishi, J., Farrell, J. A., Schaal, S.

Neural Networks, 18(1):71-90, January 2005, clmc (article)

Abstract
This paper introduces a provably stable learning adaptive control framework with statistical learning. The proposed algorithm employs nonlinear function approximation with automatic growth of the learning network according to the nonlinearities and the working domain of the control system. The unknown function in the dynamical system is approximated by piecewise linear models using a nonparametric regression technique. Local models are allocated as necessary and their parameters are optimized on-line. Inspired by composite adaptive control methods, the proposed learning adaptive control algorithm uses both the tracking error and the estimation error to update the parameters. We first discuss statistical learning of nonlinear functions, and motivate our choice of the locally weighted learning framework. Second, we begin with a class of first order SISO systems for theoretical development of our learning adaptive control framework, and present a stability proof including a parameter projection method that is needed to avoid potential singularities during adaptation. Then, we generalize our adaptive controller to higher order SISO systems, and discuss further extension to MIMO problems. Finally, we evaluate our theoretical control framework in numerical simulations to illustrate the effectiveness of the proposed learning adaptive controller for rapid convergence and high accuracy of control.

link (url) [BibTex]

2005

link (url) [BibTex]


no image
A model of smooth pursuit based on learning of the target dynamics using only retinal signals

Shibata, T., Tabata, H., Schaal, S., Kawato, M.

Neural Networks, 18, pages: 213-225, 2005, clmc (article)

Abstract
While the predictive nature of the primate smooth pursuit system has been evident through several behavioural and neurophysiological experiments, few models have attempted to explain these results comprehensively. The model we propose in this paper in line with previous models employing optimal control theory; however, we hypothesize two new issues: (1) the medical superior temporal (MST) area in the cerebral cortex implements a recurrent neural network (RNN) in order to predict the current or future target velocity, and (2) a forward model of the target motion is acquired by on-line learning. We use stimulation studies to demonstrate how our new model supports these hypotheses.

link (url) [BibTex]

link (url) [BibTex]


no image
Parametric and Non-Parametric approaches for nonlinear tracking of moving objects

Hidaka, Y, Theodorou, E.

Technical Report-2005-1, 2005, clmc (article)

PDF [BibTex]

PDF [BibTex]

2003


no image
Computational approaches to motor learning by imitation

Schaal, S., Ijspeert, A., Billard, A.

Philosophical Transaction of the Royal Society of London: Series B, Biological Sciences, 358(1431):537-547, 2003, clmc (article)

Abstract
Movement imitation requires a complex set of mechanisms that map an observed movement of a teacher onto one's own movement apparatus. Relevant problems include movement recognition, pose estimation, pose tracking, body correspondence, coordinate transformation from external to egocentric space, matching of observed against previously learned movement, resolution of redundant degrees-of-freedom that are unconstrained by the observation, suitable movement representations for imitation, modularization of motor control, etc. All of these topics by themselves are active research problems in computational and neurobiological sciences, such that their combination into a complete imitation system remains a daunting undertaking - indeed, one could argue that we need to understand the complete perception-action loop. As a strategy to untangle the complexity of imitation, this paper will examine imitation purely from a computational point of view, i.e. we will review statistical and mathematical approaches that have been suggested for tackling parts of the imitation problem, and discuss their merits, disadvantages and underlying principles. Given the focus on action recognition of other contributions in this special issue, this paper will primarily emphasize the motor side of imitation, assuming that a perceptual system has already identified important features of a demonstrated movement and created their corresponding spatial information. Based on the formalization of motor control in terms of control policies and their associated performance criteria, useful taxonomies of imitation learning can be generated that clarify different approaches and future research directions.

link (url) [BibTex]

2003

link (url) [BibTex]

2002


no image
Forward models in visuomotor control

Mehta, B., Schaal, S.

J Neurophysiol, 88(2):942-53, August 2002, clmc (article)

Abstract
In recent years, an increasing number of research projects investigated whether the central nervous system employs internal models in motor control. While inverse models in the control loop can be identified more readily in both motor behavior and the firing of single neurons, providing direct evidence for the existence of forward models is more complicated. In this paper, we will discuss such an identification of forward models in the context of the visuomotor control of an unstable dynamic system, the balancing of a pole on a finger. Pole balancing imposes stringent constraints on the biological controller, as it needs to cope with the large delays of visual information processing while keeping the pole at an unstable equilibrium. We hypothesize various model-based and non-model-based control schemes of how visuomotor control can be accomplished in this task, including Smith Predictors, predictors with Kalman filters, tapped-delay line control, and delay-uncompensated control. Behavioral experiments with human participants allow exclusion of most of the hypothesized control schemes. In the end, our data support the existence of a forward model in the sensory preprocessing loop of control. As an important part of our research, we will provide a discussion of when and how forward models can be identified and also the possible pitfalls in the search for forward models in control.

link (url) [BibTex]

2002

link (url) [BibTex]


no image
Scalable techniques from nonparameteric statistics for real-time robot learning

Schaal, S., Atkeson, C. G., Vijayakumar, S.

Applied Intelligence, 17(1):49-60, 2002, clmc (article)

Abstract
Locally weighted learning (LWL) is a class of techniques from nonparametric statistics that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional belief that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested on up to 90 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing by a humanoid robot arm, and inverse-dynamics learning for a seven and a 30 degree-of-freedom robot. In all these examples, the application of our statistical neural networks techniques allowed either faster or more accurate acquisition of motor control than classical control engineering.

link (url) [BibTex]

link (url) [BibTex]

2000


no image
A brachiating robot controller

Nakanishi, J., Fukuda, T., Koditschek, D. E.

IEEE Transactions on Robotics and Automation, 16(2):109-123, 2000, clmc (article)

Abstract
We report on our empirical studies of a new controller for a two-link brachiating robot. Motivated by the pendulum-like motion of an apeâ??s brachiation, we encode this task as the output of a â??target dynamical system.â? Numerical simulations indicate that the resulting controller solves a number of brachiation problems that we term the â??ladder,â? â??swing-up,â? and â??ropeâ? problems. Preliminary analysis provides some explanation for this success. The proposed controller is implemented on a physical system in our laboratory. The robot achieves behaviors including â??swing locomotionâ? and â??swing upâ? and is capable of continuous locomotion over several rungs of a ladder. We discuss a number of formal questions whose answers will be required to gain a full understanding of the strengths and weaknesses of this approach.

link (url) [BibTex]

2000

link (url) [BibTex]


no image
Interaction of rhythmic and discrete pattern generators in single joint movements

Sternad, D., Dean, W. J., Schaal, S.

Human Movement Science, 19(4):627-665, 2000, clmc (article)

Abstract
The study investigates a single-joint movement task that combines a translatory and cyclic component with the objective to investigate the interaction of discrete and rhythmic movement elements. Participants performed an elbow movement in the horizontal plane, oscillating at a prescribed frequency around one target and shifting to a second target upon a trigger signal, without stopping the oscillation. Analyses focused on extracting the mutual influences of the rhythmic and the discrete component of the task. Major findings are: (1) The onset of the discrete movement was confined to a limited phase window in the rhythmic cycle. (2) Its duration was influenced by the period of oscillation. (3) The rhythmic oscillation was "perturbed" by the discrete movement as indicated by phase resetting. On the basis of these results we propose a model for the coordination of discrete and rhythmic actions (K. Matsuoka, Sustained oscillations generated by mutually inhibiting neurons with adaptations, Biological Cybernetics 52 (1985) 367-376; Mechanisms of frequency and pattern control in the neural rhythm generators, Biological Cybernetics 56 (1987) 345-353). For rhythmic movements an oscillatory pattern generator is developed following models of half-center oscillations (D. Bullock, S. Grossberg, The VITE model: a neural command circuit for generating arm and articulated trajectories, in: J.A.S. Kelso, A.J. Mandel, M. F. Shlesinger (Eds.), Dynamic Patterns in Complex Systems. World Scientific. Singapore. 1988. pp. 305-326). For discrete movements a point attractor dynamics is developed close to the VITE model For each joint degree of freedom both pattern generators co-exist but exert mutual inhibition onto each other. The suggested modeling framework provides a unified account for both discrete and rhythmic movements on the basis of neuronal circuitry. Simulation results demonstrated that the effects observed in human performance can be replicated using the two pattern generators with a mutually inhibiting coupling.

link (url) [BibTex]


no image
Dynamics of a bouncing ball in human performance

Sternad, D., Duarte, M., Katsumata, H., Schaal, S.

Physical Review E, 63(011902):1-8, 2000, clmc (article)

Abstract
On the basis of a modified bouncing-ball model, we investigated whether human movements utilize principles of dynamic stability in their performance of a similar movement task. Stability analyses of the model provided predictions about conditions indicative of a dynamically stable period-one regime. In a series of experiments, human subjects bounced a ball rhythmically on a racket and displayed these conditions supporting that they attuned to and exploited the dynamic stability properties of the task.

link (url) [BibTex]

link (url) [BibTex]

1999


no image
Is imitation learning the route to humanoid robots?

Schaal, S.

Trends in Cognitive Sciences, 3(6):233-242, 1999, clmc (article)

Abstract
This review will focus on two recent developments in artificial intelligence and neural computation: learning from imitation and the development of humanoid robots. It will be postulated that the study of imitation learning offers a promising route to gain new insights into mechanisms of perceptual motor control that could ultimately lead to the creation of autonomous humanoid robots. This hope is justified because imitation learning channels research efforts towards three important issues: efficient motor learning, the connection between action and perception, and modular motor control in form of movement primitives. In order to make these points, first, a brief review of imitation learning will be given from the view of psychology and neuroscience. In these fields, representations and functional connections between action and perception have been explored that contribute to the understanding of motor acts of other beings. The recent discovery that some areas in the primate brain are active during both movement perception and execution provided a first idea of the possible neural basis of imitation. Secondly, computational approaches to imitation learning will be described, initially from the perspective of traditional AI and robotics, and then with a focus on neural network models and statistical learning research. Parallels and differences between biological and computational approaches to imitation will be highlighted. The review will end with an overview of current projects that actually employ imitation learning for humanoid robots.

link (url) [BibTex]

1999

link (url) [BibTex]


no image
Segmentation of endpoint trajectories does not imply segmented control

Sternad, D., Schaal, D.

Experimental Brain Research, 124(1):118-136, 1999, clmc (article)

Abstract
While it is generally assumed that complex movements consist of a sequence of simpler units, the quest to define these units of action, or movement primitives, still remains an open question. In this context, two hypotheses of movement segmentation of endpoint trajectories in 3D human drawing movements are re-examined: (1) the stroke-based segmentation hypothesis based on the results that the proportionality coefficient of the 2/3 power law changes discontinuously with each new â??strokeâ?, and (2) the segmentation hypothesis inferred from the observation of piecewise planar endpoint trajectories of 3D drawing movements. In two experiments human subjects performed a set of elliptical and figure-8 patterns of different sizes and orientations using their whole arm in 3D. The kinematic characteristics of the endpoint trajectories and the seven joint angles of the arm were analyzed. While the endpoint trajectories produced similar segmentation features as reported in the literature, analyses of the joint angles show no obvious segmentation but rather continuous oscillatory patterns. By approximating the joint angle data of human subjects with sinusoidal trajectories, and by implementing this model on a 7-degree-of-freedom anthropomorphic robot arm, it is shown that such a continuous movement strategy can produce exactly the same features as observed by the above segmentation hypotheses. The origin of this apparent segmentation of endpoint trajectories is traced back to the nonlinear transformations of the forward kinematics of human arms. The presented results demonstrate that principles of discrete movement generation may not be reconciled with those of rhythmic movement as easily as has been previously suggested, while the generalization of nonlinear pattern generators to arm movements can offer an interesting alternative to approach the question of units of action.

link (url) [BibTex]

link (url) [BibTex]

1997


no image
Locally weighted learning

Atkeson, C. G., Moore, A. W., Schaal, S.

Artificial Intelligence Review, 11(1-5):11-73, 1997, clmc (article)

Abstract
This paper surveys locally weighted learning, a form of lazy learning and memory-based learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning fit parameters, interference between old and new data, implementing locally weighted learning efficiently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control. Keywords: locally weighted regression, LOESS, LWR, lazy learning, memory-based learning, least commitment learning, distance functions, smoothing parameters, weighting functions, global tuning, local tuning, interference.

link (url) [BibTex]

1997

link (url) [BibTex]


no image
Locally weighted learning for control

Atkeson, C. G., Moore, A. W., Schaal, S.

Artificial Intelligence Review, 11(1-5):75-113, 1997, clmc (article)

Abstract
Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We explain various forms that control tasks can take, and how this affects the choice of learning paradigm. The discussion section explores the interesting impact that explicitly remembering all previous experiences has on the problem of learning to control. Keywords: locally weighted regression, LOESS, LWR, lazy learning, memory-based learning, least commitment learning, forward models, inverse models, linear quadratic regulation (LQR), shifting setpoint algorithm, dynamic programming.

link (url) [BibTex]

link (url) [BibTex]

1996


no image
A Kendama learning robot based on bi-directional theory

Miyamoto, H., Schaal, S., Gandolfo, F., Koike, Y., Osu, R., Nakano, E., Wada, Y., Kawato, M.

Neural Networks, 9(8):1281-1302, 1996, clmc (article)

Abstract
A general theory of movement-pattern perception based on bi-directional theory for sensory-motor integration can be used for motion capture and learning by watching in robotics. We demonstrate our methods using the game of Kendama, executed by the SARCOS Dextrous Slave Arm, which has a very similar kinematic structure to the human arm. Three ingredients have to be integrated for the successful execution of this task. The ingredients are (1) to extract via-points from a human movement trajectory using a forward-inverse relaxation model, (2) to treat via-points as a control variable while reconstructing the desired trajectory from all the via-points, and (3) to modify the via-points for successful execution. In order to test the validity of the via-point representation, we utilized a numerical model of the SARCOS arm, and examined the behavior of the system under several conditions.

link (url) [BibTex]

1996

link (url) [BibTex]


no image
One-handed juggling: A dynamical approach to a rhythmic movement task

Schaal, S., Sternad, D., Atkeson, C. G.

Journal of Motor Behavior, 28(2):165-183, 1996, clmc (article)

Abstract
The skill of rhythmic juggling a ball on a racket is investigated from the viewpoint of nonlinear dynamics. The difference equations that model the dynamical system are analyzed by means of local and non-local stability analyses. These analyses yield that the task dynamics offer an economical juggling pattern which is stable even for open-loop actuator motion. For this pattern, two types of pre dictions are extracted: (i) Stable periodic bouncing is sufficiently characterized by a negative acceleration of the racket at the moment of impact with the ball; (ii) A nonlinear scaling relation maps different juggling trajectories onto one topologically equivalent dynamical system. The relevance of these results for the human control of action was evaluated in an experiment where subjects performed a comparable task of juggling a ball on a paddle. Task manipulations involved different juggling heights and gravity conditions of the ball. The predictions were confirmed: (i) For stable rhythmic performance the paddle's acceleration at impact is negative and fluctuations of the impact acceleration follow predictions from global stability analysis; (ii) For each subject, the realizations of juggling for the different experimental conditions are related by the scaling relation. These results allow the conclusion that for the given task, humans reliably exploit the stable solutions inherent to the dynamics of the task and do not overrule these dynamics by other control mechanisms. The dynamical scaling serves as an efficient principle to generate different movement realizations from only a few parameter changes and is discussed as a dynamical formalization of the principle of motor equivalence.

link (url) [BibTex]

link (url) [BibTex]

1994


no image
Robot juggling: An implementation of memory-based learning

Schaal, S., Atkeson, C. G.

Control Systems Magazine, 14(1):57-71, 1994, clmc (article)

Abstract
This paper explores issues involved in implementing robot learning for a challenging dynamic task, using a case study from robot juggling. We use a memory-based local modeling approach (locally weighted regression) to represent a learned model of the task to be performed. Statistical tests are given to examine the uncertainty of a model, to optimize its prediction quality, and to deal with noisy and corrupted data. We develop an exploration algorithm that explicitly deals with prediction accuracy requirements during exploration. Using all these ingredients in combination with methods from optimal control, our robot achieves fast real-time learning of the task within 40 to 100 trials.

link (url) [BibTex]

1994

link (url) [BibTex]