The present invention is a method and an apparatus for reward-based learning of policies for managing or controlling a system or plant. In one embodiment, a method for reward-based learning includes receiving a set of one or more exemplars, where at least two of the exemplars comprise a (state, action) pair for a system, and at least one of the exemplars includes an immediate reward responsive to a (state, action) pair. A distance measure between pairs of exemplars is used to compute a Non-Linear Dimensionality Reduction (NLDR) mapping of (state, action) pairs into a lower-dimensional representation. The mapping is then applied to the set of exemplars, and reward-based learning is applied to the transformed exemplars to obtain a management policy.

 
Web www.patentalert.com

< Tactile cell model

> Braille printing device

> Interactive grammar teaching methods and system therefor

~ 00514