You are here

Version 6

DefaultProceduralLearningModule6 provides access to production compilation and utility learning. Utility learning utilizes the IExpectedUtilityEquation. Production compilation delegates to the IProductionCompiler. The module is responsible for propagating rewards backward in time. Utility learning is fully functional, and can even support the learning of utility noises (UtilityNoiseLearningExtension). The 'Reward' parameter determines how a production participates in the utility learning. A numeric value will result in that production starting a chain of rewards back propogating after it fires. A value of 'default' or NaN, will mark the production as participating the utilty learning, but not starting it. 'skip' or -Infinity will result in the production being skipped, but utility learning continues to propogate. Finally, you can mark a production with 'stop' or Infinity to halt reward processing at this production (not giving any credit to it or prior productions).

Production compilation, while implemented, is based directly on the lisp version and is not as extensible as desired. The long term goal is to get it using the ICompilableContext which describes buffers in terms of their critical properties for production compilation.

This implementation also has the ability reward productions selectively based upon what buffers they act upon. Obviously, we want to reward productions that act on the goal and imaginal buffers, but there exist a class of productions where rewards are less relevant. Goal-free or reflexive productions are often used for basic model behavior that exists below the intentional level. By limiting the productions to be rewarded to those that access the "IncludeBuffers" parameter, you can exclude some productions from rewards, without effecting the rest of the reward chain.


  • ExpectedUtilityEquation : class name for the implementor of IExpectedUtilityEquation (default : DefaultExpectedUtilityEquation)
  • ProductionCompiler : class name for the implementor of IProductionCompiler (default: DefaultProductionCompiler6)
  • EnableProductionCompilation : should comilation be used (values:true/false. default:false)
  • OptimizedLearning : number of time references (for the production) to be retained. 0 means retain them all. (default: 0)
  • ParameterLearningRate : Discount applied to utility learning (values: numeric >0, NaN (off) default: NaN)
  • IncludeBuffers : a list of buffer names (coma separated) that productions must match/manipulate in order to be considered for rewarding. (default: "goal, imaginal, retrieval")