Estimating scale-invariant future in continuous time
Natural learners must compute an estimate of future outcomes that follow from a stimulus in continuous time. Critically, the learner cannot in general know a priori the relevant time scale over which meaningful relationships will be observed. Widely used reinforcement learning algorithms discretize continuous time and use the Bellman equation to estimate exponentially-discounted future reward. However, exponential discounting introduces a time scale to the computation of value. Scaling is a serious problem in continuous time: efficient learning with scaled algorithms requires prior knowledge of the relevant scale. That is, with scaled algorithms one must know at least part of the solution to a problem prior to attempting a solution. We present a computational mechanism, developed based on work in psychology and neuroscience, for computing a scale-invariant timeline of future events. This mechanism efficiently computes a model for future time on a logarithmically-compressed scale, and can be used to generate a scale-invariant power-law-discounted estimate of expected future reward. Moreover, the representation of future time retains information about what will happen when, enabling flexible decision making based on future events. The entire timeline can be constructed in a single parallel operation.
READ FULL TEXT