Low-discrepancy sampling for approximate dynamic programming with local approximators (Articolo in rivista)

Type
Label
  • Low-discrepancy sampling for approximate dynamic programming with local approximators (Articolo in rivista) (literal)
Anno
  • 2014-01-01T00:00:00+01:00 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#doi
  • 10.1016/j.cor.2013.09.006 (literal)
Alternative label
  • C. Cervellera, M. Gaggero, D. Maccio (2014)
    Low-discrepancy sampling for approximate dynamic programming with local approximators
    in Computers & operations research; PERGAMON-ELSEVIER SCIENCE LTD, THE BOULEVARD, LANGFORD LANE, KIDLINGTON, OXFORD (Regno Unito)
    (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
  • C. Cervellera, M. Gaggero, D. Maccio (literal)
Pagina inizio
  • 108 (literal)
Pagina fine
  • 115 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#altreInformazioni
  • Journal Q1 in Management Science and Operations Research (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroVolume
  • 43 (literal)
Rivista
Note
  • Scopu (literal)
  • ISI Web of Science (WOS) (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
  • Institute of Intelligent Systems for Automation, National Research Council, Via De Marini 6, 16149 Genova, Italy (literal)
Titolo
  • Low-discrepancy sampling for approximate dynamic programming with local approximators (literal)
Abstract
  • Approximate dynamic programming (ADP) relies, in the continuous-state case, on both a flexible class of models for the approximation of the value functions and a smart sampling of the state space for the numerical solution of the recursive Bellman equations. In this paper, low-discrepancy sequences, commonly employed for number-theoretic methods, are investigated as a sampling scheme in the ADP context when local models, such as the Nadaraya-Watson (NW) ones, are employed for the approximation of the value function. The analysis is carried out both from a theoretical and a practical point of view. In particular, it is shown that the combined use of low-discrepancy sequences and NW models enables the convergence of the ADP procedure. Then, the regular structure of the low-discrepancy sampling is exploited to derive a method for automatic selection of the bandwidth of NW models, which yields a significant saving in the computational effort with respect to the standard cross validation approach. Simulation results concerning an inventory management problem are presented to show the effectiveness of the proposed techniques. (literal)
Editore
Prodotto di
Autore CNR

Incoming links:


Prodotto
Autore CNR di
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#rivistaDi
Editore di
data.CNR.it