March 01, 2021
Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner. MBRL algorithms can be fairly complex due to the separate dynamics modeling and the subsequent planning algorithm, and as a result, they often possess tens of hyperparameters and architectural choices. For this reason, MBRL typically requires significant human expertise before it can be applied to new problems and domains. To alleviate this problem, we propose to use automatic hyperparameter optimization (HPO). We demonstrate that this problem can be tackled effectively with automated HPO, which we demonstrate to yield significantly improved performance compared to human experts. In addition, we show that tuning of several MBRL hyperparameters dynamically, i.e. during the training itself, further improves the performance compared to using static hyperparameters which are kept fixed for the whole training. Finally, our experiments provide valuable insights into the effects of several hyperparameters, such as plan horizon or learning rate and their influence on the stability of training and resulting rewards.
Written by
Baohe Zhang
Raghu Rajan
Luis Pineda
Nathan Lambert
André Biedenkapp
Kurtland Chua
Frank Hutter
Roberto Calandra
Publisher
AISTATS
Research Topics
January 06, 2024
Geng Ji, Wentao Jiang, Jiang Li, Fahmid Morshed Fahid, Zhengxing Chen, Yinghua Li, Jun Xiao, Chongxi Bao, Zheqing (Bill) Zhu
January 06, 2024
December 11, 2023
Dishank Bansal, Ricky Chen, Mustafa Mukadam, Brandon Amos
December 11, 2023
October 01, 2023
Wei Hung, Bo-Kai Huang, Ping-Chun Hsieh, Xi Liu
October 01, 2023
September 12, 2023
Bill Zhu, Alex Nikulkov, Dmytro Korenkevych, Fan Liu, Jalaj Bhandari, Ruiyang Xu, Urun Dogan
September 12, 2023
Product experiences
Foundational models
Product experiences
Latest news
Foundational models