Publications

(2024). Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form. arXiv preprint arXiv:2408.16286.

Cite Source Document

(2024). A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees. arXiv preprint arXiv:2401.17780.

Cite Source Document

(2023). Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. International Conference on Machine Learning (ICML).

Cite Source Document

(2023). (OS 招待講演) 逐次意思決定における諸問題設定と問題に関する事前知識が性能保証に及ぼす影響について. 人工知能学会全国大会論文集 第 37 回 (2023).

Cite Source Document

(2022). KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. arXiv preprint arXiv:2205.14211.

Cite Source Document

(2021). Geometric Value Iteration: Dynamic Error-Aware KL Regularization for Reinforcement Learning. Asian Conference on Machine Learning (ACML).

Cite Source Document

(2021). Cautious Policy Programming: Exploiting KL Regularization in Monotonic Policy Improvement for Reinforcement Learning. arXiv preprint arXiv:2107.05798.

Cite Source Document

(2021). Cautious Actor-Critic. Asian Conference on Machine Learning (ACML).

Cite Source Document