Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation

Publication
arXiv preprint arXiv:2502.10138

Add the full text or supplementary notes for the publication here using Markdown formatting.