Abstract: In standard reinforcement learning, since the uncertainty of task objectives is not adequately considered in the policy training, the policy achieves poor generalization for the ...