Reinforcement learning for sequential decision making with constraints