Vision based language to action mapping