Towards effective and efficient video object detection