Seeing Time: Benchmarking Chronological Reasoning and Shortcut Biases in Vision-Language Models 面向视觉语言模型的图像时间推理 benchmark 独立一作 在投 Paper Code Project