Abstract: Video captioning is a complex process, involving the automatic generation of descriptive, natural language narratives for video content. The current landscape of video captioning models ...