This AI Paper from China Introduces Multimodal ArXiv Dataset: Consisting of ArXivCap and ArXivQA for Enhancing Large Vision-Language Models Scientific Comprehension

This AI Paper from China Introduces Multimodal ArXiv Dataset: Consisting of ArXivCap and ArXivQA for Enhancing Large Vision-Language Models Scientific Comprehension