Cross-media intelligence combines multimedia computing with artificial intelligence to conduct theoretical, methodological, and technical research on understanding and generating various multimedia content, including text, images, videos, audio, documents, and 3D data. The main objective is to leverage the cross-media characteristics of the human brain to bridge the perception and cognition of different sensory information such as vision, language, and hearing, enabling intelligent processing of multimedia information.
The main research areas include multimedia compression and processing, multimedia analysis, cross-media retrieval, cross-media generation, cross-media transmission, cross-media knowledge graph, document intelligence, text computation, and more. Cross-media intelligence technologies are widely applied in fields such as news publishing, new media, the Internet, and various enterprises and institutions. Application technologies include the generation of images and videos (AIGC), fine-grained image classification, specific content detection and recognition, large-scale cross-media content retrieval, document information recognition and analysis, multimodal information fusion processing, and more.