Special Sessions
Abstract
Video coding is a cornerstone of modern visual communication, enabling applications such as video streaming, conferencing, broadcasting, cloud gaming, immersive media, and intelligent visual services. While international standards such as H.265/HEVC, AV1, AV2, and H.266/VVC have continuously improved compression efficiency, emerging visual applications are driving video coding beyond traditional rate–distortion optimization. Future coding systems must simultaneously address compression performance, perceptual quality, task utility, deployment feasibility, model generalization, and standardization requirements. In response, next-generation video coding is evolving through multiple complementary directions, including advanced coding tools extending existing standards, end-to-end neural video coding, machine- and task-oriented compression, generative visual communication, and immersive media coding. These developments are fostering the convergence of conventional codec evolution, learned compression, application-aware representations, and emerging standardization efforts. In line with these trends, we invite contributions on advanced video coding tools, neural video coding, perceptual and task-oriented compression, machine vision and vision-language-model-based communication, generative video communication, immersive media coding, codec deployment, and future video coding standardization.
Organizers
- Zhuoyuan Li (The Hong Kong Polytechnic University, HK)
- Bolin Chen (DAMO Academy, Alibaba Group)
- Yichi Zhang (Purdue University, US)
- Li Li (University of Science and Technology of China, CN)
- Giuseppe Valenzise (Université Paris-Saclay, FR)
- Yan Ye (DAMO Academy, Alibaba Group)
Abstract
Recent advances in visual sensing, computational imaging, neural representations, and multimodal learning are transforming the way visual data are acquired, processed, communicated, and understood. Modern visual systems increasingly rely on high-dimensional visual data that extend beyond conventional RGB images and videos to include event streams, light fields, hyperspectral and polarization imagery, LiDAR, time-of-flight sensing, neural scene representations, 3D Gaussian splats, and hybrid multimodal sensing modalities. These data capture rich spatial, temporal, geometric, spectral, and cross-modal information, enabling more robust visual processing under challenging conditions such as fast motion, low light, occlusion, missing modalities, and distribution shift. At the same time, the growing complexity and volume of high-dimensional visual data create new challenges in acquisition, restoration, compression, representation, quality assessment, perception, and reasoning. Emerging solutions increasingly integrate imaging, communication, perception, and multimodal intelligence to support reliable visual understanding and decision making. In line with these developments, we invite contributions on computational imaging and novel sensing systems, event-based and multimodal vision, high-dimensional visual restoration and enhancement, learned compression, implicit and neural representations, quality assessment, cross-modal fusion and alignment, robust visual perception, vision-language reasoning, trustworthy AI, and efficient visual communication for next-generation visual systems.
Organizers
- Haowen Bai (Nanyang Technological University, SG)
- Rui Zhao (Nanyang Technological University, SG)
- Zeyu Xiao (National University of Singapore, SG)
- Taewoo Kim (INSAIT, BG)
- Hadi Amirpour (University of Klagenfurt, AT)
- Tae Hyun Kim (Hanyang University, KR)