CrossNet++: Cross-Scale Large-Parallax Warping for Reference-Based Super-Resolution.

TitleCrossNet++: Cross-Scale Large-Parallax Warping for Reference-Based Super-Resolution.
Publication TypeJournal Article
Year of Publication2021
AuthorsY Tan, H Zheng, Y Zhu, X Yuan, X Lin, D Brady, and L Fang
JournalIeee Transactions on Pattern Analysis and Machine Intelligence
Start Page4291
Pagination4291 - 4305
Date Published12/2021

The ability of camera arrays to efficiently capture higher space-bandwidth product than single cameras has led to various multiscale and hybrid systems. These systems play vital roles in computational photography, including light field imaging, 360 VR camera, gigapixel videography, etc. One of the critical tasks in multiscale hybrid imaging is matching and fusing cross-resolution images from different cameras under perspective parallax. In this paper, we investigate the reference-based super-resolution (RefSR) problem associated with dual-camera or multi-camera systems. RefSR consists of super-resolving a low-resolution (LR) image given an external high-resolution (HR) reference image, where they suffer both a significant resolution gap ( 8×) and large parallax (  ∼ 10% pixel displacement). We present CrossNet++, an end-to-end network containing novel two-stage cross-scale warping modules, image encoder and fusion decoder. The stage I learns to narrow down the parallax distinctively with the strong guidance of landmarks and intensity distribution consensus. Then the stage II operates more fine-grained alignment and aggregation in feature domain to synthesize the final super-resolved image. To further address the large parallax, new hybrid loss functions comprising warping loss, landmark loss and super-resolution loss are proposed to regularize training and enable better convergence. CrossNet++ significantly outperforms the state-of-art on light field datasets as well as real dual-camera data. We further demonstrate the generalization of our framework by transferring it to video super-resolution and video denoising.

Short TitleIeee Transactions on Pattern Analysis and Machine Intelligence