Background Inpainting for Videos with Dynamic Objects and a Free-moving Camera

M. Granados1       K. I. Kim1       J. Tompkin1,2,3       J. Kautz2       C. Theobalt1      

1MPI Informatik      2UCL      3Intel Visual Computing Institute


We propose a method for removing marked dynamic objects from videos captured with a free-moving camera, so long as the objects occlude parts of the scene with a static background. Our approach takes as input a video, a mask marking the object to be removed, and a mask marking the dynamic objects to remain in the scene. To inpaint a frame, we align other candidate frames in which parts of the missing region are visible. Among these candidates, a single source is chosen to fill each pixel so that the final arrangement is color-consistent. Intensity differences between sources are smoothed using gradient domain fusion. Our frame alignment process assumes that the scene can be approximated using piecewise planar geometry: A set of homographies is estimated for each frame pair, and one each is selected for aligning pixels such that the color-discrepancy is minimized and the epipolar constraints are maintained. We provide experimental validation with several real-world video sequences to demonstrate that, unlike in previous work, inpainting videos shot with free-moving cameras does not necessarily require estimation of absolute camera positions and per-frame per-pixel depth maps.


In Proc. European Conference on Computer Vision (ECCV), 2012: Full text | Supplementary video


The video sequences shown in the paper are available in H.264 lossless format:

S1: [Input] [Mask] [Foreground mask] [Result]

S2: [Input] [Mask] [Foreground mask] [Result]

S3: [Input] [Mask] [Foreground mask] [Result]

S4: [Input] [Mask] [Foreground mask] [Result]

S5: [Input] [Mask] [Foreground mask] [Result]

S6: [Input] [Mask] [Foreground mask] [Result]

S7: [Input] [Mask] [Foreground mask] [Result]