Majority of state-of-the-art monocular depth estimation methods are supervised learning approaches . Success of such approaches heavily depends on the high-quality depth labels which are expensive to obtain . Some recent methods try to learn depth networks by leveraging unsupervised cues from monocular videos which are easier to acquire but less reliable . We propose a temporally-consistent domain adaptation (TCDA) approach that simultaneously explores labels in the synthetic domain and temporal constraints in the videos to improve style transfer and depth prediction . Furthermore, we make use of the ground-truth optical flow to learn moving mask and pose prediction networks . The learned moving masks can filter out moving regions that produces erroneous temporal constraints and the estimated poses provide better initializations for estimating temporal constraints. The learned masks can filters out moving areas that produce moving regions and the . estimated poses provided better initial

Author(s) :

Links : PDF - Abstract

Code :


Keywords : depth - moving - constraints - temporal - monocular -

Leave a Reply

Your email address will not be published. Required fields are marked *