论文笔记之：Optical Flow Estimation using a Spatial Pyramid Network-阿里云开发者社区

论文笔记之：Optical Flow Estimation using a Spatial Pyramid Network

2017-06-21 2532

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 　　Optical Flow Estimation using a Spatial Pyramid Network spynet 　　本文将经典的 spatial-pyramid formulation 和 deep learning 的方法相结合，以一种 coarse to fine approach，进行光流的计算。

　　Optical Flow Estimation using a Spatial Pyramid Network

spynet

　　本文将经典的 spatial-pyramid formulation 和 deep learning 的方法相结合，以一种 coarse to fine approach，进行光流的计算。This estiamates large motions in a coarse to fine approach by warping one image of a pair at each pyramid level by the current flow estimate and compute an update to the flow.

　　我们利用 CNN 来进行每一层 flow 的更新，而不是传统方法中目标函数的最小化。与 FlowNet 相比，本文的方法不需要处理 large motions；这些已经在 pyramid 中处理了。该方法的主要优势有：

　　1. our Spatial Pyramid Network is much simpler and 96% smaller than FlowNet in terms of model parameters.

　　2. since the flow at each pyramid level is small (< 1 pixel), a convolutional approach applied to pairs of warped images is appropriate.

　　3. unlike FlowNet, the learned convolution filters appear similar to classical spatio-temporal filters, giving insight into the method and how to improve it.

　　现有方法存在的 主要问题：

　　将两张图直接 stack大一起，放到 CNN 当中。当两帧图像之间的 motion 大于 one or a few pixels， spatial-temporal convolutional filters 将不会收到有效的相应。也就是说，if a convolutional window in one image does not overlap with related image pixels at the next time instant, no meaningful temporal filter can be learned.

　　这里需要解决两个关键性的问题：1. 长期依赖的问题；　　2. detailed, sub-pixel, optical flow and precise motion boundaries。FlowNet 是尝试在一个网络中解决这两个问题，而该方法则是用 CNN 来解决第二个问题，用现有的方法来解决第一个问题。

　　Approach：

　　本文用 spatial pyramid 的方式，from coarse to fine 的方法来解决 large motion的问题。

　　其流程图如下所示：