[Pose Estimation] waterfall module 기반으로 설계된 자세 추정 방법들 (UniPose, UniPose+, OmniPose, BAPose)

꾸준희 2022. 4. 20. 02:22

728x90

waterfall module 즉 WASP(Waterfall Atrous Spatial Pyramid)는 위 그림과 같은 구성으로 되어있으며 원래 semantic segmentation을 위해 multiscale fields-of-view(FOV)를 유지하면서 cascade architecture에서 progressive filtering을 활용하는 “Waterfall” Atrous Spatial Pooling 기반 방식으로 이루어진 모듈이다. 이와 같은 module로 설계된 자세 추정 방법들은 아래와 같다.

1. UniPose, Unified Human Pose Estimation in Single Images and Videos (CVPR 2020)

WASP module (w/ a cascade of atrous convolutions and multi-scale representations)을 활용한 자세 추정 모델이며 large FOV of WASP는 프레임의 contextual information에 대한 더 나은 정보를 얻게 해주며, 더 정확한 자세 추정 결과를 달성하여 그 당시 SOTA를 달성하였다. 관련 포스팅 참고 https://eehoeskrap.tistory.com/631

Paper : https://arxiv.org/abs/2001.08095

UniPose: Unified Human Pose Estimation in Single Images and Videos

We propose UniPose, a unified framework for human pose estimation, based on our "Waterfall" Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose estimation metrics. Current pose estimation methods utilizing standard CNN a

arxiv.org

Github : https://github.com/bmartacho/UniPose

GitHub - bmartacho/UniPose: We propose UniPose, a unified framework for human pose estimation, based on our “Waterfall

We propose UniPose, a unified framework for human pose estimation, based on our “Waterfall” Atrous Spatial Pooling architecture, that achieves state-of-art-results on several pose esti...

github.com

2. UniPose+, A unified framework for 2D and 3D human pose estimation in images and videos

UniPose 방법이 3D Pose로 확장된 버전이며, 아직 코드는 릴리즈되지 않았다.

Paper : https://ieeexplore.ieee.org/document/9599531

UniPose+: A unified framework for 2D and 3D human pose estimation in images and videos

We propose UniPose+, a unified framework for 2D and 3D human pose estimation in images and videos. The UniPose+ architecture leverages multi-scale feature representations to increase the effectiveness of conventional backbone feature extractors, with no si

ieeexplore.ieee.org

3. OmniPose, A Multi-Scale Framework for Multi-Person Pose Estimation

improved Waterfall module(=WASPv2)을 기반으로 한 Multi-Person Pose Estimation 방법이며, 여러 데이터 세트에서 최첨단 결과를 달성한다. 코드는 아직 릴리즈 되지 않았다. 참고로 Omni 접두어는 "언제든, 어디든, 도처의, 무엇이든" 이라는 뜻으로 사용된다고 한다.

Paper : https://arxiv.org/abs/2103.10180

OmniPose: A Multi-Scale Framework for Multi-Person Pose Estimation

We propose OmniPose, a single-pass, end-to-end trainable framework, that achieves state-of-the-art results for multi-person pose estimation. Using a novel waterfall module, the OmniPose architecture leverages multi-scale feature representations that increa

arxiv.org

4. BAPose, Bottom-Up Pose Estimation with Disentangled Waterfall Representations

D-WASP disentangled waterfall module을 사용하여 자세를 추정한 방법이다. 코드는 아직 릴리즈 되지 않았다. 이탈리아어로 Bottom-Up을 뜻하는 단어가 “Basso verso l’Alto” 라서 BAPose로 이름을 정했다고 한다. BAPose 방법은 UniPose, UniPose+ 및 OmniPose의 방법들을 bottom-up multi-person 2D pose estimation으로 확장하는 single-stage, end-to end trainable network 이다. 이와 같은 방법은 post-processing, intermediate supervision, multiple iterations or anchor poses 없이도 2개의 대규모 데이터 세트에서 SOTA를 달성한다고 한다.

Paper : https://arxiv.org/abs/2112.10716

BAPose: Bottom-Up Pose Estimation with Disentangled Waterfall Representations

We propose BAPose, a novel bottom-up approach that achieves state-of-the-art results for multi-person pose estimation. Our end-to-end trainable framework leverages a disentangled multi-scale waterfall architecture and incorporates adaptive convolutions to

arxiv.org

728x90

저작자표시 비영리 (새창열림)