[Deep Learning] MediaPipe

AI Research Topic/Computer Vision Basics

[Deep Learning] MediaPipe

꾸준희

|2020. 6. 3. 00:57

728x90

MediaPipe Github : https://github.com/google/mediapipe

google/mediapipe

MediaPipe is the simplest way for researchers and developers to build world-class ML solutions and applications for mobile, edge, cloud and the web. - google/mediapipe

github.com

MediaPipe Document : https://mediapipe.readthedocs.io/en/latest/

MediaPipe — MediaPipe v0.5 documentation

Alpha Disclaimer MediaPipe is currently in alpha for v0.6. We are still making breaking API changes and expect to get to stable API by v1.0. We recommend that you target a specific version of MediaPipe, and periodically bump to the latest release. That way

mediapipe.readthedocs.io

Google 에서 모바일, 엣지, 클라우드에 머신러닝을 실행 할 수 있는 솔루션 및 애플리케이션을 제공하는 MediaPipe 를 공개했다.

이는 멀티 모달 머신러닝 파이프 라인을 구축하기 위한 그래프 기반 프레임워크이다. 이는 모바일 장치, 워크 스테이션 및 서버에서 실행되는 크로스 플랫폼이며 모바일 GPU 가속을 지원한다. MediaPipe 를 사용하면 추론 모델 및 미디어 처리 기능과 같은 모듈 식 구성 요소의 그래프로 적용된 머신러닝 파이프라인을 구축 할 수 있다고 한다. 오디오 및 비디오 스트림과 같은 데이터들이 그래프에 입력되고, obejct localization 및 face landmark 와 같은 정보를 출력한다.

이 MeidaPipe 는 간단하게 프로젝트를 수행하는 사람들에게 딱 좋을 것 같다. 실제로도 그렇게들 많이 하고 있는 듯 하며, 이제 누구나 AI 를 편리하게 사용하여 개발 할 수 있는 시대가 이미 온 것 같다. 앞으로 MeidaPipe 의 행보가 기대된다.

지금까지 나온 솔루션으로는 아래와 같다.

Face Detection (web demo)
Face Mesh
Hand Detection
Hand Tracking (web demo)
Multi-hand Tracking
Hair Segmentation (web demo)
Object Detection
Object Detection and Tracking
Objectron: 3D Object Detection and Tracking
AutoFlip: Intelligent Video Reframing
KNIFT: Template Matching with Neural Image Features

그 중 모바일 GPU 환경에서 real time 으로 hand tracking 을 수행하는 그래프 예시는 다음과 같다.

github 에 공개되어있는 Hand Tracking (web demo) 를 실행시켜 보았는데 아래와 같은 결과가...

매번 GPU 환경에서 2 ~ 3 ms 속도만 보다가 2.48 fps 라는 속도를 보니

가슴이 턱 막히는 것 같지만, 웹 환경이니 어쩔 수 없다고 생각한다 ㅎㅎ

mediapipe 의 hand estiamtion & tracking web demo (난 분명 손바닥을 정면으로 펴고 있었는데....)

원래는 아래와 같이, 한 개의 프레임 마다 21개의 3D keypoint 들을 추정하여 정확하게 tracking 할 수 있다. 모바일 환경이라는 점이 가장 큰 특징 같다. Top-down 방식이며, 손 위치를 검출하기 위해 BlazePalm 이라는 Single-shot detector 를 사용한다고 한다.

참고자료 1 :

https://my-pick.tistory.com/entry/%EA%B5%AC%EA%B8%80%EC%9D%B4-%EB%A7%8C%EB%93%9C%EB%8A%94-AI-%EC%88%98%ED%99%94-%EC%9D%B8%EC%8B%9D-%ED%86%B5%EC%97%AD-%EC%8B%9C%EC%8A%A4%ED%85%9C?category=760043?category=760043

구글이 만드는 AI 수화 인식 통역 시스템

손의 모양과 움직임을 인식하는 능력은 다양한 기술 영역과 플랫폼에서 사용자 경험을 개선하는 데 있어 필수적인 요소일 수 있습니다. 예를 들어, 수화 이해와 손동작 제어를 위한 기초를 형성

www.gentlehan.com

참고자료 2 : https://ai.googleblog.com/2019/08/on-device-real-time-hand-tracking-with.html

On-Device, Real-Time Hand Tracking with MediaPipe

Posted by Valentin Bazarevsky and Fan Zhang, Research Engineers, Google Research The ability to perceive the shape and motion of hands c...

ai.googleblog.com

728x90

저작자표시 비영리 (새창열림)

'AI Research Topic > Computer Vision Basics' 카테고리의 다른 글

[Paper Review] CBAM : Convolutional Block Attention Module (0)	2020.06.07
[Deep Learning] Activation Function : Swish vs Mish (1)	2020.06.07
[Deep Learning] 딥러닝에서 사용되는 다양한 Convolution 기법들 (4)	2020.05.18
[Deep Learning] Batch Normalization (배치 정규화) (10)	2020.05.16
[Deep Learning] 커널의 의미 (0)	2019.10.05

[Deep Learning] MediaPipe

'AI Research Topic > Computer Vision Basics' 카테고리의 다른 글

티스토리툴바