728x90
반응형
출처 : http://liris.cnrs.fr/voir/wiki/doku.php?id=datasets
Computer vision datasets
(kept by the computer vision and ML group of the LIRIS laboratory, INSA-Lyon)
Other dataset lists and surveys
Semantic Full Scene Labelling
- 2017 Mapillary Vistas Dataset 25000 high resolution images.
- 2017 Stanford 2D-3D-Semantics Dataset: 100k images, indoor, with segmentation and 3D labels (surface normals etc. ;S. Saverese; Paper
- 2016 ADE20K
- 2016 Citscapes dataset Cityscapes (2000 densely annotated frames, 20000 total)
- CamVid Dataset CamVaid
- Stanford Dataset
- KITTI Dataset Kitti
- Siftflow dataset SiftFlow
- NYU depth v2 dataset NYU
- Places dataset Places 2,5 millions d’images avec 205 Scènes Labellisées.
Synthetically created datasets
- 2017 Physically-Based Rendering for Indoor Scene Understanding 300 000 images.
Gesture recognition
- Review on datasets: http://www.sciencedirect.com/science/article/pii/S1077314214001568#
- 2015 INRIA LARSEN Dataset 12 Kinect video sequences of people in cluttered environment including MoCap ground truth
- 2014, 2013, 2012 ChaLearn Gesture Dataset
Full body pose estimation
- 2016 COCO 2016 Keypoint Challenge 90k RGB images
- 2015 ChaLearn Looking at People 2015 - Track 1: Human Pose Recovery 8000 RGB images
- 2014 H3.6M Full body pose, 32joints, multiple RGB cameras + Swiss ranger TOF camera
- 2014 MultiHumanPose Shelf & Campus Datasets, Multi-camera RGB images. Calibration available. 3200 images per camera, but ground truth is available for only 300 frames for Shelf and 270 frames for Campus.
- 2014 Poses in the Wild 30 video sequences (~30 RGB frames each)
- 2014 MPII Human Pose Dataset ~21K RGB images, also annotated with action classes
- 2014 PARSE dataset 300 RGB images, D. Ramanan. Additional images
- 2013 FLIC (Frames Labeled in Cinema) dataset : 3987+1016 RGB images
- 2013 KTH Multiview Football Dataset : 800 time frames, captured from 3 views (ground truth joint annotation & calibration data available) + 5907 images (ground truth joint annotation available)
- 2011 Utrecht Multi-Person Motion (UMPM) Multi-cameras (RGB), multiple activities. Camera calibration & ground truth available.
- 2010 "We are family" stickmen - multiple people per image 525 images
- 2010 LSP - Leeds Sports Pose Dataset 2000 RGB images
- 2008 Buffy pose dataset RGB images
- 2006 HumanEva dataset Multi-camera RGB / Grayscale images
Hand pose estimation
- 2017 Big Hand Dataset 2.2M Images
- 2015 FingerPaint dataset
Action recognition
Surveys and dataset lists
- Survey paper with descriptions and tables: PDF ( J.M. Chaquet, E.J. Carmona, A. Fernández-Caballero, A Survey of Video Datasets for Human Action and Activity Recognition, Computer Vision and Image Understanding, 2013)
- Survey paper with descriptions and tables : Orit Kliper-Gross, Tal Hassner, and Lior Wolf, The Action Similarity Labeling Challenge, PAMI 2012. PDF
- This page references a lot of datasets (including HARL) : https://www.cs.utexas.edu/~chaoyeh/web_action_data/dataset_list.html
Datasets
- 2017 Procedural Human Action Videos De Souza, Gaidon et al., CVPR 2017.
- 2017 20BN-something-something Dataset 256,591 labeled videos
- 2017 20BN-jester Dataset 148,092 videos,
- 2017 Edinburgh Ceilidh Overhead Video Data 16 dances with 2 dance patterns. Overhead video. Tracked people.
- 2017 AVA 64k videos, 60 classes, localized w/ bounding boxes
- 2017 PKU-MMD multi-modal human action understanding 51 classes, 66 subjects, 1076 videos, ~20 actions per video.
- 2015 A2D 7 actor classes x 8 actions, >=99 video / class
- 2015 Activity-net 203 classes, 137 video per class, from the web
- 2014 Sports 1-M (Google) 1 million youtube videos, 487 classes
- 2013 Hollywood3D
- 2013 YouCook Dataset
- 2012 UCF101 dataset
- 2011 The LIRIS dataset
- 2011 RGBD-HuDaAct Dataset
- 2011 VIRAT Video Dataset surveillance, 12 types of events
- 2011 HMDB51 : Large Video Database for Human Motion 51 action categories, 6849 clips.
- 2010 Olympic sports dataset 16 classes of sports.
- 2010 TV Human Interactions dataset For video retrieval (handshakes, high fives, hugs and kisses)
- 2010 UT Interaction dataset groundtruth : time + bounding boxes (shaking hands, pointing, hugging, pushing, kicking and punching)
- 2010 UT Tower dataset groundtruth : bounding boxes, foreground masks (pointing, standing, digging, walking, carrying, running, waving 1, waving 2, and jumping)
- 2009 The MSR dataset (hand clapping, hand waving, boxing)
- 2009 TUM Kitchen Dataset
- 2008 UIUC Action dataset Groundtruth: foreground masks (walking, running, jumping, waving, jumping jacks, clapping, jumping from sit-up, raise one hand, stretching out, turning, sitting to standing, crawling, pushing up and standing to sitting)
- 2007 Drinking and Smoking ("Coffee and Cigarettes") Including bounding boxes on key frames!
- 2007 The CASIA Action database outdoor, (walking, running, bending, jumping, crouching, fainting, wandering and punching a car)
- 2005 The VISOR dataset
- 2005 The ETISEO dataset (walking, running, sitting, lying, crouching, holding, pushing, jumping, pick up, puts down, fighting, queueing, tailgating, meeting and exchanging an object)
- 2004 The KTH dataset, Multi-KTH Dataset (Motion sequence with 6 persons each performing different KTH action. Includes camera motion, zoom and structured background with multiple planes.)
- 2004 The Behave dataset (Abnormal crime behavior)
- 2002 The CAVIAR dataset (Shopping mal; bounding boxes, activity label)
- 2001 The Weizmann dataset
- Different and various UCF datasets: Different UCF datasets: UCF50, UCF sports, UCF aerial actions, UCF youtube, UCF Crowd segmentation ( eye gaze annotations for the UCF sports dataset).
- Different and various MSR Action Recognition Datasets: MSRGesture3D, MSRDailyActivity3D, MSRAction3D
- Multiview datasets:
- 2010 VideoWeb Dataset focusus on interactions (people meeting, people following, vehicles turning, people dispersing, shaking hands, gesturing, waving, hugging, and pointing)
- 2009 i3DPost Multi-view Dataset Groundtruth inclues 3D mesh models (walking, running, jumping, bending, hand-waving, jumping in place, sitting-stand up, running-falling, walking-sitting, running-jumping-walking, handshaking, pulling, and facial-expressions)
- 2009 PETS 2009
- 2007 PETS 2007
- 2006 INRIA IXMAS dataset 5 cameras, daily living (nothing, checking watch, crossing arms, scratching head, sitting down, getting up, turning around, walking, waving, punching, kicking, pointing, picking up, throwing (over head), and throwing (from bottom up))
- 2006 HumanEva dataset Multi-camera RGB / Grayscale images
- Egocentric datasets:
- Accelerometer actions:
Body part segmentation
Object recognition and segmentation
Pedestrian detection
- 2014 PETA Dataset 19000 images
Motion capture
728x90
반응형
'기타 > 참고자료' 카테고리의 다른 글
[Tool] 파일명 일괄 변경하기 Darknamer (0) | 2020.09.11 |
---|---|
[참고자료] Bone age detection 대회 (0) | 2017.11.27 |
[참고자료] Portrait 자동 추출 관련 논문 (0) | 2017.07.05 |
[참고자료] First Contact with TensorFlow (0) | 2017.07.05 |
[참고자료] 도커 기반의 텐서플로우 개발환경 구축하기 (0) | 2017.07.05 |