A first-person vision dataset of office activities






We present a multi-subject first-person vision dataset of office activities. The dataset contains the highest number of subjects and activities compared to existing office activity datasets. Office activities include person-to-person interactions, such as chatting and handshaking, person-to-object interactions, such as using a computer or a whiteboard, as well as generic activities such as walking. The videos in the dataset present a number of challenges that, in addition to intra-class differences and inter-class similarities, include frames with illumination changes, motion blur, and lack of texture. Moreover, we present and discuss state-of-the-art features extracted from the dataset and base- line activity recognition results with a number of existing methods. The dataset is provided along with its annotation and the extracted features.






G. Abebe, A. Catala, A. Cavallaro, “A first-person vision dataset of office activities”, Proc. of Int. Workshop on Multimodal Pattern Recognition of Social Signals in Human Computer Interaction, Beijing, China, August 20, 2018





A few sample videos [data]


Continuous version of the video sequences (12 subjects) [1/4] [2/4] [3/4] [4/4]

The annotation [data]

Extracted features used in the paper [data]




Downloads for VIP Cup

Segmented video sequences (12 subjects) [1/4] [2/4] [3/4] [4/4]


New videos collected for VIP Cup (4 subjects) [1/2] [2/2]