Hand Detection in Egocentric Video and Investigation Towards Fine-grained Cooking Activities Recognition


Recently the analysis of egocentric videos is one of the hot topics in computer vision. In this paper, we mainly focus on cooking activities recognition in egocentric videos. To recognize cooking activities automatically and precisely, we need to solve the problems how to detect hand region in egocentric videos, how to represent hand motion and how to classify the cooking activities. In this research, to solve these problems, we propose a new cooking activities recognition method in egocentric videos. The characteristic points of our proposed method are 1) hand regions are accurately detected in cluttered background by exploiting color, texture and location information, 2) temporal hand features are extracted from sequential frame images with a thinning algorithm, 3) fully-connected multi layer neural network is utilized to recognize cooking activities with the extracted features. Towards fine-grained cooking activities recognition, we have investigated the performance of our proposed method with our benchmark including 12 fine-grained cooking activities in 5 coarse categories of them. From the experimental results, we have confirmed that our proposed method allows us to recognize cooking activities with 45.2%.

Technical Papers (Information and Communication Technology)