We propose a system for activity detection, which utilizes the Action
Tubelet (ACT) Detector to localize activities in video data. Our
network is trained for all of activities in the ActEV dataset with a
backbone convolutional neural network pre-trained on the ImageNet
dataset. We inserted a thresholding module to the original ACT
framework to adapt detector to the ActEV task, since activities in
this task appear more sparsely distributed than those in the action
detection task. Our result was 0.882 in mean-p [email protected] at the AD
Leaderboard Evaluation.