Relation-Based Associative Joint Location for Human Pose Estimation in Videos
Temporal Distance Matrices for Squat Classification(Waseda squat dataset)
Temporal Distance Matrices for Squat Classification(Waseda squat dataset)
HIFI-GAN
Speech-Split 2
ViVIT: A Video Vision Transformer
ViT