PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Valerii Likhosherstov, Anurag Arnab, Krzysztof Marcin Choromanski et al.
https://openreview.net/forum?id=zKnqZeUCLO
#videos #polyvit #modality