FedSearch - Federated network search engine

tl;dr: RAFT for frame-sequence + log-frame-sequence. Also propagate uncertainty and occlusion. generate several hypothesis, select least uncertain.
https://arxiv.org/abs/2305.12998

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

1004 followers · 421 posts · Server sigmoid.social

Open media

Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids

Wei Dong, Chris Choy, Charles Loop, Or Litany, Yuke Zhu, Anima Anandkumar

tl;dr: monodepth + SfM to init non-zero voxel grid, then densify and refine -> ScanNet scene <30 min

https://arxiv.org/abs/2305.13220

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

1004 followers · 420 posts · Server sigmoid.social

Open media

NeRFuser: Large-Scale Scene Representation by NeRF Fusion

Jiading Fang, Shengjie Lin, Igor Vasiljevic, Vitor Guizilini, Rares Ambrus, Adrien Gaidon, Gregory Shakhnarovich, Matthew R. Walter

tl;dr: render->SuperGlue registration->weighted blend
https://arxiv.org/abs/2305.13307.pdf

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

1004 followers · 419 posts · Server sigmoid.social

Open media

Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
v
Yang Liu, Muzhi Zhu, Hengtao Li, Hao Chen, Xinlong Wang, Chunhua Shen

tl;dr: segmented reference image of the same class -> use semantic correspondences to segment target image.
https://arxiv.org/abs/2305.13310.pdf

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

1004 followers · 418 posts · Server sigmoid.social

Open media

DAC: Detector-Agnostic Spatial Covariances for Deep Local Features

Javier Tirado-Garín, Frederik Warburg, Javier Civera

tl;dr: uncertainty in keypoint position from second moment matrix on scores instead of image (Baumberg-Affine).
https://arxiv.org/abs/2305.12250

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

1003 followers · 416 posts · Server sigmoid.social

Open media

Mimetic Initialization of Self-Attention Layers

Asher Trockman, J. Zico Kolter

tl;dr: Initialize ViT, so the attention maps has diagonal structure, similar to what was observed in trained one
https://arxiv.org/abs/2305.09828

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

1001 followers · 406 posts · Server sigmoid.social

Open media

SFD2: Semantic-guided Feature Detection and Description

Fei Xue, Ignas Budvytis, Roberto Cipolla

tl;dr: more principled approach to use segmentation for image matching, than class-name filtering.

https://arxiv.org/abs/2304.14845.pdf

#computervision #deeplearning
#dmytrotweetsaboutDL
#CVPR2023

#computervision #deeplearning #dmytrotweetsaboutdl #CVPR2023

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

998 followers · 404 posts · Server sigmoid.social

Open media

Explicit Correspondence Matching for Generalizable Neural Radiance Fields

Yuedong Chen, Haofei Xu, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai

tl;dr: correspondence helps density modeling, but frozen GMFlow is not enough - you have to finetune.

https://arxiv.org/abs/2304.12294.pdf

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

998 followers · 403 posts · Server sigmoid.social

Open media

The Devil is in the Upsampling: Architectural Decisions Made Simpler for Denoising with Deep Image Prior

Yilin Liu, Jiang Li, Yunkui Pang, Dong Nie, Pew-Thian Yap

tl;dr: it title. Bilinear upsampling FTW, transposed convolution fits noise.

https://arxiv.org/abs/2304.11409

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

998 followers · 402 posts · Server sigmoid.social

Open media

No Free Lunch in Self Supervised Representation Learning

Ihab Bendidi, Adrien Bardes, Ethan Cohen, Alexis Lamiable, Guillaume Bollot, Auguste Genovesio

tl;dr: augmentations define SSL embedding property

https://arxiv.org/abs/2304.11718.pdf
P.S. I call SSL "augmentation-supervised" for years

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

998 followers · 401 posts · Server sigmoid.social

Open media

3rd Place Solution to Meta AI Video Similarity Challenge

Shuhei Yokoo, Peifei Zhu, Junki Ishikawa, Rintaro Hasegawa

tl;dr: individual frame editing prediction (SSL) + Temporal Network + video meta-data filtering

https://arxiv.org/abs/2304.11964.pdf

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

993 followers · 398 posts · Server sigmoid.social

Open media

Are Local Features All You Need for Cross-Domain Visual Place Recognition?

Giovanni Barbarani et al.

tl;dr: SG, CVNet and DELG are good for retrieval reranking for VPR, new datasets are far from solved, night imagery is hard not only because of darkness.

https://arxiv.org/abs/2304.05887.pdf

#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

993 followers · 397 posts · Server sigmoid.social

Open media

SiLK -- Simple Learned Keypoints

Pierre Gleize, Weiyao Wang, Matt Feiszli

tl;dr: DISK simplified - trained on random homography with other augmentations -> near SOTA on #IMC2022, better than DISK or SG cosim
Also, padding is detrimental for performance.

https://arxiv.org/abs/2304.06194.pdf

#computervision #deeplearning
#dmytrotweetsaboutDL

#IMC2022 #computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

990 followers · 391 posts · Server sigmoid.social

Open media

ALIKED: A Lighter Keypoint and Descriptor Extraction Network via Deformable Transformation

Xiaoming Zhao, Xingming Wu, Weihai Chen, Peter C. Y. Chen, Qingsong Xu, Zhengguo Li

tl;dr: Journal ALIKE, many arch ablations. as good as DISK, but much faster

https://arxiv.org/abs/2304.03608.pdf
#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post

Dmytro Mishkin 🇺🇦 · @ducha_aiki

988 followers · 385 posts · Server sigmoid.social

Open media

OrienterNet: Visual Localization in 2D Public Maps with Neural Matching

Paul-Edouard Sarlin, Daniel DeTone, Tsun-Yi Yang, Armen Avetisyan, Julian Straub, Tomasz Malisiewicz, Samuel Rota Bulo, Richard Newcombe, Peter Kontschieder, Vasileios Balntas

tl;dr: Photo -> neural bird's view -> matching vs encoded 2D map -> profit!

https://arxiv.org/abs/2304.02009
#computervision #deeplearning
#dmytrotweetsaboutDL

#computervision #deeplearning #dmytrotweetsaboutdl

Last updated 2 years ago

Original post