FedSearch - Federated network search engine

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4324 posts · Server creative.ai

Open media

📝 Examining Autoexposure for Challenging Scenes 🔭

"Presents a new dataset of images and video sequences captured using a DSLR camera with a large solution space (i,e shutter speed from 1/500 to 15 seconds)." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04542v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4322 posts · Server creative.ai

Open media

📝 On the Efficacy of Multi-Scale Data Samplers for Vision Applications 🔭

"Shows that variable-batch-size multi-scale data sampling acts as an implicit regularizer which improves performance and model calibration, and makes models more robust to scaling and data distribution shifts." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04502v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Comp. Linguistics📚 · @arxiv_cl

222 followers · 3985 posts · Server creative.ai

Open media

📝 MoEController: Instruction-Based Arbitrary Image Manipulation with Mixture-of-Expert Controllers 🔭📚

"Leverages large language models (ChatGPT) and image synthesis models (ControlNet) to generate a large number of image-text pairs that can be used for global and local image manipulation datasets." [gal30b+] 🤖 #CV #CL

🔗 https://arxiv.org/abs/2309.04372v1 #arxiv

#cv #cl #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4322 posts · Server creative.ai

Open media

📝 MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask 🔭

"We advance the diffusion model with an adaptive mask, which is conditioned on the attention maps and the prompt embeddings, to dynamically adjust the contribution of each text token for the image features." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04399v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4320 posts · Server creative.ai

Open media

📝 CNN Injected Transformer for Image Exposure Correction 🔭

"A CNN Injected Transformer (CIT) is proposed to harness the individual strengths of CNN and Transformer simultaneously to perform exposure correction on images by incorporating a channel attention block (CAB) and a half-instance normalization block (HINB) into each window-based Transformer block." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04366v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Comp. Linguistics📚 · @arxiv_cl

222 followers · 3983 posts · Server creative.ai

Open media

📝 Evaluation and Mitigation of Agnosia in Multimodal Large Language Models 🔭📚

"Proposes EMMA, an evaluation-mitigation framework that automatically creates fine-grained and diverse visual question answering examples to assess the extent of agnosia in Multimodal Pre-trained Language Models (MLLMs) comprehensively." [gal30b+] 🤖 #CV #CL

🔗 https://arxiv.org/abs/2309.04041v1 #arxiv

#cv #cl #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4319 posts · Server creative.ai

Open media

📝 Mobile v-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts 🔭🧠

"Proposes a simplified and mobile-friendly MoE design where entire images rather than individual patches are routed to the experts to achieve better accuracy and efficiency trade-off on vision tasks." [gal30b+] 🤖 #CV #LG

🔗 https://arxiv.org/abs/2309.04354v1 #arxiv

#cv #lg #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4319 posts · Server creative.ai

Open media

📝 Unsupervised Object Localization with Representer Point Selection 🔭

"Proposes a novel unsupervised object localization method based on representer point selection, where the predictions of the model can be represented as a linear combination of representer values of training points by using the self-supervised pre-trained model." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04172v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4317 posts · Server creative.ai

Open media

📝 Grouping Boundary Proposals for Fast Interactive Image Segmentation 🔭

"The adaptive cut can disconnect the image domain such that the target contours are imposed to pass through this cut only once, and the selected boundary proposals and corresponding minimal paths are used to delineate the target contours." [gal30b+] 🤖 #CV

⚙️ https://github.com/Mirebeau/HamiltonFastMarching
🔗 https://arxiv.org/abs/2309.04169v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4316 posts · Server creative.ai

Open media

📝 Context-Aware Prompt Tuning for Vision-Language Model with Dual-Alignment 🔭

"Dual-Aligned Prompt Tuning (DuAl-PT) utilizes both implicit and explicit context modeling to learn more context-aware prompts that benefit from LLMs." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04158v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Robotics🦾 · @arxiv_ro

68 followers · 1345 posts · Server creative.ai

Open media

📝 Comparative Study of Visual SLAM-Based Mobile Robot Localization Using Fiducial Markers 🦾🔭

"The three approaches have similar algorithmic pipeline with a few variations, starting with the fiducial marker detection and camera pose estimation using the Perspective-n-Point (PnP) algorithm [ Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography, Fischler]." [gal30b+] 🤖 #RO #CV

🔗 https://arxiv.org/abs/2309.04441v1 #arxiv

#ro #cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4315 posts · Server creative.ai

Open media

📝 Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match vs. Mismatch Classification 🔭

"Proposes a match-vs-mismatch deep learning model to classify whether a video clip induces excitatory responses in recorded EEG signals and learn associations between the visual content and corresponding neural recordings." [gal30b+] 🤖 #CV #CE

🔗 https://arxiv.org/abs/2309.04153v1 #arxiv

#cv #ce #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4315 posts · Server creative.ai

Open media

📝 Representation Synthesis by Probabilistic Many-Valued Logic Operation in Self-Supervised Learning 🔭

"A new SSL method using mixed images, which is called logic-based SSL with mixed images (LSLwMI), and a new representation format based on many-valued logic were proposed." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04148v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4313 posts · Server creative.ai

Open media

📝 Robot Localization and Mapping Final Report -- Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry 🔭

"We propose to estimate depth and pose in a self supervised method where the task is modeled as an image generation task and pose estimation comes as a by product." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04147v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Graphics🖼️ · @arxiv_gr

65 followers · 1394 posts · Server creative.ai

Open media

📝 Score-Pa: Score-Based 3D Part Assembly 🔭

"The Score-based 3D Part Assembly Framework (Score-PA) for 3D part assembly formulates this task from a novel generative perspective and introduces a novel algorithm called Fast Predictor-Corrector Sampler (FPC) to accelerate the sampling process within the Score-PA framework." [gal30b+] 🤖 #CV

⚙️ https://github.com/J-F-Cheng/Score-PA_Score-based-3D-Part-Assembly
🔗 https://arxiv.org/abs/2309.04220v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4312 posts · Server creative.ai

Open media

📝 Depth Completion with Multiple Balanced Bases and Confidence for Dense Monocular SLAM 🔭

"Can predict multiple balanced bases and a confidence map from a monocular image with sparse points generated by off-the-shelf keypoint-based SLAM systems." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04145v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4311 posts · Server creative.ai

Open media

📝 From Text to Mask: Localizing Entities Using the Attention of Text-to-Image Diffusion Models 🔭

"A novel method is proposed to utilize the attention mechanism in diffusion models, which allows extracting rich word-pixel correlation from text-to-image diffusion models without re-training or inference-time optimization." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04109v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Graphics🖼️ · @arxiv_gr

65 followers · 1393 posts · Server creative.ai

Open media

📝 Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality 🔭

"Achieves real-time inference by unrolling an iterative cost aggregation in time (i,e, in the temporal dimension), which allows us to distribute and reuse the aggregated features over time." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04183v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4311 posts · Server creative.ai

Open media

📝 Toward Sufficient Spatial-Frequency Interaction for Gradient-Aware Underwater Image Enhancement 🔭

"Proposes a SFGNet for underwater image enhancement, which is consisted of a DSFFNet and a GAC for sufficient spatial-frequency interaction and detail correction." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.04089v1 #arxiv

#cv #arxiv

Last updated 1 year ago

Original post

arXiv Computer Vision🔭 · @arxiv_cv

165 followers · 4311 posts · Server creative.ai

Open media

📝 Towards Efficient SDRTV-to-HDRTV by Learning From Image Formation 🔭

"A novel three-step solution pipeline includes adaptive global color mapping, local enhancement, and highlight refinement, the adaptive global color mapping step uses global statistics as guidance to perform image-adaptive color mapping." [gal30b+] 🤖 #CV #MM

⚙️ https://github.com/xiaom233/HDRTVNet-plus
🔗 https://arxiv.org/abs/2309.04084v1 #arxiv

#cv #mm #arxiv

Last updated 1 year ago

Original post