Ir directamente a la navegación principal Ir directamente a la búsqueda Ir directamente al contenido principal

Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries

  • Edgar Margffoy-Tuay*
  • , Juan C. Pérez
  • , Emilio Botero
  • , Pablo Arbeláez
  • *Autor correspondiente de este trabajo
  • Universidad de los Andes Colombia

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

42 Citas (Scopus)

Resumen

We address the problem of segmenting an object given a natural language expression that describes it. Current techniques tackle this task by either (i) directly or recursively merging linguistic and visual information in the channel dimension and then performing convolutions; or by (ii) mapping the expression to a space in which it can be thought of as a filter, whose response is directly related to the presence of the object at a given spatial coordinate in the image, so that a convolution can be applied to look for the object. We propose a novel method that integrates these two insights in order to fully exploit the recursive nature of language. Additionally, during the upsampling process, we take advantage of the intermediate information generated when downsampling the image, so that detailed segmentations can be obtained. We compare our method against the state-of-the-art approaches in four standard datasets, in which it surpasses all previous methods in six of eight of the splits for this task.

Idioma originalInglés
Título de la publicación alojadaComputer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
EditoresVittorio Ferrari, Cristian Sminchisescu, Yair Weiss, Martial Hebert
EditorialSpringer Verlag
Páginas656-672
Número de páginas17
ISBN (versión impresa)9783030012519
DOI
EstadoPublicada - 2018
Publicado de forma externa
Evento15th European Conference on Computer Vision, ECCV 2018 - Munich, Alemania
Duración: 8 sept 201814 sept 2018

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen11215 LNCS
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia15th European Conference on Computer Vision, ECCV 2018
País/TerritorioAlemania
CiudadMunich
Período8/09/1814/09/18

Huella

Profundice en los temas de investigación de 'Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries'. En conjunto forman una huella única.

Citar esto