Consiglio Nazionale delle Ricerche

Tipo di prodottoArticolo in rivista
TitoloPicture it in your mind: generating high level visual representations from textual descriptions
Anno di pubblicazione2018
Formato
  • Elettronico
  • Cartaceo
Autore/iCarrara F.; Esuli A.; Fagni T.; Falchi F.; Moreo Fernandez A.
Affiliazioni autoriCNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy
Autori CNR e affiliazioni
  • FABIO CARRARA
  • FABRIZIO FALCHI
  • ANDREA ESULI
  • TIZIANO FAGNI
  • ALEJANDRO DAVID MOREO FERNANDEZ
Lingua/e
  • inglese
AbstractIn this paper we tackle the problem of image search when the query is a short textual description of the image the user is looking for. We choose to implement the actual search process as a similarity search in a visual feature space, by learning to translate a textual query into a visual representation. Searching in the visual feature space has the advantage that any update to the translation model does not require to reprocess the (typically huge) image collection on which the search is performed. We propose various neural network models of increasing complexity that learn to generate, from a short descriptive text, a high level visual representation in a visual feature space such as the pool5 layer of the ResNet-152 or the fc6-fc7 layers of an AlexNet trained on ILSVRC12 and Places databases. The Text2Vis models we explore include (1) a relatively simple regressor network relying on a bag-of-words representation for the textual descriptors, (2) a deep recurrent network that is sensible to word order, and (3) a wide and deep model that combines a stacked LSTM deep network with a wide regressor network. We compare the models we propose with other search strategies, also including textual search methods that exploit state-of-the-art caption generation models to index the image collection.
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da208
Pagine a229
Pagine totali-
RivistaInformation retrieval (Boston)
Attiva dal 1998
Editore: Kluwer Academic Publishers - Boston
Paese di pubblicazione: Stati Uniti d'America
Lingua: inglese
ISSN: 1386-4564
Titolo chiave: Information retrieval (Boston)
Titolo proprio: Information retrieval. (Boston)
Titolo abbreviato: Inf. retr. (Boston)
Titoli alternativi:
  • Information retrieval (Dordrecht) (Boston)
  • Information retrieval (London) (Boston)
Numero volume della rivista21
Fascicolo della rivista2-3
DOI10.1007/s10791-017-9318-6
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)
  • Scopus (Codice:2-s2.0-85031427867)
  • ISI Web of Science (WOS) (Codice:000432210500004)
Parole chiaveImage Retrieval, Cross-media Retrieval, Text Representation
Link (URL, URI)https://link.springer.com/article/10.1007%2Fs10791-017-9318-6
Titolo parallelo-
Licenza-
Scadenza embargo-
Data di accettazione-
Note/Altre informazioniFirst Online: 14 October 2017
Strutture CNR
  • IIT — Istituto di informatica e telematica
  • ISTI — Istituto di scienza e tecnologie dell'informazione "Alessandro Faedo"
Moduli/Attività/Sottoprogetti CNR
  • ICT.P08.010.002 : Digital Libraries
Progetti Europei-
Allegati
Picture it in your mind... (documento privato )
Tipo documento: application/pdf
Picture it in your mind
Descrizione: post-print
Tipo documento: application/pdf