Consiglio Nazionale delle Ricerche

Tipo di prodottoArticolo in rivista
TitoloLearning to weight for text classification
Anno di pubblicazione2018
FormatoElettronico
Autore/iMoreo Fernández A.D.; Esuli A.; Sebastiani F.
Affiliazioni autoriCNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy
Autori CNR e affiliazioni
  • ANDREA ESULI
  • ALEJANDRO DAVID MOREO FERNANDEZ
  • FABRIZIO SEBASTIANI
Lingua/e
  • inglese
AbstractIn information retrieval (IR) and related tasks, term weighting approaches typically consider the frequency of the term in the document and in the collection in order to compute a score reflecting the importance of the term for the document. In tasks characterized by the presence of training data (such as text classification) it seems logical to design a term weighting function that leverages the distribution (as estimated from training data) of the term across the classes of interest. Although "supervised term weighting" approaches that use this intuition have been described before, they have failed to show consistent improvements. In this article we analyse the possible reasons for this failure, and call consolidated assumptions into question. Following this criticism, we propose a novel supervised term weighting approach that, instead of relying on any predefined formula, learns a term weighting function optimised on the training set of interest; we dub this approach Learning to Weight (LTW). The experiments that we have run on several well-known benchmarks, and using different learning methods, show that our method outperforms previous term weighting approaches in text classification.
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da302
Pagine a316
Pagine totali15
RivistaIEEE transactions on knowledge and data engineering (Online)
Attiva dal 1989
Editore: Institute of Electrical and Electronics Engineers, - New York, NY
Paese di pubblicazione: Stati Uniti d'America
Lingua: inglese
ISSN: 1558-2191
Titolo chiave: IEEE transactions on knowledge and data engineering (Online)
Titolo proprio: IEEE transactions on knowledge and data engineering (Online)
Titolo abbreviato: IEEE trans. knowl. data eng. (Online)
Titoli alternativi:
  • Institute of Electrical and Electronics Engineers transactions on knowledge and data engineering (Online)
  • Transactions on knowledge and data engineering (Online)
  • Knowledge and data engineering (Online)
Numero volume della rivista32
Fascicolo della rivista2
DOI10.1109/TKDE.2018.2883446
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)
  • Scopus (Codice:2-s2.0-85057792872)
  • ISI Web of Science (WOS) (Codice:000507883700008)
Parole chiaveTerm weighting, Supervised term weighting, Text classification, Neural networks, Deep learning
Link (URL, URI)https://ieeexplore.ieee.org/document/8550687
Titolo parallelo-
Licenza-
Scadenza embargo-
Data di accettazione28/11/2018
Note/Altre informazioniOnline First - 28 November 2018 - I dati del volume e dalle pagine si riferiscono alla versione print, pubblicata nel 2020
Strutture CNR
  • ISTI — Istituto di scienza e tecnologie dell'informazione "Alessandro Faedo"
Moduli/Attività/Sottoprogetti CNR
  • ICT.P08.010.002 : Digital Libraries
Progetti Europei-
Allegati
Preprint author's version
Descrizione: Author's version of Learning To Weight for Text Classification
Tipo documento: application/pdf
Learning to weight for text classification (documento privato )
Descrizione: Published version
Tipo documento: application/pdf