Consiglio Nazionale delle Ricerche

Tipo di prodottoArticolo in rivista
TitoloOptimizing non-decomposable measures with deep networks
Anno di pubblicazione2018
Formato
  • Elettronico
  • Cartaceo
Autore/iSanyal A.; Kumar P.; Kar P.; Chawla S.; Sebastiani F.
Affiliazioni autoriUniversity of Oxford, Oxford, United Kingdom; Alan Turing Institute, London, United Kingdom; Indian Institute of Technology Kanpur, Kanpur, India; Qatar Computing Research Institute, Doha, Qatar; CNR-ISTI, Pisa, Italy
Autori CNR e affiliazioni
  • FABRIZIO SEBASTIANI
Lingua/e
  • inglese
AbstractWe present a class of algorithms capable of directly training deep neural networks with respect to popular families of task-specific performance measures for binary classification such as the F-measure, QMean and the Kullback-Leibler divergence that are structured and non-decomposable. Our goal is to address tasks such as label-imbalanced learning and quantification. Our techniques present a departure from standard deep learning techniques that typically use squared or cross-entropy loss functions (that are decomposable) to train neural networks. We demonstrate that directly training with task-specific loss functions yields faster and more stable convergence across problems and datasets. Our proposed algorithms and implementations offer several advantages including (i) the use of fewer training samples to achieve a desired level of convergence, (ii) a substantial reduction in training time, (iii) a seamless integration of our implementation into existing symbolic gradient frameworks, and (iv) assurance of convergence to first order stationary points. It is noteworthy that the algorithms achieve this, especially point (iv), despite being asked to optimize complex objective functions. We implement our techniques on a variety of deep architectures including multi-layer perceptrons and recurrent neural networks and show that on a variety of benchmark and real data sets, our algorithms outperform traditional approaches to training deep networks, as well as popular techniques used to handle label imbalance.
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da1597
Pagine a1620
Pagine totali24
RivistaMachine learning
Attiva dal 1986
Editore: Kluwer Academic Publishers, - Boston/U.S.A.
Paese di pubblicazione: Stati Uniti d'America
Lingua: inglese
ISSN: 0885-6125
Titolo chiave: Machine learning
Titolo proprio: Machine learning.
Titolo abbreviato: Mach. learn.
Numero volume della rivista107
Fascicolo della rivista8-10
DOI10.1007/s10994-018-5736-y
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)
  • ISI Web of Science (WOS) (Codice:000440438700017)
  • Scopus (Codice:2-s2.0-85049596317)
Parole chiaveOptimization, Deep learning, F-measure, Task-specific training
Link (URL, URI)https://link.springer.com/article/10.1007/s10994-018-5736-y
Titolo parallelo-
Licenza-
Scadenza embargo-
Data di accettazione-
Note/Altre informazioni-
Strutture CNR
  • ISTI — Istituto di scienza e tecnologie dell'informazione "Alessandro Faedo"
Moduli/Attività/Sottoprogetti CNR
  • ICT.P08.010.002 : Digital Libraries
Progetti Europei-
Allegati
Optimizing non-decomposable measures with deep networks
Descrizione: Versione pre-print
Tipo documento: application/pdf
Optimizing non-decomposable measures with deep networks (documento privato )
Descrizione: Published version
Tipo documento: application/pdf