http://www.cnr.it/ontology/cnr/individuo/prodotto/ID156632
A Scalable Algorithm for Metric High-Quality Clustering in Information Retrieval Tasks (Rapporti tecnici, manuali, carte geologiche e tematiche e prodotti multimediali)
- Type
- Label
- A Scalable Algorithm for Metric High-Quality Clustering in Information Retrieval Tasks (Rapporti tecnici, manuali, carte geologiche e tematiche e prodotti multimediali) (literal)
- Anno
- 2005-01-01T00:00:00+01:00 (literal)
- Alternative label
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
- Geraci F., Pellegrini M., Pisati P. (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#note
- Technical Report IIT TR-08/2005 (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#descrizioneSinteticaDelProdotto
- We consider the problem of finding efficiently a high quality k-clustering of n points in a (possibly discrete) metric space. Many methods are known when the point are vectors in a real vector space, and the distance function is a standard geometric distance such as L1, L2 (Euclidean) or L2 2 (squared Euclidean distance). In such cases efficiency is often sought via sophisticated multidimensional search structures for speeding up nearest neighbor queries (e.g. variants of kd-trees). Such techniques usually work well in spaces of moderately high dimension say up to 6 or 8). Our target is a scenario in which either the metric space cannot be mapped into a vector space, or, if this mapping is possible, the dimension of such a space is so high as to rule out the use of the above mentioned techniques. This setting is rather typical in Information Retrieval applications. We augment the well known furthest-point-first algorithm for kcenter clustering in metric spaces with a filtering step based on the triangular inequality and we compare this algorithm with some recent fast variants of the classical k-means iterative algorithm augmented with an analogous filtering schemes. We extensively tested the two solutions on synthetic geometric data and real data from Information Retrieval applications. (literal)
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#supporto
- Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
- Titolo
- A Scalable Algorithm for Metric High-Quality Clustering in Information Retrieval Tasks (literal)
- Prodotto di
- Autore CNR
- Insieme di parole chiave
Incoming links:
- Prodotto
- Autore CNR di
- Insieme di parole chiave di