Database

Corpus Paisà

Institute

Institute of computational linguistics "Antonio Zampolli" (ILC)

Referent

Vito Pirrelli
Email: vito.pirrelli@ilc.cnr.it

Description

A large (250 million tokens) corpus of authentic Italian contemporary texts from the web, freely available and freely distributable, fully annotated in CoNNL format, and openly accessible and searchable through an advanced, learner-oriented interface (ILC-CNR carried out the linguistic annotation of texts).

Web address

Url: http://www.corpusitaliano.it/preview/en/index.html

Access mode

On-line

Data tipology

Textual corpus

Database type

Corpus