Banca dati

CItA (Corpus Italiano di Apprendenti L1)


Istituto di linguistica computazionale "Antonio Zampolli" (ILC)


Felice Dell'Orletta


CItA (Corpus Italiano di Apprendenti L1), is the first freely available and digitalized corpus of essays written by Italian L1 learners. It was collected in 7 different lower secondary schools located in different areas of Rome: 3 schools are in the historical center and 4 schools in suburbs. The current version of the corpus contains 1,353 essays (for a total of 369,456 tokens) manually annotated for errors and corrections, but it is constantly updated. It is also accompanied by a questionnaire including 34 questions about biographical, socio-cultural and sociolinguistic background of students. The resource was jointly compiled by the ItaliaNLP Lab and the experimental pedagogists of the Department of Psychology of Developmental Processes and Socialization at the Sapienza University of Rome.

Indirizzo internet


Modalità di accesso

Freely downloadable from the Internet

Tipologia di dati

Textual corpus

Tipo database