The University of Pittsburgh English Language Institute Corpus (PELIC) dataset
-
Updated
Mar 31, 2023 - HTML
The University of Pittsburgh English Language Institute Corpus (PELIC) dataset
A corpus of short answers written by learners of English and graded with CEFR levels
Essay Grammar Checker trained on REALEC Corpus using SpaCy
Information and code about applying spelling correction to the PELIC dataset
Russian Learner Corpus, a platform for corpus search and annotation
Tool for converting error corpora to parallel datasets
The implementation of the Inspector tool.
Statistics on some error categories from the REALEC corpus.
Code for the thesis "A Corpus-Based Case Analysis on Syntactic Complexity in Russian ESL Learners’ Writing".
Supplementary material for "Correlations between accuracy, complexity, and task type: Learner corpus research"
Coursework on "Clustering of English texts on the basis of automated extraction of key properties"
Dataset of Estonian L2 writings and source code used to train and test machine learning models for CEFR-based classification.
Writing assistant
Text Normalization on Learner Texts (South Tyrolean German as a L2)
Add a description, image, and links to the learner-corpus topic page so that developers can more easily learn about it.
To associate your repository with the learner-corpus topic, visit your repo's landing page and select "manage topics."