Using Automatic Morphological Tools to Process Data from a Learner Corpus of Hungarian

Péter Durst; Martina Szabó; Veronika Vincze; János Zsibrita

Using Automatic Morphological Tools to Process Data from a Learner Corpus of Hungarian

Authors

Péter Durst
Martina Szabó
Veronika Vincze
János Zsibrita

Keywords:

Hungarian language, Natural language processing, Morphological parsing, Automatic error tagging, Learner corpus

Abstract

The aim of this article is to show how automatic morphological tools originally used to analyze native speaker data can be applied to process data from a learner corpus of Hungarian. We collected written data from 35 students majoring in Hungarian studies at the University of Zagreb, Croatia. The data were analyzed by magyarlanc, a sentence splitter, morphological analyzer, POS-tagger and dependency parser, which found 667 unknown word forms. We investigated the recommendations made by the Hungarian spellchecker hunspell for these unknown words and the correct forms were manually chosen. It was found that if the first suggestion made by hunspell was automatically accepted, an accuracy score of 82% could be attained. We also introduce our automatic error tagger, which makes use of our annotation scheme developed on the basis of the special characteristics of Hungarian morphology and learner language, and which is able to reliably locate and label morphological errors.

Downloads

Published

2014-06-27

Issue

Vol. 8 No. 3 (2014): Special issue on Learner Language, Learner Corpora: From corpus compilation to data analysis

Section

Articles

How to Cite

Durst, P., Szabó, M., Vincze, V., & Zsibrita, J. (2014). Using Automatic Morphological Tools to Process Data from a Learner Corpus of Hungarian. Apples - Journal of Applied Language Studies, 8(3), 39-54. https://apples.journal.fi/article/view/97871

Download Citation

Using Automatic Morphological Tools to Process Data from a Learner Corpus of Hungarian

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

How to Cite

Information

Make a Submission

Latest publications