Faria, P. (2015). Increased Recall in Annotation Variance Detection in Treebanks. In Text, Speech, and Dialogue (pp. 578-586). Springer International Publishing.

Abstract. Automatic inconsistency detection in parsed corpora is significantly helpful for building more and larger corpora of annotated texts. Inconsistencies are inevitable and originate from variance in annotation caused by different factors as, for instance, the lack of attention or the absence of clear annotation guidelines. In this paper, some results involving the automatic detection of annotation variance in parsed corpora are presented. In particular, it is shown that a generalization procedure substantially increases the recall of the variant detection algorithm proposed in [1].

Keywords: Treebank, Inconsistency detection, Quality control