Faria, P. e Galves, C. (2016). Criando “bancos de árvores”: o sistema de anotação e o processamento automático. Cadernos de Estudos Linguísticos, (58.2), Campinas, pp. 299-315 – mai./ago.

Abstract. In this paper, we highlight the tight relation between annotation systems and parsing by presenting an experiment for evaluation of alternative parses based on current and modified versions of the verbal tag system used in the Tycho Brahe Corpus. The modified version resulted in an improvement of two percentage points in the F1 measure of parsing accuracy, as evaluated by the evalb software. This result shows that the annotation system can be devised in order to be more concise and informative to the parser. As a conclusion, we suggest two guidelines for the specification of annotation systems and the training of the parser. Finally, the present discussion is contextualized by an outline and a brief discussion of the process of treebank building and of its importance for linguistic research.

Keywords: corpus linguistics, annotated corpora, automatic processing