METU-TDB Annotation Tool

Annotation Tool

The Annotation Tool (DATT - Discourse Annotation Tool for Turkish) has been developed for this project and uses the tags given below in the table. DATT produces XML files as annotation data which are generated by the implementation of a stand-off annotation methodology.

METU-TDB Tags

Annotated Discourse Elements	METU-TDB
Explicit connectives	annotated
Connective modifiers	annotated
1st and 2nd argument	annotated
Supporting Material	annotated
Reference elements in supporting material	annotated
Shared elements	annotated
Supporting material for shared elements	annotated

As explained in Aktaş et al. (2010), in order to avoid overlapping tags, stand off annotation has been preferred rather than in-line annotation. In stand-off markup, annotations are stored separate from data. Since the base file is not modified during annotation, it is guaranteed that all the annotators are dealing with the same version of the data. The text spans of dependency constructions are represented in terms of character offsets from the beginning of the text file. This is a highly errorprone way of storing annotation data. If there is a shift in the character indexes in the original text file, previously annotated data will be meaningless. To compensate for this, we keep the text spans of annotations for recovery purposes.

Annotation files are well-formed XML files. One can easily add new features to the annotations. XML facilities available as online sources such as the libraries for search and post-processing reduce the implementation effort of adding new features.

An XML output example is shown below:

Annotation Tool interface is shown below:

METU - TDB

METU - Turkish Discourse Bank

Annotation Tool

METU-TDB Tags

Annotated Discourse Elements

METU-TDB