Construction of the Turkish national corpus (TNC)
Loading...

Date
2012
Authors
Yeşim Aksan
Mustafa Aksan
Ahmet Hasan Koltuksuz
Taner Sezer
Ümit Mersinli
Umut Ufuk Demirhan
Hakan Yilmazer
Özlem Kurtoglu
Gülsüm Atasoy
Seda Öz
Journal Title
Journal ISSN
Volume Title
Publisher
European Language Resources Association (ELRA)
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
This paper addresses theoretical and practical issues experienced in the construction of Turkish National Corpus (TNC). TNC is designed to be a balanced large scale (50 million words) and general-purpose corpus for contemporary Turkish. It has benefited from previous practices and efforts for the construction of corpora. In this sense TNC generally follows the framework of British National Corpus yet necessary adjustments in corpus design of TNC are made whenever needed. All throughout the process different types of open-source software are used for specific tasks and the resulting corpus is a free resource for non-commercial use. This paper presents TNC's design features web-based corpus management system carefully planned workflow and its webbased user-friendly search interface. © 2017 Elsevier B.V. All rights reserved.
Description
Keywords
Corpus Construction, Corpus Linguistics, Turkish National Corpus, Open Systems, Software Engineering, British National Corpora, Corpus Construction, Corpus Linguistics, Design Features, Management Systems, Practical Issues, Search Interfaces, Turkishs, Open Source Software, Open systems, Software engineering, British national corpora, Corpus construction, Corpus linguistics, Design features, Management systems, Practical issues, Search interfaces, Turkishs, Open source software
Fields of Science
Citation
WoS Q
Scopus Q
Source
8th International Conference on Language Resources and Evaluation LREC 2012
