If you would use this treebank in any form of publication, please make sure you cite the following papers:
Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing.Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
Umut Sulubacak, Tuğba Pamay and Gülşen Eryiğit. IMST: A Revisited Turkish Dependency Treebank. In Proceedings of the 1st International Conference on Turkic Computational Linguistics (TurCLing) at CICLing, Konya, Turkey, 2016.
Kemal Oflazer, Bilge Say, Dilek Zeynep Hakkani-Tür, Gökhan Tür. Building a Turkish Treebank. In Building and Exploiting Syntactically-Annotated Corpora. Anne Abeille (ed.), Kluwer Academic Publishers, 2003.
Nart B. Atalay, Kemal Oflazer, Bilge Say. The Annotation Process in the Turkish Treebank. In Proceedings of the EACL Workshop on Linguistically Interpreted Corpora (LINC), Budapest, Hungary, 2003.
The IMST is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information).
Under the terms of the license,
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Upon accepting the terms, you must manually fill in and sign the provided requisition form, and then scan and send it by e-mail to the address specified in the form. You will receive an e-mail response.
If you would use this treebank in any form of publication, please make sure you cite the following paper:
Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing.Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
Tuğba Pamay, Umut Sulubacak, Dilara Torunoğlu-Selamet, Gülşen Eryiğit. The Annotation Process of the ITU Web Treebank. In Proceedings of the 9th Linguistic Annotation Workshop (LAW) at NAACL, Denver, CO, USA, 2015.
The ITU Web Treebank is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information).
Under the terms of the license,
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
If you would use this treebank in any form of publication, please make sure you cite the following paper:
Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing.Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
Tuğba Pamay, Umut Sulubacak, Dilara Torunoğlu-Selamet, Gülşen Eryiğit. The Annotation Process of the ITU Web Treebank. In Proceedings of the 9th Linguistic Annotation Workshop (LAW) at NAACL, Denver, CO, USA, 2015.
Umut Sulubacak, Memduh Gökırmak, Francis Tyers, Çağrı Çöltekin, Joakim Nivre, and Gülşen Eryiğit. Universal Dependencies for Turkish. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. Osaka, Japan, December 2016.
The IWT-UD Treebank is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information).
Under the terms of the license,
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
If you would use this propbank in any form of publication, please make sure you cite the following paper:
Gözde Gül Şahin, Eşref Adalı. Annotation of semantic roles for the Turkish proposition bankLanguage Resources and Evaluation: 1–34. 2017.
Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing.Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
Gözde Gül Şahin. Verb sense annotation for Turkish propbank via crowdsourcing. In Proceedings of the 17th International Conference Computational Linguistics and Intelligent Text Processing (CICLing), Konya, Turkey, 2016: 496–506.
Umut Sulubacak, Tuğba Pamay and Gülşen Eryiğit. IMST: A Revisited Turkish Dependency Treebank. In Proceedings of the 1st International Conference on Turkic Computational Linguistics (TurCLing) at CICLing, Konya, Turkey, 2016.
Available Turkish PropBank versions:
Turkish PropBank - Original: This version is compatible with IMST v1.1 with some minor modifications and used to obtain the results reported in (Şahin & Adalı, 2017)
Turkish PropBank - Latest: This version is compatible with the latest IMST release.
The lexicon of semantic frames, semantic role labeling source code and resources for crowdsourcing (task design files, instructions and task results) are available from project's main website.
The Turkish PropBank is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information).
Under the terms of the license,
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
If you would use this propbank in any form of publication, please make sure you cite the following paper:
Gözde Gül Şahin, Eşref Adalı. Annotation of semantic roles for the Turkish proposition bankLanguage Resources and Evaluation: 1–34. 2017.
Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing.Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
Gözde Gül Şahin. Verb sense annotation for Turkish propbank via crowdsourcing. In Proceedings of the 17th International Conference Computational Linguistics and Intelligent Text Processing (CICLing), Konya, Turkey, 2016: 496–506.
Umut Sulubacak, Memduh Gökırmak, Francis Tyers, Çağrı Çöltekin, Joakim Nivre, and Gülşen Eryiğit. Universal Dependencies for Turkish. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. Osaka, Japan, December 2016.
The lexicon of semantic frames, semantic role labeling source code and resources for crowdsourcing (task design files, instructions and task results) are available from project's main website.
Available Turkish PropBank-UD versions:
Turkish PropBank-UD - Original: This version is compatible with IMST-UD v1.3 with some minor modifications and used to obtain the results reported in (Şahin & Adalı, 2017)
Turkish PropBank-UD - Latest: This version is compatible with the latest IMST-UD release.
The Turkish PropBank-UD is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information).
Under the terms of the license,
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
This page provides normalization resources used in the following work:
Gülşen Eryiğit and Dilara Torunoğlu-Selamet. 2017. Social Media Text Normalization for Turkish. In Natural Language Engineering, 23(6): 835–875. (indexed in SCI-Expanded and SSCI)
Terms of Use
If you would use this normalization resources in any form of publication, please make sure you cite the following article:
Gülşen Eryiğit and Dilara Torunoğlu-Selamet. 2017. Social Media Text Normalization for Turkish. In Natural Language Engineering, 23(6): 835–875. (indexed in SCI-Expanded and SSCI)
Tuğba Pamay, Umut Sulubacak, Dilara Torunoğlu-Selamet, Gülşen Eryiğit. The Annotation Process of the ITU Web Treebank. In Proceedings of the 9th Linguistic Annotation Workshop (LAW) at NAACL, Denver, CO, USA, 2015.
The Normalization Resources are licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information).
Under the terms of the license,
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Upon accepting the terms, you must manually fill in and sign the provided requisition form, and then scan and send it by e-mail to the address specified in the form. You will receive an e-mail response.
This page provides code-switching resources used in the following work:
Yirmibeşoğlu, Z., & Eryiğit, G. (2018, November). Detecting code-switching between Turkish-English language pair.
In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text (pp. 110-115).
Terms of Use
If you would use this Code-Switching resources in any form of publication, please make sure you cite the following article:
Yirmibeşoğlu, Z., & Eryiğit, G. (2018, November). Detecting code-switching between Turkish-English language pair.
In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text (pp. 110-115).
The CodSwt,ch,ng Resources are licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information).
Under the terms of the license,
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Upon accepting the terms, you must manually fill in and sign the provided requisition form, and then scan and send it by e-mail to the address specified in the form. You will receive an e-mail response.
This page provides named entity resources used in the following work:
Gökhan Akın Şeker and Gülşen Eryiğit. 2017. Extending a CRF-based named entity recognition model for Turkish well formed text and user generated content of Turkish.
In Semantic Web Journal, doi:10.3233/SW-170253
NE Annotations: Train7, WFS7, TDS1_V4 and IWT Gazetteers Feature Templates
Terms of Use
If you would use this Named Entity Recognition resources in any form of publication, please make sure you cite the following article:
Gökhan Akın Şeker and Gülşen Eryiğit. 2017. Extending a CRF-based named entity recognition model for Turkish well formed text and user generated content of Turkish.
In Semantic Web Journal, doi:10.3233/SW-170253
The Named Entity Recognition Resources are licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information).
Under the terms of the license,
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Upon accepting the terms, you must manually fill in and sign the provided requisition form, and then scan and send it by e-mail to the address specified in the form. You will receive an e-mail response.