Login
Datasets

License Agreement

Treebank Synopsis

[sample sentence]
  • # Sentences: 5635
  • # (Orthographic) Words: 56422
  • # (Syntactic) Tokens: 63066
  • # Single-headed Tokens: 61585
  • # Multi-headed Tokens: 1481
  • # Surface Dependencies (incl. DERIV): 63066
  • # Surface Dependencies (excl. DERIV): 56424
  • # Deep Dependencies: 1746
POS Tag #
Adj 7582 (11,70%)
Adverb 3096 (4,78%)
Conj 2260 (3,49%)
Det 1006 (1,55%)
Dup 23 (0,04%)
Interj 99 (0,15%)
Noun 24539 (37,86%)
Postp 1726 (2,66%)
Pron 2320 (3,58%)
Punc 10425 (16,08%)
Verb 11736 (18,11%)

Feature #
A1pl 565 (0,51%)
A1sg 1733 (1,56%)
A2pl 371 (0,33%)
A2sg 639 (0,58%)
A3pl 3888 (3,51%)
A3sg 27029 (24,39%)
Abl 1095 (0,99%)
Able 500 (0,45%)
Abr 1 (0,00%)
Acc 2678 (2,42%)
Aor 1117 (1,01%)
Caus 679 (0,61%)
Cond 155 (0,14%)
Cop 372 (0,34%)
Dat 2833 (2,56%)
Desr 134 (0,12%)
Dist 12 (0,01%)
Equ 38 (0,03%)
Fitfor 9 (0,01%)
Fut 315 (0,28%)
Gen 2884 (2,60%)
Hastily 12 (0,01%)
Imp 410 (0,37%)
Ins 810 (0,73%)
Loc 2218 (2,00%)
Narr 760 (0,69%)
Neces 58 (0,05%)
Neg 1054 (0,95%)
Nom 14303 (12,91%)
Noun 42 (0,04%)
Opt 148 (0,13%)
Ord 65 (0,06%)
P1pl 261 (0,24%)
P1sg 824 (0,74%)
P2pl 118 (0,11%)
P2sg 242 (0,22%)
P3pl 652 (0,59%)
P3sg 6345 (5,73%)
Pass 1198 (1,08%)
Past 3209 (2,90%)
Pnon 19059 (17,20%)
Pos 9920 (8,95%)
Pres 682 (0,62%)
Prog1 1329 (1,20%)
Prog2 35 (0,03%)
Prop 1 (0,00%)
Stay 3 (0,00%)

DepRel #
APPOSITION 79 (0,12%)
ARGUMENT 1782 (2,75%)
CONJUNCTION 1325 (2,04%)
COORDINATION 3062 (4,72%)
DERIV 6642 (10,25%)
DETERMINER 2159 (3,33%)
INTENSIFIER 1033 (1,59%)
MODIFIER 15058 (23,23%)
MWE:COMP 582 (0,90%)
MWE:CONJ 101 (0,16%)
MWE:DUP 228 (0,35%)
MWE:ENAMEX:LOC 83 (0,13%)
MWE:ENAMEX:ORG 255 (0,39%)
MWE:ENAMEX:PERS 313 (0,48%)
MWE:FORMEX 27 (0,04%)
MWE:IDEX 882 (1,36%)
MWE:LVC 541 (0,83%)
MWE:NCOMP 370 (0,57%)
MWE:NUMEX 61 (0,09%)
MWE:NUMEX:DATE 1 (0,00%)
MWE:NUMEX:MONEY 115 (0,18%)
MWE:NUMEX:PCT 46 (0,07%)
MWE:PROVERB 8 (0,01%)
MWE:SIMEX 15 (0,02%)
MWE:TIMEX:DATE 73 (0,11%)
MWE:TIMEX:TIME 17 (0,03%)
OBJECT 4530 (6,99%)
POSSESSOR 4081 (6,30%)
PREDICATE 5743 (8,86%)
PUNCTUATION 10374 (16,01%)
RELATIVIZER 128 (0,20%)
SUBJECT 4889 (7,54%)
VOCATIVE 209 (0,32%)

Terms of Use

  • If you would use this treebank in any form of publication, please make sure you cite the following papers:
    • Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing. Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
    • Umut Sulubacak, Tuğba Pamay and Gülşen Eryiğit. IMST: A Revisited Turkish Dependency Treebank. In Proceedings of the 1st International Conference on Turkic Computational Linguistics (TurCLing) at CICLing, Konya, Turkey, 2016.
    • Kemal Oflazer, Bilge Say, Dilek Zeynep Hakkani-Tür, Gökhan Tür. Building a Turkish Treebank. In Building and Exploiting Syntactically-Annotated Corpora. Anne Abeille (ed.), Kluwer Academic Publishers, 2003.
    • Nart B. Atalay, Kemal Oflazer, Bilge Say. The Annotation Process in the Turkish Treebank. In Proceedings of the EACL Workshop on Linguistically Interpreted Corpora (LINC), Budapest, Hungary, 2003.
  • The IMST is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information). Under the terms of the license,

    You are free to:

    • Share — copy and redistribute the material in any medium or format
    • Adapt — remix, transform, and build upon the material
    • The licensor cannot revoke these freedoms as long as you follow the license terms.

    Under the following terms:

    • AttributionYou must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
    • NonCommercial — You may not use the material for commercial purposes.
    • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
    • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
  • Upon accepting the terms, you must manually fill in and sign the provided requisition form, and then scan and send it by e-mail to the address specified in the form. You will receive an e-mail response.
I understand and agree to the Terms of Use.


License Agreement

Treebank Synopsis

[sample sentence]
  • # Sentences: 5009
  • # (Orthographic) Words: 43191
  • # (Syntactic) Tokens: 47226
  • # Single-headed Tokens: 46080
  • # Multi-headed Tokens: 1144
  • # Surface Dependencies (incl. DERIV): 47226
  • # Surface Dependencies (excl. DERIV): 43192
  • # Deep Dependencies: 1271
POS Tag #
Adj 5148 (10,62%)
Adverb 3572 (7,37%)
Conj 1709 (3,52%)
Det 974 (2,01%)
Dup 11 (0,02%)
Interj 309 (0,64%)
Noun 18556 (38,26%)
Num 2 (0,00%)
Postp 1519 (3,13%)
Pron 1905 (3,93%)
Punc 5983 (12,34%)
Verb 8809 (18,16%)

Feature #
A1pl 556 (0,66%)
A1sg 1882 (2,23%)
A2pl 546 (0,65%)
A2sg 835 (0,99%)
A3pl 2595 (3,08%)
A3sg 20759 (24,63%)
A3spl 2 (0,00%)
Abl 778 (0,92%)
Able 42 (0,05%)
Acc 2003 (2,38%)
Aor 1081 (1,28%)
Card 1 (0,00%)
Caus 9 (0,01%)
Cond 223 (0,26%)
Cop 406 (0,48%)
Dat 1907 (2,26%)
Desr 154 (0,18%)
Equ 49 (0,06%)
Fitfor 19 (0,02%)
Fut 326 (0,39%)
Gen 1223 (1,45%)
Imp 683 (0,81%)
Ins 342 (0,41%)
Loc 1459 (1,73%)
Mention 1 (0,00%)
Narr 492 (0,58%)
Neces 61 (0,07%)
Neg 1009 (1,20%)
Nom 12697 (15,07%)
Opt 162 (0,19%)
Ord 39 (0,05%)
P1pl 194 (0,23%)
P1sg 823 (0,98%)
P2pl 318 (0,38%)
P2sg 417 (0,49%)
P3pl 191 (0,23%)
P3sg 3259 (3,87%)
Pass 36 (0,04%)
Past 1707 (2,03%)
Pnom 3 (0,00%)
Pnon 15613 (18,53%)
Pos 7131 (8,46%)
Pres 909 (1,08%)
Prog1 1296 (1,54%)
Prog2 37 (0,04%)
Prop 1 (0,00%)

DepRel #
APPOSITION 16 (0,03%)
ARGUMENT 1555 (3,21%)
CONJUNCTION 921 (1,90%)
COORDINATION 2751 (5,67%)
DERIV 4034 (8,32%)
DETERMINER 1846 (3,81%)
INTENSIFIER 799 (1,65%)
MODIFIER 11808 (24,35%)
MWE:COMP 579 (1,19%)
MWE:CONJ 76 (0,16%)
MWE:DUP 139 (0,29%)
MWE:ENAMEX:LOC 22 (0,05%)
MWE:ENAMEX:ORG 122 (0,25%)
MWE:ENAMEX:PERS 136 (0,28%)
MWE:FORMEX 278 (0,57%)
MWE:IDEX 669 (1,38%)
MWE:LVC 660 (1,36%)
MWE:NCOMP 251 (0,52%)
MWE:NUMEX 36 (0,07%)
MWE:NUMEX:MONEY 50 (0,10%)
MWE:NUMEX:PCT 5 (0,01%)
MWE:PROVERB 23 (0,05%)
MWE:SIMEX 8 (0,02%)
MWE:TIMEX:DATE 41 (0,08%)
MWE:TIMEX:TIME 18 (0,04%)
OBJECT 3002 (6,19%)
POSSESSOR 2213 (4,56%)
PREDICATE 5027 (10,37%)
PUNCTUATION 5959 (12,29%)
RELATIVIZER 99 (0,20%)
SUBJECT 4124 (8,50%)
VOCATIVE 1230 (2,54%)

Terms of Use

  • If you would use this treebank in any form of publication, please make sure you cite the following paper:
    • Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing. Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
    • Tuğba Pamay, Umut Sulubacak, Dilara Torunoğlu-Selamet, Gülşen Eryiğit. The Annotation Process of the ITU Web Treebank. In Proceedings of the 9th Linguistic Annotation Workshop (LAW) at NAACL, Denver, CO, USA, 2015.
  • The ITU Web Treebank is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information). Under the terms of the license,

    You are free to:

    • Share — copy and redistribute the material in any medium or format
    • Adapt — remix, transform, and build upon the material
    • The licensor cannot revoke these freedoms as long as you follow the license terms.

    Under the following terms:

    • AttributionYou must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
    • NonCommercial — You may not use the material for commercial purposes.
    • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
    • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
I understand and agree to the Terms of Use.


License Agreement

Treebank Synopsis

  • # Sentences: 5009
  • # (Orthographic) Words: 44463
  • # (Syntactic) Tokens: 44545
  • # Single-headed Tokens: 44545
  • # Multi-headed Tokens: 0
  • # Surface Dependencies: 44545
  • # Deep Dependencies: 0
  • # Projective Dependencies: 43855 (98,45%)
  • # Non-Projective Dependencies: 690 (1,55%)
POS Tag #
ADJ 4495 (10,09%)
ADP 1682 (3,78%)
ADV 2940 (6,60%)
AUX 1003 (2,25%)
CCONJ 1705 (3,83%)
DET 967 (2,17%)
INTJ 309 (0,69%)
NOUN 12129 (27,23%)
NUM 1403 (3,15%)
PRON 1773 (3,98%)
PROPN 1530 (3,43%)
PUNCT 5776 (12,97%)
SYM 536 (1,20%)
VERB 8286 (18,60%)
X 11 (0,02%)

Feature #
Abbr=Yes 238 (0,19%)
Aspect=Imp 1148 (0,93%)
Aspect=Perf 6808 (5,54%)
Aspect=Prog 1333 (1,08%)
Case=Abl 745 (0,61%)
Case=Acc 1881 (1,53%)
Case=Dat 1849 (1,50%)
Case=Equ 48 (0,04%)
Case=Gen 986 (0,80%)
Case=Ins 331 (0,27%)
Case=Loc 1404 (1,14%)
Case=Nom 12052 (9,80%)
Echo=Rdp 11 (0,01%)
Evident=Nfh 502 (0,41%)
Mood=AbilCnd 1 (0,00%)
Mood=AbilGen 3 (0,00%)
Mood=AbilImp 1 (0,00%)
Mood=Cnd 220 (0,18%)
Mood=CndPot 1 (0,00%)
Mood=Des 153 (0,12%)
Mood=Gen 376 (0,31%)
Mood=GenNec 13 (0,01%)
Mood=GenPot 3 (0,00%)
Mood=Imp 670 (0,55%)
Mood=ImpPot 1 (0,00%)
Mood=Ind 7597 (6,18%)
Mood=Nec 48 (0,04%)
Mood=Opt 161 (0,13%)
Mood=Pot 39 (0,03%)
Mood=Prs 10 (0,01%)
Negative=Neg 152 (0,12%)
NumType=Card 1362 (1,11%)
NumType=Ord 39 (0,03%)
Number=Plur 3369 (2,74%)
Number=Sing 21543 (17,53%)
Number[psor]=Plur 676 (0,55%)
Number[psor]=Sing 4304 (3,50%)
Person=1 2394 (1,95%)
Person=2 1350 (1,10%)
Person=3 21168 (17,22%)
Person[psor]=1 971 (0,79%)
Person[psor]=2 715 (0,58%)
Person[psor]=3 3294 (2,68%)
Polarity=Neg 1075 (0,87%)
Polarity=Pos 7057 (5,74%)
Polite=Form 37 (0,03%)
Polite=Infm 1297 (1,06%)
PronType=Dem 361 (0,29%)
PronType=Ind 189 (0,15%)
PronType=Prs 860 (0,70%)
Reflex=Yes 108 (0,09%)
Tense=Aor 1091 (0,89%)
Tense=AorPast 57 (0,05%)
Tense=Fut 442 (0,36%)
Tense=FutPast 19 (0,02%)
Tense=Past 2705 (2,20%)
Tense=Pqp 61 (0,05%)
Tense=Pres 4923 (4,01%)
VerbForm=Conv 532 (0,43%)
VerbForm=Part 1442 (1,17%)
VerbForm=Vnoun 651 (0,53%)
Voice=Cau 8 (0,01%)
Voice=CauPass 1 (0,00%)
Voice=Pass 35 (0,03%)

DepRel #
acl 990 (2,22%)
advmod 2784 (6,25%)
advmod:emph 782 (1,76%)
amod 2471 (5,55%)
appos 7 (0,02%)
aux:q 359 (0,81%)
case 1666 (3,74%)
cc 651 (1,46%)
ccomp 15 (0,03%)
compound 1792 (4,02%)
compound:lvc 642 (1,44%)
compound:redup 104 (0,23%)
conj 3908 (8,77%)
cop 783 (1,76%)
csubj 6 (0,01%)
det 1449 (3,25%)
discourse 327 (0,73%)
fixed 93 (0,21%)
flat 431 (0,97%)
mark 74 (0,17%)
nmod 2262 (5,08%)
nmod:poss 2025 (4,55%)
nsubj 3324 (7,46%)
nummod 492 (1,10%)
obj 2804 (6,29%)
obl 3510 (7,88%)
punct 5785 (12,99%)
root 5009 (11,24%)

Terms of Use

  • If you would use this treebank in any form of publication, please make sure you cite the following paper:
    • Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing. Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
    • Tuğba Pamay, Umut Sulubacak, Dilara Torunoğlu-Selamet, Gülşen Eryiğit. The Annotation Process of the ITU Web Treebank. In Proceedings of the 9th Linguistic Annotation Workshop (LAW) at NAACL, Denver, CO, USA, 2015.
    • Umut Sulubacak, Memduh Gökırmak, Francis Tyers, Çağrı Çöltekin, Joakim Nivre, and Gülşen Eryiğit. Universal Dependencies for Turkish. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. Osaka, Japan, December 2016.
  • The IWT-UD Treebank is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information). Under the terms of the license,

    You are free to:

    • Share — copy and redistribute the material in any medium or format
    • Adapt — remix, transform, and build upon the material
    • The licensor cannot revoke these freedoms as long as you follow the license terms.

    Under the following terms:

    • AttributionYou must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
    • NonCommercial — You may not use the material for commercial purposes.
    • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
    • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
I understand and agree to the Terms of Use.


License Agreement

Turkish PropBank Synopsis

[sample sentence]

Turkish PropBank Latest

  • # Sentences: 5635
  • # (Orthographic) Words: 56422
  • # (Syntactic) Tokens: 63066
  • # Predicates: 11738
  • # Arguments: 21169
  • # Predicates with First Sense: 7961
  • # Distinct Lemmas: 685
  • # Distinct Senses: 1052
  • # Distinct Roles: 36
  • # Total Morphemes: 260089
  • # Average Morphemes: 4.610716

Turkish PropBank Original

  • # Sentences: 5633
  • # (Orthographic) Words: 56401
  • # (Syntactic) Tokens: 63058
  • # Predicates: 11742
  • # Arguments: 21198
  • # Predicates with First Sense: 7969
  • # Distinct Lemmas: 685
  • # Distinct Senses: 1052
  • # Distinct Roles: 36
  • # Total Morphemes: 260049
  • # Average Morphemes: 4.609709
Argument Types #
A1 7801
A0 3791
AM-TMP 1611
AM-MNR 1485
A2 1329
AM-LOC 858
AM-LVB 692
A4 621
AM-GOL 415
AM-EXT 384
AM-CAU 359
AM-ADV 317
A3 289
A-A 178
AM-INS 167
AM-PRD 144
AM-DIS 139
C-A1 126
AM-TWO 92
AM-COM 83
AM-DIR 78
AM-NEG 67
C-A0 52
R-A0 27
R-A1 22
C-A2 20
Argument Types #
A1 7812
A0 3800
AM-TMP 1614
AM-MNR 1486
A2 1330
AM-LOC 859
AM-LVB 693
A4 621
AM-GOL 416
AM-EXT 384
AM-CAU 359
AM-ADV 317
A3 289
A-A 178
AM-INS 167
AM-PRD 144
AM-DIS 139
C-A1 126
AM-TWO 92
AM-COM 84
AM-DIR 78
AM-NEG 67
C-A0 52
R-A0 27
R-A1 22
C-A2 20

Terms of Use

  • If you would use this propbank in any form of publication, please make sure you cite the following paper:
    • Gözde Gül Şahin, Eşref Adalı. Annotation of semantic roles for the Turkish proposition bank Language Resources and Evaluation: 1–34. 2017.
    • Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing. Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
    • Gözde Gül Şahin. Verb sense annotation for Turkish propbank via crowdsourcing. In Proceedings of the 17th International Conference Computational Linguistics and Intelligent Text Processing (CICLing), Konya, Turkey, 2016: 496–506.
    • Umut Sulubacak, Tuğba Pamay and Gülşen Eryiğit. IMST: A Revisited Turkish Dependency Treebank. In Proceedings of the 1st International Conference on Turkic Computational Linguistics (TurCLing) at CICLing, Konya, Turkey, 2016.
  • Available Turkish PropBank versions:
    • Turkish PropBank - Original: This version is compatible with IMST v1.1 with some minor modifications and used to obtain the results reported in (Şahin & Adalı, 2017)
    • Turkish PropBank - Latest: This version is compatible with the latest IMST release.
  • The lexicon of semantic frames, semantic role labeling source code and resources for crowdsourcing (task design files, instructions and task results) are available from project's main website.
  • The Turkish PropBank is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information). Under the terms of the license,

    You are free to:

    • Share — copy and redistribute the material in any medium or format
    • Adapt — remix, transform, and build upon the material
    • The licensor cannot revoke these freedoms as long as you follow the license terms.

    Under the following terms:

    • AttributionYou must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
    • NonCommercial — You may not use the material for commercial purposes.
    • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
    • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
I understand and agree to the Terms of Use.


License Agreement

Turkish PropBank-UD Synopsis

[sample sentence]

Turkish PropBank-UD Latest

  • # Sentences: 5635
  • # (Orthographic) Words: 58085
  • # (Syntactic) Tokens: 58146
  • # Predicates: 11729
  • # Arguments: 21159
  • # Predicates with First Sense: 8123
  • # Distinct Lemmas: 685
  • # Distinct Senses: 1052
  • # Distinct Roles: 36
  • # Total Morphemes: 314936
  • # Average Morphemes: 5.421985

Turkish PropBank-UD Original

  • # Sentences: 5635
  • # (Orthographic) Words: 58059
  • # (Syntactic) Tokens: 58060
  • # Predicates: 11729
  • # Arguments: 21184
  • # Predicates with First Sense: 8131
  • # Distinct Lemmas: 685
  • # Distinct Senses: 1052
  • # Distinct Roles: 36
  • # Total Morphemes: 314414
  • # Average Morphemes: 5.415422
Argument Types #
A1 7782
A0 3798
AM-TMP 1614
AM-MNR 1482
A2 1330
AM-LOC 859
AM-LVB 693
A4 621
AM-GOL 415
AM-EXT 382
AM-CAU 359
AM-ADV 317
A3 289
A-A 178
AM-INS 167
AM-PRD 144
AM-DIS 139
C-A1 126
AM-TWO 92
AM-COM 84
AM-DIR 78
AM-NEG 67
C-A0 52
R-A0 27
R-A1 22
C-A2 20
Argument Types #
A1 7805
A0 3798
AM-TMP 1614
AM-MNR 1482
A2 1330
AM-LOC 859
AM-LVB 693
A4 621
AM-GOL 416
AM-EXT 383
AM-CAU 359
AM-ADV 317
A3 289
A-A 178
AM-INS 167
AM-PRD 144
AM-DIS 139
C-A1 126
AM-TWO 92
AM-COM 84
AM-DIR 78
AM-NEG 67
C-A0 52
R-A0 27
R-A1 22
C-A2 20

Terms of Use

  • If you would use this propbank in any form of publication, please make sure you cite the following paper:
    • Gözde Gül Şahin, Eşref Adalı. Annotation of semantic roles for the Turkish proposition bank Language Resources and Evaluation: 1–34. 2017.
    • Umut Sulubacak, Gülşen Eryiğit. Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing. Turkish Journal of Electrical Engineering & Computer Sciences, DOI: 10.3906/elk-1706-81):1–23. Accepted for publication 9 January 2018.
    • Gözde Gül Şahin. Verb sense annotation for Turkish propbank via crowdsourcing. In Proceedings of the 17th International Conference Computational Linguistics and Intelligent Text Processing (CICLing), Konya, Turkey, 2016: 496–506.
    • Umut Sulubacak, Memduh Gökırmak, Francis Tyers, Çağrı Çöltekin, Joakim Nivre, and Gülşen Eryiğit. Universal Dependencies for Turkish. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. Osaka, Japan, December 2016.
  • The lexicon of semantic frames, semantic role labeling source code and resources for crowdsourcing (task design files, instructions and task results) are available from project's main website.
  • Available Turkish PropBank-UD versions:
    • Turkish PropBank-UD - Original: This version is compatible with IMST-UD v1.3 with some minor modifications and used to obtain the results reported in (Şahin & Adalı, 2017)
    • Turkish PropBank-UD - Latest: This version is compatible with the latest IMST-UD release.
  • The Turkish PropBank-UD is licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information). Under the terms of the license,

    You are free to:

    • Share — copy and redistribute the material in any medium or format
    • Adapt — remix, transform, and build upon the material
    • The licensor cannot revoke these freedoms as long as you follow the license terms.

    Under the following terms:

    • AttributionYou must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
    • NonCommercial — You may not use the material for commercial purposes.
    • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
    • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
I understand and agree to the Terms of Use.


License Agreement

Normalization Resources

  • This page provides normalization resources used in the following work:
    • Gülşen Eryiğit and Dilara Torunoğlu-Selamet. 2017. Social Media Text Normalization for Turkish. In Natural Language Engineering, 23(6): 835–875. (indexed in SCI-Expanded and SSCI)

Terms of Use

  • If you would use this normalization resources in any form of publication, please make sure you cite the following article:
    • Gülşen Eryiğit and Dilara Torunoğlu-Selamet. 2017. Social Media Text Normalization for Turkish. In Natural Language Engineering, 23(6): 835–875. (indexed in SCI-Expanded and SSCI)
    • Tuğba Pamay, Umut Sulubacak, Dilara Torunoğlu-Selamet, Gülşen Eryiğit. The Annotation Process of the ITU Web Treebank. In Proceedings of the 9th Linguistic Annotation Workshop (LAW) at NAACL, Denver, CO, USA, 2015.
  • The Normalization Resources are licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information). Under the terms of the license,

    You are free to:

    • Share — copy and redistribute the material in any medium or format
    • Adapt — remix, transform, and build upon the material
    • The licensor cannot revoke these freedoms as long as you follow the license terms.

    Under the following terms:

    • AttributionYou must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
    • NonCommercial — You may not use the material for commercial purposes.
    • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
    • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
  • Upon accepting the terms, you must manually fill in and sign the provided requisition form, and then scan and send it by e-mail to the address specified in the form. You will receive an e-mail response.
I understand and agree to the Terms of Use.


License Agreement

Code-Switching Turkish-English Language Pair

  • This page provides code-switching resources used in the following work:
    • Yirmibeşoğlu, Z., & Eryiğit, G. (2018, November). Detecting code-switching between Turkish-English language pair. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text (pp. 110-115).

Terms of Use

  • If you would use this Code-Switching resources in any form of publication, please make sure you cite the following article:
    • Yirmibeşoğlu, Z., & Eryiğit, G. (2018, November). Detecting code-switching between Turkish-English language pair. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text (pp. 110-115).
  • The CodSwt,ch,ng Resources are licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information). Under the terms of the license,

    You are free to:

    • Share — copy and redistribute the material in any medium or format
    • Adapt — remix, transform, and build upon the material
    • The licensor cannot revoke these freedoms as long as you follow the license terms.

    Under the following terms:

    • AttributionYou must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
    • NonCommercial — You may not use the material for commercial purposes.
    • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
    • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
  • Upon accepting the terms, you must manually fill in and sign the provided requisition form, and then scan and send it by e-mail to the address specified in the form. You will receive an e-mail response.
I understand and agree to the Terms of Use.


License Agreement

Named Entity Recognition Resources

  • This page provides named entity resources used in the following work:
    • Gökhan Akın Şeker and Gülşen Eryiğit. 2017. Extending a CRF-based named entity recognition model for Turkish well formed text and user generated content of Turkish. In Semantic Web Journal, doi:10.3233/SW-170253

    NE Annotations: Train7, WFS7, TDS1_V4 and IWT

    Gazetteers

    Feature Templates

Terms of Use

  • If you would use this Named Entity Recognition resources in any form of publication, please make sure you cite the following article:
    • Gökhan Akın Şeker and Gülşen Eryiğit. 2017. Extending a CRF-based named entity recognition model for Turkish well formed text and user generated content of Turkish. In Semantic Web Journal, doi:10.3233/SW-170253
  • The Named Entity Recognition Resources are licensed under Creative Commons (BY-NC-SA 4.0). A summary for the terms of the license is given below (see here for more information). Under the terms of the license,

    You are free to:

    • Share — copy and redistribute the material in any medium or format
    • Adapt — remix, transform, and build upon the material
    • The licensor cannot revoke these freedoms as long as you follow the license terms.

    Under the following terms:

    • AttributionYou must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
    • NonCommercial — You may not use the material for commercial purposes.
    • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
    • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
  • Upon accepting the terms, you must manually fill in and sign the provided requisition form, and then scan and send it by e-mail to the address specified in the form. You will receive an e-mail response.
I understand and agree to the Terms of Use.

Report a bug