Turkish NLP Suite

bg-shape

bg-shape

bg-shape

bg-shape

bg-shape

bg-shape

bg-shape

bg-shape

bg-shape

Turkish NLP Suite is a nonprofit organization that aims to deliver resources for Turkish NLP, including corpus, pretrained models, benchmarking resources and much more. We aim to bring excellence to the Turlish language and speech research by publishing and open-sourcig for all areas of Turkish processing including morphology, language modelling, subword modelling, corpora building and more. The organization is run by Duygu, your research scientist (originally )from Istanbul, (currently) living in Berlin and San Diego, CA.

Skills

Large-scale Turkish Corpus (99%)

Pretrained Turkish Models (99%)

Publications (99%)

Tutorials (99%)

Publications

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

portfolio-image

Resources

resource-image

resource-image

resource-image

resource-image

resource-image

resource-image

resource-image

resource-image

resource-image

resource-image

resource-image

resource-image

resource-image

Blogs

Ruj, Blazer ve Cesaret / Bir

Ruj, Blazer ve Cesaret / Bir "Biz" Hikâyesi

Bu yazıda 8 Mart için küçük bir “biz dili” manifestosuna başlıyoruz: Türkiye’de kadınların kadınlara yazdığı güzellik, stil ve yaşam yazılarında dayanışma nasıl kuruluyor? SüslüTrendler derleminden yola çıkarak; kombin, makyaj, iyi...

Optimal Turkish Subword Strategies I / Morphology Decides, Subwords Deliver

Optimal Turkish Subword Strategies I / Morphology Decides, Subwords Deliver

We proudly present the first part of our Turkish subword manifesto: how Turkish language modeling benefits from morphology. In this article, we compare character-, word-, and morphology-aware subword tokenization, where...

TrGLUE, Native and No‑Nonsense / A Turkish Benchmark You Can Trust

For a long time, Turkish models climbed leaderboards written somewhere else—translated datasets, English‑first assumptions, noisy web scrapes. Useful, yes. Representative of real Turkish? Not quite. TrGLUE is our answer: a...