Отворени ресурси

Vikizvornik > WikiSource-based dataset for Sebrian, plain text, 20 million words, published in open access under permisive licence https://huggingface.co/datasets/procesaur/Vikizvornik

Vikipedija > Wikipedia-based dataset for Sebrian, plain text, 150 million words, published in open access under permisive licence https://huggingface.co/datasets/procesaur/Vikipedija

Kišobran > Umbrella web plain text corpus for Serbian, 8.2 billion words, published in open access under permisive licence https://huggingface.co/datasets/procesaur/kisobran

S.T.A.R.S. > Scientific publication corpora for Serbian, 700 million words, published in open access under permisive licence https://huggingface.co/datasets/procesaur/STARS