
Accesul la această resursă este restricționat. Pentru a o descărca adresați-vă unui membru al echipei.

Autori: Cătălina Mărănduc, Augusto Perez

Treebank pentru limba română diacronică al Universității Alexandru Ioan Cuza din Iași, adnotat în formalismul Universal Dependency Grammar.

Din cele 21.403 fraze ale resursei, 2.500 reprezintă folclor din România și din Republica Moldova, iar restul sunt texte vechi din secolele XVI-XVII. Menționăm că o parte din treebank UAIC (4.000 fraze) a fost transpus de către Augusto Perez în format UD și inclus într-un alt treebank, UD-Romanian RRT în anul 2015.

Dimensiunea resursei: 21.403 fraze și 449.959 cuvinte și semne de punctuație, adnotate manual.


  • Treebank
  1. Bobicev, V., T. Bumbu, V. Lazu, V. Maxim, D. Istrati, Folk poetry for computers: Moldovan Codri’s ballads parsin, in Proceedings of the 12th International Conference “Linguistic Resources and Tools for Processing the Romanian Language, pp. 39-50, 2016.
  2. Svetlana C.,, A. Colesnicov, L. Malahov, Digitization of Old Romanian Texts Printed in the Cyrillic Script, in Proceedings of International Conference on Digital Access to Textual Cultural Heritage. pages 143–148, 2017.
  3. Colhon, M., C. Mărănduc, C. Mititelu, A Multiform Balanced Dependency Treebank for Romanian, in Proceedings of Knowledge Resources for the Socio-Economic Sciences and Humanities, (KnowRSH), Varna, Bulgaria September 8, 2017 workshop at the Recent Advances in Natural Language Processing (RANLP) p. 9-19, 2017.
  4. Mărănduc. C., F. Hociung, V. Bobicev, Treebank Annotator for multiple formats and conventions, in Proceedings of The 4th Conference of Mathematical and Computer Science Society of the Republic of Moldova, Chisinau, Republic of Moldova, June 28 – July 2, 2017, p. 529-534, 2017.
  5. Mărănduc C., V. Bobicev, C.-A. Perez, Tools for Building a Corpus to Study the Historical and Geographical Variation of the Romanian Language, in Proceeding of Language technology for Digital Humanities in Central and (South-) Eastern Europe (LT4DH-CEE 2017) Varna, Bulgaria September 8, 2017 workshop at the Recent Advances in Natural Language Processing (RANLP) conference, p. 10-20, 2017.
  6. Mărănduc C., V. Bobicev, Non Standard Treebank Romania – Republic of Moldova in the Universal Dependencies, in Proceedings of Conference on Mathematical Foundations of Informatics (MFOI-2017) November 9–11, 2017, Chisinau, Moldova, pp. 111-116, 2017.
  7. Mărănduc C., C. Mititelu, V. Bobicev, Syntactic Semantic Correspondence in Dependency Grammar, in Proceedings of 16th International Workshop on Treebanks and Linguistic Theories Prague, Jan. 23-24, 2018.
  8. Mărănduc C., V. Bobicev, R. Untilov, Syntactic Parser for Old and Regional Romanian, at the 3-rd DATeCH Conference, Brussels May 2019.

Leave a Reply

Your email address will not be published. Required fields are marked*