These webservices for Language & Speech Technology are offered by the research group Language Machines and the Centre for Language and Speech Technology, both part of the Centre of Language Studies of Radboud University Nijmegen. We work in close collaboration with the Tilburg Centre for Creative Computing (TiCC), Tilburg University, and will include web-interfaces to various relevant tools developed there.

Here we make available various open-source computational linguistic tools for the web, through the software CLAM.

Many of these webservices support input and/or output in the FoLiA XML format, some of them will be explicitly designed for Dutch only.

You will need an account to access the tools. An account can be obtained free of charge for small-scale non-commercial use, so feel free to register. The webservices can be accessed by both human end-users, offering an in-browser interface, as well as by client software. Apply for an account here, staff members may need to review your application and activate your account. If you already have an account but lost your username or password, you can reset your password here.

Service Name Description
Ucto Tokeniser Ucto is a unicode-compliant tokeniser. It takes input in the form of one or more untokenised texts, and subsequently tokenises them. Several languages are supported, but the software is extensible to other languages.
Frog Frog is a suite containing a tokeniser, PoS-tagger, lemmatiser, morphological analyser, and dependency parser for Dutch, developed at Tilburg University. It is the successor of Tadpole.
Valkuil Valkuil is a Dutch spelling correction system.
Fowlt Fowlt is an English spelling correction system.
Oersetter Oersetter is a Frisian-Dutch, Dutch-Frisian Machine Translation system
T-scan T-scan is a Dutch text analytics tool for readability prediction.
Colibri Core Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patte rns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way.
FoliAStats Compute n-grams from FoLiA documents

Alpino Dependency parser for Dutch, developed at the University of Groningen
PICCL An OCR (Tesseract) and OCR post-correction pipeline (TICCL)
Frisian Forced Alignment Demo This webservice provides you a ctm file with word alignments given a Frisian speech recording and its transcription.
Automatic Transcription of Oral History Interviews Automatic Transcription of Oral History Interviews
DEDICON Daisylezer App ASR Simulation DEDICON Daisylezer App ASR Simulation

For further details please contact Antal van den Bosch, for technical support contact the respective developers or for issues pertaining to CLAM and this server contact Maarten van Gompel