CodeNewbie Community 🌱

arushikulkarni
arushikulkarni

Posted on

german_transliterate

Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to clean messy text (e.g. map peculiar Unicode encodings to ASCII) or replace common abbreviations in text in combination with various text mining tasks.

german_transliterate is a Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to clean messy text (e.g. map peculiar Unicode encodings to ASCII) or replace common abbreviations in text in combination with various text mining tasks.

However, it is particularly useful for Text-To-Speech (TTS) preprocessing (both in training and inference) and has features to support phonemic encoding of the results (e.g. with espeak-ng) afterwards as next step in the processing pipeline.

Is has been successfully applied to preprocessing with Mozilla TTS in combination with espeak-ng phonemes as input data to both training and inference pipeline.

Top comments (0)