University of Colombo School of Computing - Langauge Technology Research Laboratory

Speech Initiative

The Sinhala Text-to-Speech Project was a major deleverable of PAN Localization project, funded by the IDRC of Canada. This system is the result of two years of research by the Language Technology Research Laboratory (LTRL), University of Colombo School of Computing(UCSC), Sri Lanka.

The aim was to produce a free and open-source Sinhala Text-to-Speech synthesis engine of commercial quality. The system is based on the Festival speech synthesizer, developed at University of Edinburgh.

The resources are available under GNU General Public License.

The project outputs are classified according to the type of people who will need them, as follows:-

* End users: End users who simply wish to install and use a Sinhala speech synthesiser.
* Developers: Developers and Researchers who are working in this area, who are willing to study, adapt or incorporate Sinhala Text-to-Speech Synthesis engine into their work.

Resources for End Users

Download: Microsoft Speech API compilant Sinhala Text-to-Speech engine setup

Download: Festival Sinhala voice( Festival-si) Debian Linux Package

Download archived Festival Sinhala Voice: ucsc_sin_sdn_diphone.zip

Instruction to set up Gnopernicus to read Sinhala text in the screen.

Resources for Developers

Publications

A Rule Based Syllabification Algorithm for Sinhala (285 KB)
Ruvan Weerasinghe, Asanka Wasala and Kumudu Gamage, Language Technology Research Laboratory, University of Colombo School of Computing

Abstract.
This paper presents a study of Sinhala syllable structure and an algorithm for identifying syllables in Sinhala words. After a thorough study of the Syllable structure and linguistic rules for syllabification of Sinhala words and a survey of the relevant literature, a set of rules was identified and implemented as a simple, easy-to-implement algorithm. The algorithm was tested using 30,000 distinct words obtained from a corpus and compared with the same words manually syllabified. The algorithm performs with 99.95 % accuracy.

Sinhala Grapheme to Phoneme Conversion and Rules for Schwa Epenthesis (271 KB)
Asanka Wasala, Ruvan Weerasinghe and Kumudu Gamage, Language Technology Research Laboratory, University of Colombo School of Computing

Abstract.
This paper describes an architecture to convert Sinhala Unicode text into phonemic specification of pronunciation. The study was mainly focused on disambiguating schwa-/ə/ and /a/ vowel epenthesis for consonants, which is one of the significant problems found in Sinhala. This problem has been addressed by formulating a set of rules. The proposed set of rules was tested using 30,000 distinct words obtained from a corpus and compared with the same words manually transcribed to phonemes by an expert. The Grapheme-to-Phoneme (G2P) con version model achieves 98 % accuracy.

Useful Links

Acknowledgements

This work was made possible through the PAN Localization Project, (http://www.PANL10n.net) a grant from the International
Development Research Center (IDRC), Ottawa, Canada, administered through the Center for Research in Urdu Language Processing,
National University of Computer and Emerging Sciences, Pakistan.

The voice you hear on Sinhala Text-to-Speech Synthesizer is that of Mr. Sandun Sulakshana.

We would like to thank Mr. Sandun for providing us with such a lucid, male voice.
We apologize for robotizing his manly natural voice to suit our selfish motives!

Developers of Sinhala Text-to-Speech Synthesis Voice 'Festival-si':

- Asanka Wasala, Viraj Welgama, Kumudu Gamage

- Sinhala MSAPI Solution (Win32) & Gnopernicus (Linux) solution developed by:

- Asanka Wasala

Special Thanks to :

- Dr. Ruvan Weerasingh - Director, University of Colombo School of Computing(UCSC)/Team lead of PAN L10n Sri Lanka team

- We would like to thank Dr. Ruvan Weerasinghe for providing thought leadership and access to academic thinking at UCSC. In addition, over the years he has provided mentoring and coaching on the development of TTS.

- Dr. Sarmad Hussain (NUCES, Pakistan)- Associate Professor and Head of Center for Research in Urdu Language Processing.

We would like to thank him on behalf of conducting training workshop on "Phonetics and Phonology for TTS" and supporting throughout the project

- Mr. S.T. Nandasara

For providing assitance for recording in the UCSC Studio.

- Rafia Bokhari (Team lead-Pakistan TTS team)

For assisting us to solve various technical issues.

- David Brown

For creating and sharing Festival.NET solution

Devlopers of the Festival-MSAPI bridge

- Dr. Briony Williams (University of Wales, Bangor)
- Dr. Rhys J Jones (University of Wales, Bangor)

For their invaluable help in developing Sinhala MSAPI solution as well as prosody aspects.

- Ambrose Choy (University of Wales, Bangor)
- Dewi B Jones (University of Wales, Bangor)

- Mr. Harsha Wijayawardhana, for being available whenever his technical assistance was sought.

- Mr. Harshula Jayasuriya, for helping us with solving various Linux and CVS related issues.

- Roger Wilson-Hinds (Thunder screenreader), Dipendra Manocha (SAFA screenreader), for helping to improve the functionality of Screen readers in Sinhala.

Sinhala Language scholars:

- Prof. J.B. Dissanayake
- Prof. R.M.W. Rajapaksha

Developers of the Festival Speech Synthesis System:

- Alan W Black (Carnegie Mellon University)
- Rob Clark (Edinburgh University)
- Korin Richmond (Edinburgh University)
- Heiga Zen (Nagoya Institute of Technology)

And other colleagues of the Language Technology Research Laboratory, UCSC,

- Mr. Vincent Halahakone
- Mr. Dulip Lakmal Herath
- Mr. Nishantha Medagoda
- Mr. Eranga Jayalatharachchi

For his valuble feedback:

- Prof. D.P.M.W. Weerakkody


© 2006-2010 by Language Technology Research Laboratory, University of Colombo School of Computing, Sri Lanka
Last Updated On: 1stFebruary 2010