Test version of Papagayo supporting extra languages
Posted: Tue Aug 29, 2006 1:48 am
Greetings all,
With Lost Marble's permission, I am releasing an interim test version of Papagayo with 10 more languages supported for breakdown into phonemes.
Note: the interface is still in English, other languages are only supported in the breakdown feature.
New languages supported are: Svenska (Swedish), Deutsch (German), Nederlands (Dutch), Italiano (Nicola Jelmorini/AloRom's version and my own), Suomi (Finnish), Magyar (Hungarian), Norsk (Norwegian), Turkish, Ukrainian, and Russian. English and Espanol (Spanish) are of course still supported.
Language support is based on simple spelling-based pronunciation - languages that do not have very consistent phonetic spelling are likely to get only approximate breakdown.
Phoneme support in Papagayo is limited internally to the USA CMU phoneme set - that means exact breakdown is often not possible, certainly not for many of the European vowel sounds. Even support for the short English 'o' sound in "hot dog" is not available, as North Americans use a broad "h-ah-t d-aw-g" instead (broken down as HH-AA1-T D-AO1-G).
For English breakdown, just type or paste in the English text and hit the "English" button. For other languages, type or paste in the text, select your language from the drop-down list, and press the "Breakdown" button.
No, the current version does not remember your selected language between sessions.
This is an unsupported test/beta version for people to play with. The final official version will eventually be released on Lost Marble's site.
This is a zip file only, there is no GUI installer - retain the folder structure when unzipping.
Bugs, instabilities and features that may not make it into the final release version are probably my fault, but you use the program at your own risk - I accept no legal liability. There is a harmless bug on start-up - "No handler found for image type", to do with loading the default mouths.
This test version is is a Windows binary only - download it here. Due to the Unicode handling system used to support non-roman characters, it will not work on Windows 95 or Windows 98 (not even 98SE) (sorry Alex!).
Source code is provided for Linux people and Mac programmers to play with - if any Mac programmer wants to provide a Mac binary (or any Mac-based studio with excess budget wants to buy me a Mac Mini), please get in contact with me. I'll see if I can get Linux binary up but no guarantees, so if any Linux power-user wants to have a go, please get in contact with me. This source code zip file contains new and changed files only, based on the Papagayo 1.1 source code available here. Unzip the official source, then copy my new and changed files into the main Papagayo folder before
building/bundling the app.
Hidden feature: there is a hidden feature in this version that allows advanced users to use a larger mouth-set than the default Preston Blair mouth-set. Note: this feature may not make it into the final release - that's a decision for Lost Marble.
Download this zip file, put the entire "Extended CMU" folder (containing mouth images) into the Papagayo "mouths" folder, copy the convert.ini file in this folder into the main Papagayo folder.
The disadvantage: you can then only use the Extended CMU mouth set for mouth preview, and you don't get the button-press phoneme insertion for manual breakdown and editing - you have to type in custom phonemes.
The advantage: by editing the convert.ini entries and providing your own mouth images, you can create and use your own custom mouth-set, not just the Preston Blair mouth-set. Note: only alter the right-hand side of the conversion pairs in the custom.ini file, and do not delete any conversion pairs.
Limitation: you can only provide conversions for the CMU (Carnegie-Mellon University) phonemes used internally by Papagayo. The list of CMU phonemes can be found at http://www.speech.cs.cmu.edu/cgi-bin/cmudict
Vowel sounds have 3 versions e.g. AA0, AA1, AA2 - for no, primary, and secondary emphasis/stress e.g. abalone (the shellfish) is broken down as AE2 B AH0 L OW1 N IY0, indicating primary stress on the OW sound , secondary stress on the AE, and no stress on the other 2 vowel sounds. The language breakdown routines currently use only the no-stress version for most vowel sounds.
Many thanks to all my breakdown testers for their help and feedback!
Yes, I will be working on other languages, and I will need some feedback from native speakers.
My next goal is to provide language breakdown for Japanese (only for romaji, hiragana, and katakana, kanji will not be supported), Polish, and French (although I suspect French will be rather inaccurate). Don't hold your breath, it may take some time. Experimental releases will be provided when they are ready.
Regards, Myles.
With Lost Marble's permission, I am releasing an interim test version of Papagayo with 10 more languages supported for breakdown into phonemes.
Note: the interface is still in English, other languages are only supported in the breakdown feature.
New languages supported are: Svenska (Swedish), Deutsch (German), Nederlands (Dutch), Italiano (Nicola Jelmorini/AloRom's version and my own), Suomi (Finnish), Magyar (Hungarian), Norsk (Norwegian), Turkish, Ukrainian, and Russian. English and Espanol (Spanish) are of course still supported.
Language support is based on simple spelling-based pronunciation - languages that do not have very consistent phonetic spelling are likely to get only approximate breakdown.
Phoneme support in Papagayo is limited internally to the USA CMU phoneme set - that means exact breakdown is often not possible, certainly not for many of the European vowel sounds. Even support for the short English 'o' sound in "hot dog" is not available, as North Americans use a broad "h-ah-t d-aw-g" instead (broken down as HH-AA1-T D-AO1-G).
For English breakdown, just type or paste in the English text and hit the "English" button. For other languages, type or paste in the text, select your language from the drop-down list, and press the "Breakdown" button.
No, the current version does not remember your selected language between sessions.
This is an unsupported test/beta version for people to play with. The final official version will eventually be released on Lost Marble's site.
This is a zip file only, there is no GUI installer - retain the folder structure when unzipping.
Bugs, instabilities and features that may not make it into the final release version are probably my fault, but you use the program at your own risk - I accept no legal liability. There is a harmless bug on start-up - "No handler found for image type", to do with loading the default mouths.
This test version is is a Windows binary only - download it here. Due to the Unicode handling system used to support non-roman characters, it will not work on Windows 95 or Windows 98 (not even 98SE) (sorry Alex!).
Source code is provided for Linux people and Mac programmers to play with - if any Mac programmer wants to provide a Mac binary (or any Mac-based studio with excess budget wants to buy me a Mac Mini), please get in contact with me. I'll see if I can get Linux binary up but no guarantees, so if any Linux power-user wants to have a go, please get in contact with me. This source code zip file contains new and changed files only, based on the Papagayo 1.1 source code available here. Unzip the official source, then copy my new and changed files into the main Papagayo folder before
building/bundling the app.
Hidden feature: there is a hidden feature in this version that allows advanced users to use a larger mouth-set than the default Preston Blair mouth-set. Note: this feature may not make it into the final release - that's a decision for Lost Marble.
Download this zip file, put the entire "Extended CMU" folder (containing mouth images) into the Papagayo "mouths" folder, copy the convert.ini file in this folder into the main Papagayo folder.
The disadvantage: you can then only use the Extended CMU mouth set for mouth preview, and you don't get the button-press phoneme insertion for manual breakdown and editing - you have to type in custom phonemes.
The advantage: by editing the convert.ini entries and providing your own mouth images, you can create and use your own custom mouth-set, not just the Preston Blair mouth-set. Note: only alter the right-hand side of the conversion pairs in the custom.ini file, and do not delete any conversion pairs.
Limitation: you can only provide conversions for the CMU (Carnegie-Mellon University) phonemes used internally by Papagayo. The list of CMU phonemes can be found at http://www.speech.cs.cmu.edu/cgi-bin/cmudict
Vowel sounds have 3 versions e.g. AA0, AA1, AA2 - for no, primary, and secondary emphasis/stress e.g. abalone (the shellfish) is broken down as AE2 B AH0 L OW1 N IY0, indicating primary stress on the OW sound , secondary stress on the AE, and no stress on the other 2 vowel sounds. The language breakdown routines currently use only the no-stress version for most vowel sounds.
Many thanks to all my breakdown testers for their help and feedback!
Yes, I will be working on other languages, and I will need some feedback from native speakers.
My next goal is to provide language breakdown for Japanese (only for romaji, hiragana, and katakana, kanji will not be supported), Polish, and French (although I suspect French will be rather inaccurate). Don't hold your breath, it may take some time. Experimental releases will be provided when they are ready.
Regards, Myles.