Test version of Papagayo supporting extra languages
Moderators: Víctor Paredes, slowtiger
Test version of Papagayo supporting extra languages
Greetings all,
With Lost Marble's permission, I am releasing an interim test version of Papagayo with 10 more languages supported for breakdown into phonemes.
Note: the interface is still in English, other languages are only supported in the breakdown feature.
New languages supported are: Svenska (Swedish), Deutsch (German), Nederlands (Dutch), Italiano (Nicola Jelmorini/AloRom's version and my own), Suomi (Finnish), Magyar (Hungarian), Norsk (Norwegian), Turkish, Ukrainian, and Russian. English and Espanol (Spanish) are of course still supported.
Language support is based on simple spelling-based pronunciation - languages that do not have very consistent phonetic spelling are likely to get only approximate breakdown.
Phoneme support in Papagayo is limited internally to the USA CMU phoneme set - that means exact breakdown is often not possible, certainly not for many of the European vowel sounds. Even support for the short English 'o' sound in "hot dog" is not available, as North Americans use a broad "h-ah-t d-aw-g" instead (broken down as HH-AA1-T D-AO1-G).
For English breakdown, just type or paste in the English text and hit the "English" button. For other languages, type or paste in the text, select your language from the drop-down list, and press the "Breakdown" button.
No, the current version does not remember your selected language between sessions.
This is an unsupported test/beta version for people to play with. The final official version will eventually be released on Lost Marble's site.
This is a zip file only, there is no GUI installer - retain the folder structure when unzipping.
Bugs, instabilities and features that may not make it into the final release version are probably my fault, but you use the program at your own risk - I accept no legal liability. There is a harmless bug on start-up - "No handler found for image type", to do with loading the default mouths.
This test version is is a Windows binary only - download it here. Due to the Unicode handling system used to support non-roman characters, it will not work on Windows 95 or Windows 98 (not even 98SE) (sorry Alex!).
Source code is provided for Linux people and Mac programmers to play with - if any Mac programmer wants to provide a Mac binary (or any Mac-based studio with excess budget wants to buy me a Mac Mini), please get in contact with me. I'll see if I can get Linux binary up but no guarantees, so if any Linux power-user wants to have a go, please get in contact with me. This source code zip file contains new and changed files only, based on the Papagayo 1.1 source code available here. Unzip the official source, then copy my new and changed files into the main Papagayo folder before
building/bundling the app.
Hidden feature: there is a hidden feature in this version that allows advanced users to use a larger mouth-set than the default Preston Blair mouth-set. Note: this feature may not make it into the final release - that's a decision for Lost Marble.
Download this zip file, put the entire "Extended CMU" folder (containing mouth images) into the Papagayo "mouths" folder, copy the convert.ini file in this folder into the main Papagayo folder.
The disadvantage: you can then only use the Extended CMU mouth set for mouth preview, and you don't get the button-press phoneme insertion for manual breakdown and editing - you have to type in custom phonemes.
The advantage: by editing the convert.ini entries and providing your own mouth images, you can create and use your own custom mouth-set, not just the Preston Blair mouth-set. Note: only alter the right-hand side of the conversion pairs in the custom.ini file, and do not delete any conversion pairs.
Limitation: you can only provide conversions for the CMU (Carnegie-Mellon University) phonemes used internally by Papagayo. The list of CMU phonemes can be found at http://www.speech.cs.cmu.edu/cgi-bin/cmudict
Vowel sounds have 3 versions e.g. AA0, AA1, AA2 - for no, primary, and secondary emphasis/stress e.g. abalone (the shellfish) is broken down as AE2 B AH0 L OW1 N IY0, indicating primary stress on the OW sound , secondary stress on the AE, and no stress on the other 2 vowel sounds. The language breakdown routines currently use only the no-stress version for most vowel sounds.
Many thanks to all my breakdown testers for their help and feedback!
Yes, I will be working on other languages, and I will need some feedback from native speakers.
My next goal is to provide language breakdown for Japanese (only for romaji, hiragana, and katakana, kanji will not be supported), Polish, and French (although I suspect French will be rather inaccurate). Don't hold your breath, it may take some time. Experimental releases will be provided when they are ready.
Regards, Myles.
With Lost Marble's permission, I am releasing an interim test version of Papagayo with 10 more languages supported for breakdown into phonemes.
Note: the interface is still in English, other languages are only supported in the breakdown feature.
New languages supported are: Svenska (Swedish), Deutsch (German), Nederlands (Dutch), Italiano (Nicola Jelmorini/AloRom's version and my own), Suomi (Finnish), Magyar (Hungarian), Norsk (Norwegian), Turkish, Ukrainian, and Russian. English and Espanol (Spanish) are of course still supported.
Language support is based on simple spelling-based pronunciation - languages that do not have very consistent phonetic spelling are likely to get only approximate breakdown.
Phoneme support in Papagayo is limited internally to the USA CMU phoneme set - that means exact breakdown is often not possible, certainly not for many of the European vowel sounds. Even support for the short English 'o' sound in "hot dog" is not available, as North Americans use a broad "h-ah-t d-aw-g" instead (broken down as HH-AA1-T D-AO1-G).
For English breakdown, just type or paste in the English text and hit the "English" button. For other languages, type or paste in the text, select your language from the drop-down list, and press the "Breakdown" button.
No, the current version does not remember your selected language between sessions.
This is an unsupported test/beta version for people to play with. The final official version will eventually be released on Lost Marble's site.
This is a zip file only, there is no GUI installer - retain the folder structure when unzipping.
Bugs, instabilities and features that may not make it into the final release version are probably my fault, but you use the program at your own risk - I accept no legal liability. There is a harmless bug on start-up - "No handler found for image type", to do with loading the default mouths.
This test version is is a Windows binary only - download it here. Due to the Unicode handling system used to support non-roman characters, it will not work on Windows 95 or Windows 98 (not even 98SE) (sorry Alex!).
Source code is provided for Linux people and Mac programmers to play with - if any Mac programmer wants to provide a Mac binary (or any Mac-based studio with excess budget wants to buy me a Mac Mini), please get in contact with me. I'll see if I can get Linux binary up but no guarantees, so if any Linux power-user wants to have a go, please get in contact with me. This source code zip file contains new and changed files only, based on the Papagayo 1.1 source code available here. Unzip the official source, then copy my new and changed files into the main Papagayo folder before
building/bundling the app.
Hidden feature: there is a hidden feature in this version that allows advanced users to use a larger mouth-set than the default Preston Blair mouth-set. Note: this feature may not make it into the final release - that's a decision for Lost Marble.
Download this zip file, put the entire "Extended CMU" folder (containing mouth images) into the Papagayo "mouths" folder, copy the convert.ini file in this folder into the main Papagayo folder.
The disadvantage: you can then only use the Extended CMU mouth set for mouth preview, and you don't get the button-press phoneme insertion for manual breakdown and editing - you have to type in custom phonemes.
The advantage: by editing the convert.ini entries and providing your own mouth images, you can create and use your own custom mouth-set, not just the Preston Blair mouth-set. Note: only alter the right-hand side of the conversion pairs in the custom.ini file, and do not delete any conversion pairs.
Limitation: you can only provide conversions for the CMU (Carnegie-Mellon University) phonemes used internally by Papagayo. The list of CMU phonemes can be found at http://www.speech.cs.cmu.edu/cgi-bin/cmudict
Vowel sounds have 3 versions e.g. AA0, AA1, AA2 - for no, primary, and secondary emphasis/stress e.g. abalone (the shellfish) is broken down as AE2 B AH0 L OW1 N IY0, indicating primary stress on the OW sound , secondary stress on the AE, and no stress on the other 2 vowel sounds. The language breakdown routines currently use only the no-stress version for most vowel sounds.
Many thanks to all my breakdown testers for their help and feedback!
Yes, I will be working on other languages, and I will need some feedback from native speakers.
My next goal is to provide language breakdown for Japanese (only for romaji, hiragana, and katakana, kanji will not be supported), Polish, and French (although I suspect French will be rather inaccurate). Don't hold your breath, it may take some time. Experimental releases will be provided when they are ready.
Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
-- Groucho Marx
I took a turn at trying to apply your patches for linux, but ran into a snag:
There is a file "_lm.dll" but that is for windows. Do you know how I create the _lm module for linux?
Thanks,
jorgy
Code: Select all
# python papagayo.py
Traceback (most recent call last):
File "papagayo.py", line 24, in ?
from LipsyncFrame import LipsyncFrame
File "/home/c180391/papagayo/ORIG/papagayo_1.1_source/LipsyncFrame.py", line 26, in ?
import lm
File "/home/c180391/papagayo/ORIG/papagayo_1.1_source/lm.py", line 5, in ?
import _lm
ImportError: No module named _lm
Thanks,
jorgy
Oops, sorry about that Jorgy! Grab the Linux version of Papagayo (which I think is in source code format), and copy my changed code over the top. I'm fairly sure it contains an _lm.so (the equivalent of the Windows .dll).jorgy wrote:There is a file "_lm.dll" but that is for windows. Do you know how I create the _lm module for linux?Code: Select all
import _lm ImportError: No module named _lm
Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
-- Groucho Marx
Hi Vern,heyvern wrote:What software or tools would be needed to compile this for Mac? Or linux?
on Macs you'll probably need a Python update (to the version that comes pre-installed with MacOS). See http://www.python.org/download/mac/
You'll also need wxPython, the cross-platform GUI library used by Papagayo. See http://pythonmac.org/packages/py24-fat/index.html and/or http://www.wxpython.org/download.php#binaries
You'll need the Lost Marble _lm sound-handling library - if it's not included in the source code, it should be with the binary version of Papagayo from Lost Marble - and possibly a libsndfile library (likewise, if it's not pre-installed on Mac systems - I'm a Mac ignoramus).
You don't need these to run Papagayo from Python, only to build a double-clickable program, that you can also distribute:
The MacPython update (above) may include a Python application bundler - I'm not sure of the exact details. I think it includes BundleBuilder - see http://pythonmac.org/wiki/BundleBuilder
Otherwise/Anyway, you'll need py2app (see http://pythonmac.org/wiki/py2app) - either using setuptools or download the source version from http://cheeseshop.python.org/pypi/py2app/
Instructions for setuptools or the source version at http://svn.pythonmac.org/py2app/py2app/ ... rom-source
For Linux, to get it to run you'll need to install Python (http://www.python.org/download/)if not already installed (upgrade to 2.4 if a previous version is installed), also wxPython (http://www.wxpython.org/download.php#sources, athough there are packages for Debian, Ubuntu and some rpm-based systems), and the Linux version (Python source) of Papagayo from Lost Marble for the _lm.so library and the base code (copy my changed code over the top).
To create a distributable executable bundle you'll also need cxFreeze (http://www.python.net/crew/atuining/cx_Freeze/). Unless you are running Fedora Core 5, you'll possibly need to re-compile from source - the binaries provide require a somewhat older glibc 2.4 than is available on the live Linux system I play with.
Alternatively, you could try PyInstaller, at http://pyinstaller.hpcf.upr.edu/cgi-bin/trac.cgi
When I get the chance I'll be playing around compiling cxFreeze and trying PyInstaller myself.
Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
-- Groucho Marx
Bingo, that was it. It's now working, and I'll investigate the tools you mention in your other post to convert to a distributable form. I'll let you know how I progress.myles wrote:Oops, sorry about that Jorgy! Grab the Linux version of Papagayo (which I think is in source code format), and copy my changed code over the top. I'm fairly sure it contains an _lm.so (the equivalent of the Windows .dll).
Regards, Myles.
jorgy
- Lost Marble
- Site Admin
- Posts: 2354
- Joined: Tue Aug 03, 2004 6:02 pm
- Location: Scotts Valley, California, USA
- Contact:
Nice Miles, very nice.
My own improvements (speed, interface, etc) have kind of gone on hold as other work has got in the way. I'm currently looking at rewriting some of the code as a Python module for extra speed (the entire waveform view is slooow with large files, this should help a lot).
I'll see if I can fold these edits into my version too!
My own improvements (speed, interface, etc) have kind of gone on hold as other work has got in the way. I'm currently looking at rewriting some of the code as a Python module for extra speed (the entire waveform view is slooow with large files, this should help a lot).
I'll see if I can fold these edits into my version too!
Linux binary version ready
Okay, I've added (amongst other things) a C/C++ compiler to my Linux setup, so I've made a binary version of cxFreeze that matches my system, so I've been able to bundle together a binary version of Papagayo for Linux.
What sort of Linux system doesn't come with a C/C++ compiler pre-installed? A 200MB (including KDE) base version running live from a 400MB partition of a 512MB USB stick. Yay Slax! Easiest live Linux system to customise I've found yet.
Will the binary version work on your Linux distribution? I don't know - isn't Linux fun?
You shouldn't need Python or wxPython, and you probably won't even need a Unicode-enabled wxGTK. However, you will probably need glibc version 2.8 or later - but I'd be surprised if most modern Linux distributions didn't include this (and if you run Slax, you can download 2.12 as a module).
It unzips into a papagayoM folder (just in case you have some other Papagayo version already present). Run it using the shell-script .sh file (./papgayo.sh) - I've renamed it from just papagayo (no .sh) as it was supplied by Lost Marble because cxFreeze turns papagayo.py into papagayo, overwiting the shell script file.
A note about path names - the code in Papagayo (all versions) uses the Latin-1 encoding to handle directories and filenames. This may potentially result in Papagayo not working properly if you are trying to open or save files with accented letters or diacritics or non-Roman letters in their names, or in the directory/folder paths.
Regards, Myles.
What sort of Linux system doesn't come with a C/C++ compiler pre-installed? A 200MB (including KDE) base version running live from a 400MB partition of a 512MB USB stick. Yay Slax! Easiest live Linux system to customise I've found yet.
Will the binary version work on your Linux distribution? I don't know - isn't Linux fun?
You shouldn't need Python or wxPython, and you probably won't even need a Unicode-enabled wxGTK. However, you will probably need glibc version 2.8 or later - but I'd be surprised if most modern Linux distributions didn't include this (and if you run Slax, you can download 2.12 as a module).
It unzips into a papagayoM folder (just in case you have some other Papagayo version already present). Run it using the shell-script .sh file (./papgayo.sh) - I've renamed it from just papagayo (no .sh) as it was supplied by Lost Marble because cxFreeze turns papagayo.py into papagayo, overwiting the shell script file.
A note about path names - the code in Papagayo (all versions) uses the Latin-1 encoding to handle directories and filenames. This may potentially result in Papagayo not working properly if you are trying to open or save files with accented letters or diacritics or non-Roman letters in their names, or in the directory/folder paths.
Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
-- Groucho Marx
Thanks DarkCryst!DarkCryst wrote:btw - the hot dog thing? That's in UK english, and many US accents, pretty well described by the "AO2" CMU phoneme
It's all a matter of the lexical stress
Although the CMU dictionary uses the AO2 phoneme for (as examples) the 'o' in glorification, the 'a' in softball, and the 'o' in therefore, all of which are a completely different sound to the short 'o' in hot dog, at least in Australain English, my native tongue, which I thought was closer to UK English in this respect than the US.
I guess we just have to live with a certain amount of inaccuracy or variation even in English, let alone the very rough approximations of the sounds in some of the mainland European languages.
Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
-- Groucho Marx
Another note about the Linux binary for Papagayo: I found on my Slax system I had to change the default sound system from threaded OSS to ALSA to get it loading and displaying sound files and waveforms. I also set a default auto-suspend time of 6 seconds to prevent the KDE sound system blocking programs that use the audio hardware directly, and turned off all the default system sounds - I don't know if either of these changes were important.
I only briefly tested loading projects and wav files and phonetic breakdown, it occurred to me just now that I didn't actually test exporting from the Linux binary version.
Regards, Myles
I only briefly tested loading projects and wav files and phonetic breakdown, it occurred to me just now that I didn't actually test exporting from the Linux binary version.
Regards, Myles
"Quote me as saying I was mis-quoted."
-- Groucho Marx
-- Groucho Marx
late bump...
Australia, like the UK and USA has many regional variations. Victoria vs Sydney accents for example
So really you are just describing in these files a generic english. Probably something transatlantic would be the best ground. Or.. for an Ozzie - think Kyle or How Mel Gibson sounded 15 years ago
Australia, like the UK and USA has many regional variations. Victoria vs Sydney accents for example
So really you are just describing in these files a generic english. Probably something transatlantic would be the best ground. Or.. for an Ozzie - think Kyle or How Mel Gibson sounded 15 years ago