Test version of Papagayo supporting extra languages

Discuss Papagayo issues here

Moderators: Víctor Paredes, slowtiger

myles
Posts: 821
Joined: Sat Aug 21, 2004 3:32 am
Location: Australia, Victoria, Morwell
Contact:

Test version of Papagayo supporting extra languages

Post by myles »

Greetings all,

With Lost Marble's permission, I am releasing an interim test version of Papagayo with 10 more languages supported for breakdown into phonemes.

Note: the interface is still in English, other languages are only supported in the breakdown feature.

New languages supported are: Svenska (Swedish), Deutsch (German), Nederlands (Dutch), Italiano (Nicola Jelmorini/AloRom's version and my own), Suomi (Finnish), Magyar (Hungarian), Norsk (Norwegian), Turkish, Ukrainian, and Russian. English and Espanol (Spanish) are of course still supported.

Language support is based on simple spelling-based pronunciation - languages that do not have very consistent phonetic spelling are likely to get only approximate breakdown.

Phoneme support in Papagayo is limited internally to the USA CMU phoneme set - that means exact breakdown is often not possible, certainly not for many of the European vowel sounds. Even support for the short English 'o' sound in "hot dog" is not available, as North Americans use a broad "h-ah-t d-aw-g" instead (broken down as HH-AA1-T D-AO1-G).

For English breakdown, just type or paste in the English text and hit the "English" button. For other languages, type or paste in the text, select your language from the drop-down list, and press the "Breakdown" button.

No, the current version does not remember your selected language between sessions.

This is an unsupported test/beta version for people to play with. The final official version will eventually be released on Lost Marble's site.
This is a zip file only, there is no GUI installer - retain the folder structure when unzipping.

Bugs, instabilities and features that may not make it into the final release version are probably my fault, but you use the program at your own risk - I accept no legal liability. There is a harmless bug on start-up - "No handler found for image type", to do with loading the default mouths.

This test version is is a Windows binary only - download it here. Due to the Unicode handling system used to support non-roman characters, it will not work on Windows 95 or Windows 98 (not even 98SE) (sorry Alex!).

Source code is provided for Linux people and Mac programmers to play with - if any Mac programmer wants to provide a Mac binary (or any Mac-based studio with excess budget wants to buy me a Mac Mini), please get in contact with me. I'll see if I can get Linux binary up but no guarantees, so if any Linux power-user wants to have a go, please get in contact with me. This source code zip file contains new and changed files only, based on the Papagayo 1.1 source code available here. Unzip the official source, then copy my new and changed files into the main Papagayo folder before
building/bundling the app.

Hidden feature: there is a hidden feature in this version that allows advanced users to use a larger mouth-set than the default Preston Blair mouth-set. Note: this feature may not make it into the final release - that's a decision for Lost Marble.
Download this zip file, put the entire "Extended CMU" folder (containing mouth images) into the Papagayo "mouths" folder, copy the convert.ini file in this folder into the main Papagayo folder.
The disadvantage: you can then only use the Extended CMU mouth set for mouth preview, and you don't get the button-press phoneme insertion for manual breakdown and editing - you have to type in custom phonemes.
The advantage: by editing the convert.ini entries and providing your own mouth images, you can create and use your own custom mouth-set, not just the Preston Blair mouth-set. Note: only alter the right-hand side of the conversion pairs in the custom.ini file, and do not delete any conversion pairs.
Limitation: you can only provide conversions for the CMU (Carnegie-Mellon University) phonemes used internally by Papagayo. The list of CMU phonemes can be found at http://www.speech.cs.cmu.edu/cgi-bin/cmudict
Vowel sounds have 3 versions e.g. AA0, AA1, AA2 - for no, primary, and secondary emphasis/stress e.g. abalone (the shellfish) is broken down as AE2 B AH0 L OW1 N IY0, indicating primary stress on the OW sound , secondary stress on the AE, and no stress on the other 2 vowel sounds. The language breakdown routines currently use only the no-stress version for most vowel sounds.

Many thanks to all my breakdown testers for their help and feedback!



Yes, I will be working on other languages, and I will need some feedback from native speakers.

My next goal is to provide language breakdown for Japanese (only for romaji, hiragana, and katakana, kanji will not be supported), Polish, and French (although I suspect French will be rather inaccurate). Don't hold your breath, it may take some time. Experimental releases will be provided when they are ready.

Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
User avatar
7feet
Posts: 840
Joined: Wed Aug 04, 2004 5:45 am
Location: L.I., New Yawk.
Contact:

Post by 7feet »

Very cool Myles, I'm sure it'll make a lot of people happy(er). Also, I put in a stong vote to keep the "hidden feature", that's something that I would certainly use. Being able to use the particular set of phonemes you want in the manual breakdown would be a nice add, though.
User avatar
slowtiger
Posts: 6176
Joined: Thu Feb 16, 2006 6:53 pm
Location: Berlin, Germany
Contact:

Post by slowtiger »

Now where can I find a Mac programmer, beat him down, drag him to my place and make him do what's necessary ...

See? Macs make you a criminal! *fg*

Thank you, Myles, I'll give you a report as soon as I'm able to make this thing run.
User avatar
heyvern
Posts: 7035
Joined: Fri Sep 02, 2005 4:49 am

Post by heyvern »

What software or tools would be needed to compile this for Mac? Or linux?


-vern
User avatar
jorgy
Posts: 779
Joined: Sun Sep 05, 2004 8:01 pm
Location: Colorado, USA

Post by jorgy »

I took a turn at trying to apply your patches for linux, but ran into a snag:

Code: Select all

# python papagayo.py
Traceback (most recent call last):
  File "papagayo.py", line 24, in ?
    from LipsyncFrame import LipsyncFrame
  File "/home/c180391/papagayo/ORIG/papagayo_1.1_source/LipsyncFrame.py", line 26, in ?
    import lm
  File "/home/c180391/papagayo/ORIG/papagayo_1.1_source/lm.py", line 5, in ?
    import _lm
ImportError: No module named _lm
There is a file "_lm.dll" but that is for windows. Do you know how I create the _lm module for linux?

Thanks,

jorgy
myles
Posts: 821
Joined: Sat Aug 21, 2004 3:32 am
Location: Australia, Victoria, Morwell
Contact:

Post by myles »

jorgy wrote:

Code: Select all

    import _lm
ImportError: No module named _lm
There is a file "_lm.dll" but that is for windows. Do you know how I create the _lm module for linux?
Oops, sorry about that Jorgy! Grab the Linux version of Papagayo (which I think is in source code format), and copy my changed code over the top. I'm fairly sure it contains an _lm.so (the equivalent of the Windows .dll).

Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
myles
Posts: 821
Joined: Sat Aug 21, 2004 3:32 am
Location: Australia, Victoria, Morwell
Contact:

Post by myles »

heyvern wrote:What software or tools would be needed to compile this for Mac? Or linux?
Hi Vern,

on Macs you'll probably need a Python update (to the version that comes pre-installed with MacOS). See http://www.python.org/download/mac/

You'll also need wxPython, the cross-platform GUI library used by Papagayo. See http://pythonmac.org/packages/py24-fat/index.html and/or http://www.wxpython.org/download.php#binaries

You'll need the Lost Marble _lm sound-handling library - if it's not included in the source code, it should be with the binary version of Papagayo from Lost Marble - and possibly a libsndfile library (likewise, if it's not pre-installed on Mac systems - I'm a Mac ignoramus).

You don't need these to run Papagayo from Python, only to build a double-clickable program, that you can also distribute:

The MacPython update (above) may include a Python application bundler - I'm not sure of the exact details. I think it includes BundleBuilder - see http://pythonmac.org/wiki/BundleBuilder
Otherwise/Anyway, you'll need py2app (see http://pythonmac.org/wiki/py2app) - either using setuptools or download the source version from http://cheeseshop.python.org/pypi/py2app/

Instructions for setuptools or the source version at http://svn.pythonmac.org/py2app/py2app/ ... rom-source


For Linux, to get it to run you'll need to install Python (http://www.python.org/download/)if not already installed (upgrade to 2.4 if a previous version is installed), also wxPython (http://www.wxpython.org/download.php#sources, athough there are packages for Debian, Ubuntu and some rpm-based systems), and the Linux version (Python source) of Papagayo from Lost Marble for the _lm.so library and the base code (copy my changed code over the top).

To create a distributable executable bundle you'll also need cxFreeze (http://www.python.net/crew/atuining/cx_Freeze/). Unless you are running Fedora Core 5, you'll possibly need to re-compile from source - the binaries provide require a somewhat older glibc 2.4 than is available on the live Linux system I play with.
Alternatively, you could try PyInstaller, at http://pyinstaller.hpcf.upr.edu/cgi-bin/trac.cgi

When I get the chance I'll be playing around compiling cxFreeze and trying PyInstaller myself.

Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
User avatar
jorgy
Posts: 779
Joined: Sun Sep 05, 2004 8:01 pm
Location: Colorado, USA

Post by jorgy »

myles wrote:Oops, sorry about that Jorgy! Grab the Linux version of Papagayo (which I think is in source code format), and copy my changed code over the top. I'm fairly sure it contains an _lm.so (the equivalent of the Windows .dll).

Regards, Myles.
Bingo, that was it. It's now working, and I'll investigate the tools you mention in your other post to convert to a distributable form. I'll let you know how I progress.

jorgy
User avatar
Lost Marble
Site Admin
Posts: 2354
Joined: Tue Aug 03, 2004 6:02 pm
Location: Scotts Valley, California, USA
Contact:

Post by Lost Marble »

Thanks for your hard work, Myles. This should all get integrated into the next "official" build of Papagayo. For now, if you need to lip-sync one of the languages on the list, Myles's version is the way to go.
DarkCryst
Posts: 24
Joined: Mon Jul 24, 2006 9:36 pm

Post by DarkCryst »

Nice Miles, very nice.

My own improvements (speed, interface, etc) have kind of gone on hold as other work has got in the way. I'm currently looking at rewriting some of the code as a Python module for extra speed (the entire waveform view is slooow with large files, this should help a lot).

I'll see if I can fold these edits into my version too! :D
DarkCryst
Posts: 24
Joined: Mon Jul 24, 2006 9:36 pm

Post by DarkCryst »

btw - the hot dog thing? That's in UK english, and many US accents, pretty well described by the "AO2" CMU phoneme

It's all a matter of the lexical stress :)
myles
Posts: 821
Joined: Sat Aug 21, 2004 3:32 am
Location: Australia, Victoria, Morwell
Contact:

Linux binary version ready

Post by myles »

Okay, I've added (amongst other things) a C/C++ compiler to my Linux setup, so I've made a binary version of cxFreeze that matches my system, so I've been able to bundle together a binary version of Papagayo for Linux.

What sort of Linux system doesn't come with a C/C++ compiler pre-installed? A 200MB (including KDE) base version running live from a 400MB partition of a 512MB USB stick. Yay Slax! Easiest live Linux system to customise I've found yet.

Will the binary version work on your Linux distribution? I don't know - isn't Linux fun? :)

You shouldn't need Python or wxPython, and you probably won't even need a Unicode-enabled wxGTK. However, you will probably need glibc version 2.8 or later - but I'd be surprised if most modern Linux distributions didn't include this (and if you run Slax, you can download 2.12 as a module).

It unzips into a papagayoM folder (just in case you have some other Papagayo version already present). Run it using the shell-script .sh file (./papgayo.sh) - I've renamed it from just papagayo (no .sh) as it was supplied by Lost Marble because cxFreeze turns papagayo.py into papagayo, overwiting the shell script file.

A note about path names - the code in Papagayo (all versions) uses the Latin-1 encoding to handle directories and filenames. This may potentially result in Papagayo not working properly if you are trying to open or save files with accented letters or diacritics or non-Roman letters in their names, or in the directory/folder paths.

Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
myles
Posts: 821
Joined: Sat Aug 21, 2004 3:32 am
Location: Australia, Victoria, Morwell
Contact:

Post by myles »

DarkCryst wrote:btw - the hot dog thing? That's in UK english, and many US accents, pretty well described by the "AO2" CMU phoneme

It's all a matter of the lexical stress :)
Thanks DarkCryst!

Although the CMU dictionary uses the AO2 phoneme for (as examples) the 'o' in glorification, the 'a' in softball, and the 'o' in therefore, all of which are a completely different sound to the short 'o' in hot dog, at least in Australain English, my native tongue, which I thought was closer to UK English in this respect than the US.

I guess we just have to live with a certain amount of inaccuracy or variation even in English, let alone the very rough approximations of the sounds in some of the mainland European languages. :)

Regards, Myles.
"Quote me as saying I was mis-quoted."
-- Groucho Marx
myles
Posts: 821
Joined: Sat Aug 21, 2004 3:32 am
Location: Australia, Victoria, Morwell
Contact:

Post by myles »

Another note about the Linux binary for Papagayo: I found on my Slax system I had to change the default sound system from threaded OSS to ALSA to get it loading and displaying sound files and waveforms. I also set a default auto-suspend time of 6 seconds to prevent the KDE sound system blocking programs that use the audio hardware directly, and turned off all the default system sounds - I don't know if either of these changes were important.

I only briefly tested loading projects and wav files and phonetic breakdown, it occurred to me just now that I didn't actually test exporting from the Linux binary version.

Regards, Myles
"Quote me as saying I was mis-quoted."
-- Groucho Marx
DarkCryst
Posts: 24
Joined: Mon Jul 24, 2006 9:36 pm

Post by DarkCryst »

late bump...

Australia, like the UK and USA has many regional variations. Victoria vs Sydney accents for example ;)

So really you are just describing in these files a generic english. Probably something transatlantic would be the best ground. Or.. for an Ozzie - think Kyle or How Mel Gibson sounded 15 years ago ;)
Post Reply