Christoph's CJK-centered concerns

Mandarin phonology with IPA (2)

Submitted by Christoph on 2 June, 2008 - 19:57

I want to add something to what I said in Mandarin phonology with IPA:

It seems both sources, Hànyǔ Pǔtōnghuà Yǔyīn Biànzhèng and Das neue chinesisch-deutsche Wörterbuch, make a mistake on at least two finals. I say mistake though knowing that the depiction chosen depends heavily on your own interpretation of IPA and/or the Standard Mandarin language. I want to explain why.

-üan (in yuan, juan, quan, xuan, see Views on initials and finals of Mandarin in Pinyin) is written as [yan], and thus equals -iao [iau] on phone [a], where in fact it is pronounced equal to -ian [iɛn] with [ɛ] (IPA only in source 1). I would thus write it as [yɛn].
-iong (yong, jiong, qiong, xiong) is written as [yŋ] and therefor equals -ün [yn] on [y] but in fact it is pronounced equal to -ong [uŋ] so I would write [iuŋ].

Compare to article Pinyin on the English Wikipedia.

I will still stick tightly to the book when implementing the conversion table, but might point out this problem.

Christoph's blog

Mandarin phonology with IPA

Submitted by Christoph on 24 May, 2008 - 01:18

I am currently trying to create a simple conversion from (Hanyu) Pinyin to IPA for the Standard Mandarin pronunciation Pinyin was created for.

This conversion will be build on a table based mapping from Pinyin to IPA syllables and has some simple routines for coping with tone sandhi occurrences (see German Wikipedia on Mandarin tonsandhi).

Apart from the tonal aspect, conversion of plain syllables is pretty much straightforward. Based on the initial/final structure of Mandarin syllables (see earlier post on Views on initials and finals of Mandarin in Pinyin) one can create a mapping to IPA with only a few rules: while for duan and tuan the final part uan is the same, this is not that obvious for wen and dun. In the latter case one has to note that wen is the form uen, where an initial u will change to a w, and dun is the shortend form of what should actually be duen. A set of rules will make sure these "writing variations" in Pinyin will be taken care of, such that finally only a mapping from initials and finals needs to be done.

There are different works on IPA for the Mandarin language (I collect them here, additionally see the Wikipedias) and as pronunciation is highly dependant on the region (i.d. dialect), speaker (i.d. individual variation) and basically the fact that IPA seems to give you some freedom when choosing phonological descriptions, no proposition on a syllable set seems to come to the same solution.

For me it was important to have a set I could give a source for, so I checked my few books on this:

Hànyǔ Pǔtōnghuà Yǔyīn Biànzhèng (汉语普通话语音辨正). Page 15, Běijīng Yǔyán Dàxué Chūbǎnshè (北京语言大学出版社), Beijing 2003, ISBN 7-5619-0622-6. - This is a book for learners of Mandarin Chinese. With the pronunciation it gives a set of training lessons.
Das neue chinesisch-deutsche Wörterbuch - 新汉德词典, Shāngwù Yìnshūguǎn (商务印书馆), Beijing 2003, ISBN 7-100-00096-3. - This Chinese-German dictionary includes a table on pronunciation of Pinyin in IPA.

Both have nearly the same way of using IPA:

Aspiration is given using an apostrophe, though it seems [ʰ] is the standard character for this.
d and t ([t]), b and p ([p]), g and k ([k]), z and c ([ts]), zh and ch ([tʂ]), j and q ([tɕ]) use the same string except the latter ones including the [‘] to mark the aspiration.
For final vowels an a is an [a], an o can mostly be [o] or turn to [u], e can have several forms depending on the context (e.g. single vowel, diphthong...), u is mostly [u].

Differences in the usage of IPA:

Hànyǔ Pǔtōnghuà Yǔyīn Biànzhèng will use [ɤ] instead of [ə] for a single e vowel and it states a rule when this can change to a [ə]
ou is [ou] instead of [əu], uo is [uo] instead of [uə], ian is [iɛn] instead of [ian].
Das neue chinesisch-deutsche Wörterbuch has phones for initials y and w ([j] and [w]) whereas my first book places 有 under [iou] without mentioning any special phones.

Using this data a mapping should be easy to implement. What seems to be left is to implement the rule of sound changes for the final and single vowel e which Hànyǔ Pǔtōnghuà Yǔyīn Biànzhèng gives. Simply speaking it states that syllables like ge as in 哥哥 (older brother) will be pronounced differently depending on the tone, i.e. [kɤkə].

Addendum:In Mandarin phonology with IPA (2) I point out two issues in the two sources I believe to be errors.

Christoph's blog

id3encodingconverter

Submitted by Christoph on 20 April, 2008 - 23:28

Current project image for application id3encodingconverter.

Thumbnail

Creating KDE4 apps with Python

Submitted by Christoph on 15 April, 2008 - 23:15

Creating the Python KDE4 application id3encodingconverter makes me dive into PyKDE programming again after the first version of EncodingConverter EncodingConverter a KDE Amarok plugin with PyKDE 3. But it's the first time getting that deep into the world of Qt, KDE, PyKDE and it takes some time to solve all the issued that come up.

I'll just list a few of them and hope this might help others having the same problems.

PyKDE4 seems to still have some issues, like menus not showing up or the ui compiler doing wrong translations of imports. Overall it does work, but it's worthwile scanning the mailing list for issues brought up by people, as the whole project was just released a few weeks ago.

setup.py from the Python distutils module will install your application. It's important though to tell the script where to place files. Here's a short excerpt of my script, it took me a few minutes to get all the settings needed for the different files

We have to get the target directories for KDE4 data files. This uses kde4-config to query the local system's default directories. Not too nice, as we generate absolute directories here. For Debian packaging I still need to adapt this.

# get target directories for kde4 files
kde4DataTarget = os.popen("kde4-config --expandvars --install data")\
    .read().strip()
kde4DesktopTarget = os.popen("kde4-config --expandvars --install apps")\
    .read().strip()

To distribute the different files I added this to the setup method:

setup(name='id3encodingconverter',

    [...]

    # UI files generated by pykdeuic4 and other private libraries
    py_modules=['encoding', 'ngram', 'ID3EncodingConverterUI',
        'ID3EncodingConverterGuessingSetup'],
    # main application
    scripts=['id3encodingconverter'],
    # ui.rc, .desktop and documentation files
    data_files=[(os.path.join(kde4DataTarget, 'id3encodingconverter'),
            ['id3encodingconverterui.rc']),
        (os.path.join(kde4DesktopTarget, 'id3encodingconverter'),
                 ['id3encodingconverter.desktop']),
        ('share/doc/id3encodingconverter/', ['TODO'])],)

.desktop file: If you want to have a menu entry in KDE or other window managers you need to create a .desktop file. Some information can be found under https://wiki.ubuntu.com/PackagingGuide/SupplementaryFiles#Desktop and http://standards.freedesktop.org/desktop-entry-spec/latest/. desktop-file-validate can scan your file and check if everything is ok.
Packaging your application seems worthwhile as many users won't take the time to go solve all problems that arise by installing from a source package without dependency handling. For me it took quite some time as I described in Debian package for id3encodingconverter but every major distribution should provide help how to process KDE and Python apps.

Still waiting for more issued to learn. Stay tuned.

Christoph's blog

Debian package for id3encodingconverter

Submitted by Christoph on 15 April, 2008 - 02:11

Just finished the debian package for id3encodingconverter. First Debian package at all.

I used the following steps to create the package:

svn checkout http://id3encodingconverter.googlecode.com/svn/trunk/ id3encodingconverter-0.1alpha.svn20080415

svn update http://id3encodingconverter.googlecode.com/svn/wiki/ id3encodingconverter_wiki/

cp id3encodingconverter_wiki/TODO.wiki id3encodingconverter-0.1alpha.svn20080415/TODO

tar --exclude=".svn" -cvvf id3encodingconverter-0.1alpha.svn20080415.tar id3encodingconverter-0.1alpha.svn20080415

gzip id3encodingconverter-0.1alpha.svn20080415.tar

cd id3encodingconverter-0.1alpha.svn20080415

dh_make -e christoph.burgmer@stud.uni-karlsruhe.de -f ../id3encodingconverter-0.1alpha.svn20080415.tar.gz

# changes changes changes to files under debian/* here

dpkg-buildpackage -rfakeroot

debc # have a look

lintian -i ../id3encodingconverter_0.1alpha.svn20080415-1_i386.changes

Christoph's blog

Navigation

tags in site content

Archive

Blogs I read

Mandarin phonology with IPA (2)

Mandarin phonology with IPA

id3encodingconverter

Creating KDE4 apps with Python

Debian package for id3encodingconverter