PinyinDialectConverter --- Hanyu Pinyin dialects
================================================

Specifics
---------

Examples
^^^^^^^^
The following examples show how to convert between different representations
of Pinyin.

- Create the Converter and convert from standard Pinyin to Pinyin with
  tones represented by numbers:

    >>> from cjklib.reading import *
    >>> targetOp = operator.PinyinOperator(toneMarkType='numbers')
    >>> pinyinConv = converter.PinyinDialectConverter(
    ...     targetOperators=[targetOp])
    >>> pinyinConv.convert(u'hànzì', 'Pinyin', 'Pinyin')
    u'han4zi4'

- Convert Pinyin written with numbers, the ü (u with umlaut) replaced
  by character v and omitted fifth tone to standard Pinyin:

    >>> sourceOp = operator.PinyinOperator(toneMarkType='numbers',
    ...    yVowel='v', missingToneMark='fifth')
    >>> pinyinConv = converter.PinyinDialectConverter(
    ...     sourceOperators=[sourceOp])
    >>> pinyinConv.convert('nv3hai2zi', 'Pinyin', 'Pinyin')
    u'nǚháizi'

- Or more elegantly:

    >>> f = ReadingFactory()
    >>> f.convert('nv3hai2zi', 'Pinyin', 'Pinyin',
    ...     sourceOptions={'toneMarkType': 'numbers', 'yVowel': 'v',
    ...     'missingToneMark': 'fifth'})
    u'nǚháizi'

- Decompose the reading of a dictionary entry from CEDICT into syllables
  and convert the ü-vowel and forms of *Erhua sound*:

    >>> pinyinFrom = operator.PinyinOperator(toneMarkType='numbers',
    ...     yVowel='u:', Erhua='oneSyllable')
    >>> syllables = pinyinFrom.decompose('sun1nu:r3')
    >>> print syllables
    ['sun1', 'nu:r3']
    >>> pinyinTo = operator.PinyinOperator(toneMarkType='numbers',
    ...     Erhua='twoSyllables')
    >>> pinyinConv = converter.PinyinDialectConverter(
    ...     sourceOperators=[pinyinFrom], targetOperators=[pinyinTo])
    >>> pinyinConv.convertEntities(syllables, 'Pinyin', 'Pinyin')
    [u'sun1', u'nü3', u'r5']

- Or more elegantly with entities already decomposed:

    >>> f.convertEntities(['sun1', 'nu:r3'], 'Pinyin', 'Pinyin',
    ...     sourceOptions={'toneMarkType': 'numbers', 'yVowel': 'u:',
    ...        'Erhua': 'oneSyllable'},
    ...     targetOptions={'toneMarkType': 'numbers',
    ...        'Erhua': 'twoSyllables'})
    [u'sun1', u'nü3', u'r5']

- Fix cosmetic errors in Pinyin input (note tone mark and apostrophe):

    >>> f.convert(u"Wǒ peí nǐ qù Xīān.", 'Pinyin', 'Pinyin')
    u"Wǒ péi nǐ qù Xī'ān."

- Fix more errors in Pinyin input (note diacritics):

    >>> string = u"Wŏ peí nĭ qù Xīān."
    >>> dialect = operator.PinyinOperator.guessReadingDialect(string)
    >>> f.convert(string, 'Pinyin', 'Pinyin', sourceOptions=dialect)
    u"Wǒ péi nǐ qù Xī'ān."


Class
-----

.. currentmodule:: cjklib.reading.converter

.. autoclass:: cjklib.reading.converter.PinyinDialectConverter
   :show-inheritance:
   :members:
   :undoc-members: