cjklib.reading.operator.MandarinBrailleOperator is an implementation for phonetically transcribing Mandarin using the Braille system.
In Braille the fifth tone of Mandarin Chinese is indicated without a tone mark making a pure entity ambiguous if entities without tonal information are mixed in. As by default Braille seems to be frequently written omitting tone marks where unnecessary, the option missingToneMark controlling the behaviour of absent tone marking is set to 'extended', allowing the mixing of entities with fifth and with no tone. If lossless conversion is needed, this option should be set to 'fifth', forbidding entities without tonal information.
A small trick to get Braille output into an easily readable form on a normal screen; do:
>>> import unicodedata
>>> input = u'⠅⠡ ⠝⠊ ⠙⠼ ⠊⠁⠓⠫⠰⠂'
>>> [unicodedata.name(char).replace('BRAILLE PATTERN DOTS-', 'P') \\
... for char in input]
['P13', 'P16', 'SPACE', 'P1345', 'P24', 'SPACE', 'P145', 'P3456', 'SPACE', 'P24', 'P1', 'P125', 'P1246', 'P56', 'P2']
Bases: cjklib.reading.operator.ReadingOperator
Provides an operator on strings written in the Braille system for Mandarin.
Todo
Parameters: |
|
---|
Composes the given list of basic entities to a string.
No special treatment is given for subsequent Braille entities. Use getSpaceSeparatedEntities() to insert spaces between two Braille syllables.
Parameter: | readingEntities (list of str) – list of basic entities or other content |
---|---|
Return type: | str |
Returns: | composed entities |
Decomposes the given string into basic entities that can be mapped to one Chinese character each (exceptions possible).
The given input string can contain other non reading characters, e.g. punctuation marks.
The returned list contains a mix of basic reading entities and other characters e.g. spaces and punctuation marks.
Parameter: | readingString (str) – reading string |
---|---|
Return type: | list of str |
Returns: | a list of basic entities of the input string |
Splits the given plain syllable into onset (initial) and rhyme (final).
Parameter: | plainSyllable (str) – syllable without tone marks |
---|---|
Return type: | tuple of str |
Returns: | tuple of syllable onset and rhyme |
Raises InvalidEntityError: | |
if the entity is invalid. |
Inserts spaces between two Braille entities for a given list of reading entities.
Spaces in the Braille system are applied between words. This is not reflected here and instead a space will be added between single syllables.
Parameter: | readingEntities (list of str) – list of basic entities or other content |
---|---|
Return type: | list of str |
Returns: | entities with spaces inserted between Braille sequences |
Gets the entity with tone mark for the given plain entity and tone.
Parameters: |
|
---|---|
Return type: | str |
Returns: | entity with appropriate tone |
Raises InvalidEntityError: | |
if the entity is invalid. |
Returns a set of tones supported by the reading.
Return type: | set |
---|---|
Returns: | set of supported tone marks. |
Splits the entity into an entity without tone mark and the name of the entity’s tone.
Parameter: | entity (str) – entity with tonal information |
---|---|
Return type: | tuple |
Returns: | plain entity without tone mark and additionally the tone |
Raises InvalidEntityError: | |
if the entity is invalid. |