cjklib.util — Utilities

Utilities.

Functions

cjklib.util.cachedproperty(fget)
Decorates a property to memoize its value.
cjklib.util.cross(*args)

Builds a cross product of the given lists.

Example:
>>> cross(['A', 'B'], [1, 2, 3])
[['A', 1], ['A', 2], ['A', 3], ['B', 1], ['B', 2], ['B', 3]]
cjklib.util.crossDict(*args)
Builds a cross product of the given dicts.
cjklib.util.deprecated(func)
Decorator which can be used to mark functions as deprecated. It will result in a warning being emitted when the function is used.
cjklib.util.fromCodepoint(codepoint)

Creates a character for a Unicode codepoint similar to unichr.

For Python narrow builds this function does not raise a ValueError for characters outside the BMP but returns a string with a UTF-16 surrogate pair of two characters.

See also

PEP 261

cjklib.util.getCharacterList(string)
Split a string of characters into a list of single characters. Parse UTF-16 surrogate pairs.
cjklib.util.getConfigSettings(section, projectName='cjklib')

Reads the configuration from the given section of the project’s config file.

Parameters:
  • section (str) – section of the config file
  • projectName (str) – name of project which will be used as name of the config file
Return type:

dict

Returns:

configuration settings for the given project

cjklib.util.getDataPath()

Gets the path to packaged data.

Return type:str
Returns:path
cjklib.util.getSearchPaths(projectName='cjklib')

Gets a list of search paths for the given project.

Parameter:projectName (str) – name of project
Return type:list
Returns:list of search paths
cjklib.util.istitlecase(strng)

Checks if the given string is in titlecase.

Parameter:strng (str) – a string
Return type:bool
Returns:True if the given string is in titlecase according to L{titlecase()}.
cjklib.util.isValidSurrogate(string)

Returns True if the given string is a single surrogate pair.

Always returns False for wide builds.

cjklib.util.locateProjectFile(relPath, projectName='cjklib')

Locates a project file relative to the project’s directory. Returns None if module pkg_resources is not installed or package information is not available.

Parameters:
  • relPath (str) – path relative to project directory
  • projectName (str) – name of project which will be used as name of the config file
cjklib.util.titlecase(strng)

Returns the string (without “word borders”) in titlecase.

This function is not designed to work for multi-entity strings in general but rather for syllables with apostrophes (e.g. 'Ch’ien1') and combining diacritics (e.g. 'Hm\u0300h'). It additionally needs to support cases where a multi-entity string can derive from a single entity as in the case for GR (e.g. 'Shern.me' for 'Sherm').

Parameter:strng (str) – a string
Return type:str
Returns:the given string in titlecase

Todo

  • Impl: While this function is only needed as long as Python doesn’t ship with a proper title casing algorithm as defined by Unicode, we need a proper handling for Wade-Giles, as Pinyin Erhua forms will convert to two entities being separated by a hyphen, which does not fall in to the Unicode title casing algorithm’s definition of a case-ignorable character.
cjklib.util.toCodepoint(char)

Returns the Unicode codepoint for this character similar to ord.

This function can handle surrogate pairs as used by narrow builds.

Raises ValueError:
 if the string is not a single char or not a valid surrogate pair

Classes

class cjklib.util.CharacterRangeIterator(ranges)

Bases: object

Iterates over a given set of codepoint ranges given in hex.

next()
class cjklib.util.CollationString(length=None, collation=None, **kwargs)

Bases: cjklib.util._CollationMixin, sqlalchemy.types.String

Construct a VARCHAR.

Parameter:collation – Optional, a column-level collation for this string value.
get_col_spec()
class cjklib.util.CollationText(length=None, collation=None, **kwargs)

Bases: cjklib.util._CollationMixin, sqlalchemy.types.Text

Construct a TEXT.

Parameter:collation – Optional, a column-level collation for this string value.
get_col_spec()
class cjklib.util.DictMixin
clear()
get(key, default=None)
has_key(key)
items()
iteritems()
iterkeys()
itervalues()
pop(key, *args)
popitem()
setdefault(key, default=None)
update(other=None, **kwargs)
values()
class cjklib.util.ExtendedOption(*opts, **attrs)

Bases: optparse.Option

Extends optparse by adding:

  • bool type, boolean can be set by True or False, no one-way setting
  • path type, a list of paths given in one string separated by a colon ':'
  • extend action that resets a default value for user specified options
  • append action that resets a default value for user specified options
check_bool(option, opt, value)
check_pathstring(option, opt, value)
take_action(action, dest, opt, value, values, parser)
class cjklib.util.LazyDict(creator, *args)

Bases: dict

A dict that will load entries on-demand.

class cjklib.util.OrderedDict(*args, **kwds)

Bases: dict, UserDict.DictMixin

clear()
copy()
classmethod fromkeys(iterable, value=None)
items()
iteritems()
iterkeys()
itervalues()
keys()
pop(key, *args)
popitem(last=True)
setdefault(key, default=None)
update(other=None, **kwargs)
values()
class cjklib.util.UnicodeCSVFileIterator(fileHandle)

Bases: object

Provides a CSV file iterator supporting Unicode.

class DefaultDialect

Bases: csv.Dialect

Defines a default dialect for the case sniffing fails.

static UnicodeCSVFileIterator.byte_string_dialect(dialect)
UnicodeCSVFileIterator.next()
static UnicodeCSVFileIterator.utf_8_encoder(unicode_csv_data)
class cjklib.util._CollationMixin(collation=None, **kwargs)

Bases: object

Parameter:collation – Optional, a column-level collation for this string value.
get_search_list()
class cjklib.util.cachedmethod(fget)

Bases: object

Decorate a method to memoize its return value. Only applicable for methods without arguments.

Table Of Contents

Previous topic

cjklib.test.readingconverter — Unit tests for reading.converter

Next topic

To do

This Page