cjklib.build — Build database

Builds the library’s database.

Each table that needs to be created has to be implemented by subclassing a TableBuilder. The DatabaseBuilder is the central instance for managing the build process. As the creation of a table can depend on other tables the DatabaseBuilder keeps track of dependencies to process a build in the correct order.

Building is tested on the following storage methods:

  • SQLite
  • MySQL

Examples

The following examples should give a quick view into how to use this package.

  • Create the DatabaseBuilder object with default settings (read from cjklib.conf or using cjklib.db in same directory as default):

    >>> from cjklib import build
    >>> dbBuilder = build.DatabaseBuilder(dataPath=['./cjklib/data/'])
    Removing conflicting builder(s) 'StrokeCountBuilder' in favour of 'CombinedStrokeCountBuilder'
    Removing conflicting builder(s) 'CharacterResidualStrokeCountBuilder' in favour of 'CombinedCharacterResidualStrokeCountBuilder'
    
  • Build the table of Jyutping syllables from a csv file:

    >>> dbBuilder.build(['JyutpingSyllables'])
    building table 'JyutpingSyllables' with builder
    'JyutpingSyllablesBuilder'...
    Reading table definition from file './cjklib/data/jyutpingsyllables.sql'
    Reading table 'JyutpingSyllables' from file
    './cjklib/data/jyutpingsyllables.csv'
    

Functions

cjklib.build.warn(message)

Prints the given message to stderr with the system’s default encoding.

Parameter:message (str) – message to print

Classes

class cjklib.build.DatabaseBuilder(**options)

DatabaseBuilder provides the main class for building up a database for the cjklib package.

It contains all TableBuilder classes and a dependency graph to handle build requests.

To modify the behaviour of TableBuilder instances, global or local options can be specified, see getBuilderOptions().

Parameters:
  • databaseUrl – database connection setting in the format driver://user:pass@host/database.
  • dbConnectInst – instance of a DatabaseConnector
  • dataPath – optional list of paths to the data file(s)
  • quiet – if True no status information will be printed to stderr
  • rebuildDepending – if True existing tables that depend on updated tables will be dropped and built from scratch
  • rebuildExisting – if True existing tables will be dropped and built from scratch
  • noFail – if True build process won’t terminate even if one table fails to build
  • prefer – list of TableBuilder names to prefer in conflicting cases
  • additionalBuilders – list of externally provided TableBuilders
Raises ValueError:
 

if two different options from two different builder collide.

build(tables)

Builds the given tables.

Parameter:tables (list) – list of tables to build
Raises IOError:if a table builder fails to read its data; only if noFail is set to False
clearTemporary()
Removes all tables only built temporarily as to satisfy build dependencies. This method is called before build() terminates. If the build process is interruptes (e.g. by the user pressing Ctrl+C), this method should be called as to make sure that these temporary tables are removed and not included in later builds.
static getBuildDependencyOrder(tableBuilderClasses)

Create order in which the tables have to be created.

Parameter:tableBuilderClasses (list of classobj) – list of TableBuilder classes
Return type:list of classobj
Returns:the given classes ordered in build dependency order
getBuildDependentTables(tableNames)

Gets the name of the tables that needs to be built to resolve dependencies.

Parameter:tableNames (list of str) – list of tables to build
Return type:list of str
Returns:names of tables needed to resolve dependencies
getBuilderOptions(builderClass, ignoreUnknown=False)

Gets a dictionary of options for the given builder that were specified to the DatabaseBuilder.

Options included are global options understood by the builder (e.g. 'dataPath') or local options given in the formats '--BuilderClassName-option' or '--TableName-option'. For example '--Unihan-wideBuild' sets the option 'wideBuild' for all builders providing the Unihan table. '--BuilderClassName-option' has precedence over '--TableName-option'.

Parameters:
  • builderClass (classobj) – TableBuilder class
  • ignoreUnknown (bool) – if set to True unknown options will be ignored, otherwise a ValueError is raised.
Return type:

dict

Returns:

dictionary of options for the given table builder.

Raises ValueError:
 

if unknown option is specified and ignoreUnknown is False

getClassesInBuildOrder(tableNames)

Gets the build order for the given table names.

Parameter:tableNames (list of str) – list of names of tables to build
Return type:list of classobj
Returns:TableBuilder classes in build order
Raises UnsupportedError:
 if an unsupported table is given.
getCurrentSupportedTables()

Gets names of tables supported by this instance of the database builder.

This list can have more entries then getSupportedTables() as additional external builders can be supplied on instantiation.

Return type:list of str
Returns:names of tables
getDependingTables(tableNames)

Gets the name of the tables that depend on the given tables to be built and are not included in the given set.

Dependencies depend on the choice of table builders and thus may vary.

Parameter:tableNames (list of str) – list of tables
Return type:list of str
Returns:names of tables that depend on given tables
getExternalRebuiltDependingTables(tableNames)

Gets the name of the tables that depend on the given tables to be built and already exist similar to getRebuiltDependingTables() but only for tables of attached databases.

Parameter:tableNames (list of str) – list of tables
Return type:list of str
Returns:names of tables of attached databsaes that need to be rebuilt because of dependencies
getRebuiltDependingTables(tableNames)

Gets the name of the tables that depend on the given tables to be built and already exist, thus need to be rebuilt.

Parameter:tableNames (list of str) – list of tables
Return type:list of str
Returns:names of tables that need to be rebuilt because of dependencies
static getSupportedTables()

Gets names of supported tables.

Return type:list of str
Returns:names of tables
getTableBuilder(tableName)

Gets the TableBuilder used by this instance of the database builder to build the given table.

Parameter:tableName (str) – name of table
Return type:classobj
Returns:TableBuilder used to build the given table by this build instance.
Raises UnsupportedError:
 if an unsupported table is given.
static getTableBuilderClasses(preferClassNameSet=None, resolveConflicts=True, quiet=True, additionalBuilders=None)

Gets all classes in module that implement TableBuilder.

Parameters:
  • preferClassNameSet (list of str) – list of TableBuilder class names that will be preferred in conflicting cases, resolveConflicting must be True to take effect (default)
  • resolveConflicts (bool) – if true conflicting builders will be removed so that only one builder is left per Table.
  • quiet (bool) – if True no status information will be printed to stderr
  • additionalBuilders (list of classobj) – list of externally provided TableBuilders
Return type:

set

Returns:

list of all classes inheriting form TableBuilder that provide a table (i.d. non abstract implementations), with its name as key

Raises ValueError:
 

if two builders are preferred that provide the same table, if two different options with the same name collide

isOptimizable()

Checks if the current database supports optimization.

Return type:boolean
Returns:True if optimizable, False otherwise
needsRebuild(tableName)

Returns True if either rebuild is turned on by default or the table does not exist yet in any of the databases.

Parameter:tableName (str) – table name
Return type:bool
Returns:True, if table needs to be rebuilt
optimize()

Optimizes the current database.

Raises Exception:
 if database does not support optimization
Raises OperationalError:
 if optimization failed
remove(tables)

Removes the given tables from the main database.

Parameter:tables (list) – list of tables to remove
Raises UnsupportedError:
 if an unsupported table is given.
Return type:list
Returns:names of deleted tables, might be smaller than the actual list
static resolveBuilderConflicts(classList, preferClassNames=None, quiet=True)

Returns a subset of TableBuilder classes so that every buildable table is only represented by exactly one builder.

Parameters:
  • classList (list of classobj) – list of TableBuilders
  • preferClassNames (list of classobj) – list of TableBuilder class names that will be preferred in conflicting cases
  • quiet (bool) – if True no status information will be printed to stderr
Return type:

list of classobj

Returns:

mapping of table names to builder classes that provide the given table

Raises ValueError:
 

if two builders are preferred that provide the same table

setBuilderOptions(builderClass, options, exclusive=False)

Sets the options for the given builder that were specified.

Parameters:
  • builderClass (classobj) – TableBuilder class
  • options (dict) – dictionary of options for the given table builder.
  • exclusive (bool) – if set to True unspecified options will be set to the default value.
Raises ValueError:
 

if unknown option is specified

Table Of Contents

Previous topic

cjklib.cjknife — Command line interface

Next topic

cjklib.build.builder — Build methods

This Page