ISO 639-3 changes Macrolanguage table format

A bit late I realised that SIL changed the ISO 639-3 macrolanguage table format two days after the release of the 2007 cycle.

With the deprecation of now five language codes that are mapped to a macrolanguage it seems to have been inevitable to add a column indicating whether the code is still in use or obsolete and put the deprecated codes back in which seem to have been removed before. After a quick glance the change two days later didn't create new macro mappings, but just added the deprecated codes back again.

See Deprecated languages codes mapped to a macrolanguage for a list of these codes.

Here is the new table format:

CREATE TABLE ISO_639_3_Macrolanguages (
   M_Id      char(3) NOT NULL,  -- The identifier for a macrolanguage
   I_Id      char(3) NOT NULL,  -- The identifier for an individual language
                                -- that is a member of the macrolanguage
   I_Status  char(1) NOT NULL,  -- A (active) or R (retired) indicating the
                                -- status of the individual code element
   PRIMARY KEY (I_Id),
   INDEX (M_Id)
);

Deprecated languages codes mapped to a macrolanguage

Deprecated language codes mapped to a macrolanguage are still being listed in the macrolanguage table. See ISO 639-3 changes Macrolanguage table format. Here's a list of these codes:

Count of deprecated language codes with macrolanguage mappings in ISO 639-3: 6
Query: SELECT COUNT(*) AS count FROM ISO_639_3_Macrolanguages LEFT JOIN ISO_639_3_Retirements ON I_Id = Id WHERE I_Status = 'R'

ISO 639-3 LanguageISO 639-3 MacrolanguageEffective date
mdogba2008-01-14
bluhmn2008-01-14
mlymsa2008-02-18
ztczap2007-07-18
ccxzha2008-01-14
ccyzha2007-07-18

Versions of relevant table(s):

TableVersion
ISO_639_320081110
ISO_639_3_Macrolanguages20080218
ISO_639_3_Retirements20081110

Sudoku in Python

A friend of mine asked me to write a Sudoku module in Python so he could compare that to his C-implementation, and to see how Python works.

I wrote a simple class for 9x9 Sudokus which includes a brute force solver, that just iterates through all possible combinations until it either finds a solution or aborts when no solution is possible.

Simple usage:

>>> import sudoku
>>> s = sudoku.Sudoku()
>>> s.setRandomFields() # generate a task and set some fields
>>> print s
|8|_|6| |_|7|_| |_|_|_|
|_|1|_| |_|_|_| |8|6|3|
|9|_|_| |_|_|_| |_|_|1|

|_|_|_| |_|_|_| |_|_|_|
|_|_|_| |_|_|1| |4|3|8|
|4|5|1| |_|_|_| |_|2|6|

|_|_|_| |_|4|_| |_|1|_|
|_|8|_| |3|_|6| |_|_|_|
|_|_|_| |9|_|_| |_|_|_|
>>> s.solveBruteForce() # might take a looong time
>>> print s
|8|2|6| |1|7|3| |5|4|9|
|5|1|7| |2|9|4| |8|6|3|
|9|3|4| |6|8|5| |2|7|1|

|3|6|8| |4|2|7| |1|9|5|
|2|7|9| |5|6|1| |4|3|8|
|4|5|1| |8|3|9| |7|2|6|

|6|9|5| |7|4|8| |3|1|2|
|7|8|2| |3|1|6| |9|5|4|
|1|4|3| |9|5|2| |6|8|7|

There might be bugs. I release it under the MIT license.

See also Sudoku in TCL.

AttachmentSize
sudoku.py.txt6.13 KB

Updating Ethnologue codes from the 14th Edition to the 15th Edition

In Mapping from Ethnologue 14 to ISO 639-2 I wrote about mapping Ethno 14 codes to ISO 639-2. At this time I didn't know that there actually is a mapping to ISO 639-3 provided by Ethnologue under http://www.ethnologue.com/codes/updating_codes.asp.

The SQL table create commands are given on the page, load the data using:

LOAD DATA LOCAL INFILE 'Retired_Codes.tab' INTO TABLE Retired_Codes
CHARACTER SET latin1 FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\r\n' IGNORE 1 LINES;

LOAD DATA LOCAL INFILE 'Update_Mappings.tab' INTO TABLE Update_Mappings
CHARACTER SET latin1 FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\r\n' IGNORE 1 LINES;

The page does some step to step description how to upgrade, so I guess there's nothing more to say here.

Update:
Table Update_Mappings has three mappings where the given code doesn't exist in ISO 639-3:

mysql> SELECT * FROM Update_Mappings WHERE New_Code NOT IN
(SELECT Id FROM ISO_639_3 UNION SELECT Id FROM ISO_639_3_Retirements);
+----------+----------+-------------+
| Old_Code | New_Code | Name        |
+----------+----------+-------------+
| DLC      | dlc      | Dalecarlian |
| JMK      | jmk      | Jamtska     |
| SCY      | scy      | Scanian     |
+----------+----------+-------------+

Further more it has 102 mappings to codes that at the time of writing are retired:

SELECT COUNT(*) FROM Update_Mappings WHERE New_Code IN
(SELECT Id FROM ISO_639_3_Retirements);

New ISO 639 codes

I updated the ISO 639 codes in the local database as SIL published a new revision of codes on the 14th of January. The Library of Congress made a change to ISO 639-2 on December the 17th, so all tables have been updated now.

The table for ISO 639-3 code retirements SIL gives right now is too restrictive on the column Ret_Remedy, as it surpasses 200 characters in data. I had to fix that on importing.

CREATE TABLE ISO_639_3_Retirements (
   Id          char(3)      NOT NULL,   -- The three-letter 639-3 identifier
   Ret_Reason  char(1)      NOT NULL,   -- code for retirement: C (change),
                                        -- D (duplicate), N (non-existent),
                                        -- S (split), M (merge)
   Change_To   char(3)      NULL,       -- in the cases of C, D, and M, the
                                        -- identifier to which all instances
                                        -- of this Id should be changed
   Ret_Remedy  varchar(255) NOT NULL,   -- The instructions for updating an
                                        -- instance of the retired (split)
                                        -- identifier
   Effective   date         NOT NULL,   -- The date the retirement became
                                        -- effective
   PRIMARY KEY (Id)
);
Furthermore what I might have missed last time was that in the SQL dump given I had to remove some more stuff as MySQL 4 doesn't know about it: cat iso639codes_clean.sql | grep -v "character_set_client" Finally now these are the local versions:

Versions of tables:

TableVersion
ISO_639_320081110
ISO_639_3_Names20081110
ISO_639_3_Macrolanguages20080218
ISO_639_3_Retirements20081110
ISO_639_220081107

Syndicate content