Locale magic (literally)

Another programming-centric post and follow-up on Thursday's post about locale issues with Turkish.

So, I showed some Problems with locale-dependant mappings using the case of Turkish, that mapps small Latin character i to uppercase İ, which also has a dot on top. Now join me on some more magic. On Unix you need to have the proper locale generated, which under debian works with dpkg-reconfigure locales.

Python, Unicode and the digital divide

One could say that Unicode is the reflection of globalization in computing. So, being a computer scientist this huge project very much gets my attention and fascinates me on a daily basis. And Unicode is not just a feature, it is a foundation that bridges between languages and cultures in the digital world.

Followup on "Python doctest and Unicode"

I complained about Python doctest and Unicode some time ago. This was an itch I finally wanted to scratch, so I followed the popular saying: "Luke, read the source".

Turns out the error in question is fixed pretty easily. Python needs to properly encode the output, so a conversion to the output stream's encoding did the trick. Now a new issue came up.

(Natural) language in the world of programming

When it comes to writing code, directives and commands are dictated by the programming language (e.g. if ... then ... else), which then is for most programming languages English[1], but when it comes to writing comments the programmer is free to choose which language he uses.

Simple image segmenter in Python

口-bw.1.png, 口-bw.2.png, 口-bw.3.png
So I was looking for a simple segmenter to break down images containing several tiles into single pieces. I decided to write one myself, so here it is. comes with a help page (python --help) which explains the parameters in short. Most important segmentation can be done either by using a window or by looking for whitespaces in the image. Giving the width/height ratio or more specifically the tilesize makes guessing more accurate by discarding solutions that don't fit the given sizes. Furthermore by selecting equaltiles the segmenter will try to find a segmenting solution that results in having exact same size tiles. This can even improve segmentation results.

Syndicate content