Views on initials and finals of Mandarin in Pinyin

As there are different tables floating around, I wanted to compare them and explore the different view on the syllables' initials and finals. There is not only a difference in the order of the parts, but also in how the syllables are placed under which initial and which final. E.g. the final of yu is pronounced like , though the "dots" (diaresis) are missing, so this syllable is placed under final ü in some tables.

Started as Pinyin syllable sets compared the interest in differnt visualisations grew, so here you can find a bigger collection of tables, each with special characteristics.

Without wanting to say too much - here is a list of different views on the syllables of Mandarin Chinese:

Simple grouping according to written form of Pinyin

aaoaianangoouonguüuauaiuanuangueüeunuouieereienengiiaiaoiuieianiniangingiong
bbabaobaibanbangbobubeibenbengbibiaobiebianbinbing
ppapaopaipanpangpopoupupeipenpengpipiaopiepianpinping
mmamaomaimanmangmomoumumemeimenmengmimiaomiumiemianminming
ffafanfangfofoufufefeifenfeng
ddadaodaidandangdoudongduduandunduoduidedeidendengdidiadiaodiudiedianding
ttataotaitantangtoutongtutuantuntuotuitetengtitiaotietianting
nnanaonainannangnounongnunuannüenunnuoneneinennengniniaoniunienianninniangning
llalaolailanlangloloulongluluanlüelunluoleleilenglilialiaoliulielianlinliangling
zzazaozaizanzangzouzongzuzuanzunzuozuizezeizenzengzi
ccacaocaicancangcoucongcucuancuncuocuiceceicencengci
ssasaosaisansangsousongsusuansunsuosuisesensengsi
zhzhazhaozhaizhanzhangzhouzhongzhuzhuazhuaizhuanzhuangzhunzhuozhuizhezheizhenzhengzhi
chchachaochaichanchangchouchongchuchuachuaichuanchuangchunchuochuichechenchengchi
shshashaoshaishanshangshoushushuashuaishuanshuangshunshuoshuishesheishenshengshi
rraoranrangrourongruruaruanrunruoruirerenrengri
jjujuanjuejunjijiajiaojiujiejianjinjiangjingjiong
qququanquequnqiqiaqiaoqiuqieqianqinqiangqingqiong
xxuxuanxuexunxixiaxiaoxiuxiexianxinxiangxingxiong
ggagaogaiganganggougongguguaguaiguanguanggunguoguigegeigengeng
kkakaokaikankangkoukongkukuakuaikuankuangkunkuokuikekeikenkeng
hhahaohaihanhanghouhonghuhuahuaihuanhuanghunhuohuiheheihenheng
wwawaiwanwangwowuweiwenweng
yyayaoyaiyanyangyoyouyongyuyuanyueyunyeyiyinying
aaoaianangooueereieneng

Compare the Article Pinyin on Wikipedia (de).

Missing syllables: hng, hm, n, ng, m, ê

Discrimination of some parts pronounced differently, following ISO 7098

aoeê-ieraieiaoouanenangengongiiaiaoieiuianiniangingionguuauouaiuiuanunuangüüeüanün
aoeeeraieiaoouanenangeng
yyayoyeyaiyaoyouyanyangyongyiyinyingyuyueyuanyun
wwawowaiweiwanwenwangwengwu
bbabobaibeibaobanbenbangbengbibiaobiebianbinbingbu
ppapopaipeipaopoupanpenpangpengpipiaopiepianpinpingpu
mmamomemaimeimaomoumanmenmangmengmimiaomiemiumianminmingmu
ffafofefeifoufanfenfangfengfu
ddadedaideidaodoudandendangdengdongdidiadiaodiediudiandingduduoduiduandun
ttatetaitaotoutantangtengtongtitiaotietiantingtutuotuituantun
nnanenaineinaonounannennangnengnongniniaonieniunianninniangningnunuonuannunnüe
llalolelaileilaoloulanlanglenglonglilialiaolieliulianlinlianglingluluoluanlunlüe
zzazezizaizeizaozouzanzenzangzengzongzuzuozuizuanzun
ccacecicaiceicaocoucancencangcengcongcucuocuicuancun
ssasesisaisaosousansensangsengsongsusuosuisuansun
zhzhazhezhizhaizheizhaozhouzhanzhenzhangzhengzhongzhuzhuazhuozhuaizhuizhuanzhunzhuang
chchachechichaichaochouchanchenchangchengchongchuchuachuochuaichuichuanchunchuang
shshasheshishaisheishaoshoushanshenshangshengshushuashuoshuaishuishuanshunshuang
rreriraorouranrenrangrengrongruruaruoruiruanrun
jjijiajiaojiejiujianjinjiangjingjiongjujuejuanjun
qqiqiaqiaoqieqiuqianqinqiangqingqiongququequanqun
xxixiaxiaoxiexiuxianxinxiangxingxiongxuxuexuanxun
ggagegaigeigaogougangenganggenggongguguaguoguaiguiguangunguang
kkakekaikeikaokoukankenkangkengkongkukuakuokuaikuikuankunkuang
hhahehaiheihaohouhanhenhanghenghonghuhuahuohuaihuihuanhunhuang

Missing syllables: hng, hm, n, ng, m, ê

See below for ê.

More consequent view, initials y, w grouped under diphthongs and triphthong without initial character

As found in Praktisches Chinesisch, Band I, Kommerzieller Verlag, Beijing 2001, ISBN 7-100-01675-4.

aoeê-ieraieiaoouanenangengongiiaiaoieiuianiniangingionguuauouaiuiuanunuanguengüüeüanün
aoeeeraieiaoouanenangengyiyayaoyeyouyanyinyangyingyongwuwawowaiweiwanwenwangwengyuyueyuanyun
bbabobaibeibaobanbenbangbengbibiaobiebianbinbingbu
ppapopaipeipaopoupanpenpangpengpipiaopiepianpinpingpu
mmamomemaimeimaomoumanmenmangmengmimiaomiemiumianminmingmu
ffafofefeifoufanfenfangfengfu
ddadedaideidaodoudandendangdengdongdidiadiaodiediudiandingduduoduiduandun
ttatetaitaotoutantangtengtongtitiaotietiantingtutuotuituantun
nnanenaineinaonounannennangnengnongniniaonieniunianninniangningnunuonuannunnüe
llalolelaileilaoloulanlanglenglonglilialiaolieliulianlinlianglingluluoluanlunlüe
zzazezizaizeizaozouzanzenzangzengzongzuzuozuizuanzun
ccacecicaiceicaocoucancencangcengcongcucuocuicuancun
ssasesisaisaosousansensangsengsongsusuosuisuansun
zhzhazhezhizhaizheizhaozhouzhanzhenzhangzhengzhongzhuzhuazhuozhuaizhuizhuanzhunzhuang
chchachechichaichaochouchanchenchangchengchongchuchuachuochuaichuichuanchunchuang
shshasheshishaisheishaoshoushanshenshangshengshushuashuoshuaishuishuanshunshuang
rreriraorouranrenrangrengrongruruaruoruiruanrun
jjijiajiaojiejiujianjinjiangjingjiongjujuejuanjun
qqiqiaqiaoqieqiuqianqinqiangqingqiongququequanqun
xxixiaxiaoxiexiuxianxinxiangxingxiongxuxuexuanxun
ggagegaigeigaogougangenganggenggongguguaguoguaiguiguangunguang
kkakekaikeikaokoukankenkangkengkongkukuakuokuaikuikuankunkuang
hhahehaiheihaohouhanhenhanghenghonghuhuahuohuaihuihuanhunhuang

Missing syllables: yo, yai, hng, hm, n, ng, m, ê

y- and w- initials are pronounced as i and u except with yi, wu and thus can be put under the empty initial row (except yo as there is no io).

-i indicates different pronuciation for zi, zhi and others in contrast to yi, ti....

e is put under two columns, as 俄 is pronounced [ɤ] but 欸 is pronounced [ɛ] which is an exception. ê is left out as it serves the same purpose as e in this context: to mark the irregular pronuncitation as [ɤ].

Forms like yu, qu and others should actually be written yü, qü as to follow the design for nü, lü but as u is written wu and there is no [tɕʰu] the diaresis are ommited.

Transposed table following Pinyin.info

bpmfdtnlgkhzcszhchshrjqx
abapamafadatanalagakahazacasazhachashaa
obopomofoloo
emefedetenelegekehezecesezhechesheree
aibaipaimaidaitainailaigaikaihaizaicaisaizhaichaishaiai
eibeipeimeifeideineileigeikeiheizeiceizheisheiei
aobaopaomaodaotaonaolaogaokaohaozaocaosaozhaochaoshaoraoao
oupoumoufoudoutounoulougoukouhouzoucousouzhouchoushourouou
anbanpanmanfandantannanlangankanhanzancansanzhanchanshanranan
angbangpangmangfangdangtangnanglanggangkanghangzangcangsangzhangchangshangrangang
enbenpenmenfendennengenkenhenzencensenzhenchenshenrenen
engbengpengmengfengdengtengnenglenggengkenghengzengcengsengzhengchengshengrengeng
ongdongtongnonglonggongkonghongzongcongsongzhongchongrong
ubupumufudutunulugukuhuzucusuzhuchushuruwu
uaguakuahuazhuachuashuaruawa
uoduotuonuoluoguokuohuozuocuosuozhuochuoshuoruowo
uaiguaikuaihuaizhuaichuaishuaiwai
uiduituiguikuihuizuicuisuizhuichuishuiruiwei
uanduantuannuanluanguankuanhuanzuancuansuanzhuanchuanshuanruanwan
uangguangkuanghuangzhuangchuangshuangwang
unduntunnunlungunkunhunzuncunsunzhunchunshunrunwen
uengweng
ibipimiditinilizicisizhichishirijiqixiyi
iadialiajiaqiaxiaya
iebiepiemiedietienieliejieqiexieye
iaobiaopiaomiaodiaotiaoniaoliaojiaoqiaoxiaoyao
iumiudiuniuliujiuqiuxiuyou
ianbianpianmiandiantiannianlianjianqianxianyan
inbinpinminninlinjinqinxinyin
ingbingpingmingdingtingninglingjingqingxingying
iangniangliangjiangqiangxiangyang
iongjiongqiongxiongyong
üjuquxuyu
üenüelüejuequexueyue
üanjuanquanxuanyuan
ünjunqunxunyun

Missing syllables: yo, yai, er, hng, hm, n, ng, m, ê

Compare this table to http://www.pinyin.info/rules/initials_finals.html where there is also a short explanation on different (re-)groupings.

Extended table including all syllables

bpmfdtnlzcszhchshrjqxgkh
aabapamafadatanalazacasazhachashagakaha
oobopomofolo
eemefedetenelezecesezhechesheregekehe
êê
ɿzicisi
ʅzhichishiri
erer
aiaibaipaimaidaitainailaizaicaisaizhaichaishaigaikaihai
eieibeipeimeifeideineileizeiceizheisheigeikeihei
aoaobaopaomaodaotaonaolaozaocaosaozhaochaoshaoraogaokaohao
ououpoumoufoudoutounoulouzoucousouzhouchoushourougoukouhou
ananbanpanmanfandantannanlanzancansanzhanchanshanrangankanhan
enenbenpenmenfendennenzencensenzhenchenshenrengenkenhen
angangbangpangmangfangdangtangnanglangzangcangsangzhangchangshangranggangkanghang
engengbengpengmengfengdengtengnenglengzengcengsengzhengchengshengrenggengkengheng
ongdongtongnonglongzongcongsongzhongchongronggongkonghong
iyibipimiditinilijiqixi
iayadialiajiaqiaxia
iaoyaobiaopiaomiaodiaotiaoniaoliaojiaoqiaoxiao
ieyebiepiemiedietienieliejieqiexie
iouyoumiudiuniuliujiuqiuxiu
iaiyai
ianyanbianpianmiandiantiannianlianjianqianxian
inyinbinpinminninlinjinqinxin
iangyangniangliangjiangqiangxiang
ingyingbingpingmingdingtingninglingjingqingxing
ioyo
iongyongjiongqiongxiong
uwubupumufudutunuluzucusuzhuchushurugukuhu
uawazhuachuashuaruaguakuahua
uowoduotuonuoluozuocuosuozhuochuoshuoruoguokuohuo
uaiwaizhuaichuaishuaiguaikuaihuai
ueiweiduituizuicuisuizhuichuishuiruiguikuihui
uanwanduantuannuanluanzuancuansuanzhuanchuanshuanruanguankuanhuan
uenwenduntunnunlunzuncunsunzhunchunshunrungunkunhun
uangwangzhuangchuangshuangguangkuanghuang
uengweng
üyujuquxu
üeyuenüelüejuequexue
üanyuanjuanquanxuan
ünyunjunqunxun
mmhm
nn
ngnghng

This table includes all syllables and furthermore distinguishes between the different pronunciations of '-i' in zhi and zi. The finals iu, ui, un are shown as iou, uei, uen as to emphasise their pronunciation and the fact, that the forms with prepended y- and w- actually belong to this class.

Jyutping syllable table

Following Pinyin syllable sets compared here is a table of syllables of the Cantonese language written in Romanisation Jyutping.

There are two sources: Research Centre for Humanities Computing of the Research Institute for the Humanities (RIH), Faculty of Arts, The Chinese University of Hong Kong - 粵音節表 (Table of Cantonese Syllables) and the Unihan table.

Unihan has two syllables in field kCantonese which finals are not listed in the table from the Centre for Humanities Computing: loei (for 唳, 捩) and om (for 媕). The table beneath thus extends the final set by -oei and -om. Syllables found in the Unihan database are emphasised (italic), syllables from the table of the Centre for Humanities Computing marked with a 1.

bpmfdtnlgknghgwkwwzcsj
imi1ditini1li1wizi1ci1si1ji1
ipdip1tip1nip1lip1gip1kiphip1zip1cip1sip1jip1
itbit1pit1mit1dit1tit1nitlit1git1kit1ngit1hit1zit1cit1sit1jit1
ikbik1pik1mik1dik1tik1nik1lik1gik1gwik1kwikwik1zik1cik1sik1jik1
imdim1tim1nim1lim1gim1kim1him1zim1cim1sim1jim1
inbin1pin1min1din1tin1nin1lin1gin1kin1hin1zin1cin1sin1jin1
ingbing1ping1ming1fingding1ting1ning1ling1ging1king1hing1gwing1wing1zing1cing1sing1jing1
iubiu1piu1miu1fiudiu1tiu1niu1liu1giu1kiu1hiu1ziu1ciu1siu1jiu1
yuzyu1cyu1syu1jyu1
yutdyut1tyut1lyut1gyut1kyut1hyut1zyut1cyut1syut1jyut1
yundyun1tyun1nyun1lyun1gyun1kyun1hyun1zyun1cyun1syun1jyun1
ubufu1gu1ku1wu1
up
utbut1put1mut1fut1gutkut1wut1
ukuk1buk1puk1muk1fuk1duk1tuk1nuk1luk1guk1kuk1nguk1huk1zuk1cuk1suk1juk1
um
unbun1pun1mun1fun1gun1kwunwun1cun
ungung1bung1pung1mung1fung1dung1tung1nung1lung1gung1kung1ngung1hung1zung1cung1sung1jung1
uibui1pui1mui1fui1gui1kui1kwuiwui1zui
ee1be1peme1fede1ne1le1ge1ke1heweze1ce1se1je1
epgep1kep
etpet
ekbek1pek1dek1tek1lek1kek1hek1zek1cek1sek1
emlem1
en
engbeng1peng1meng1deng1teng1leng1geng1heng1zeng1ceng1seng1jeng1
eiei1bei1pei1mei1fei1dei1nei1lei1gei1kei1hei1sei1
eudeu1
eotdeot1neot1leot1zeot1ceot1seot1
eondeon1teon1leon1zeon1ceon1seon1jeon1
eoideoi1teoi1neoi1leoi1geoi1keoi1heoi1zeoi1ceoi1seoi1jeoi1
oeoedoe1toe1goe1koehoe1zoe
oetloet
oekdoek1loek1goek1koek1zoek1coek1soek1joek1
oengdoengnoeng1loeng1goeng1koeng1hoeng1zoeng1coeng1soeng1joeng1
oeiloei
oo1bo1po1mo1fo1do1to1no1lo1go1ko1ngo1ho1gwo1wo1zo1co1so1jo1
otgot1hot1
okok1bok1pok1mok1fok1dok1tok1nok1lok1gok1kok1ngok1hok1gwok1kwok1wok1zok1cok1sok1
omom
onon1gon1ngon1hon1
ongong1bong1pong1mong1fong1dong1tong1nong1long1gong1kong1ngong1hong1gwong1kwong1wong1zong1cong1song1
oioi1moidoi1toi1noi1loi1goi1koi1ngoi1hoi1zoi1coi1soi1
ouou1bou1pou1mou1dou1tou1nou1lou1gou1ngou1hou1zou1cou1sou1
apapdaptapnap1lap1gap1kap1ngaphap1zap1cap1sap1jap1
atatbat1pat1mat1fat1dat1tatnat1lat1gat1kat1ngat1hat1gwat1wat1zat1cat1sat1jat1
akak1bak1pakmak1dak1lak1gakkakngak1hak1wakzak1cak1sak1
amam1bam1dam1tamnam1lam1gam1kam1ngam1ham1zam1cam1sam1jam1
anan1ban1pan1man1fan1dan1tan1nan1langan1kan1ngan1han1gwan1kwan1wan1zan1can1san1jan1
angang1bang1pang1mang1fang1dang1tang1nang1langgang1kang1nganghang1gwang1wang1zang1cang1sang1
aiai1bai1pai1mai1fai1dai1tai1nai1lai1gai1kai1ngai1hai1gwai1kwai1wai1zai1cai1sai1jai1
auau1baupau1mau1fau1dau1tau1nau1lau1gau1kau1ngau1hau1wauzau1cau1sau1jau1
aaaa1baa1paa1maa1faa1daa1taa1naa1laa1gaa1kaa1ngaa1haa1gwaa1kwaa1waa1zaa1caa1saa1jaa1
aapaap1daap1taap1naap1laap1gaap1kaapngaaphaap1zaap1caap1saap1
aataat1baat1paatmaat1faat1daat1taat1naat1laat1gaat1kaat1ngaat1haatgwaat1waat1zaat1caat1saat1
aakaak1baak1paak1maak1faakdaak1laak1gaak1kaak1ngaak1haak1gwaak1waak1zaak1caak1saak1jaak1
aamaam1daam1taam1naam1laam1gaam1kaamngaam1haam1zaam1caam1saam1
aanaan1baan1paan1maan1faan1daan1taan1naan1laan1gaan1kaanngaan1haan1gwaan1kwaanwaan1zaan1caan1saan1
aangaang1baang1paang1maang1daangtaangnaanglaang1gaang1ngaang1haang1gwaang1kwaang1waang1zaang1caang1saang1jaang
aaiaai1baai1paai1maai1faai1daai1taai1naai1laai1gaai1kaai1ngaai1haai1gwaai1kwaai1waai1zaai1caai1saai1jaai1
aauaau1baau1paau1maau1faaudaautaaunaau1laaugaau1kaau1ngaau1haau1zaau1caau1saau1jaau
mm1hm1
ngng1hng1

Pinyin syllable sets compared

If you are in need of all syllables of Standard Mandarin (Putonghua) written in Pinyin you might come up with some seldom ones. The Xiàndài Hànyǔ Cídiǎn (现代汉语词典(第5版)商务印书馆, 北京 2005, ISBN 7-100-04385-9) for example lists syllables n and ng. The ISO 7098 norm for Pinyin then has syllables kei and rua.

Well, if you just need to do some simple processing, you might want to work with the known initials and finals and generate all combinations on the fly. This though introduces forms that are not understood by native speakers and might not help if you want to do some kind of error detection/correction.

Where do I find all forms? There is ISO 7098 which has a Annex A containing a table with all syllables contained in the standard. So, standards are always good. Though, as seen above some forms like kei might seem a bit akward. I'll be happy if anybody can point me out to a character that is transcribed this way.

Looking at the Unihan table you will find more forms not covered by the ISO norm. This is why I started to compile a table comparing the Unihan syllables with the ones defined in ISO 7098. The shape of the table follows the one in the IOS standard:

Emphasised (italic) syllables are found in the Unihan database. Forms with blue background are found in the ISO norm. You will see some forms not found in the ISO norm and some forms not found in the Unihan table.

aoeê-ieraieiaoouanenangengongiiaiaoieiuianiniangingionguuauouaiuiuanunuangüüeüanün
aoeeeraieiaoouanenangeng
yyayoyeyaiyaoyouyanyangyongyiyinyingyuyueyuanyun
wwawowaiweiwanwenwangwengwu
bbabobaibeibaobanbenbangbengbibiaobiebianbinbingbu
ppapopaipeipaopoupanpenpangpengpipiaopiepianpinpingpu
mmamomemaimeimaomoumanmenmangmengmimiaomiemiumianminmingmu
ffafofefeifoufanfenfangfengfu
ddadedaideidaodoudandendangdengdongdidiadiaodiediudiandingduduoduiduandun
ttatetaitaotoutantangtengtongtitiaotietiantingtutuotuituantun
nnanenaineinaonounannennangnengnongniniaonieniunianninniangningnunuonuannunnüe
llalolelaileilaoloulanlanglenglonglilialiaolieliulianlinlianglingluluoluanlunlüe
zzazezizaizeizaozouzanzenzangzengzongzuzuozuizuanzun
ccacecicaicaocoucancencangcengcongcucuocuicuancun
ssasesisaisaosousansensangsengsongsusuosuisuansun
zhzhazhezhizhaizheizhaozhouzhanzhenzhangzhengzhongzhuzhuazhuozhuaizhuizhuanzhunzhuang
chchachechichaichaochouchanchenchangchengchongchuchuachuochuaichuichuanchunchuang
shshasheshishaisheishaoshoushanshenshangshengshushuashuoshuaishuishuanshunshuang
rreriraorouranrenrangrengrongruruaruoruiruanrun
jjijiajiaojiejiujianjinjiangjingjiongjujuejuanjun
qqiqiaqiaoqieqiuqianqinqiangqingqiongququequanqun
xxixiaxiaoxiexiuxianxinxiangxingxiongxuxuexuanxun
ggagegaigeigaogougangenganggenggongguguaguoguaiguiguangunguang
kkakekaikeikaokoukankenkangkengkongkukuakuokuaikuikuankunkuang
hhahehaiheihaohouhanhenhanghenghonghuhuahuohuaihuihuanhunhuang

For an overview of different table schemes, see Views on initials and finals of Mandarin in Pinyin.

Update: The Unihan table's column 'kMandarin' was used as a source here. Actually Unicode 5.1 Unihan comes with two other independent sets 'kHanyuPinlu' for the Xiandai Hanyu Pinlu Cidian and 'kXHC' for the Xiandai Hanyu Cidian, which have some forms not included in the first set. Syllable kei is found for two characters 剋 and 尅 in kXHC.

Sudoku in TCL

After doing the Sudoku implementation in Python I ported the code to TCL as to do some first steps in this language. It took me some time to cope with the TCL syntax, but there it is.

Sadly the solution isn't really object oriented, creating the namespace will only create one copy of the Sudoku field. I found a solution with http://www.tcl.tk/man/tcl8.5/tutorial/Tcl31.html but this needs TCL 8.5.

% source sudoku.tcl
% sudoku::create
% sudoku::setRandomFields
% sudoku::toString
|2|_|3| |6|_|1| |_|_|_|
|_|6|_| |9|_|_| |_|_|_|
|_|5|_| |_|_|_| |_|_|3|

|_|_|4| |_|9|_| |8|_|_|
|1|_|_| |_|_|6| |3|_|_|
|_|9|_| |7|_|_| |_|_|2|

|_|_|_| |8|_|4| |_|_|_|
|8|_|_| |_|_|_| |_|6|_|
|_|_|_| |_|6|_| |9|_|5|
% sudoku::solveBruteForce
1
% sudoku::toString
|2|4|3| |6|8|1| |5|9|7|
|7|6|1| |9|3|5| |4|2|8|
|9|5|8| |4|7|2| |6|1|3|

|5|2|4| |1|9|3| |8|7|6|
|1|8|7| |5|2|6| |3|4|9|
|3|9|6| |7|4|8| |1|5|2|

|6|7|9| |8|5|4| |2|3|1|
|8|3|5| |2|1|9| |7|6|4|
|4|1|2| |3|6|7| |9|8|5|

Same as the implementation in Python: released under the MIT license.

AttachmentSize
sudoku.tcl6.4 KB

Spell check python source code

How to spell check Python source code, without wanting to non stop clicking "Ignore" for 'def' or other Python commands?

I have no IDE installed for python, quick Google check didn't give me any nice results, there's something for Emacs, brr.

On a mailing list somebody proposed to write a small script to extract all comments and strings. Though I hate small hacks like this, and would prefer a proper way, like opening my beloved Editor and letting it decided to not treat python code, but just comments and so on...

Well there you go, file extractSpellCheckable.py:

#!/usr/bin/python
# -*- coding: utf8 -*-

"""
Usage:

cat yoursource.py | python extractSpellCheckable.py
"""

import sys
import tokenize

g = tokenize.generate_tokens(sys.stdin.readline)   # tokenize the string
for toknum, tokval, _, _, _  in g:
    if tokenize.tok_name[toknum] in ['STRING', 'COMMENT']:
        print tokval

Use it for whatever you like.

Syndicate content