From Wikimedia Commons, the free media repository
Jump to: navigation, search
Please contact me using my Talk page or my email address artsakenos at GMail.


Contributions in the Chinese Character Decomposition (CCD) project[edit]

The Chinese Character Decomposition page: Commons:Chinese_characters_decomposition (CCD Talk)

A more exaustive guide to CCD: User:Artsakenos/CCD-Guide (replaced on Revision as of 09:32, 22 April 2013)

TSV Version of CCD: User:Artsakenos/CCD-TSV

Software to retrieve CCD statistics and exploit decomposition: [Download]
Many downloads lately, if you found it is useful, please consider making a donation.

The ISO10646 decomposition legend: User:Artsakenos/CCD-ISO10646

The table of the new unicode character set: User:Artsakenos/CCD-Table2

Other contributors: User:Yug/Stroke_order2, User:Micheletb

Refactoring of CCD, 12 March 2014[edit]

A subset of the CJK Decomposition Data under Apache 2.0 Licence has been considered, removing from CJK all characters just coded and characters with multiple decomposition. Hence, the result:

  • Considered set: 20902 lines;
  • Empty words on CJK: 1788;
  • Differences with CJK: 4785; Following.

Statistics pre:

Composition Kind Count: 品=48,吅=14958 (71.6%),*=4,+=196,冖=132,叕=4,弼=60,十=1,咒=77,一=283,吕=4725 (22.6%),回=412,?=2

Verification Part1: =19203 (91.9%),?=1699

Verification Part2: =18698 (89.5%),?=2204

Most common compounds: 木: 1025, 口: 725, 金: 706, 氵: 1032, 艹: 968

Statistic post:

Composition Kind Count: [吅=14959 (71.6%),吕=4727 (22.6%),回=411 (1.96%), 一=283, +=198, 冖=132, 咒=77, 弼=60, 品=48, *=4, 叕=4, ?=2]

Verification Part1: [=19962 (95.5%), ?=943]

Verification Part2: [=15344 (95.5%), ?=937]


 -> 勤黄错撒共赛, in CCD si trova: 共 6 吕 井? 4  ? 八 2 TC 八


--(Artsakenos (talk)) 09:12, 20 March 2011 (UTC)