Commons talk:Ancient Chinese characters project

From Wikimedia Commons, the free media repository
Jump to: navigation, search
  1. Archive: Commons talk:Ancient Chinese characters project/archive1

Evolution-tasks.png Pending tasks for Ancient Chinese characters project: edit this list - add to watchlist - purge
  1. This project's initial contributors have retired. You are welcome to take over ! --Yug (talk) 17:33, 27 December 2017 (UTC)
  2. ✓  Done Finished the Commons:Ancient Chinese characters/seal -- 0 missing !!!
  3. The template {{ACClicense}} (example) needs some love and upgrade, so to look as elegant as the {{SOlicense}} template (example).
  4. The template {{ACClicense}} & {{SOlicense}} must allows visitors easy access to the how-to tutorials so to ease action.
  5. Find fonts to programmaticaly generate the Songti, Kaishu, Lishu sets.
  6. ✓  Done Template:Chinese characters naming : add "priority" column (as for teaching) #f7fcb9 #addd8e #31a354; add coverage column.
  7. Migrate the 215 SVGs_of_Kangxi_radicals_in_the_dictionary’s_own_style to {*}-songti-kangxi.svg
Type Minimal definition Time periods Old convention New convention Example Coverage
Images showing stroke order
•••• 楷書 / 楷书 kǎishū Row of grey chars web era 馬-bw.png 馬-bw.svg* 馬-jbw.png 41,01457
•• 楷書 / 楷书 kǎishū Black to red progression web era 馬-red.png 馬-red.svg* 馬-red.png 1422936
•••• 楷書 / 楷书 kǎishū Animated calligraphy web era 馬-order.gif 馬-order.gif ** 馬-order.gif 853512
楷書 / 楷书 kǎishū Animated by Stroke web era 馬-sbs.gif 馬-sbs.gif ** 馬-sbs.gif 4
Do not use “m”, “t”, or “j” freely in filenames above. We use them such : "m" for "modern", "t" for "traditional", "j" for Japanese, etc. <ed>
Images showing writing styles
•• 金文 jīnwén>Shang Casted bronze scripts, Shang period 1300–1046 BCE 馬-s.svg 馬-bronze-shang.svg 馬-bronze-shang.svg 149
•••• 甲骨文 jiǎgǔwén Knife carved oracle scripts 1300–1046 BCE 馬-oracle.svg 馬-oracle.svg 馬-oracle.svg 4,551
•••• 金文 jīnwén Casted bronze scripts, Western Zhou period 1046–771 BCE 馬-bronze.svg 馬-bronze.svg 馬-bronze.svg 861
甲骨文 jiǎgǔwén>Western Zhou Knife carved oracle scripts, Western Zhou period 1046–771 BCE none 馬-oracle-zhouyuan.svg N/A 0
•• 金文 jīnwén>Spring and Autumn Casted bronze scripts, Spring and Autumn period 771–476 BCE 馬-sa.svg 馬-bronze-spring.svg 馬-bronze-spring.svg 285
•• 金文 jīnwén>Warring States Casted bronze scirpts, Warring States period 476–221 BCE 馬-w.svg 馬-bronze-warring.svg 馬-bronze-warring.svg 214
••• 簡帛文字 jiǎnbówénzì Brush on slip or silk scripts by Chu state 476–221 BCE none 馬-silk.svg 馬-silk.svg 16613
••• 簡牘文字 jiǎndúwénzì Brush on slip scripts by Qin state 476–221 BCE none 馬-slip.svg 馬-slip.svg 153
•••• 小篆 xiǎo zhuàn Normalized zhuan scirpts by Qin dynasty and some Han dynasty zhuan scirpts, collected by Shouwen 221 BCE–220 CE 馬-seal.svg 馬-seal.svg 馬-seal.svg 2,879
•• 籀文 zhòuwén Script from Shizhoupian by late Western Zhou, collected by Shouwen 221 BCE–220 CE 馬-zhou.svg 馬-zhou.svg 馬-zhou.svg 11
•• 古文 zhòuwén Script from Zuo zhuan by Warring States, collected by Shouwen 221 BCE–220 CE 馬-ancient.svg 馬-ancient.svg 馬-ancient.svg 47
奇字 qízì Similar to variants, collected by Shouwen 221 BCE–220 CE none 馬-odd.svg N/A 2
•• 隸書 / 隶书 lìshū Early ancient clerical script to be specific, collected by Libian 221 - 134 BCE none 馬-clerical.svg 馬-clerical.svg 47
傳抄古文字 Chuánchāo gǔwénzì Transcribed various ancient scripts before Qin dynasty, collected by Liushutong 1627–1644 CE 馬-bigseal.svg 馬-bigseal.svg 馬-bigseal.svg 3,448
New Modern Chinese Characters*
•• 草書 / 草书 cǎoshū Brush on paper's fast script, aka Cursive. 43 BCE-present none (font available) 馬-caoshu.svg 馬-caoshu.svg 1
•• 行書 / 行书 xíngshū Brush on paper's fluid writing, aka Semi-cursive. ca.100 CE-present none (font available) 馬-xingshu.svg 馬-xingshu.svg 3
•• 楷書 / 楷书 kǎishū Brush on paper since Han dynasty, aka Regular scripts ca. 200 CE-present none (font available) 馬-kaishu.svg 馬-kaishu.svg 6
•• 宋體 / 宋体 sòngtǐ Printing-blocks scripts. Aka Songti, Mingti. 1368 CE-present none (font available) 馬-songti.svg 馬-songti.svg 4
•• 明體 / 明体 míngtǐ Printing-blocks scripts. 1??? CE-present none (font available) 馬-mingti.svg 20px 0
康熙 kāngxī Printing-blocks scripts in Kangxi dictionary 1716 CE-present none (font available) 馬-mingti-kangxi.svg Kangxi Style Kangxi Radical 187.svg 0, 215

Visualizing the work done for the 214 radicals ![edit]

  1. Ancient Chinese characters/order ✓  Done , thanks to everyone !!! priority: high
  2. Ancient Chinese characters/red -- 200 missing. priority: low. Resource: File:214_Kangxi_stroke_order_friendly.svg
  3. Ancient Chinese characters/oracle -- 62 missing. priority: high
  4. Ancient Chinese characters/bronze -- 46 missing. priority: high
  5. Ancient Chinese characters/bigseal -- 41 missing. priority: high
  6. Ancient Chinese characters/seal ✓  Done , thanks to Warzaz !!! priority: high
  7. Ancient Chinese characters/clerical -- priority: low, fonts widely available
  8. Ancient Chinese characters/songti -- priority: low, fonts widely available
  9. Ancient Chinese characters/songti-kangxi -- priority: low, fonts widely available
  10. Ancient Chinese characters/kaishu -- priority: low, fonts widely available
  11. Ancient Chinese characters/cursive -- priority: low, fonts widely available

--Yug (talk) 17:35, 27 December 2017 (UTC)

Note : In parallel to this ACC project, User:LiliCharlie has created great SVG datasets for CJK radicals. Images in these datasets, while following a different naming convention, could be :
Yug (talk) 16:00, 28 December 2017 (UTC)
As far as naming conventions are concerned: Are you aware that Unicode will encode Chinese "pre-Hàn" characters on the Tertiary Ideographic Plane and not unify them with 汉字 proper? —LiliCharlie (talk) 16:55, 28 December 2017 (UTC)
I like your question OoO.
(1) I knewn it was a technical possibility but didn't know it was on an official unicode roadmap.
(2) The ACC project stands upon Unihan. The extrem typography and multiple glyphs flexibility of pre-Li Si is put aside. The ACC project focus on one glyp for one style for each one modern character. We have in mind the illustration of learning materials, by example, Wikipedia ^^.
Due to diverging national guidelines for modern days Kaishu, we also must accept variants, generally {t|s|j|k|h} for trad, simp, jap, korean, hongkongese. We also add some other variants code for "atypic" or "historical variants" which we only store for rare illustrative purposes but not aim to cover properly (for all characters).
(3) As of now and for Kangxi style and kaishu style, I thing it could be technically ok. I noticed your naming scheme ({0|1|2|3|4}), could of some help. Did you notice if these numbered >0 variants are each attached to one separate country, or if you saw them all coming from a single country ? Yug (talk) 20:05, 28 December 2017 (UTC)
PS: This is mind flowing ^0^ --Yug (talk) 20:46, 28 December 2017 (UTC)
Yes, and note that not all small seal characters in have a corresponding modern Hàn character (e.g. numbers 93 through 96), so the current naming conventions will not be applicable for those. —LiliCharlie (talk) 23:30, 28 December 2017 (UTC)
It's beautiful....... U.U. They will also design with more regularity than we do. And I have a lead (NodeJS) to extract all these characters into {unicode}-seal.svg files :D Yug (talk) 16:10, 29 December 2017 (UTC)


As of 2018, we bnow have access via internet to Chinese books are helpful resources for drawing our shapes or selecting one among many (Oracle and seal). Do you know an helpful relevant source ? Please share with the team :) --Yug (talk) 13:57, 28 December 2017 (UTC)
Also, on Google, the following search are insightful :

  • 字體
  • 六體 書法 字典
  • 七體 書法 字典
  • 中國 書法 字典
  • 字體 書法 字典

--Yug (talk) 13:58, 28 December 2017 (UTC)

Covering pictograms[edit]

There is a popular list of Chinese pictograms and their evolution on wikimedia, created in 2007, which need to be updated to use our higher quality images. The list is as follow :

  1. 馬 马
  2. 鳥 鸟
  3. 龜 龟
  4. 龍 龙

Help to redo the table welcome ! Yug (talk) 14:04, 28 December 2017 (UTC)

Related to this template, which partially use our images while also picking into tiny, custom sets. Harmonistation welcome. (need admin rights I guess) Yug (talk) 16:32, 28 December 2017 (UTC)
The above mentioned template uses strange files. For example Character Yu3 Trad.svg is called Character Yu3 Trad.svg, but this is a misnomer since the so-called "traditional" character shape in use in Taiwan and Hong Kong is Regular Style CJKV Radical 173 (1).svg. — See also character 96E9 in the Unicode charts which give the shape Regular Style CJKV Radical 173 (0).svg for the Mainland Chinese (G), Japanese (J) and Korean (K) sources, but Regular Style CJKV Radical 173 (1).svg for Hong Kong (H), Taiwan (T) and Vietnam (V). —LiliCharlie (talk) 00:34, 29 December 2017 (UTC)
I found a CJK radicals nerd budy #<3 + crying-emoji. More seriously, most people are simply not aware of graphic localisation or evolution. Needs knowledge and sharp eyes ! This said, we must replace these ugly files and increase consistency. Yug (talk) 16:14, 29 December 2017 (UTC)
I gave it a try but we cannot simply replace by available -seal.svg versions since these ones are of lower graphic quality. Will need slower replacement. --Yug (talk) 17:35, 29 December 2017 (UTC)
Oh god, what is that... Regular Style CJKV Radical 173 (6).svg #6 U_U --Yug (talk) 18:44, 29 December 2017 (UTC) How to handle those *-* ...

Quest for open fonts in Kaishu, Songti, Lishu[edit]

Hello all, I just coded an opensource JS / NodeJS script to generate ACC svg from fonts. This is expected to help a lot for -kaishu.svg, -lishu.svg, and -songti.svg which are currently completely missing (aside of LiliCharlie effort on Kaishu).

  1. Ancient Chinese characters/clerical -- priority: low, fonts widely available
  2. Ancient Chinese characters/songti -- priority: low, fonts widely available
  3. Ancient Chinese characters/kaishu -- priority: low, fonts widely available
  4. Ancient Chinese characters/cursive -- priority: low, fonts widely available

I give a list of 500 characters to the program, some settings (size, style), press Enter to run it and it output the 500 associated svgs with the correct filenames. I currently can use cwTeXQKaiZH-Medium.ttf for kaishu, and NotoSerifCJKtc-Medium.otf for mingti.

Unfortunately, given the modules I depend on for automatic vectorization :

  1. some fonts formats fails (Arphic's Ukai.ttc by example)
  2. cwTeXQKaiZH-Medium has only glyphs for true characters, and has no glyph for the Kangxi radicals who are only radicals, nor on the CJK Unified Ideographs 4E00+, nor on the Kangxi Radicals 2F00—2FDF.
False characters are not available in cwtex fonts. Can be done by hand from ukai tho or patched by @LiliCharlie:'s svg, but it feels a bit dirty to me (not same code within the svg files).

Also, I need to test more fonts and ask you for help : could you share with me your Kaishu, Songti, Lishu fonts ? To do so, please contact me via email.

Any font will be appreciated. --Yug (talk) 22:31, 9 January 2018 (UTC)

A wide range of fonts are available online, under open or closed copyrights. The technology used make that, unfortunately, some formats cannot be processed. Credit image: Micheletb. Face-grin.svg
1. As for copyright, I think I can legally upload png of copyrighted fonts of standard Chinese styles, and machine-made svg of open license fonts. This indeed opens positive avenues, beyond kaishu, Songti and Lishu, potentially affecting Seal fonts and Semi-cursive. For Cursive, I wonder of copyright of the calligrapher applies. Yug (talk) 10:15, 10 January 2018 (UTC)
2. As for Xinshu, I found a font working. I wonder why the characters not working are different... Yug (talk) 15:21, 10 January 2018 (UTC)
Thanks to Micheletb listing, I got a nice and working Xinshu font. Same, some radicals missing but should roll fine for characters. I haven't tested for trad + simplified yet.
Just ran it over a traditional kaishu font for 3000 characters and it works * ___ *. (no one care but I'am happy ^0^y) --Yug (talk) 14:28, 11 January 2018 (UTC)
I want to mention one thing which is Ancient Chinese characters exclude Kaishu, Songti (which is a printed variant of Kaishu), Lishu, Xingshu and Caoshu. The fonts for those scripts are easy to find and commonly used now. In contrast, there is almost no font available for ancient scripts like oracle, bronze and silk, and normally they cannot be found at every corner of modern China or Japan. I think that the reason why we started this project, rather than Kaishu or Lishu characters project. lishu is less popular now, and most people don't write it in daily life. Maybe we could upload some Lishu ACC, however, it is still considered as a modern script. For CAoshu an Xingshu, many people write in those ways. As always, we can use <span style="font-family:宋体;">馬</span> and <span style="font-family:楷体;">馬</span> to display and . By the way , I wonder besides Songti and Kaiti, what else can be displayed on web page directly. --Wargaz (talk) 12:24, 12 January 2018 (UTC)
As for your code for modern glyphs, please note that the call to the font family doesn't work on many computers.
Yeah... i've been bothered by that thing... I climb myself on the fact Lishu, Xingshu and Caoshu were in use under Han dynasty n__n, but since still under active use, they cannot be labelled as "ACC". Will we, my pet script and me, have to to create a Commons:Modern_Chinese_characters_project page ? Q__Q Yug (talk) 14:11, 12 January 2018 (UTC) (been thinking about it)
"I wonder besides Songti and Kaiti, what else can be displayed on web page directly."
Heiti fonts are installed on numerous systems.
Web page designers should always expect viewers not to have seemingly "universally installed" fonts installed on their various systems, but modern web typography offers solutions for this frequent scenario. So the correct reply is: Any font style can be displayed on web pages without resorting to images if you have a suitable embeddable font. —LiliCharlie (talk) 14:28, 12 January 2018 (UTC)
I would like to consistently cover the Kangxi radicals and supplement in all these Ancient (ACC) and Modern (MCC?) styles at least, for teaching purposes, since they carry the core of the meaningful cases. Files are helpful for books and web{sites|pages|apps} (wp & wiktionary), so they don't have to fight on both battlefronts, images and fonts. Some other meaningful characters would be interesting to cover consistently across ACC and MCC as well. While priority is low to produce images of MCC styles widely available via open licences fonts, the script I runs now opens a cost-efficient avenue for those styles / projects. Yug (talk) 13:06, 12 January 2018 (UTC)

Community vision[edit]

Hello all, on some of the ACC project's aspects I feel we need more clarity. The Template:Chinese characters naming page and its bottom gather some helpful indications. Yet I collected the following questions below, and would like to discuss them in a focused manner so to clarify it which way we are walking. Also, your input would much welcome as the community has to set its common vision.

1. Radicals or characters for Kangxi uploads ?[edit]

I just noticed it is not clear for me where whe should put our kangxi radicals image. See File:⿓-kaishu.svg (radical U+2FD3) and File:龍-kaishu.svg (character U+9F8D). Do we have a policy on that ? From Ancient_Chinese_characters/kaishu it seems we are uploading on characters. Did someone noticed that and made a policy ? Yug (talk) 22:39, 9 January 2018 (UTC)

2. Redirect ?[edit]

Hello folks, I just noticed the ambiguous case of simplified characters for the ACC project. Few points to discuss tho !

  1. Uploads go upon {traditional character}-{period}.svg, defacto and rightfully.
  2. {simplified character}-{period}.svg should generally stay empty (no image), right ?
  3. {simplified character}-{period}.svg should be red as in no file available, or blue as in redirect available. We currently do both. I just added one

Yug (talk) 17:43, 5 February 2018 (UTC)

I think Redirect for simplified characters is unnecessary. In English Wiktionary, {{Han simp}} can redirect users to check the traditional form. In Chinese and Japanese Wiktionary, {{:Han etyl|<the simplified character / Shinjitai>}} can show the same content as the traditional one. --Wargaz (talk) 23:21, 7 February 2018 (UTC)

3. What to do with accidental graphic features and details ?[edit]

I wonder if we on the ACC project are storing the exact replicas of archaeological signs, including their accidental features and details. Or if we are storing the conceptual drawing, cleaned up from these accidental features. From the direction we take on this point will flow a series of related decisions. Yug (talk) 19:15, 7 February 2018 (UTC)

4. Documenting today's characters origins or documenting past characters ?[edit]

It is still unclear if our aim is to illustrate the ancestors of today characters, using the point of view and light from nowadays professors teaching to nowadays kids.
Or if we want to documents historical signs, a number of them who are now dead in themselves, replaced by other sign for same meaning, and not currently covered by the common CJK(V) unicode plane. We are now touching this issue. Yug (talk) 12:12, 29 March 2018 (UTC)

2018 New Year wishes and welcome ![edit]

@Micheletb:, @Justinrleung:, @Backinstadiums:, @LiliCharlie:, @Wargaz:, @Wyangbot: happy new years to you all,

As I return on the ACC project I noticed your activities on various fronts of the ACC project. It seems we actually do have again the critical mass for a positive ACC community to gather discussions, share know how and tools, push for new guidelines (cf expanded naming convention) or to refresh old ones, increase readability, and naturally, to set priorities and align our efforts to serve better our effort to provide elegant and meaningful graphic resources relative to (Ancient) Chinese characters. For symbolic purpose, I just added myself back on the list of active contributors. Please feel free to do it as well, add this current page to your watchlist and move your ongoing ACC discussions here, for the ones who may concern and impact the whole ACC community. Micheletb and myself have a long « historical » Face-grin.svg experience on the ACC project, several of you recently do powerful pushes, the project surely can gain from positive exchanges of ideas. Micheletb has already a great mentoring activity which is a model to follow (thanks <3). Also, if I forgot some active contributors, feel free to edit my message so to include this user. This project is super elegant and interesting, we can be happy to include any new ally.

Also, best 2018 wishes to each of you, Yug (talk) 09:49, 10 January 2018 (UTC)

@Yug:, @Justinrleung:, @Backinstadiums:, @LiliCharlie:, @Wargaz: Best wishes for 2018, and many happy returns. I'm on a low contribution rate right now, but I do keep the project in my target. And, of course, all new contributors are welcome and heartedly sheered up to contribute. Michelet-密是力 (talk) 18:07, 12 January 2018 (UTC)

Clerical script[edit]

@Yug, Wargaz: I don't think clerical script characters should be generated from fonts. It should actually be part of the ancient characters section rather than the modern section. The clerical script fonts are modern and do not reflect ancient clerical script forms. I would recommend using forms from 隸辨, which is a great collection for the clerical script. Justinrleung (talk) 14:56, 13 February 2018 (UTC)

Hi, talk:Justinrleung! I' m glad that you ask that. A friend of mine told me that "clerical script fonts produced by mainland China are completely valueless rubbish". Certainly, 隸辨 is a reliable and valuable source, and the early ancient clerical script can be considered as ACC, I believe. --Wargaz (talk) 15:37, 13 February 2018 (UTC)
@Justinrleung, Wargaz: We here have a divergence in purpose between « will of strict academic and historical accuracy » VS « will of a complete set of indicative and illustrative teaching material ».
If we chose the former, indeed, using fonts is not proper. If we chose the later, let's set for a "correct" (consistent) font and output 3000 png characters. Yug (talk) 08:33, 8 March 2018 (UTC)
It's kind of related to the issue I raised upper 3._What_to_do_with_accidental_graphic_features_and_details_?, the trade off between historical accuracy and teaching clarity, which we haven't clarified. Yug (talk) 08:36, 8 March 2018 (UTC)