Ambrose Li

How to typeset Chinese-language book titles the traditional way on the web

(updated )

In Chinese languages, titles of books and similar works are not italicized.‍[Note 1] Instead, they are given any of the following treatments:

  1. marked with a wavy underline,‍[Note 2]
  2. enclosed in cap-height double guillemets (,),‍[Note 3]
  3. enclosed in CJK-style quotation marks (, or ,; sometimes called corner brackets), or
  4. left unmarked.

In practice, titles of books are most often left unmarked (option 4).

When titles of books are marked, it’s often with cap-height double guillemets (option 2). As in English, quotation marks have become discouraged.‍[Note 4]

Option 1 is still taught, so why it isn’t used can be a matter of debate. However, I’ll claim that it isn’t used because there’s no way use it either in a word processor, in typesetting software, or on the web. The technology we use is just too Western-centric.


When Unicode was invented, it inherited a one-em “wavy overline” (U+FE4B) from Big5. Unfortunately, unaware of its function, the standard did not create a nonspacing variant or include any control codes that can position it correctly. Even though the punctuation mark exists in Unicode, it is useless.

However, on the web it is possible to typeset it correctly using CSS plus Javascript. If you want a comprehensive solution you can use Han.css,‍[Note 5] but I’ll describe the minimalistic solution I came up with‍[Note 6] that I use on this site.


My solution, like any other workable solution, consists of three parts two parts in Javascript and one part in CSS.

First we define the Javascript functions that will do the actual work. At the top level, add_citation_marks, which takes an optional selector in case don’t want to process the entire page, looks for cite elements that are in some kind of traditional Chinese‍[Note 7] and calls add_citation_mark to process each of these cite elements. Because we’re going to break up the text inside the cite, it’s probably a good idea to inject an aria-label into any cite element we’re manipulating.

At the next level, add_citation_mark takes an element that we find, then calls add_citation_mark_to_text to split the text into characters and mark them up with span class=cite.

function add_citation_mark_to_text (s) {
    var t = '';
    for (var i = 0; i < s.length; i += 1) {
        t += '<span class=cite>' + s[i] + '</span>';
    } /* for */
    return t;
} /* add_citation_mark_to_text */

function add_citation_mark (elem) {
    if ($(elem).children().length) {
        $(elem).children().each(function (i, subelem) {
            add_citation_mark(subelem);
        });
    } else if (!($(elem).html().match(/^<span class=cite>/))) {
        $(elem).html(add_citation_mark_to_text($(elem).html()));
    } /* if */
} /* add_citation_mark */

function add_citation_marks( root = false ) {
  let langs = ['zh-HK', 'zh-TW', 'zh-yue-hant', 'zh-cmn-hant', 'zh-hant'];
  $(langs.map(x => (root? root + ' ': '') + 'cite:lang(' + x + ')').join()).each(function (i, elem) {
    $(elem).attr('aria-label', $(elem).text());
    add_citation_mark(elem);
  });
  $(langs.map(x => (root? root + ' ': '') + ' cite>[lang=' + x + ']').join()).each(function (i, elem) {
    if (!$(elem).parent().attr('aria-label')) {
      $(elem).parent().attr('aria-label', $(elem).parent().text());
    } /* if */
    add_citation_mark(elem);
  });
} /* add_citation_marks*/
The Javascript functions

Then, we make sure to call add_citation_marks when everything is loaded.

$(function () {
  add_citation_marks();
});
The Javascript code in the jQuery ready handler

Finally we have the CSS that undoes the usual italics and adds the wavy underlines:

cite:lang(zh-hant),
cite:lang(zh-HK),
cite:lang(zh-TW) {
  font-style: inherit;
  display: inline-flex;
}
cite :lang(zh-hant),
cite :lang(zh-HK),
cite :lang(zh-TW) {
  font-style: normal; /* this is wrong, but can't do better than this */
  display: inline-flex;
}
.cite:lang(zh-hant):after,
.cite:lang(zh-HK):after,
.cite:lang(zh-TW):after {
  position: absolute;
  padding-top: 1em;
  align-self: baseline;
  content: "﹋";
}
The CSS to add the wavy underlines

This CSS isn’t foolproof, though. In multi-column layouts the ˈsy ˌmiŋ ˍhou could become detached from the text it marks. I’m not sure if this is a bug in my code, in the browser, or in CSS itself.

You can see examples of this code in action on this site, for example in the notes on this page or in the bibliography of the Meta-index to etymologically correct written forms of Cantonese words.

Notes

  1. I’ll claim that it is not possible to properly italicize Chinese-language text. Software can only synthesize oblique type, not italic (flowing) type. Also, any synthesized oblique type will be slanted along the wrong axis; this has to do with Chinese languages being originally written vertical.

    A style that fits the definition of italics in Western typography does exist in Chinese-language typography, but it is not usually treated as italic (possibly due to a mistranslation) and is not used in the same way italics are used in modern Western typography.

  2. This is called the ˈsy ˌmiŋ ˍhou (書名號 ‘book title mark’). See Ministry of Education of Taiwan, 重訂標點符號手冊 [Revised handbook of punctuation marks], Revised ed. (2008), http://​language​.moe​.gov​.tw/​001/​Upload/​FILES/​SITE​_​CONTENT/​M0001/​HAU/​h12​.htm.
  3. Confusingly, this is also called the ˈsy ˌmiŋ ˍhou. See MOE Taiwan, Revised handbook.
  4. This is only true in Chinese languages; in Japanese double quotation marks are usual.
  5. Ethan Chen, “Han.css: the CSS typography framework optimised for Hanzi” (2013), https://​github​.com/​ethantw/​Han.
  6. Originally created for a different project. See Ambrose Li, “presentation_template” (2017), https://​github​.com/​acli/​presentation​_​template.
  7. These can only be examples. In principle, it is meaningful to tag something as, for example, zh-cn-hant (Mainland Chinese written in traditional characters i.e., the vocabulary and/or grammar is Mainland Chinese, but the presentational form is traditional).

Tags

  • #Chinese languages
  • #code snippets
  • #CSS
  • #Javascript
  • #punctuation
  • #typography