Skip to main content

figlet 中文支持

figlet的帮助里说它支持多字节字符,如CJK字符。

不过这肯定需要字体文件,我查到它叫figfonts-CJK,但是不大容易下到,后来发现Debian里虽然package下不来,但是src可以,解开以后赫然发现了gb16fs.flf,呵呵,把里面所有的flf和flc文件放到/usr/share/figlet里就好了,注意要去掉目录,只把文件拷过来。

本以为现在就行了,可是“figlet -f gb16fs 王璐”是乱码,又查了查帮助,说是需要control file, figfonts-CJK自带了一些,叫unshift, iso2022, big5等,但是都试了一遍还是不行。

仔细看了figfont.txt, 它描述了figlet的字体文件和控制文件的格式,我发现控制文件中应当用b命令,但是后来发现用了也不行。

再看一眼figfont.txt中关于command的说明,发现h命令类似,只不过是针对HZ编码的,是以"~{"为开始符,然后把以后每两个字节都当成一个汉字处理,直到遇到"~}"为止。我写个CPP看“王”的GB编码,是0xcdf5, 在gb16fs 对应于0x4d75, 我又看了下unshift.flc,这不就是我想要的么,unshift里定义的编码转换就是把每个字节的首二进制位清零,那么0xcdf5就应该变成0x4d75,但是仔细一看又不对了,每个t命令接的两个区间应该等大,而它里面第一个区间都是以\0x??7E结尾的,显然不对,应该改为\0x??FE,于是我另存了个wl.flc,修正了这个bug

现在试“figlet -f wl 王璐” 仍然不行,直觉告诉了我,它仍没有按多字节处理,又查了下gb16fs.flf,果然乱码的第一个符号就是202,即0xcd,“王”的第一个字节!

于是改用h命令强制处理双字节,并把输入改为"~{王璐~}",字体仍用gb16fs,控制文件用wl(已加入wl.flc),于是, 出现了!!!

这就是成功的一大步了,说明了h命令是好的,字体文件也是好的,figlet也确实具有处理多字节字符的能力,于是目标锁定在b命令上

文档上说b命令每次读一个字符,如果小于0x80则当单字节字符输出,否则再读入一个字符,一起当作双字节字符输出。看似这应当正常工作,但是就是不行。

然后我怒了,下了个源码看(是到官方地址下的,源里只有2.2.1,没有2.2.2,但是差别应该不大),在figlet.c中,742行附近为判断输入命令的,可见b命令使得multibyte值为1.而getinchr中对
multibyte不同值对应的模式都有说明,看似multibyte=1对应的DBCS模式也是正常的。

但是,具体一看getinchr()函数,赫然发现multibyte==1时执行的竟然是只把0x80~0x94和0xE0~0xEF区间的字节当作双字节起始字节!!多么大的bug啊。这个应该是和
SHIFT-JIS相同(如果它对SHIFT-JIS的描述是正确的),不过也难怪,毕竟老外不了解CJK,另外它们也不会去测试这个,因此这个bug一直没发现(至少主页上没说)。

那当然改掉它了,改成0x80~0xFF区间就行了,另外我还在wl.flc上加了一行t \0xFFA1-\0xFFFE \0x7F21-\0x7F7E

编译时要设一下font目录,然后编译安装,然后还要把/usr/share/figlet里的字体和控制文件拷出来一份。

然后。。。哇哈哈哈。。。终于出现了!!!

Comments

ESN said…
“于是改用h命令强制处理双字节,并把输入改为"~{王璐~}",字体仍用gb16fs,控制文件用wl(已加入wl.flc),于是, 出现了!!!”

请问上面说的h命令是直接放到unshift.flc中的单独一行吗?你的终端locale是zh_CN.UTF-8吗?我这里没有输出,请指教,谢谢
Lu Wang said…
啊,说实话已经忘了,都这么长时间了。

但是根据情况,当时我的终端应该是zh_CN.GBK,否则编码也不可能按照gb来读入

h应该是放在unshift.flc里

另外我看现在有个controlfile是utf8,不知道是不是可以利用一下

figfonts-cjk你是从哪里下的呢,我现在倒找不到了。
ESN said…
谢谢你的回复。

我在这里下载到的:
http://archive.debian.net/en/slink/figfonts-cjk
下载链接:
http://archive.debian.org/debian/dists/slink/non-free/source/text/figfonts_2.2.orig.tar.gz

我也试过utf8.flc,但是一样没有输出。

另外,那张女孩的画很棒,用什么软件画的?我喜欢用手机画,不过水平很低,呵呵。另外,能否给我发邮件?在这里聊有碍观瞻哈

Popular posts from this blog

Determine Perspective Lines With Off-page Vanishing Point

In perspective drawing, a vanishing point represents a group of parallel lines, in other words, a direction. For any point on the paper, if we want a line towards the same direction (in the 3d space), we simply draw a line through it and the vanishing point. But sometimes the vanishing point is too far away, such that it is outside the paper/canvas. In this example, we have a point P and two perspective lines L1 and L2. The vanishing point VP is naturally the intersection of L1 and L2. The task is to draw a line through P and VP, without having VP on the paper. I am aware of a few traditional solutions: 1. Use extra pieces of paper such that we can extend L1 and L2 until we see VP. 2. Draw everything in a smaller scale, such that we can see both P and VP on the paper. Draw the line and scale everything back. 3. Draw a perspective grid using the Brewer Method. #1 and #2 might be quite practical. #3 may not guarantee a solution, unless we can measure distances/p...

[转] UTF-8 and Unicode FAQ for Unix/Linux

这几天,这个东西把我搞得很头疼 而且这篇文章好像太大了,blogger自己的发布系统不能发 只好用mail了 //原文 http://www.cl.cam.ac.uk/~mgk25/unicode.html UTF-8 and Unicode FAQ for Unix/Linux by Markus Kuhn This text is a very comprehensive one-stop information resource on how you can use Unicode/UTF-8 on POSIX systems (Linux, Unix). You will find here both introductory information for every user, as well as detailed references for the experienced developer. Unicode has started to replace ASCII, ISO 8859 and EUC at all levels. It enables users to handle not only practically any script and language used on this planet, it also supports a comprehensive set of mathematical and technical symbols to simplify scientific information exchange. With the UTF-8 encoding, Unicode can be used in a convenient and backwards compatible way in environments that were designed entirely around ASCII, like Unix. UTF-8 is the way in which Unicode is used under Unix, Linux, and similar systems. It is now time to make sure that you are well familiar ...

Moving Items Along Bezier Curves with CSS Animation (Part 2: Time Warp)

This is a follow-up of my earlier article.  I realized that there is another way of achieving the same effect. This article has lots of nice examples and explanations, the basic idea is to make very simple @keyframe rules, usually just a linear movement, then use timing function to distort the time, such that the motion path becomes the desired curve. I'd like to call it the "time warp" hack. Demo See the Pen Interactive cubic Bezier curve + CSS animation by Lu Wang ( @coolwanglu ) on CodePen . How does it work? Recall that a cubic Bezier curve is defined by this formula : \[B(t) = (1-t)^3P_0+3(1-t)^2tP_1+3(1-t)t^2P_2+t^3P_3,\ 0 \le t \le 1.\] In the 2D case, \(B(t)\) has two coordinates, \(x(t)\) and \(y(t)\). Define \(x_i\) to the be x coordinate of \(P_i\), then we have: \[x(t) = (1-t)^3x_0+3(1-t)^2tx_1+3(1-t)t^2x_2+t^3x_3,\ 0 \le t \le 1.\] So, for our animated element, we want to make sure that the x coordiante (i.e. the "left" CSS property) is \(...