Skip to main content

再谈pdftohtml支持中文

以前的文章里提到过pdftohtml支持中文的问题, 是用ccmap里面的.cmap文件解决的.

所谓pdftohtml支持中文, 其实根fpdftohtml没什么关系, 而是pdf文件自己需要支持. 我认为它跟pdf支持复制粘贴是一回事.

今天突然发现我的pdftohtml又不支持中文了, 整蛊半天发现我不知什么时候我把那些.cmap文件删了-_-b

于是再去http://lsec.cc.ac.cn/cgi-bin/viewcvs.cgi/cct/ccmap/#dirlist逛逛, 发现原来的cmap的tar包已经删除了, 而多了一个makecmap.tex文件用于生成cmap文件.

我要转的是GBK编码的, 用命令 sudo latex \\def\\cmapEnc{GBK} \\input{makecmap.tex} 即可,
但是虽然弄出来了cmap文件, 编译出来的pdf还是不能复制粘贴, pdftohtml当然也不行.

于是上网搜了搜, 发现了CJKutf8, 说是能代替ccmap, 于是试了下, 真的可以, 只需要加入\usepackage{CJKutf8} 即可, 真不错.

不过,好像有超长行时会出些bug.

以后考虑用utf8了...

Comments

Popular posts from this blog

Determine Perspective Lines With Off-page Vanishing Point

In perspective drawing, a vanishing point represents a group of parallel lines, in other words, a direction. For any point on the paper, if we want a line towards the same direction (in the 3d space), we simply draw a line through it and the vanishing point. But sometimes the vanishing point is too far away, such that it is outside the paper/canvas. In this example, we have a point P and two perspective lines L1 and L2. The vanishing point VP is naturally the intersection of L1 and L2. The task is to draw a line through P and VP, without having VP on the paper. I am aware of a few traditional solutions: 1. Use extra pieces of paper such that we can extend L1 and L2 until we see VP. 2. Draw everything in a smaller scale, such that we can see both P and VP on the paper. Draw the line and scale everything back. 3. Draw a perspective grid using the Brewer Method. #1 and #2 might be quite practical. #3 may not guarantee a solution, unless we can measure distances/p...

[转] UTF-8 and Unicode FAQ for Unix/Linux

这几天,这个东西把我搞得很头疼 而且这篇文章好像太大了,blogger自己的发布系统不能发 只好用mail了 //原文 http://www.cl.cam.ac.uk/~mgk25/unicode.html UTF-8 and Unicode FAQ for Unix/Linux by Markus Kuhn This text is a very comprehensive one-stop information resource on how you can use Unicode/UTF-8 on POSIX systems (Linux, Unix). You will find here both introductory information for every user, as well as detailed references for the experienced developer. Unicode has started to replace ASCII, ISO 8859 and EUC at all levels. It enables users to handle not only practically any script and language used on this planet, it also supports a comprehensive set of mathematical and technical symbols to simplify scientific information exchange. With the UTF-8 encoding, Unicode can be used in a convenient and backwards compatible way in environments that were designed entirely around ASCII, like Unix. UTF-8 is the way in which Unicode is used under Unix, Linux, and similar systems. It is now time to make sure that you are well familiar ...

Moving Items Along Bezier Curves with CSS Animation (Part 2: Time Warp)

This is a follow-up of my earlier article.  I realized that there is another way of achieving the same effect. This article has lots of nice examples and explanations, the basic idea is to make very simple @keyframe rules, usually just a linear movement, then use timing function to distort the time, such that the motion path becomes the desired curve. I'd like to call it the "time warp" hack. Demo See the Pen Interactive cubic Bezier curve + CSS animation by Lu Wang ( @coolwanglu ) on CodePen . How does it work? Recall that a cubic Bezier curve is defined by this formula : \[B(t) = (1-t)^3P_0+3(1-t)^2tP_1+3(1-t)t^2P_2+t^3P_3,\ 0 \le t \le 1.\] In the 2D case, \(B(t)\) has two coordinates, \(x(t)\) and \(y(t)\). Define \(x_i\) to the be x coordinate of \(P_i\), then we have: \[x(t) = (1-t)^3x_0+3(1-t)^2tx_1+3(1-t)t^2x_2+t^3x_3,\ 0 \le t \le 1.\] So, for our animated element, we want to make sure that the x coordiante (i.e. the "left" CSS property) is \(...