Convert collection of JPEGs to single PDF file

Install ImageMagick.

  1. place all JPEG files to be converted into a single directory
  2. convert *.jpg my_pdf_file.pdf

The order of the pages will match the order of the original file-names. Conversion can be slow and resource-intensive if the number of JPEG files is large (ca. 100). The resulting PDF file is typically about the same size as the sum of the original file sizes.

QHJ1.1 – Yin Zhi 尹至 “Yin’s arrival”

The text narrates an encounter and dialog between Yi Yin 伊尹 and the Shang king Tang 商湯. Yi Yin abandons the Xia 夏 king, and reports to Shang Tang the evils of Xia government, the suffering of the people, and celestial omens for the overthrow of Xia. The same narrative appears in the “Shen Da 慎大” chapter of the Lüshi Chunqiu 呂氏春秋, though with only occasional precise textual parallels.

The following loose transcription and translation aims at capturing the emerging consensus about what the text means. References to web publications on the text follow.

It was when Yin went from Xia to Bo [Shang Tang’s capital],

Carefully he arrived, and was in Tang’s presence.

Tang said: “Come! You have perhaps some fortunate intent [志 “intent” an error for 言 “news”]?”

Yin said: “My Lord, I come, having been on the road ten days now.

I have scrutinized the common people of Xia, they […] lucky and good,

but as for their Lord, he has lost all [good?] intention, is excessively fond of the two Jade Ladies, and has no sympathy for his common people.

The people indeed said: ‘We will perish together with you.’

惟災:虐德、暴重、 亡典。
It is a disaster: he abuses virtue, does violence to [?], and abandons the written codes.

The Xia have omens, in the West and in the East, seeing manifestations in the sky. Their people all say, ‘It is [the sign of] our swift calamity [or “It is we who have invited this calamity”].’

They all say, ‘Why now does the eastern omen not manifest itself? What shall we do now?’”

Tang said, “Is it all so, what you have told me of Xia’s secrets [or “eclipse”, or “agonies”]?” Yin said, “That’s how it is.”

Tang’s covenant was extended to Yin, and he thence busied himself with the great Ying ritual [for warding off meteorological calamity].

Tang went to campaign against those who would not ally with him.

Zhi [i.e. Yin] planned. Zhi’s virtue was not faulty.

From the west they destroyed the western settlement [Xia], and defeated the state of Xia.

Xia counted [i.e. took a census of] its people, entered into Shui, and talked of “battle”.

Di [i.e. Tang] said: “Spare not a single one.”



劉波. 清華簡《尹至》“僮亡典”補說. 复旦大学出土文献与古文字研究中心.
孫飛燕. 試論《尹至》的“至在湯”與《尹誥》的“及湯”. 复旦大学出土文献与古文字研究中心 Available from:
朱曉海. 〈尹至〉可能是百篇《尚書》中前所未見的一篇. 复旦大学出土文献与古文字研究中心
王寧. 清華簡《尹至》《尹誥》中的“衆”和“民”. 复旦大学出土文献与古文字研究中心  Available from:
王寧. 清華簡《尹至》釋證四例. 簡帛網
王寧. 清華簡《尹至》、《尹誥》中“西邑”和“西邑夏”的問題. 簡帛研究
讀書會. 清華簡《尹至》、《尹誥》研讀札記. 复旦大学出土文献与古文字研究中心
邢文. 談清華簡《尹至》的“動亡典,夏有祥”. 簡帛網
黃人二, 趙思木. 清華簡《尹至》餘釋. 簡帛網
黃人二, 趙思木. 清華簡《尹至》補釋. 簡帛網
黃懷信. 清華簡《尹至》補釋. 簡帛網

Chinese OCR on Ubuntu Linux

Convert scanned images of Chinese documents to real, searchable, editable text.

There is some information for OCR options on Ubuntu/Linux, but it doesn’t explain the set up for Chinese text very well.  OCRFeeder can be installed from the Ubuntu Software Center (Applications > Ubuntu Software Center – click on Office). OCRFeeder works as a graphical front end for OCR engines like Tesseract that do the actual optical character recognition. Tesseract provides files for language specific OCR on their downloads page. For Chinese, these are chi_tra.traineddata.gz and chi_sim.traineddata.gz for traditional and simplified Chinese respectively.

  1. Download the files and gunzip them.
  2. Move them to the tessdata directory. For me the path is /usr/local/share/tessdata/.
  3. Start OCRFeeder.
  4. Open the OCR Engines dialog ( Tools > OCR Engines).
  5. Click “Add”, and fill in the fields as follows:
    • Name: Tesseract – Traditional Chinese
    • Image format: TIFF
    • Failure string: (leave blank)
    • Engine path: /usr/local/bin/tesseract (or whatever the path is for your tesseract installation)
    • Engine arguments: $IMAGE $FILE -l chi_tra; cat $FILE.txt; rm $FILE
  6. That was for traditional Chinese. For simplified Chinese, add another engine. The following fields will be different:
    • Name: Tesseract – Simplified Chinese
    • Engine arguments: $IMAGE $FILE -l chi_sim; cat $FILE.txt; rm $FILE

It should now be possible to select either form of Chinese when performing OCR.