Convert collection of JPEGs to single PDF file

Install ImageMagick.

  1. place all JPEG files to be converted into a single directory
  2. convert *.jpg my_pdf_file.pdf

The order of the pages will match the order of the original file-names. Conversion can be slow and resource-intensive if the number of JPEG files is large (ca. 100). The resulting PDF file is typically about the same size as the sum of the original file sizes.

QHJ1.1 – Yin Zhi 尹至 “Yin’s arrival”

The text narrates an encounter and dialog between Yi Yin 伊尹 and the Shang king Tang 商湯. Yi Yin abandons the Xia 夏 king, and reports to Shang Tang the evils of Xia government, the suffering of the people, and celestial omens for the overthrow of Xia. The same narrative appears in the “Shen Da 慎大” chapter of the Lüshi Chunqiu 呂氏春秋, though with only occasional precise textual parallels.

The following loose transcription and translation aims at capturing the emerging consensus about what the text means. References to web publications on the text follow.

惟尹自夏徂亳,
It was when Yin went from Xia to Bo [Shang Tang’s capital],

逯至,在湯。
Carefully he arrived, and was in Tang’s presence.

湯曰:“格,汝其有吉志。”
Tang said: “Come! You have perhaps some fortunate intent [志 “intent” an error for 言 “news”]?”

尹曰:“后,我來,越今旬日。
Yin said: “My Lord, I come, having been on the road ten days now.

余微其有夏眾□吉好,
I have scrutinized the common people of Xia, they […] lucky and good,

其有后厥志其喪,寵二玉,弗虞其有眾。
but as for their Lord, he has lost all [good?] intention, is excessively fond of the two Jade Ladies, and has no sympathy for his common people.

民允曰:‘余及汝皆亡。’
The people indeed said: ‘We will perish together with you.’

惟災:虐德、暴重、 亡典。
It is a disaster: he abuses virtue, does violence to [?], and abandons the written codes.

夏有祥,在西在東,見章于天,其有民率曰:‘惟我速禍。’
The Xia have omens, in the West and in the East, seeing manifestations in the sky. Their people all say, ‘It is [the sign of] our swift calamity [or “It is we who have invited this calamity”].’

咸曰:‘曷今東祥不章?今其如台?’”
They all say, ‘Why now does the eastern omen not manifest itself? What shall we do now?’”

湯曰:“汝告我夏隱率若時?”尹曰:“若時。”
Tang said, “Is it all so, what you have told me of Xia’s secrets [or “eclipse”, or “agonies”]?” Yin said, “That’s how it is.”

湯盟誓及尹,茲乃務大禜。
Tang’s covenant was extended to Yin, and he thence busied himself with the great Ying ritual [for warding off meteorological calamity].

湯往征弗附。
Tang went to campaign against those who would not ally with him.

摯度,摯德不僭。
Zhi [i.e. Yin] planned. Zhi’s virtue was not faulty.

自西翦西邑,戡其有夏。
From the west they destroyed the western settlement [Xia], and defeated the state of Xia.

夏料民,入于水,曰戰。
Xia counted [i.e. took a census of] its people, entered into Shui, and talked of “battle”.

帝曰:“一勿遺。”
Di [i.e. Tang] said: “Spare not a single one.”

References

 

劉波. 清華簡《尹至》“僮亡典”補說. 复旦大学出土文献与古文字研究中心.
孫飛燕. 試論《尹至》的“至在湯”與《尹誥》的“及湯”. 复旦大学出土文献与古文字研究中心 Available from: http://www.gwz.fudan.edu.cn/SrcShow.asp?Src_ID=1373
朱曉海. 〈尹至〉可能是百篇《尚書》中前所未見的一篇. 复旦大学出土文献与古文字研究中心
王寧. 清華簡《尹至》《尹誥》中的“衆”和“民”. 复旦大学出土文献与古文字研究中心  Available from: http://www.gwz.fudan.edu.cn/SrcShow.asp?Src_ID=1396
王寧. 清華簡《尹至》釋證四例. 簡帛網
王寧. 清華簡《尹至》、《尹誥》中“西邑”和“西邑夏”的問題. 簡帛研究
讀書會. 清華簡《尹至》、《尹誥》研讀札記. 复旦大学出土文献与古文字研究中心
邢文. 談清華簡《尹至》的“動亡典,夏有祥”. 簡帛網
黃人二, 趙思木. 清華簡《尹至》餘釋. 簡帛網
黃人二, 趙思木. 清華簡《尹至》補釋. 簡帛網
黃懷信. 清華簡《尹至》補釋. 簡帛網

Chinese OCR on Ubuntu Linux

Convert scanned images of Chinese documents to real, searchable, editable text.

There is some information for OCR options on Ubuntu/Linux, but it doesn’t explain the set up for Chinese text very well.  OCRFeeder can be installed from the Ubuntu Software Center (Applications > Ubuntu Software Center – click on Office). OCRFeeder works as a graphical front end for OCR engines like Tesseract that do the actual optical character recognition. Tesseract provides files for language specific OCR on their downloads page. For Chinese, these are chi_tra.traineddata.gz and chi_sim.traineddata.gz for traditional and simplified Chinese respectively.

  1. Download the files and gunzip them.
  2. Move them to the tessdata directory. For me the path is /usr/local/share/tessdata/.
  3. Start OCRFeeder.
  4. Open the OCR Engines dialog ( Tools > OCR Engines).
  5. Click “Add”, and fill in the fields as follows:
    • Name: Tesseract – Traditional Chinese
    • Image format: TIFF
    • Failure string: (leave blank)
    • Engine path: /usr/local/bin/tesseract (or whatever the path is for your tesseract installation)
    • Engine arguments: $IMAGE $FILE -l chi_tra; cat $FILE.txt; rm $FILE
  6. That was for traditional Chinese. For simplified Chinese, add another engine. The following fields will be different:
    • Name: Tesseract – Simplified Chinese
    • Engine arguments: $IMAGE $FILE -l chi_sim; cat $FILE.txt; rm $FILE

It should now be possible to select either form of Chinese when performing OCR.