SSSOC – Guang yun data

Data imported from the RhymeDict project. Four tables: entries, niu, initials and finals.

Page number data imported from UniHan database. Single table with graph, page number, and number on page. The page numbers and numbers with each page need to be combined with the entries table from RhymeDict, but this is not a trivial task:

  • graphs do not provide a unique index
  • 25361 rows in entries table vs 25337 from UniHan, a difference of 24. The last 28 rows of the RhymeDict entries (and the corresponding last 9 rows of niu) seem to be some kind of supplement, perhaps for syllables missing in the dictionary.
  • There are three duplicates in the UniHan data, i.e. three pairs of graphs with
    • 曅 and 𬀽 at 540.47
    • 匨 and 𫧔 at 183.9
    • 𦶎 and 𬝨 at 93.48

One of each of the duplicates retained (first of each pair above) and second deleted.

There were a few of other missing graphs or graphs incorrectly present in one or the other source. These were hand corrected.

Leave a Reply

Your email address will not be published. Required fields are marked *