| Literature DB >> 30561430 |
Pao K Wang1, Kuan-Hui Elaine Lin1, Yi-Chun Liao2, Hsiung-Ming Liao3, Yu-Shiuan Lin1, Ching-Tzu Hsu1, Shih-Ming Hsu1, Chih-Wei Wan1, Shih-Yu Lee1, I-Chun Fan3, Pei-Hua Tan4, Te-Tien Ting5.
Abstract
This paper describes the methodology of an ongoing project of constructing an East Asian climate database REACHES based on Chinese historical documents. The record source is Compendium of Meteorological Records of China in the Last 3000 Years which collects meteorology and climate related records from mainly official and local chronicles along with a small number of other documents. We report the digitization of the records covering the period 1644-1795. An example of the original records is translated to illustrate the typical contents which contain time, location and type of events. Chinese historical times and location names are converted into Gregorian calendar and latitudes and longitudes. A hierarchical database system is developed that consists of the hierarchies of domains, main categories, subcategories, and further details. Historical events are then digitized and categorized into such a system. Code systems are developed at all levels such that the original descriptive entries are converted into digitized records suitable for treatment by computers. Statistics and characteristics of the digitized records in the database are described.Entities:
Year: 2018 PMID: 30561430 PMCID: PMC6298253 DOI: 10.1038/sdata.2018.288
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Figure 1The image of the first record as published in Compendium Vol. III 2nd Edition.
Figure 2A flow chart of the REACHES database construction, showing the digitization process of the historical records in Compendium and how it interacts with the historical GIS and Lunar-Gregorian Calendar Conversion systems of Academia Sinica.
Hierarchical data structure of event categorization system in REACHES database.
| Domain | Main Category | Subcategory (some examples) |
|---|---|---|
| Meteorology | Precipitation | rain, snow, hail, etc. |
| Temperature | warm/sultry/freezing/cold | |
| Visibility | haze/fog/darkness | |
| Thunder, Lighting | thunder/lighting | |
| Optical | corona/rainbow/halo | |
| Wind | typhoon/tornado/wind | |
| Cloud | colored clouds | |
| Gas, Air | gas/colored gas | |
| Hazard | Drought | drought/dry riverbank/dry well |
| Flood | flood/tidal surge/ surge/tsunami/debris/ | |
| Pest/Vermin | locust/other pests | |
| Crops | bumper/harvest failure, different species are distinguished. | |
| Disease | plague/malaria/herpes/dysentery | |
| Famine | famine/cannibalism | |
| Unusual phenomena | Geophysical abnormities | earthquake/river course change/land cover change |
| Precipitation abnormities: color | black rain/red rain/yellow rain | |
| Precipitation abnormities: plants | rained grains, nuts, flowers, etc. | |
| Precipitation abnormities: animal | rained fish, bugs, etc. | |
| Precipitation abnormities: metal | rained sulphur, black particles, etc. | |
| Acoustical abnormities | thunder without clouds | |
| Sun-related phenomena | solar eclipse, solar prominence, sunspot | |
| Moon-related phenomena | lunar eclipse, red moon | |
| Astronomical phenomena | new star (nova or supernova) | |
| Plant abnormities | abnormal growing conditions of plant, species are distinguished. | |
| Animal abnormities | abnormal animal conditions, species are distinguished. | |
| Socioeconomic turmoil | wars, unrest, human migration, disaster relief, human trafficking | |
| Others | Unrecognized vocabularies | unclear ‘dragon’ description |
Figure 3Map of geographical sites mentioned in the records in Compendium Vol. III.
Colors represent the number of records of the site (n = 1,435).
Figure 4The number of local chronicles published as a function in the period 1403–1911.
Gray bars indicate eras of Ming dynasty; Yellow bars indicate those of Qing dynasty. The Chinese characters at the bottom are Chinese era names.
Figure 5The number of L1-records in Sichuan Provincial Chronicles as a function of time in the period 1644–1795 (n = 906).
Figure 6The number of L2-records as a function of time in the period 1644–1795 (n = 49,714).
Figure 7Number of L2-records in the six most populated categories.
One record can contain more than one event (such as those described in Figure 1). Thus the percentage shows the actual categorical rate and therefore the total exceeds 100%.
Figure 8Network analysis chart of categories.