BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Fairbank Center for Chinese Studies - ECPv6.15.12.2//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Fairbank Center for Chinese Studies
X-ORIGINAL-URL:https://fairbank.fas.harvard.edu
X-WR-CALDESC:Events for Fairbank Center for Chinese Studies
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20230312T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20231105T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241107T120000
DTEND;TZID=America/New_York:20241107T130000
DTSTAMP:20260503T044051
CREATED:20241029T172857Z
LAST-MODIFIED:20241029T172858Z
UID:38083-1730980800-1730984400@fairbank.fas.harvard.edu
SUMMARY:Transforming Classical Chinese Texts into Searchable Databases with AI
DESCRIPTION:Register now\n\n\n\n\n\n\n\nSpeaker: Guenther Lomas\, Founder\, Sigtica \n\n\n\nAs artificial intelligence becomes integral to the digital humanities\, it offers innovative methods that transform research capabilities and uncover new insights into historical texts and cultural narratives. This talk will demonstrate how AI-powered pipelines can process large volumes of unstructured classical Chinese texts\, such as genealogies and Qing dynasty government employee records\, including those from the Da Qing jin shen quan shu\, into organized\, searchable databases. \n\n\n\nThe pipeline addresses a longstanding challenge in classical Chinese studies: the labor-intensive manual data entry process. It is designed to efficiently process millions of pages from historical Chinese texts\, tackling complexities like layout identification and precision in text extraction. Central to this effort is customized Optical Character Recognition (OCR)\, which enhances data extraction accuracy and identifies key fields using Named Entity Recognition (NER) models. The result is clean\, tabular databases that improve accessibility\, allowing researchers to analyze Chinese historical content with unprecedented efficiency. Furthermore\, this methodology holds potential applications for other languages\, including Japanese\, Korean\, Arabic and Latin\, broadening its impact. \n\n\n\nBy exploring these methodologies and their implications\, this presentation aims to show how integrating advanced technological tools enriches scholarly inquiry in the digital humanities\, providing deeper insights into patterns and narratives within Chinese history and beyond. This approach promises to revolutionize data collection\, paving the way for alternative research practices across various linguistic contexts. \n\n\n\nLunch will be provided. Registration required \n\n\n\n\n\n\n\n\n\n\n\nVenue
URL:https://fairbank.fas.harvard.edu/events/transforming-classical-chinese-texts-into-searchable-databases-with-ai/
LOCATION:CGIS South Room S354\, 1730 Cambridge St\, Cambridge\, MA\, 02138\, United States
CATEGORIES:Special Event
ATTACH;FMTTYPE=image/jpeg:https://fairbank.fas.harvard.edu/wp-content/uploads/2024/09/Digital-China-LOGO.jpg
END:VEVENT
END:VCALENDAR