Lessons from the First Digital China Bootcamp

Teaching Digital Tools for Chinese Studies

Digital tools provide new techniques for answering key questions in the humanities and social sciences. The Fairbank Center’s ongoing support for digital projects – including creating and housing databases, funding digital scholarship, and providing training – is helping to push the boundaries of our disciplines.


Dozens of scholars of ancient Chinese literature, history, and contemporary China flocked to the Fairbank Center in June for a first-ever digital bootcamp on new research tools revolutionizing the field of Chinese studies. All came to learn from the Harvard experts leading the way in devising new digital approaches to deep historical and sociological research.

Dozens of scholars from around the world flocked to the 2022 Digital China Bootcamp to study cutting-edge digital humanities techniques from Harvard experts, including China Biographical Database Project Manager Hongsu Wang, Digital China Fellow Hongsu Wang, and Professor Peter Bol. Photo Credit: Kwok-leong Tang

Participants in the intensive, two-week Digital China Workshop attended lectures, lunch talks, and small-group collaborative exercises led by Kwok-leong Tang, who heads the Fairbank Center’s work on digital humanities, and Hongsu Tang, who manages Fairbank’s innovative China Biographical Database (CBDB). The session kicked off with a talk by Harvard Professor Peter Bol, one of the godfathers of digital humanities in China studies; see below for the full workshop program.

Frank Zhou talks to Fairbank Center Digital China Fellow Kwok-Leong Tang and China Biographical Database (CBDB) Senior Manager Hongsu Wang about lessons from the Workshop and its larger implications for digital scholarship across the world.


Frank Zhou:  How would you explain the Digital China workshop in 30 seconds—an elevator pitch, if you will?

Hongsu Wang: More and more people are very interested in the digital humanities. But a lot of them don’t know where or how to start. We wanted to give them a map, instead of all the details. After they get the map, they can find the destinations that they want to reach in terms of how digital tools and techniques can enhance their understanding of their own research.

Digital China Fellow Kwok-leong Tang emphasizes that rigor and vigilance—whether in digital humanities techniques or COVID-19 precautions—are key to the bootcamp’s success. Photo Credit: Dorinda Elliott

FZ: And it’s quite the map—what a wide swath of skill sets and methodologies! How did you decide on the arc of the workshop?

Kwok-Leong Tang: From the beginning, we decided that this workshop would be for beginners. Since it’s for beginners, we needed to think about what participants should take away from our workshop—and we needed to think about our own strengths. Our strengths are the China Biographical Database (CBDB) and the China Historical Geographic Information System (CHGIS), the two megaprojects in digital scholarship in Chinese Studies at Harvard.

Professor Michael Szonyi, former Director of the Fairbank Center for Chinese Studies (2016-2022), captured highlights from lectures and lunch talks in real-time.

The first week of the workshop walked participants through the workflow for CBDB: how to extract data from analog sources, books, or even archives or other material into digital form and then extract useful information to construct a database. In the second week, we structured the course thematically, including sessions devoted to network analysis, natural language processing, and visualizations.

We gave students three paragraphs on the first day and asked them, “if you need to put this into a tabular data form, what would you do?” On the last day, we gave them the same three paragraphs and asked them the same thing. We saw a dramatic difference in how participants approached the question. It is through real-world practice that we can see students experience a transformation in how they use digital tools to think differently about research questions.

HW: We picked content from our real work because we also use these techniques to work on our data. We wanted participants to know how to start their own project and to apply the techniques that they had learned to their research.

FZ: What was your overall assessment of the workshop?

KT: We got feedback from over 60% of our students. 100% of them would strongly recommend this workshop to their friends.

We also got a lot of feedback about our syllabus design. In the future, we will also plan a course for more advanced students. I will say it was a success, but there’s still a lot of room for improvement and for adding new courses on specific techniques.

FZ: Why is the Fairbank Center’s Digital China project important?

KT: From a bird’s eye perspective, the Digital China project is not just about using digital tools.  It is fundamentally about big questions that concern everyone: the production of knowledge, the media through which that knowledge is disseminated, and the means by which we understand it. One small example of this digital transformation: when I was an undergraduate, most of our time was spent in the library. Now, the first step usually is trying to find out if there is a digital version of this book instead of trying to go to the library to find the same book.

From a bird’s eye perspective, the Digital China project is not just about using digital tools.  It is fundamentally about big questions that concern everyone: the production of knowledge, the media through which that knowledge is disseminated, and the means by which we understand it.

Kwok-leong Tang, Digital China Fellow at the Fairbank Center

HW: For humanities scholars, using digital humanities tools means that they can quickly get an overview of—and change their opinions towards—different fields and disciplines. For example, if students have certain keywords or information about a topic, they can do data mining on all kinds of materials to reach broad conclusions about the core questions in the field.

Computer science brings us a lot of knowledge from math, statistics, and information science. This leads us to think, “if we want to work on some topics in the humanities, maybe we can borrow techniques from computer science to enhance our understanding.” This can really help to give us a lot of new tools to rethink key research questions in the humanities.

KT: We’re facing a time when we are transitioning from analog to the digital age. It’s uncharted waters. We are—to use a Chinese saying—摸着石头过河 [crossing the river by touching the stones].” I think everyone in the humanities has the same feeling. In the course, we mentioned that we need to think about a central question: how do our new methods can compare with more traditional approaches?

At the heart of the bootcamp’s collaborative vision pairing beginners with experts as peers; Professors Peter Bol and Michael Szonyi introduce the experts and mentors who joined participants in the trenches, asked hard questions, and made that vision possible. Photo credit: Dorinda Elliott

FZ: What are the particular challenges for developing digital tools in Chinese studies?

KT: Most of the methods and technologies developed in digital scholarship originated in the Western world. Computing machines themselves originated in the European Anglophone world, which predominantly deals with the alphabetical system. For Chinese or other East Asian languages— those which are not alphabetical systems—we need different ways to deal with text as data.

For example, English uses spaces in between each word as clear markers moving from one word to the next; each word is easy to tokenize. In Chinese, especially classical Chinese, however, a word or a token can be one or multiple characters. The boundary between words is obscure; we need new techniques to tokenize accurately and reliably. This is but one example of the innovations that digital humanities scholarship on East Asian languages demands. We need specific techniques for Chinese distinct from those that were developed for alphabetic languages.


The Digital China Workshop thanks Japan Digital Fellow Jungeun Lim, Postdoctoral Fellow Keyao Pan, and the Edwin O. Reischauer Institute of Japanese Studies for offering technical expertise for the workshop; Mark Grady for his support for all aspects of our programming; and the Fairbank Center for subsidizing all workshop participants—Harvard affiliates and non-affiliates alike—to make up the difference between the $1,500 program tuition and the per-capita costs of the program. The Digital China Initiative Workshop is sponsored by the Fairbank Center’s Digital China Initiative and China Biographical Database Project (CBDB).