Kwok-leong Tang, formerly a Digital China Fellow at the Fairbank Center for Chinese Studies, was named Managing Director of the Digital China Initiative last year.

Transforming the Work of Literary Sinitic Studies Through Generative AI: A Q+A with the Digital China Initiative’s Kwok-leong Tang

For the great majority of Fridays during this Fall semester, the Digital China Initiative—which studies the impact of digital technologies in the field of China Studies—is offering workshops on Generative AI for Literary Sinitic Studies. Two sessions from the initial round of workshops, focusing on an introductory adoption of GenAI into the every-day workflow of researchers in the China Studies field, have already taken place; a third and final introductory level session is scheduled for this Friday, and spaces are still open

The Digital China Initiative’s GenAI for Literary Sinitic Studies workshops are being taught by Managing Director Kwok-leong Tang, formerly a Digital China Fellow at the Fairbank Center for Chinese Studies, who says they are designed to “build on each other, moving from basic adoption to advanced techniques, and finally, to a concrete research application.” Sign-ups are currently open for the next round of workshops, focusing on expanded applications for Large Language Models (LLMs), while sign-ups for the final, more advanced round of workshops will open in mid-October.

We recently caught up with Dr. Tang to ask him how the workshops have been going so far, what level of experience and familiarity with these technologies he’s been encountering, and whether or not any of his students have surprised him with their uses of this exciting new technology.

How have the workshops been going so far? While you advertise these as being for students with “no prior experience in AI,” AI has become such a big part of a lot of people’s lives that I wonder if most of the students really have been novices by your own definition? 

The workshops have been going well so far. In my experience, there’s quite a range regarding AI familiarity among students and faculty in the humanities. Some tried ChatGPT when it first came out in 2022, found the answers too general or inaccurate for their specialized fields, and haven’t touched AI tools since. Others use AI daily for writing improvement, translation, and language tasks. Then, there’s a smaller group of advanced users who are building tools and conducting research with AI.

Our workshops are aimed at the first two groups, particularly those who might have given up after an initial, disappointing experience. While I advertise the workshops as being for those with “no prior experience,” in reality, only a handful of participants fall into that category. Most attendees have some familiarity with AI tools but are eager to explore their potential further. 

What I’m really trying to do is reach out to those who tried AI once or twice, got discouraged, and stopped. That’s why the first workshop topic goes beyond basic writing assistance and translation. I want to show how we can leverage GenAI in humanities research contexts in ways that might not be immediately obvious. It’s about opening eyes to the possibilities and overcoming initial skepticism.

Kwok-leong Tang (middle) with graduate students from the Department of East Asian Languages and Civilizations (EALC) and the Regional Studies East Asia (RSEA) program at the Tools of the Trade Conference, March 2023.

The workshops are divided into a “Topic A,” “Topic B,” and “Topic C.” Could you break down a bit what these three sections address and why they’re each important?

The workshops are structured into three main topics, each addressing a crucial aspect of GenAI in our field.

Topic A, “Adopting GenAI for Literary Sinitic Studies,” focuses on integrating AI into our daily research tasks. One example that really resonated with participants is using GenAI for formatting references. As researchers in our field, we often grapple with different style requirements across disciplines and publishers. It’s especially challenging with non-English materials, where we need to provide transliterations, translations, and original scripts. Many of us have spent countless hours getting references just right. Traditional software like Zotero or Endnote can’t fully automate this process, due to the multilingual complexity, but GenAI tools can format references accurately based on the examples we provide. It’s a real time-saver.

In Topic B, “Beyond Chatbots: RAG and Agents,” we delve into two cutting-edge approaches in GenAI. Retrieval Augmented Generation (RAG) allows us to generate responses based on specific knowledge bases. This addresses common issues like hallucination and overly general answers, while also offering enhanced security and privacy. It’s been a hot topic in the GenAI industry over the past year. We also explore agents, which enable language models to execute a sequence of pre-defined actions. This expands the capabilities of language models beyond simple question-answering.

The final topic, “Building a Digital Collection with GenAI Tools,” is where we put theory into practice. We walk through a real-life use case, demonstrating how researchers can combine various GenAI tools to construct a digital collection. It’s a practical application that ties together everything we’ve learned.

Each of these topics is important because they progressively build on each other, moving from basic adoption to advanced techniques, and finally, to a concrete research application. It’s designed to give participants a comprehensive understanding of how GenAI can transform our work in Literary Sinitic Studies.

What’s been the biggest surprise from these two workshops so far? Have any of the students taught you something that you didn’t know, or introduced an application for the tools that you’re teaching them that you had not thought about before?   

The biggest surprise from these workshops has definitely been the constant learning experience—for me as much as for the participants. GenAI is such a rapidly evolving field that there are new tools, skills, and applications popping up almost daily. It’s created an environment where we’re all learning from each other.

One particular moment that stands out was in a recent workshop. We had a retired associate from the Fairbank Center who had never used a chatbot before. I was deeply impressed by his curiosity and courage in approaching this new technology. It was a powerful reminder that the desire to learn and adapt knows no age limit.

The moments of reciprocal learning are what make these workshops so rewarding. They’re not just about me teaching; they’re about creating a collaborative space where we can all explore the potential of GenAI in our field. It’s exciting to see how different minds approach these tools and come up with innovative ways to apply them to Literary Sinitic Studies.