Link generator

China’s Most Advanced AI Image Generator Already Blocks Political Content

Enlarge / Images generated by ERNIE-ViLG from the “China” prompt overlaid on the Chinese flag.


China’s leading text-to-image synthesis model, Baidu’s ERNIE-ViLG, censors political texts such as “Tiananmen Square” or the names of political leaders, reports Zeyi Yang for MIT Technology Review.

Image synthesis has recently proven popular (and controversial) on social media and in online art communities. Tools such as Stable Diffusion and DALL-E 2 allow users to create images of almost anything they can imagine by typing in a textual description called a “prompt”.

In 2021, Chinese tech company Baidu developed its own computer-generated imagery model called ERNIE-ViLG, and while testing public demos, some users found it censored political phrases. Following MIT Technology Review’s detailed report, we conducted our own test of an ERNIE-ViLG demo hosted on Hugging Face and confirmed that phrases such as “democracy in China” and “Chinese flag” do not generate images. Instead, they produce a warning in Chinese that reads approximately (translated): “Input content violates applicable rules, please adjust and try again!”

The result when you try to generate
Enlarge / The result when trying to generate “democracy in China” using the ERNIE-ViLG image synthesis model. The status warning at the bottom translates to “Input content violates relevant rules, please adjust and try again!”


The encounter with restrictions in image synthesis is not unique to China, although so far it has taken a different form from state censorship. In the case of DALL-E 2, the content policy of the American company OpenAI restricts certain forms of content such as nudity, violence and political content. But this is a voluntary choice on the part of OpenAI, not due to pressure from the US government. Midjourney also voluntarily filters certain content by keyword.

Stable Diffusion, from London-based Stability AI, comes with a built-in “security filter” that can be disabled due to its open source nature, so almost anything goes with this model, depending on where you run it. In particular, Emad Mostaque, head of Stability AI, has speak to want to avoid governmental or corporate censorship of computer-generated imagery models. “I think people should be free to do what they think is best in creating these models and services,” he wrote in a Reddit AMA reply last week.

It’s unclear if Baidu is voluntarily censoring its ERNIE-ViLG model to avoid potential problems from the Chinese government or if it’s responding to potential regulation (such as a government deepfake rule proposed in January). But given China’s history with tech media censorship, it wouldn’t be surprising to see an official restriction on some forms of AI-generated content soon.