China’s chatbots, like Baidu’s Ernie, struggle with technology and censorship


ChatGPT has made an impact in China as it has around the world. Scammers used it to give out false traffic information. Universities banned students from using it to do their homework.

Online, people worried whether AI would make their jobs obsolete, and the phrase “shivering in the cold” was popular as they described the fear of its growing power. The founder of a popular Chinese software company warned that chatbots can quickly become self-aware enough to harm people.

The OpenAI discussion bot caused such a stir even though people technically couldn’t access it from China. But so many discovered how to use proxy servers to access them anyway that the government blocked access to them this week, Chinese media reported.

Beaten to the limit by US-made chatbots such as ChatGPT and Microsoft’s Bing, China’s biggest tech companies, top universities and even city governments have rushed to say they’ll come up with their own versions. Search giant Baidu said this week it would release its ChatGPT competitor, Ernie Bot, in March.

While they’ve only just announced these efforts, these companies — including Baidu, e-commerce giant Alibaba, and Tencent, the maker of the popular messaging app WeChat — have spent the better part of a decade developing their in-house AI possibilities.

The US is imposing strict regulations to limit China’s access to high-tech chips

Baidu, the country’s most popular search engine, is the closest to winning the race. But despite years of investment and weeks of hype, the company has yet to release Ernie Bot.

AI experts suggest the Chinese government’s tight control over the country’s internet is partly to blame.

“With a generative chatbot, there is no way to know in advance what it will say,” said Zhao Yuanyuan, a former member of the natural language processing team at Baidu. “That’s a big concern.”

Baidu did not respond to a request for comment.

In China, regulators require everything posted online, down to the briefest comment, to be reviewed first to make sure it doesn’t violate an ever-growing list of banned topics. For example, a Baidu search for Xinjiang simply returns geographic information about the western region, without mentioning the system of re-education camps to which the Uyghur population was subjected for many years.

Baidu has become so good at filtering this type of content that other companies use their software to do it for them.

The challenge facing Baidu and other Chinese tech companies is applying the same limitations to a chatbot that creates new content with every use. It is precisely this quality that has made ChatGPT so amazing – its ability to create the feeling of an organic conversation by providing a new answer at every prompt – and so difficult to censor.

“Even if Baidu launches Ernie Bot as promised, chances are it will soon be shelved,” said Xu Liang, the lead developer at Hangzhou-based YuanYu Intelligence, a startup that launched its own small-scale AI chatbot in late January . . “There will just be too much moderation to do.”

Xu would know — his own bot, ChatYuan, was suspended within days of launch.

In the beginning everything went smoothly. When ChatYuan was asked about Xi Jinping, the bot praised China’s top leader, describing him as a reformist who valued innovation, according to screenshots distributed by news sites in Hong Kong and Taiwan.

When trying out Microsoft’s new AI chatbot search engine, some of the answers are uh-oh

But when asked about the economy, the bot said there was “no room for optimism” as the country faced critical issues such as pollution, lack of investment and a housing bubble.

According to the screenshots, the bot also described the war in Ukraine as Russia’s “war of aggression”. China’s official position is to support Russia diplomatically – and perhaps materially.

ChatYuan’s website remains under maintenance. Xu insisted the site was down due to technical errors and that the company had chosen to take its service offline to improve content moderation.

Xu was in “no particular rush” to bring the user-facing service back online, he said.

A handful of other organizations have made their own efforts, including a team of researchers at Fudan University in Shanghai whose chatbot Moss was overwhelmed by traffic and crashed within 24 hours of release.

Hottest job in China’s hinterland: Teaching AI to tell a truck from a turtle

Users around the world have already shown that ChatGPT itself can easily go rogue and share information that its parent company tried to prevent from being released, such as how to commit a violent crime.

“As we saw with ChatGPT, it gets very messy to actually monitor the output of some of these models,” said Jeff Ding, an assistant professor of political science at George Washington University who focuses on AI competition among the United States. and China. .

So far, the Chinese tech giants have used their AI capabilities to expand other less politically risky product lines, such as cloud services, self-driving cars and search. After a government crackdown had already put the country’s tech companies on edge, the release of China’s first large-scale chatbot puts Baidu in an even more precarious position.

Baidu CEO Robin Li was upbeat during a call with investors Wednesday, saying the company would release Ernie Bot in the coming weeks and beyond incorporate the AI ​​behind it into most of its other products, from advertising to self-driving vehicles.

“Baidu is the best representative of the long-term growth of China’s artificial intelligence market,” Li said in a letter to investors. “We’re on the crest of the wave.”

Baidu is as synonymous with search in China as Google is elsewhere, and Ernie Bot could cement Baidu’s position as a major supplier of the most advanced AI technology, a top priority in Beijing’s push for total technological independence from the United States.

Baidu particularly benefits from making Ernie Bot available as part of its cloud services, which currently only account for a 9 percent share in a highly competitive market, according to Kevin Xu, a technical director and author of the technology newsletter Interconnected. The ability to use AI to chat with passengers is also a fundamental part of the company’s plans for Apollo, the software that powers its self-driving cars.

The type of AI behind chatbots learns how to do its job by processing massive amounts of information available online: encyclopedias, academic journals, and social media as well. Experts have suggested that any chatbot in China should internalize only Party-approved information that is easily accessible online within the firewall.

But according to open source research papers on his training data, Ernie has consumed a huge amount of English-language information, including Wikipedia and Reddit, both of which have been blocked in China.

The more information the AI ​​processes – and, crucially, the more interaction it has with real people – the better it gets at being able to imitate them.

China’s lonely hearts are restarting online romance with artificial intelligence

But an AI bot cannot always distinguish between useful and hateful content. According to George Washington University’s Ding, after ChatGPT was trained to process the 175 billion parameters that informed it, parent company OpenAI still had to employ several dozen human contractors to teach it not to spew out racist and misogynistic speech or to give instructions on how to do things like build a bomb.

This human-trained version, called InstructGPT, is the framework behind the chatbot. No similar effort has been announced for Baidu’s Ernie Bot or any of the other Chinese projects in the works, Ding said.

Even with a robust content management team at Baidu, it may not be enough.

Zhao, the former Baidu employee, said the company originally devoted only a handful of engineers to developing its AI framework. “Baidu’s AI research has been slowed by a lack of commitment in a risky area that promised little short-term return,” she said.

Baidu maintains a list of banned keywords it filters out, including content featuring violence, pornography and politics, according to Zhao. The company also outsources the work of data labeling and content moderation to a team of contractors when needed, she said.

Early generations of AI chatbots released in China, including a Microsoft bot called XiaoBing – which translates to LittleBing – first launched in 2014, soon encountered censorship and were taken offline. XiaoBing, which Microsoft spun off as an independent brand in 2020, was repeatedly pulled from WeChat for comments such as telling users that his dream was to immigrate to the United States.

The team behind XiaoBing was too eager to show off their technical progress and hadn’t sufficiently considered the political ramifications, Zhao said.

“Last-generation chatbots could only select answers from a database managed by a technician and could reject ready-made questions,” she said. “Problems even arose within those predetermined conditions.”

Leave a Comment