Chinese firm amasses trove of open-source data on influential Canadians

by
https://www.theglobeandmail.com/resizer/QMf6rmxDRv0QSqSUflp1fIqyE4g=/620x0/filters:quality(80)/cloudfront-us-east-1.images.arcpublishing.com/tgam/WHAZZALB65HVJGC4KVJ4G4FC3A.jpg
The office for Shenzhen Zhenhua Data Information Technology Co. Ltd., a small Chinese technology company that is building tools to process the world’s open-source information about influential people.
Nathan VanderKlippe/The Globe and Mail

On the 14th floor of a tower filled with cramped workspaces in China’s high-tech city, a small military contractor is building tools to track politicians, aerospace entrepreneurs, scholars and other influential people around the world, including thousands in Canada.

The office of Shenzhen Zhenhua Data Information Technology Co. Ltd. is modest. When a Globe and Mail reporter visited, three people sat at desks in what appeared to be a converted studio apartment, with the bathroom door open. One person was entering code into a computer, while another worked on a PowerPoint presentation. The third person said the company has a work force of more than 30 employees. And although she said Zhenhua is expanding, that’s small even by startup standards in China.

Its ambitions, though, extend far beyond its small real estate footprint. It is building tools to process the world’s open-source information about influential people – culled from Twitter, criminal records, LinkedIn posts, YouTube videos and more – into data that can be analyzed and used by universities, companies, government actors and the Chinese military.

“Our client base is a bit special,” the woman said.

The Globe and a consortium of international journalists have accessed an early copy of the company’s Overseas Key Information Database (OKIDB), which shows the type of information Zhenhua is collecting for use in China, including records on small-town mayors in Western Canada, where Chinese diplomats have sought to curry favour.

The company claims to have built tools to manipulate content on Twitter, WhatsApp and other platforms, including Facebook – which says it has banned Zhenhua from gathering data on its platform.

The company declined an interview request, saying it was not “convenient” to disclose trade secrets. Its website became inaccessible after The Globe visited its office, which is located in a government-backed business incubator building across the street from an investigative centre for the local Public Security Bureau – all a short drive from the headquarters of some of China’s most important technology companies and civilian military contractors, including Tencent and China Electronics Corp.

The company, led by a former IBM data centre management expert, has also described its work online in job postings, LinkedIn records, blog articles and software patents. One employee described work “mining the business needs of military customers for overseas data.” Before it became inaccessible, Zhenhua’s website listed a series of partners that included important military contractors.

It claims to have collected information on more than 2.4 million people and 650,000 organizations from about two billion social-media articles.

Together the documents reveal a Chinese firm with a keen interest in advanced forms of warfare, the structure of the U.S. intelligence apparatus and the use of social media to achieve military victories.

The company has secured a software patent for a “social media account simulation system,” a title that connotes a tool for managing networks of fake social-media usernames in ways that emulate human characteristics, making them more effective at spreading messages.

Zhenhua’s name translates to “China Revival,” a reference to a mantra of President Xi Jinping, who has proclaimed the “great rejuvenation of the Chinese nation.”

“It seems to be collecting information about people who are around things that China would be interested in. The question is whether this is a database of potential targets that could be used by the intelligence services of China to get what they want,” said Stephanie Carvin, a former national security analyst who is now an associate professor of international relations at Carleton University.

Prof. Carvin looked at the database on behalf of The Globe and said it wasn’t clear whether it was being used by Chinese intelligence or had simply been created by a company hoping to sell it to Chinese intelligence.

But she found it curious that it contained records on people such as Ella-Grace Trudeau, the 11-year-old daughter of Prime Minister Justin Trudeau, and Jeremy Fry, the adult son of long-time MP Hedy Fry. That, Prof. Carvin said, suggested an attempt to learn more not just about the people in power in Canada but about those around them.

“Why have these people in some kind of database? That, to me, is the question that national security agencies in the West have to figure out. That’s the thing I worry about,” Prof. Carvin said. “Is this an attempt to create a database of targetable individuals? And what are they trying to do with that?”

A version of the Zhenhua OKIDB database analyzed by The Globe contained almost 16,000 entries mentioning Canada.

Its files seem to have been cobbled together from various sources. Some catalogue news stories, including hundreds of Globe articles, while others are archived Facebook posts from U.S. President Donald Trump about trade tariffs. A large portion of the data appears to have been extracted from the business information website Crunchbase and serves as a Rolodex of social-media accounts and contact information for people in all sorts of occupations, from tech executives to university professors. Roughly 70 per cent of the people captured in the data are men.

The database appears to have a special focus on mayors of Western Canadian towns, as well as academics and bureaucrats who focus on international relations.

However, the effort is broader than it is deep.

Jeremy Kirk, an information security analyst who said he gained independent access to the database earlier this year, said he didn’t see any sign that it was a tool of an intelligence service.

“This is data that anyone could find through a Google search. So far, none of the data has been linked to a non-public data source. As it stands, it doesn’t represent a threat to any country,” said Mr. Kirk, who is the executive editor of the Australia-based Information Security Media Group. “But people should be mindful of what they post publicly on the internet, as it could be collected by other countries for commercial gain or intelligence purposes.”

The vast majority of the files contain little more than what can be found about the individuals on social-media websites such as Twitter, Facebook and LinkedIn. If the person of interest has a police record, links are included to newspaper stories about their cases.

The mass scraping of data contravenes Facebook’s policies, spokeswoman Liz Bourgeois said. “We have banned Shenzhen Zhenhua Data Technology from our platform and sent a cease and desist letter ordering them to stop,” she said. LinkedIn does “not permit the use of any software that scrapes or copies information from LinkedIn,” spokesperson Billy Huang said. “If any violation of our user agreement is uncovered or reported, we investigate and take necessary steps to protect our members' information.”

The database also contains a shorter list of 3,767 Canadians who have been assigned a grade of 1, 2 or 3. Those assigned a 1 appear to be people of direct influence, such as mayors, MPs or senior civil servants, while those assigned a 2 are often relatives of people in power, such as Mr. Trudeau’s daughter and Ms. Fry’s son. Those assigned a grade of 3 often have criminal convictions, mostly for economic crimes.

Dozens of current and former MPs dot the list, including new Conservative Party Leader Erin O’Toole, whose file includes a link to the web page of his official parliamentary profile and, like most, a seven-digit ID number.

Others with files assigned a grade of 1 include senior bureaucrats at the Canadian Nuclear Safety Commission, the Canadian Food Inspection Agency, the Treasury Board, the Transportation Safety Board, Export Development Canada – even the Office of the Privacy Commissioner.

The justice system appears to be another focus of the database, which contains entries on judges up to and including current and former members of the Supreme Court of Canada.

Notable individuals assigned a grade of 3 include former theatre impresario Garth Drabinsky, who was convicted of fraud in 2009, former SNC-Lavalin executive Riadh Ben Aissa, who pleaded guilty to corruption charges in Switzerland before testifying against his superiors in Canada, and Nicola Iammarrone, a former Canada Revenue Agency auditor who pleaded guilty to taking bribes.

It is unclear how often the database is updated, as several names on the list appear to correspond with those of prominent Canadians who have died, in some cases many years ago.

Entries about Canadian criminals feature prominently. The database lists 198 people it says are associated with narcotics, 178 with conspiracy, 162 with fraud and 100 with money laundering. A handful of people are mentioned multiple times, including Gilles Vaillancourt, the former Laval, Que., mayor jailed in 2016 on fraud charges; Amin Mohamed Durrani, jailed after being arrested in the 2006 Toronto anti-terrorism sweep; and Michael Witen, an accountant who was found guilty of defrauding the federal government.

Zhenhua appears to be “a company hoovering up open-source intelligence, and one of the things where there is a lot of open-source intelligence is around criminal records and court records,” said Garrett Graff, co-author of Dawn of the Code War: America’s Battle Against Russia, China, and the Rising Global Cyber Threat.

According to database timestamps, all of the Canadian entries analyzed by The Globe were collected in mid-to-late 2018.

Data collection on this scale is not without precedent. The internet made collecting massive amounts of general-purpose data infinitely easier, giving rise to “data brokers.” Today these companies sell datasets ranging from credit-card purchasing histories to cellphone geolocation data, and their clients rely on the information for everything from tailoring advertising campaigns to calculating credit scores.

But other evidence points to Chinese players attempting to take in large amounts of data. A series of data breaches between 2013 and 2018 attributed to Chinese hackers stole personal information from Marriott hotels, the United States Office of Personnel Management and health insurer Anthem. China routinely denies involvement in hacking.

Each individual collection of data may be of limited value. But “when you begin to layer these databases on top of one another, it provides an arguably unparallelled window into human targeting – backgrounds, personal motivations, personal weaknesses – and provides a roadmap for influencing people,” Mr. Graff said.

Zhenhua’s data is structured in a way similar to that of Factiva, a research tool from Dow Jones that also catalogues influential people around the world. In fact, the woman at the Zhenhua office likened the company’s products to those of Dow Jones and Wind Information, a Chinese provider.

“Some foreign software companies are able to obtain content such as videos, text and music from social media posts. What we can do is to get them all at once,” she said. The company describes OKIDB as tracking people, institutions, connections and relationships. The people include “global leaders and core figures in the fields of military, politics, business, science and technology, media, civil organizations” and the like.

Zhenhua’s clients are in government, the military, universities and academic institutes, the woman at the company said, adding they can use the company’s technology to “conduct a more detailed analysis of a certain professor.” She said the company is not merely a technology provider, as its employees actively work with customers and are based in cities across China, including Nanjing and Wuhan.

Western intelligence services now estimate that “China has collected personally identifiable information on 80 per cent of the U.S. population,” said Nicholas Eftimiades, a former senior U.S. intelligence officer and China expert who recently published Chinese Espionage: Operations and Tactics. For a company like Zhenhua, applying artificial intelligence tools to a trove of social-media data can help its customers attain their “goals locally, regionally, nationally or commercially,” he said.

“You’re talking about the ability to influence academics, political leaders ranging from mayors up through senior leaders in a government. It’s about influencing them to serve the Chinese Communist Party’s desires, their goals.”

Online, Zhenhua stresses its military connections. On LinkedIn posts, a senior R&D engineer describes working on a “social media cultivation system, and military deployment simulation demonstration system,” while a product sales manager discusses “mining military customers' business needs for overseas data.” A job posting seeks a candidate that can manage sales and focus “management systems at the direction of the Party, government, and military.”

Zhenhua also lists a series of corporate partners with ties to the security establishment. Wenge Group uses big data and artificial intelligence to aid “smart law enforcement.” LSSEC Tech provides encryption tools and IT equipment to national security and military customers and has trained its employees to keep secrets on weaponry research. GTCOM sifts social media to spot the development of heated public opinion, equipping authorities to “minimize the probability of group incidents.” TRS lists the police and the Communist Party as customers for software services that include online relationship mining, a “public opinion management system” and a “crystal ball intelligence analysis platform.” CHRTC provides “urban governance” products to the country’s security apparatus.

Zhenhua itself has been granted 10 software patents, Chinese records show, for systems that include searching global think tanks, monitoring personnel appointments and removals around the world, gathering real-time telecommunications content – and social-media account simulation. The latter appears to describe a technology to teach a computerized system to better mimic humans on social media.

“The ultimate intent of this sort of thing is to get attention, to stimulate phony online traffic,” said Wu Fei, director of the AI Research Centre at Zhejiang University. When people see content that has already received thousands of likes or comments, “the majority of which may be created and stimulated by the system – they would immediately be interested.”

Zhenhua said on its website that its system manages multiple social-media accounts belonging to virtual humans – bots. When an “assigned task is received, the user can select all social media accounts or part of them to execute the assigned instructions.”

Such tools can be used by companies to promote products. But Zhenhua has explained how they could also hold military value.

The company has published extensively on “Number 99 Institute,” a blog account on the WeChat messaging app. Its articles reveal an interest in the structure and hierarchy of U.S. intelligence agencies, as well as in future forms of conflict. One, which was also posted to the Zhenhua website, describes social media as important tools for “hybrid warfare,” explaining how the manipulation of public opinion through social media can be a cost-effective and powerful way to prevail in battle. “Social media can manipulate reality and weaken a country’s administrative, social, military or economic power,” the company wrote. “It can also lead to internal conflicts, social polarization and radicalization in a country.”

Our Morning Update and Evening Update newsletters are written by Globe editors, giving you a concise summary of the day’s most important headlines. Sign up today.