Facebook’s Language Gaps Weaken Screening of Hate, Terrorism

In Gaza and Syria, journalists and activists feel Facebook censors their speech, flagging inoffensive Arabic posts as terrorist content. In India and Myanmar, political groups use Facebook to incite violence. All of it frequently slips through the company’s efforts to police its social media platforms because of a shortage of moderators who speak local languages and understand cultural contexts.

Internal company documents from the former Facebook product manager-turned-whistleblower Frances Haugen show the problems plaguing the company’s content moderation are systemic, and that Facebook has understood the depth of these failings for years while doing little about it.

Its platforms have failed to develop artificial-intelligence solutions that can catch harmful content in different languages. As a result, terrorist content and hate speech proliferate in some of the world’s most volatile regions. Elsewhere, the company’s language gaps lead to overzealous policing of everyday expression.

This story, along with others published Monday, is based on former Facebook product manager-turned-whistleblower Frances Haugen’s disclosures to the Securities and Exchange Commission, which were also provided to Congress in redacted form by her legal team. The redacted versions received by Congress were obtained by a consortium of news organizations, including The Associated Press.

In a statement to the AP, a Facebook spokesperson said that over the last two years the company has invested in recruiting more staff with local dialect and topic expertise to bolster its review capacity globally.

When it comes to Arabic content moderation, in particular, the company said, “We still have more work to do.”

But the documents show the problems are not limited to Arabic. In Myanmar, where Facebook-based misinformation has been linked repeatedly to ethnic violence, the company’s internal reports show it failed to stop the spread of hate speech targeting the minority Rohingya Muslim population.

In India, the documents show moderators never flagged anti-Muslim hate speech broadcast by Prime Minister Narendra Modi’s far-right Hindu nationalist group because Facebook lacked moderators and automated filters with knowledge of Hindi and Bengali.

Arabic, Facebook’s third-most common language, does pose particular challenges to the company’s automated systems and human moderators, each of which struggles to understand spoken dialects unique to each country and region, their vocabularies salted with different historical influences and cultural contexts. The platform won a vast following across the region amid the 2011 Arab Spring, but its reputation as a forum for free expression in a region full of autocratic governments has since changed.

Scores of Palestinian journalists have had their accounts deleted. Archives of the Syrian civil war have disappeared. During the 11-day Gaza war last May, Facebook’s Instagram app briefly banned the hashtag #AlAqsa, a reference to the Al-Aqsa Mosque in Jerusalem’s Old City, a flashpoint of the conflict. The company later apologized, saying it confused Islam’s third-holiest site for a terrorist group.

Criticism, satire and even simple mentions of groups on the company’s Dangerous Individuals and Organizations list — a docket modeled on the U.S. government equivalent — are grounds for a takedown.

“We were incorrectly enforcing counterterrorism content in Arabic,” one document reads, noting the system “limits users from participating in political speech, impeding their right to freedom of expression.”

The Facebook blacklist includes Gaza’s ruling Hamas party, as well as Hezbollah, the militant group that holds seats in Lebanon’s Parliament, along with many other groups representing wide swaths of people and territory across the Middle East.

The company’s language gaps and biases have led to the widespread perception that its reviewers skew in favor of governments and against minority groups.

Israeli security agencies and watchdogs also monitor Facebook and bombard it with thousands of orders to take down Palestinian accounts and posts as they try to crack down on incitement.

“They flood our system, completely overpowering it,” said Ashraf Zeitoon, Facebook’s former head of policy for the Middle East and North Africa region, who left in 2017.

Syrian journalists and activists reporting on the country’s opposition also have complained of censorship, with electronic armies supporting embattled President Bashar Assad aggressively flagging dissident posts for removal.

Meanwhile in Afghanistan, Facebook does not translate the site’s hate speech and misinformation pages into Dari and Pashto, the country’s two main languages. The site also doesn’t have a bank of hate speech terms and slurs in Afghanistan, so it can’t build automated filters that catch the worst violations.

In the Philippines, homeland of many domestic workers in the Middle East, Facebook documents show that engineers struggled to detect reports of abuse by employers because the company couldn’t flag words in Tagalog, the major Philippine language.

In the Middle East, the company over-relies on artificial-intelligence filters that make mistakes, leading to “a lot of false positives and a media backlash,” one document reads. Largely unskilled moderators, in over their heads and at times relying on Google Translate, tend to passively field takedown requests instead of screening proactively. Most are Moroccans and get lost in the translation of Arabic’s 30-odd dialects.

The moderators flag inoffensive Arabic posts as terrorist content 77% of the time, one report said.

Although the documents from Haugen predate this year’s Gaza war, episodes from that bloody conflict show how little has been done to address the problems flagged in Facebook’s own internal reports.

Activists in Gaza and the West Bank lost their ability to livestream. Whole archives of the conflict vanished from newsfeeds, a primary portal of information. Influencers accustomed to tens of thousands of likes on their posts saw their outreach plummet when they posted about Palestinians.

“This has restrained me and prevented me from feeling free to publish what I want,” said Soliman Hijjy, a Gaza-based journalist.

Palestinian advocates submitted hundreds of complaints to Facebook during the war, often leading the company to concede error. In the internal documents, Facebook reported it had erred in nearly half of all Arabic language takedown requests submitted for appeal.

Facebook’s internal documents also stressed the need to enlist more Arab moderators from less-represented countries and restrict them to where they have appropriate dialect expertise.

“It is surely of the highest importance to put more resources to the task to improving Arabic systems,” said the report.

Meanwhile, many across the Middle East worry the stakes of Facebook’s failings are exceptionally high, with potential to widen long-standing inequality, chill civic activism and stoke violence in the region.

“We told Facebook: Do you want people to convey their experiences on social platforms, or do you want to shut them down?” said Husam Zomlot, the Palestinian envoy to the United Kingdom. “If you take away people’s voices, the alternatives will be uglier.”

Source: Voice of America