Scale AI exposed sensitive data about clients like...

What happened

---

Summary of Findings

Scale AI cofounder Alexandr Wang, who is joining Meta following a $14.3 billion investment.Scale AI Scale AI routinely uses public Google Docs for work with Google, Meta, and xAI.BI reviewed thousands of files — some marked confidential, others exposing contractor data.Scale AI says it's conducting a "thorough investigation."

As Scale AI seeks to reassure customers that their data is secure following Meta's $14.3 billion investment, leaked files and the startup's own contractors indicate it has some serious security holes.Scale AI routinely uses public Google Docs to track work for high-profile customers like Google, Meta, and xAI, leaving multiple AI training documents labeled "confidential" accessible to anyone with the link, Business Insider found.

Contractors told BI the company relies on public Google Docs to share internal files, a method that's efficient for its vast army of at least 240,000 contractors and presents clear cybersecurity and confidentiality risks.Scale AI also left public Google Docs with sensitive details about thousands of its contractors, including their private email addresses and whether they were suspected of "cheating."

Some of those documents can be viewed and also edited by anyone with the right URL.There's no indication that Scale AI has suffered a breach because of this. Two cybersecurity experts told BI that such practices could leave the company and its clients vulnerable to various kinds of hacks, such as hackers impersonating contractors or uploading malware into accessible files.

Scale AI told Business Insider it takes data security seriously and is looking into the matter."We are conducting a thorough investigation and have disabled any user's ability to publicly share documents from Scale-managed systems," a Scale AI spokesperson said. "We remain committed to robust technical and policy safeguards to protect confidential information and are always working to strengthen our practices."

Meta declined to comment. Google and xAI didn't respond to requests for comment.In the wake of Meta's blockbuster investment, clients like Google, OpenAI, and xAI paused work with Scale. In a blog post last week, Scale reassured Big Tech clients that it remains a neutral and independent partner with strict security standards.

The company said that "ensuring customer trust has been and will always be a top priority," and that it has "robust technical and policy safeguards to protect customers' confidential information."BI's findings raise questions about whether it did enough to ensure security and whether Meta was aware of the issue before writing the check.

Confidential AI projects were accessibleBI was able to view thousands of pages of project documents across 85 individual Google Docs tied to Scale AI's work with Big Tech clients. The documents include sensitive details, such as how Google used ChatGPT to improve its own struggling chatbot, then called Bard.

Scale also left public at least seven instruction manuals marked "confidential" by Google, which were accessible to anyone with the link. Those documents spell out what Google thought was wrong with Bard — that it had difficulties answering complex questions — and how Scale contractors should fix it.

For Elon Musk's xAI, for which Scale ran at least 10 generative AI projects as of April, public Google documents and spreadsheets show details of "Project Xylophone," BI reported earlier this month. Training documents and a list of 700 conversation prompts revealed how the project focused on improving the AI's conversation skills about a wide array of topics, from zombie apocalypses to plumbing.

Meta training documents, marked confidential at the top, were also left public to anyone with the link. These included links to accessible audio files with examples of "good" and "bad" speech prompts, suggesting the standards Meta set for expressiveness in its AI products.Some of those projects focused on training Meta's chatbots to be more conversational and emotionally engaging while ensuring they handled sensitive topics safely, BI previously reported.

As of April, Meta had at least 21 generative AI projects with Scale.Several Scale AI contractors interviewed by BI said it was easy to figure out which client they worked for, even though they were codenamed, often just from the nature of the task or the way the instructions were phrased. Sometimes it was even easier: One presentation seen by BI had Google's logo.

Even when projects were meant to be anonymized, contractors across different projects described instantly recognizing clients or products. In some cases, simply prompting the model or asking it directly which chatbot it was would reveal the underlying client, contractors said.Scale AI left contractor information publicOther Google Docs exposed sensitive personal information about Scale's contractors.

BI reviewed spreadsheets that were not locked down and that listed the names and private Gmail addresses of thousands of workers. Several contacted by BI said they were surprised to learn their details were accessible to anyone with the URL of the document.Many documents include details about their work performance.

One spreadsheet titled "Good and Bad Folks" categorizes dozens of workers as either "high quality" or suspected of "cheating." Another list of hundreds of personal email addresses is titled "move all cheating taskers," which also flagged workers for "suspicious behavior."Another sheet names nearly 1,000 contractors who were "mistakenly banned" from Scale AI's platforms.

Other documents show how much individual contractors were paid, along with detailed notes on pay disputes and discrepancies.The system seemed 'incredibly janky'Five current and former Scale AI contractors who worked on separate projects told BI that the use of public Google Docs was widespread across the company.

Contractors said that using them streamlined operations for Scale, which relies mostly on freelance contributors. Managing individual access permissions for each contractor would have slowed down the process.Scale AI's internal platform requires workers to verify themselves, sometimes using their camera, contractors told BI.

At the same time, many documents containing information on training AI models can be accessed through public links or links in other documents without verification."The whole Google Docs system always seemed incredibly janky," one worker said.Two other workers said they retained access to old projects they no longer worked on, which were sometimes updated with requests from the client company regarding how the models should be trained.'

Of course it's dangerous'Organizing internal work through public Google Docs can create serious cybersecurity risks, Joseph Steinberg, a Columbia University cybersecurity lecturer, told BI."Of course it's dangerous. In the best-case scenario, it's just enabling social engineering," he said.Social engineering refers to attacks where hackers trick employees or contractors into giving up access, often by impersonating someone within the company.

Leaving details about thousands of contractors easily accessible creates many opportunities for that kind of breach, Steinberg said.At the same time, investing more in security can slow down growth-oriented startups."The companies that actually spend time doing security right very often lose out because other companies move faster to market," Steinberg said.

The fact that some of the Google Docs were editable by anyone creates risks, such as bad actors inserting malicious links into the documents for others to click, Stephanie Kurtz, a regional director at cyber firm Trace3, told BI.Kurtz added that companies should start with managing access via invites."

Putting it out there and hoping somebody doesn't share a link, that's not a great strategy there," she said.Have a tip? Contact this reporter via email at crollet@insider.com or Signal and WhatsApp at 628-282-2811. Use a personal email address and a nonwork device; here's our guide to sharing information securely.

Meta Cybersecurity ExclusiveMore Read next

Business Insider6/24/2025

Read original at Business Insider →

Source coverage

---

Deeper analysis

Full source content

Meta Cybersecurity ExclusiveMore Read next

How this page is built

Goose Pod turns cited reporting into a public episode summary first, then pairs that summary with audio playback so listeners can check the source material before they decide how deeply to engage.

The goal is to make this page useful as a news landing page first, while still giving listeners transcript access, related episodes, and direct links back to the original publishers.

Scale AI exposed sensitive data about clients like Meta and xAI in public Google Docs, BI finds

In 30 seconds

Quick brief

Why this summary is trustworthy

Primary source

Listen to the episode

What happened

Source coverage

Deeper analysis

How this page is built

Cited sources

More on this topic

美银：AI开启政府背书的“新泡沫时代”，繁荣与崩盘将成常态

亚马逊因“辐射”AI回顾出错而将其下架

AI超级智能是硅谷的幻想，AI2研究员如是说

About this page

Primary source

Explore related pages