# Comprehensive News Summary: Scale AI Data Exposure --- * **News Title:** Scale AI exposed sensitive data about clients like Meta and xAI in public Google Docs, BI finds * **Report Provider:** Business Insider * **Authors:** Charles Rollet, Effie Webb, Shubhangi Goel, Hugh Langley * **Date Published:** 2025-06-24 16:42:16 (UTC) * **Topic/Sub-Topic:** Technology / AI * **Keywords:** Meta, Cybersecurity, Exclusive * **URL:** [https://www.businessinsider.com/scale-ai-public-google-docs-security-2025-6](https://www.businessinsider.com/scale-ai-public-google-docs-security-2025-6) --- ## Summary of Findings Business Insider (BI) has uncovered significant security vulnerabilities at Scale AI, a prominent AI data labeling startup, revealing that the company routinely used public Google Docs to manage work for high-profile clients like Google, Meta, and xAI. This practice exposed thousands of confidential AI training documents and sensitive contractor data, raising serious cybersecurity and confidentiality concerns. The revelations come in the wake of Meta's recent $14.3 billion investment in Scale AI and the planned move of Scale AI cofounder Alexandr Wang to Meta. ### Core Issue: Public Google Docs Usage Scale AI, which relies on a vast network of at least **240,000 contractors**, utilized public Google Docs as an efficient, albeit risky, method for sharing internal files and tracking work. BI reviewed thousands of these files, finding many marked "confidential" and accessible to anyone with the link. Some documents were even editable by external parties. ### Details of Data Exposure 1. **Client Confidential AI Projects:** * BI viewed **thousands of pages** of project documents across **85 individual Google Docs** related to Scale AI's work with major tech clients. * **Google:** At least **seven instruction manuals** marked "confidential" by Google were publicly accessible. These documents detailed issues with Google's chatbot, then called Bard (e.g., difficulties with complex questions), and outlined how Scale contractors should improve it, including how Google used ChatGPT for Bard's improvement. * **xAI:** Public Google documents and spreadsheets exposed details of "Project Xylophone," for which Scale ran at least **10 generative AI projects** as of April. This included training documents and a list of **700 conversation prompts** focused on improving the AI's conversation skills across diverse topics. * **Meta:** Confidential Meta training documents, including links to accessible audio files with examples of "good" and "bad" speech prompts, were publicly available. These projects aimed to train Meta's chatbots to be more conversational, emotionally engaging, and handle sensitive topics safely. Meta had at least **21 generative AI projects** with Scale as of April. * Contractors reported easily identifying clients or products, even when projects were codenamed, sometimes due to client logos or by directly prompting the AI model. 2. **Contractor Sensitive Information:** * Unsecured spreadsheets listed the names and private Gmail addresses of **thousands of Scale AI workers**. * Documents detailed contractor work performance, including a spreadsheet titled "Good and Bad Folks" categorizing workers as "high quality" or suspected of "cheating." * Another list of hundreds of personal email addresses was titled "move all cheating taskers," flagging workers for "suspicious behavior." * One document named nearly **1,000 contractors** who were "mistakenly banned" from Scale AI's platforms. * Other documents showed individual contractor pay rates, along with notes on pay disputes and discrepancies. ### Context and Background * **Meta's Investment:** The findings emerge shortly after Meta's substantial **$14.3 billion investment** in Scale AI, which also involves Scale AI cofounder Alexandr Wang joining Meta. * **Client Reactions:** Following Meta's investment, clients such as Google, OpenAI, and xAI reportedly **paused work** with Scale AI. * **Scale AI's Reassurance:** In a recent blog post, Scale AI sought to reassure Big Tech clients, emphasizing its neutrality, independence, and commitment to "robust technical and policy safeguards" and "strict security standards." ### Risks and Concerns * **Cybersecurity Vulnerabilities:** Cybersecurity experts Joseph Steinberg (Columbia University) and Stephanie Kurtz (Trace3) confirmed that using public Google Docs creates serious risks. * **Social Engineering:** The exposed contractor data facilitates "social engineering" attacks, where hackers could impersonate employees or contractors to gain unauthorized access. * **Malware Insertion:** The fact that some Google Docs were editable by anyone creates a risk of bad actors inserting malicious links or malware. * **Operational Janky-ness:** Five current and former Scale AI contractors described the Google Docs system as "incredibly janky" and noted that they retained access to old projects, sometimes updated with new client requests, even after their work on them concluded. * **Growth vs. Security:** Steinberg highlighted the dilemma for growth-oriented startups, where prioritizing security can slow market entry. ### Company Responses * **Scale AI:** A spokesperson stated, "We are conducting a thorough investigation and have disabled any user's ability to publicly share documents from Scale-managed systems." They reiterated their commitment to data security and strengthening practices. * **Meta:** Declined to comment on the findings. * **Google and xAI:** Did not respond to requests for comment. ### Implications BI's findings raise significant questions regarding the adequacy of Scale AI's security measures and whether Meta was aware of these vulnerabilities prior to its substantial investment. While there is no indication that Scale AI has suffered a breach directly because of these practices, the exposed data and lax security protocols leave the company and its high-profile clients highly vulnerable to future attacks.
Scale AI exposed sensitive data about clients like Meta and xAI in public Google Docs, BI finds
Read original at Business Insider →Scale AI cofounder Alexandr Wang, who is joining Meta following a $14.3 billion investment.Scale AI Scale AI routinely uses public Google Docs for work with Google, Meta, and xAI.BI reviewed thousands of files — some marked confidential, others exposing contractor data.Scale AI says it's conducting a "thorough investigation."
As Scale AI seeks to reassure customers that their data is secure following Meta's $14.3 billion investment, leaked files and the startup's own contractors indicate it has some serious security holes.Scale AI routinely uses public Google Docs to track work for high-profile customers like Google, Meta, and xAI, leaving multiple AI training documents labeled "confidential" accessible to anyone with the link, Business Insider found.
Contractors told BI the company relies on public Google Docs to share internal files, a method that's efficient for its vast army of at least 240,000 contractors and presents clear cybersecurity and confidentiality risks.Scale AI also left public Google Docs with sensitive details about thousands of its contractors, including their private email addresses and whether they were suspected of "cheating."
Some of those documents can be viewed and also edited by anyone with the right URL.There's no indication that Scale AI has suffered a breach because of this. Two cybersecurity experts told BI that such practices could leave the company and its clients vulnerable to various kinds of hacks, such as hackers impersonating contractors or uploading malware into accessible files.
Scale AI told Business Insider it takes data security seriously and is looking into the matter."We are conducting a thorough investigation and have disabled any user's ability to publicly share documents from Scale-managed systems," a Scale AI spokesperson said. "We remain committed to robust technical and policy safeguards to protect confidential information and are always working to strengthen our practices."
Meta declined to comment. Google and xAI didn't respond to requests for comment.In the wake of Meta's blockbuster investment, clients like Google, OpenAI, and xAI paused work with Scale. In a blog post last week, Scale reassured Big Tech clients that it remains a neutral and independent partner with strict security standards.
The company said that "ensuring customer trust has been and will always be a top priority," and that it has "robust technical and policy safeguards to protect customers' confidential information."BI's findings raise questions about whether it did enough to ensure security and whether Meta was aware of the issue before writing the check.
Confidential AI projects were accessibleBI was able to view thousands of pages of project documents across 85 individual Google Docs tied to Scale AI's work with Big Tech clients. The documents include sensitive details, such as how Google used ChatGPT to improve its own struggling chatbot, then called Bard.
Scale also left public at least seven instruction manuals marked "confidential" by Google, which were accessible to anyone with the link. Those documents spell out what Google thought was wrong with Bard — that it had difficulties answering complex questions — and how Scale contractors should fix it.
For Elon Musk's xAI, for which Scale ran at least 10 generative AI projects as of April, public Google documents and spreadsheets show details of "Project Xylophone," BI reported earlier this month. Training documents and a list of 700 conversation prompts revealed how the project focused on improving the AI's conversation skills about a wide array of topics, from zombie apocalypses to plumbing.
Meta training documents, marked confidential at the top, were also left public to anyone with the link. These included links to accessible audio files with examples of "good" and "bad" speech prompts, suggesting the standards Meta set for expressiveness in its AI products.Some of those projects focused on training Meta's chatbots to be more conversational and emotionally engaging while ensuring they handled sensitive topics safely, BI previously reported.
As of April, Meta had at least 21 generative AI projects with Scale.Several Scale AI contractors interviewed by BI said it was easy to figure out which client they worked for, even though they were codenamed, often just from the nature of the task or the way the instructions were phrased. Sometimes it was even easier: One presentation seen by BI had Google's logo.
Even when projects were meant to be anonymized, contractors across different projects described instantly recognizing clients or products. In some cases, simply prompting the model or asking it directly which chatbot it was would reveal the underlying client, contractors said.Scale AI left contractor information publicOther Google Docs exposed sensitive personal information about Scale's contractors.
BI reviewed spreadsheets that were not locked down and that listed the names and private Gmail addresses of thousands of workers. Several contacted by BI said they were surprised to learn their details were accessible to anyone with the URL of the document.Many documents include details about their work performance.
One spreadsheet titled "Good and Bad Folks" categorizes dozens of workers as either "high quality" or suspected of "cheating." Another list of hundreds of personal email addresses is titled "move all cheating taskers," which also flagged workers for "suspicious behavior."Another sheet names nearly 1,000 contractors who were "mistakenly banned" from Scale AI's platforms.
Other documents show how much individual contractors were paid, along with detailed notes on pay disputes and discrepancies.The system seemed 'incredibly janky'Five current and former Scale AI contractors who worked on separate projects told BI that the use of public Google Docs was widespread across the company.
Contractors said that using them streamlined operations for Scale, which relies mostly on freelance contributors. Managing individual access permissions for each contractor would have slowed down the process.Scale AI's internal platform requires workers to verify themselves, sometimes using their camera, contractors told BI.
At the same time, many documents containing information on training AI models can be accessed through public links or links in other documents without verification."The whole Google Docs system always seemed incredibly janky," one worker said.Two other workers said they retained access to old projects they no longer worked on, which were sometimes updated with requests from the client company regarding how the models should be trained.'
Of course it's dangerous'Organizing internal work through public Google Docs can create serious cybersecurity risks, Joseph Steinberg, a Columbia University cybersecurity lecturer, told BI."Of course it's dangerous. In the best-case scenario, it's just enabling social engineering," he said.Social engineering refers to attacks where hackers trick employees or contractors into giving up access, often by impersonating someone within the company.
Leaving details about thousands of contractors easily accessible creates many opportunities for that kind of breach, Steinberg said.At the same time, investing more in security can slow down growth-oriented startups."The companies that actually spend time doing security right very often lose out because other companies move faster to market," Steinberg said.
The fact that some of the Google Docs were editable by anyone creates risks, such as bad actors inserting malicious links into the documents for others to click, Stephanie Kurtz, a regional director at cyber firm Trace3, told BI.Kurtz added that companies should start with managing access via invites."
Putting it out there and hoping somebody doesn't share a link, that's not a great strategy there," she said.Have a tip? Contact this reporter via email at crollet@insider.com or Signal and WhatsApp at 628-282-2811. Use a personal email address and a nonwork device; here's our guide to sharing information securely.
Meta Cybersecurity ExclusiveMore Read next




