Skip to main content
May 30, 2024

Tech Policy Experts at NYU, UCLA, and USC Call On CPPA to Protect Personal Data Against GenAI

Engelberg Center

By Nicole Jaiyesimi (‘26) Yanan Wang (‘24)

Was ChatGPT trained on your name, phone number, and other personal details? When NYU’s Technology Law and Policy Clinic teamed up with California researchers to find out, they discovered gaps in both business compliance and consumer understanding around key provisions of the California Consumer Privacy Act (CCPA).

The CCPA secures Californians’ right to know about the personal information a business collects about them and how it is used and shared. But it is not currently clear how consumers can exercise that right in the context of generative AI systems (GenAI).

On behalf of the clinic, USC’s Knowing Machines Project, and the Institute for Technology, Law & Policy at UCLA, law students Nicole Jaiyesimi (JD/MBA ‘26) and Yanan Wang (JD ‘24) drafted an informal public comment to the California Privacy Protection Agency, the state government agency dedicated to enforcing the CCPA and its supplement, the California Privacy Rights Act of 2020.

The comment urges the agency to explicitly affirm that the CCPA gives Californians the right to know whether businesses have used their personal information to train GenAI. It additionally proposes practical measures for the agency to ensure that this right is protected.

The catalyst for the comment was a series of Data Subject Access Requests (DSARs) that UCLA and USC researchers sent to businesses prominent in the GenAI field, including OpenAI, Meta, and Microsoft. For the clinic, the business’s thin responses—none of which addressed the specific request for GenAI training data—underscored a need for greater clarity around the CCPA’s function within this realm. Notably, the responses exposed significant gaps in the businesses’ comprehension of proper identity verification and the delivery of a comprehensive response.

As a foundational matter, the comment highlights the mission of consumer control at the heart of the CCPA, which assures Californians that their personal data belongs to them—not to businesses. And in order to keep data control in the hands of consumers, it is essential for Californians to know in the first place how their personal information is being used.

“Companies like OpenAI acknowledge that personal details generally are part of their training datasets,” said Wang. “What we don’t know are the individual pieces of information that are being funneled into the systems, which the CCPA should require businesses to disclose.”

The comment also emphasizes that, under the CCPA, the burden falls on the business to either provide the requested information or explain why they are exempt from compliance. It criticizes the businesses’ inadequate responses and urges the agency to establish clear, concise guidelines for handling Data Subject Access Requests (DSARs), with a specific focus on identity verification and the requirements of a sufficient response.

Jaiyesimi points out, “Having the right to know how one’s information is used becomes meaningless without knowledge of how to exercise that right. Furthermore, it is unacceptable for businesses to impede an individual’s access to their own information by complicating the retrieval process.”

The comment closes with a list of overall recommendations to the agency.