Why “No Model Training” Isn’t Protecting You

There has been an increase in AI products that tout they will provide an option for “no model training”, typically for a fee under a higher tier of the product. In a vacuum, this is a crucial option for businesses and consumers that desire to be privacy conscious. However, there has also been an increase in confusion or conflation between no model training and safe data processing mechanisms.

What does it mean when an AI product offers “no model training”?

This is not an uncommon practice in non-AI products. You may often see language within a privacy policy that discusses utilizing data to improve services and provide analytics. Within the sphere of AI, no model training means the product will not leverage users data to train the way the model responds or reacts to requests or improve the way the product functions overall. This offering allows users to feel at ease, knowing that any information provided to the product, sensitive or not, will not be incorporated in how the model functions moving forward or their datasets.

What is data processing?

Data processing is the act of collecting, storing, utilizing, deleting, and anonymizing (not an exhaustive list, just examples) data provided to a company or product. It can also include sharing or selling of data to third-parties. Typically, there are two distinct documents offered by organizations that dictate data processing, the privacy policy and the data processing addendum. The privacy policy, more informally, goes into the collection and use of data, outlining privacy rights and ways to reach a company with questions, concerns, or requests. We may also have attachments to privacy policies outlining rights on a jurisdiction basis, with GDPR and CCPA being at the forefront, and the legal patchwork of US States following behind.The data processing addendum is a more formal document for the controller/processor relationship. It goes into depth on the processing details, such as purpose, specific data types, retention periods, legal basis, and more. No matter what product is utilized, these two legal documents are the source of truth as it relates to data processing.

Why is no model training and data processing different?

While no model training restricts the ability to leverage datasets from algorithms that make up AI products, data processing has a much broader reach of data and use cases. Data processing can span from being leverage to provide the contracted product or services, to enabling marketing outreach, including targeted outreach. This simply means, where no model training stops, companies and products can still leverage data for a myriad of other use cases, unless restricted within the legal documentation.

How does the focus on one and not the other hurt privacy?

Fixating on one of these protections and not the other will also leave you with incomplete protections. When only focusing on no model training (which has been the pattern as of late) data subjects leave themselves vulnerable to difficulties exercising their data subject rights due to lack of awareness or lack of clarity within the legal documents. They are also susceptible to consent concerns, with no true understand of what they have implied consent to, including model training in the future. On the other hand, if the focus is solely data processing, with no consideration for model training, the biggest concern becomes accidental exposure of PII that has made it’s way into the model and future outputs.

As individuals, what can we do?

The goal is to be mindful of the products that we engage in. Individuals (those in the B2C relationship) are often beholden to the boilerplate terms of a company and their products. It may seem arduous, but take time to consider what the privacy policy and data processing addendum (if available) has to say about what you are sharing, how data is being collected, how the information is utilized, and the mechanisms to remove the data from the company’s hands, if needed. Also, it’s important to be aware of your data rights, as defined by your jurisdictions regulators.

As businesses, what can we do?

Utilized the spend for the product as leverage for negotiation. In B2B relationships, a lot can be redlined in the name of a sale. This helps protect the business and its end users. Note that privacy is a right, a selling point, and sign of respect the the end users as well as the business’s internal function. A the bare minimum, work to protect the business’s intellectual property and trade secrets that may come in contact with AI products. Finally, note your contractual obligations with your end users. There are always terms that discuss the standards of subprocessors. You want to ensure whatever privacy standards you set with an AI product maintains alignment with contractual requirements to avoid a breach of contract, loss of revenue, and regulatory investigation or fines.