Skip to content

Insights on the Legal Impacts of AI in the Marketplace

Read

AI Licensing Tip - Use Data Definitions and Rights Allocations to Demystify AI Blackboxes

By Vorys

 

 

On Applied AI, we will regularly share AI licensing tips from Craig Auge, a partner in the Vorys Columbus office and a member of the firm’s technology and commercial transactions practice.   These tips offer brief insights into AI-related provisions that could be utilized in licensing contracts.  

Use Data Definitions and Rights Allocations to Demystify AI Blackboxes

Example contract language:

Initial Training Data” means information and data that initially trained the applicable Foundation Model or Foundation Model Application.

Enhanced Training Data” means information or data submitted, inputted, uploaded, or otherwise provided by or on behalf of Customer to Provider or systems operated by or for the benefit of Provider in connection with Services to perform enhanced training or the like (e.g., for “fine tuning”) of a Foundation Model or Foundation Model Application, including as a part of developing the AI Solution for Customer.

Prompts” means information or data that are prompts, instructions, queries, input, or other data (such as Internet-of-Things sensor data) submitted or otherwise provided by or through Customer to the AI Solution.

Output” means information or data resulting from or outputted in connection with the submission or provision of Prompts to the AI Solution.

As between Provider and Customer:

  • Provider owns all right, title, and interest in or has superior rights to Initial Training Data; and
  • Customer owns all right, title, and interest in or has superior rights to Enhanced Training Data, Prompts, and Outputs (collectively, “Customer’s Data”).

Customer grants to Provider a limited, non-exclusive license during the term of the Agreement to use and reproduce Customer’s Data solely in the performance of the Services.

Provider shall treat Customer’s Data as Customer’s Confidential Information.

Insights on applying:

  • Cracking open blackbox AI solutions may not be possible, but defining data types and establishing associated rights are possible – and should be done.
  • Data trains, fuels, guides, emanates from, and modifies AI Solutions.
  • Raw data is not copyrightable.  But declaring ownership “as between the parties” or declaring which party has superior rights can protect key data assets.
  • Data definitions enable the parties to memorialize critical representations and warranties, scope of licenses, and IP indemnification for different types of data, including more granular ways.
  • Providers are hesitant to “assign title” in data to Customer, but Providers may expressly disavow their ownership.
  • Alternatively and short of “ownership,” a robust license grant can also get Customer much of what it needs.
  • Deeming as “Confidential Information” is a way to buttress protections.
  • “Output” treatment is the stickiest of the types of data – due to its amalgamated or derivative nature.
  • Providers may ask that they can re-purpose some data provided by or through Customer if in anonymized or aggregated formats.

By: Craig Auge

Tags: AI Licensing

Subscribe

Insights for the Labor Relations Professional