AI applications have been increasingly used in every aspect of our lives. However, AI has many potential risks emerging from data collection, accessibility, availability, processing, and sharing.
Luckily, to a great extent, proper law enforcement and transparency can act as modifiers to these potential challenges without disrupting the further development of AI.
Therefore, analyzing the legal implications of using data for AI is important. The access and the use of data can be limited in two ways: either by privacy laws or by intellectual property rights that protect the commercial value of data for firms and individuals.
These limitations or restrictions aim to balance open data access that enables the development of AI tools while also respecting the rights of data holders and data subjects. These rights should be respected by people and companies using AI products and fully automated algorithmic tools operating without human intervention.
Privacy and data protection
Privacy and data protection laws require that certain safeguards that protect individuals are in place to allow the free flow and utility of data. These rules vary among different countries and regions, but they may generally limit access afforded to AI companies to datasets containing personal data.
According to the GDPR, for example, the use of such datasets must comply with a set of data protection principles (i.e., processing for a specified and legitimate purpose, data minimization, consent). Likewise, privacy regulations require companies to give people access to and information about the processing of their data.
Compliance with these requirements in an AI context is difficult, and the lack of explicit answers to AI-related privacy issues leads to uncertainties and costs for AI companies. Increased costs may then limit their access to data and hinder the development of AI tools.
Despite some similarities, privacy regulations worldwide follow different approaches to data access and data use due to the diverse understanding of privacy. The lack of a unified global data privacy regime can also hinder data flow beyond jurisdictional borders. Harmonizing regional and domestic privacy frameworks via international collaboration can make data rapidly and equitably available to develop AI-driven tools. Otherwise, the development of AI tools will depend on data availability in each separate region or country.
The differences in data access regulation across regions can also influence the quality and competitiveness of AI tools. AI companies may be interested in developing their tools in countries with less restrictive privacy laws. Fewer privacy restrictions enable AI development by increasing data availability and reducing compliance costs.
Intellectual property rights (IPR)
Intellectual property rights, specifically copyright and database rights, are another legal framework that directly impacts AI data access. These rights have the potential to both enable and hinder AI development.
One of the main reasons is that intellectual property rights, particularly rights in databases, are treated differently in national law, with significant differences in the level of protection for databases. While international laws apply, they only establish minimum protection standards that must be guaranteed in domestic law. We can look at database rights to understand how IP regulations affect data accessibility.
In the United States, a database can only be protected with compilation rights, which give it copyright if the data has been “selected, coordinated, or arranged in such a way that the resulting work as a whole constitutes an original work of authorship.” Compilation rights, in essence, only protect databases that have been assembled or curated with a minimum level of originality or creativity; they do not provide protection to a creator solely because of the time and/or effort put into a database’s creation.
Databases that aren’t eligible for compilation rights aren’t protected by IP laws in the United States, and creators can’t legally prevent others from accessing or using them. Australia, Brazil, China, Hong Kong, Japan, and Singapore are the countries that only provide this level of protection. As a result, it is more difficult to protect databases created in these countries under IP law, and unprotected databases can be accessed more easily for AI development.
Sui generis database rights are a type of protection that exists in Europe in addition to compilation rights. The creator’s investment in obtaining, verifying, or presenting the database’s contents is protected by a sui generis database right. To be eligible for protection, the investment must use a significant amount of resources and/or effort, either qualitatively or quantitatively. In essence, this right recognizes and protects the time and resources invested in the creation and/or maintenance of a database. This type of broader security allows the owner to prevent others from accessing the database, potentially limiting how AI can be developed using the data.
A protected database can be used for legal purposes, but it must be identified in the law (e.g., the sole purpose of illustration for teaching or scientific research). Similar safeguards are in place in India, South Africa, and South Korea. Databases created in these countries are easier to protect under intellectual property law and may be more difficult to access for AI development.
However, it should be noted that IP law is not the only way for a database creator to protect his or her work. Creators can lawfully restrict access to a database through contract law, such as licensing or confidentiality provisions, in both types of legal regimes – with or without sui generis database rights.
Therefore, the existence or lack of different legal frameworks governing access and rights to data and datasets can significantly impact their availability to develop and use AI tools. Stricter privacy and IP rules can limit the use of data. Still, they can also create incentives for individuals and organizations to share their data or to engage in building new tools and/or datasets. The lack of harmonized global privacy or IP legal regimes may hinder the cross-border flow of data and, thus, restrict the development of AI at a regional or national level. Providing clarity to the rights of data holders and establishing concrete rules governing data sharing among public and private actors can increase the flow and availability of data for AI.