CLS Blue Sky Blog

Why Boards Should Care About Data Supply Chains

Firms need data to ensure core business functions. Yet, as we discuss in a new paper, many  do not pay enough attention to data sources, tending instead to focus on how data are used.  Most regulations address downstream uses – how data are processed, shared, or deployed –but complying with these regulations is not the only challenge that data create for firms.  Instead, data can also create upstream risks that originate from how data is obtained. If unaddressed, these risks can expose firms to operational, legal, regulatory, and reputational costs.

Companies do not simply “have” data. Instead, they obtain data through their “data supply chain:” networks of transactions that transfer data as an intermediate good between individuals, organizations, and technology to create a product or service for an end user.  As a supply chain, data create risks familiar to board members and corporate managers: operational, environmental, legal, geopolitical, and reputational risks that emanate from both direct suppliers of data and their sub-suppliers. Disruptions in data supply chains – whether caused by litigation, regulatory changes, or geopolitical conflict – can jeopardize the delivery of products and services just as badly as disruptions in physical supply chains can.

To manage these risks, executives must recognize that data are obtained through supply chains and should identify the specific data acquisition methods that their companies use.

Companies obtain data in five distinct ways: build, buy, scrape, surveil, or generate. In practice, most firms rely on multiple methods simultaneously, often without treating them as part of an integrated system. In a build-it model, a company uses its own technology to acquire data that it then incorporates into its goods and services. In a buy-it model, a company purchases data from another company, such as a data broker. The data purchaser may even purchase the other company to acquire its data; simply purchase or merge with another company to harvest its data. In a scrape-it model, a company “scrapes” data from the internet.  A scraper develops a computer program that directly extracts, copies and aggregates content from the source code of various websites for later use or sale.  In a surveil-it model, a company  generally employs physical devices to collect information on individual users, such as through smart home devices, watches, apps, and other technologies. This technology is used to “self-surveil” for the purpose of tracking personal goals – such as miles run, or calories consumed – but this data are also collected by the companies that sell these devices. Moreover, even internet service providers (ISPs) may collect a surprising amount of data from homes and people unbeknownst to the users. Finally, in a generate-it model, a company relies on generative AI outputs as data inputs for other software uses.  Alternatively referred to as synthetic data,  or “AI slop,”  it is produced when users employ generative AI to produce outputs, such as text, images, or video, that are then turned into inputs for future AI applications.

Each of these acquisition methods carries its own risks. The more serious – and often overlooked – problem arises when companies combine them.

Governance Implications for Boards

Viewing data acquisition as a supply chain has immediate implications for corporate governance. First, it reframes data governance from a purely compliance issue into one that necessitates  oversight of upstream data sourcing practices. Enhanced upstream scrutiny can help to address downstream harms, whether  privacy violations, operational failure, or regulatory exposure.

Second, identifying data acquisition practices as a data supply chain means that data can be managed, governed, and regulated like supply chains for physical goods. Applying similar oversight to the source of data would require management to answer some of the following questions about their own data supply chains:

Third, corporate management should prioritize supply chain resilience by investing in specific resilience capabilities that can strengthen their data supply chains, including flexibility, adaptability, redundancies, visibility, and collaboration.  Not all of the data acquisition methods will be equally amenable to resiliency. Prioritizing resilience will allow management to evaluate the acquisition choices and to improve these methods through enhanced capabilities.   Just as firms learned – often painfully – that lean physical supply chains can magnify systemic risk, data supply chains exhibit similar fragilities. Yet few firms have invested in resilience measures such as diversification of data sources, contractual safeguards, or contingency planning for data access disruptions.

Fourth, overseeing data supply-chain risks is not the same as overseeing cyber risks. Cybersecurity focuses on protecting data once it is acquired by the company. Data supply chain governance focuses on how data are acquired by the company. Addressing the former does very little to guard against the risks created by the latter.

Finally, data supply-chain management is particularly important to companies that support critical digital infrastructure. The management of these companies must pay particular attention to data supply chains because any disruptions to them not only jeopardize the operations of the company but can also endanger the well-being of the nation.

For corporate management, the implication is straightforward: As a critical resource, data’s sources deserve the same level of oversight as anything else in a critical supply chain. Identifying data acquisition methods, evkishaluating suppliers, incorporating resiliency, and addressing risks will increasingly determine which companies will innovate successfully and which will struggle when upstream risks materialize.

Carla L. Reyes is an associate professor at SMU’s Dedman School of Law, and Kish Parella is  the James P. Morefield Professor of Law at Washington and Lee School of Law. This post is based on their recent paper, “Data Supply Chains,” available here.

Exit mobile version