Healthcare AI Procurement Guide

Procurement Language Generator

AI Model Lifecycle Checklist

Glossary

Explore

Design Principles for Healthcare AI 101

Key considerations for enabling responsible healthcare AI procurement.

⁠

Figure 1: Uses of healthcare AI applications (graphic courtesy of:

GAO⁠

copyright policy⁠

)

This section focuses on further elaborating on the six principles that underlie the clauses found in the RFP procurement language. The RFP procurement tool uses the World Health Organization’s list of guiding principles as a framework for healthcare AI use and design.

Each principle is individually discussed with a key background summary, along with additional recommendations to help procurement officers apply these concepts.

Key Principles

In June 2021, the World Health Organization (WHO) released the

WHO AI Guiding Principles⁠

, which outlines 6 guiding principles for healthcare AI use and design. When purchasing algorithmic products, procurement officials can seek to abide by these principles by building the procurement template clauses from the language generator tool into their RFPs and bid evaluation criteria.

⁠

Ensuring Transparency, Explainability, and Intelligibility⁠

⁠

Ensuring Inclusiveness and Equity⁠

⁠

Promoting Human Safety and Well-being⁠

⁠

Protecting Human Autonomy and Privacy⁠

⁠

Fostering Responsibility and Accountability⁠

⁠

Promoting Responsive and Sustainable AI⁠

⁠

Applying the Healthcare AI Principles to Procurement

Each of the sections below describes one of the WHO’s 6 guiding principles for responsible healthcare AI design and usage, and explains how it can be applied to procurement. Every section follows the following format:

A recommendation for how procurement officials can translate these principles into actions by using the procurement template language tool. These actions can be included in future requests for proposals.

Background on the recommendation, guiding principle, and relevant issues that motivate the product development.

A description of the actions that procurement officers can use to practice the guiding principles in their purchasing processes.

⁠

Ensuring Transparency, Explainability, and Intelligibility

Recommendation: Procurement officials should prioritize vendor products and services that can provide transparent explanations in plain language to healthcare provider teams and patients regarding algorithms involved in their care delivery.

Background: The ability for an AI model to explain itself in reasonable, logical, and transparent terms is pivotal in

assuring medical device regulators and healthcare professionals⁠

that its recommendations are medically sound and trustworthy. In a

December 2020 report⁠

from The Great Democracy Initiative focused on government procurement of AI, the authors state: “If the government cannot explain how an AI system works… then it may violate constitutional due process or administrative law. Even if an AI system clears those hurdles, it may still violate federal anti-discrimination laws, privacy laws, and domain-specific laws and regulations.”

The Food & Drug Administration (FDA)

has specified⁠

that doctors should be able to independently confirm the logical reasoning behind AI recommendations in order to avoid stricter scrutiny of an AI-based product as a high-risk medical device. The

IMDRF medical risk framework⁠

being considered by the FDA as a risk assessment framework measures medical device risk by assessing the severity of any medical conditions involved, in addition to the degree of autonomy an algorithm possesses within a patient’s medical decision-making process. For example, an algorithmic system that could take fully automated actions for patient care without any human input, or algorithms that affect patients with terminal illnesses, would fall into the highest risk level. Both of these regulatory policies depend on healthcare professionals being able to understand and trust an algorithm’s internal reasoning for how it produced its recommendations.

According to

Google’s definition of explainable AI standards⁠

, there are two levels of AI model explainability: global explainability, which describes the higher-level variables that determine a model’s overall logical reasoning, and local explainability, which describes the most influential variables for individual predictions. Global explainability offers a means for auditors and users to verify the medical soundness of an AI model’s reasoning and flag questionable assumptions; this is useful in procurement for assuring the quality and validity of an AI model before deployment. Local explainability offers

a way for healthcare professionals to understand⁠

why a recommendation is being made, whether to use it, and how to avoid over-reliance on recommendation systems. This can be valuable when a patient is requesting more information about their diagnosis or treatment, and is a key component in trusting AI recommendations.

Future Actions: In order to ensure transparency in AI products, procurement officers should:

Prioritize vendors that have strong global explainability capabilities in their products. Global explainability capabilities are important because they allow others to verify the medical validity and explainability of the AI model’s internal logic before the model is approved for use in the real world.

Hire third-party auditors to use these global explainability functions to verify the AI models. Hiring third-party auditors to verify AI models takes the onus of verifying the validity of the model’s internal logic off of physicians, who

already report⁠

high burnout rates with IT tool adoption, bureaucratic reporting requirements, and business pressure to increase patient visit volume.

⁠

Ensuring Inclusiveness and Equity

Recommendation: Procurement officials should require evidence from vendors that their products and services do not violate anti-discrimination laws governing protected classes like race, gender, sexuality, and religion.

Background: The Federal Trade Commission

released guidance⁠

in April 2021 regarding equity and fairness in AI products. This guidance specifically mentions three laws that AI developers should be aware of when creating algorithms:

Section 5 of the Fair Trade Commission (FTC) Act. Section 5 of the FTC Act prohibits unfair or deceptive practices. This includes the sale or use of, for example, racially biased algorithms.

Fair Credit Reporting Act (FCRA). The FCRA comes into play in circumstances when an algorithm is used to deny people employment, housing, credit, insurance, or other benefits.

Equal Credit Opportunity Act (ECOA). The ECOA makes it illegal for a company to use a biased algorithm that results in credit discrimination on the basis of race, color, religion, national origin, sex, marital status, age, or whether a person receives public assistance.

Future actions: In order to ensure inclusivity and equity in AI products, procurement officers should:

Solicit evidence from AI vendors that their algorithm systems are compliant with these anti-discrimination policies and deliver unbiased results across protected identity classes. Because

Section 5 of the FTC Act⁠

prohibits the sale or use of racially biased algorithms, both the procuring organization and the contract vendor may face legal liability if a purchased AI tool is found to be racially biased. A contract vendor can provide evidence that satisfies this condition by delivering a report before tool deployment describing the AI model’s predictive performance across multiple protected classes, like race, gender, religion, nationality, sex, marital status, age, or social benefits status, and demonstrating that it assures equal results.

Ensure that vendors continuously monitor the model’s performance and that the model shows consistent and equal performance levels across protected classes over time. These reports should indicate the AI model’s performance across protected classes — including true positives, false positives, true negatives, and false negatives — and consider any histories of systemic inequality that might be exacerbated by the model’s behavior when evaluating fairness and equity.

⁠

Promoting Human Safety and Well-being

Recommendation: Procurement officials should set minimum requirements for key safety and reliability standards for AI products to ensure patient safety and high standards of care through service-level agreements and algorithm change protocols.

Background: The use of “

service level agreements⁠

” (SLAs) is common in software development, where technology vendors guarantee a minimum level of performance for such metrics as availability uptime, or “time-to-resolution” for any product issues. Failure to meet these performance agreements may result in a customer credit or refund to the consumer from the vendor.

Regulatory bodies like the Food & Drug Administration (FDA) have established high-quality, secure medical devices through an approval process that relies on data, careful documentation, and risk assessment to protect patient safety. The FDA regulates medical devices,

which it defines as⁠

anything “intended for use in the diagnosis of disease or other conditions, or in the cure, mitigation, treatment, or prevention of disease, in man.”

Similar to how the FDA uses risk assessment and patient safety standards to ensure high-quality medical devices, AI products can be accompanied by SLAs that promise minimum thresholds of performance and reliability tailored to the risk level of a product’s use. SLAs that are commonly used for AI models include:

Greater than 95% confidence in predictions provided by the product. If a product is unable to meet the confidence threshold, the prediction is either withheld or provided with additional context.

Usability and availability more than 99.9% of the time. Any downtime from product failure or maintenance should stay within this expectation range.

Less than 100 milliseconds of latency for obtaining predictions from the product. This means that the end-to-end process of submitting a prediction request and receiving the prediction should take less than 100 milliseconds.

Less than 2 hours of time-to-response for issue reporting. This means that any issues reported to the vendor should receive a response in less than two hours.

Less than 24 hours of time-to-resolution for critical issues. This means that any critical issues that render the product unavailable should be resolved within 24 hours.

In addition to SLAs, the process of safely deploying updates and patches to the AI model is critical for ensuring stability and uninterrupted patient care. To target this problem, the FDA has proposed an “algorithm change protocol” (ACP) for algorithmic products that seek pre-market authorization. ACPs would require algorithm developers to include a proposed plan of anticipated changes that they foresee being made to the algorithmic system, all of which would be included in the market authorization. Any deviation would require re-approval from the FDA. If enacted, these requirements will likely obligate procurement processes to evaluate the risk severity of a desired tool for FDA premarket clearance, as well as buildout an algorithm change protocol document with the vendor.

⁠

Screen Shot 2021-10-19 at 5.48.52 PM.png

⁠

FDA recommendations for building an algorithm change protocol plan (graphic courtesy of

FDA⁠

copyright policy⁠

)

Future Actions: In order to promote human safety and well-being in AI products, procurement officers should:

State expected SLAs for AI products within RFP documents and ask for AI vendors to meet or exceed those standards in order to be competitive bidders. Having defined performance expectations obligates vendors to prioritize patient safety and health outcomes in order to meet their contractual performance obligations.

Ask AI vendors to submit algorithm change protocols (ACPs), a documentation artifact first proposed by the FDA that consists of lists of anticipated changes that vendors intend to make for a product in the future. Soliciting ACPs from vendors provides assurance that the vendor has carefully considered necessary maintenance or upgrade work required for the product, and that it approaches changes in a deliberate manner, without risking patient safety.

⁠

Protecting Human Autonomy and Privacy

Recommendation: Procurement officials should restrict vendors from selling or sharing data provided by the organization, even when the data is anonymized under HIPAA’s “

safe harbor⁠

” clause. This may entail preventing vendors from selling or sharing data or products to third parties.

For less strict standards, procurement officers should require certain fields in datasets to be deleted, rather than merely anonymized, to prevent re-identification. For jurisdictions where “right to be forgotten” legislation applies, procurement officials should require vendors to provide individual data deletion capabilities, as well as evidence that the deletion took place.

Background: Obtaining informed consent is an important tool in protecting patients’ rights and privacy, but has become increasingly difficult given the complexity of the medical supply chain. Physicians may be reluctant to spend time-sensitive visits with patients to explain the underlying systems involved in a medical diagnosis or decision.

Many types of health data fall outside of existing health data protections and risk impeding patient trust. For instance, health data collected by tech industry companies currently falls into a regulatory gray zone. This class of sensitive yet unregulated health data consists of behavioral information like social media post activity, internet search history, and vitals data from wearable devices. With such data, companies may be able to infer medical information about patients but are not subject to health privacy laws.

Physicians also may not be aware of the extended network of institutions that have access to a patient’s data and how healthcare algorithms have utilized that information (either for building the algorithm or generating a prediction). Advanced analytics products often use data to iteratively explore many potential use cases, making it difficult to reliably inform patients about how their data is used. Even when those use cases are identified, the underlying technical systems are so complex that it puts a large burden of technical literacy on the patient to understand the benefits and risks. This literacy gap and informed consent problem is exacerbated for vulnerable populations.

Procurement officers may be tempted to follow new federal guidelines in allowing vendors to share information anonymously.

The HIPAA Safe Harbor Act⁠

, federal legislation signed into law in January 2021, mandates that protected health information (PHI) be anonymized before being used for purposes other than for the delivery and administration of patient healthcare.

Under the HIPAA Safe Harbor Act, companies must anonymize 18 specific types of personal data in order to utilize PHI data for research, sharing, or commercial use.

The following types of fields can be expanded here.

Names (Full names, birth names, or initials)

Address (any information more specific than state-level, i.e., street address, county, and city)

Dates (other than year) directly related to an individual (i.e., birth date, hospital admission date, discharge date, and year of death)

Phone numbers

Fax numbers

Email addresses

Social Security numbers

Medical record numbers

Health insurance beneficiary numbers

Account numbers

Certificate/license numbers

Vehicle identifiers (serial numbers, license plate numbers)

Device identifiers and serial numbers

Web addresses/uniform resource locators (URLs)

Internet Protocol (IP) address numbers

Biometric identifiers, including finger, retinal, and voice prints

Full-face photographic images and any comparable images

Any other unique identifying number, characteristic, or code, except the unique code assigned by the investigator to code the data

Anonymization under the HIPAA Safe Harbor Bill brings its own set of liability risks for algorithmic products. Even after the data is stripped of personally identifying information, sophisticated algorithm techniques have the ability to infer protected health information from anonymized data and represent a persistent privacy risk.

A research study⁠

flagged potential re-identification risks for patients even after their data had been anonymized. Re-identification also can become possible through cross-referencing with other datasets that share a common identifier, or through AI extraction techniques. Anonymized data can also be less useful for analysis when patients are more difficult to track across datasets and institutions.

Future Actions: In order to protect human autonomy and privacy in AI products, procurement officers should:

Require that a vendor seeking to re-share or sell anonymized organization data to third parties conduct a

risk assessment analysis⁠

that determines the specific fields that should be deleted, and not merely anonymized, to prevent re-identification.

Require that a vendor retain the ability to delete an individual user’s data from their organization, in compliance with any “right to be forgotten” regulation surrounding data privacy standards in the jurisdiction.

⁠

Fostering Responsibility and Accountability

Recommendation: Procurement officials should require human-in-the-loop frameworks for incorporating a human agent into AI prediction processes that make the final decision on whether to act on a recommendation. There should be clear specifications in the procurement requirements about when the procuring organization assumes liability — and when the vendor assumes liability — for an AI system failure.

Background: AI systems subject users to unique liability scenarios, and the number of people who are involved in medical decision-making for a patient's care multiplies as AI systems become involved. Moreover, the black box nature of certain AI models — and the complexity/technical literacy involved in AI explanations — make it difficult to explain exactly how an AI system has reached a decision or prescribed a treatment, which in turn makes it difficult to determine liability. Is the issue with the encoded rules and biases in the algorithm, or with the medical provider that interprets those recommendations in the context of their medical training?

Future Actions: In order to foster responsibility and accountability in AI products, procurement officers should:

Require

human-in-the-loop AI systems⁠

that require a human to be the final decision-maker for accepting recommendations, rather than relying on a fully automated prediction process. This requirement should clearly define the responsibilities of and liabilities for human decision-makers in the purchasing organization. The AI vendor has a responsibility to give human decision-makers as much information as possible to be able to safely accept or reject recommendations made by the AI-based tool. Additionally, as with any other medical device, fixing any manufacturing and product defects on a timely basis is the responsibility of the device supplier.

⁠

Promoting Responsive and Sustainable AI

Recommendation: Procurement officials should require continuous monitoring plans and regularly scheduled third-party audits from vendors to flag product issues over the duration of the contract period.

Background: The process of developing a healthcare technology platform involves significant work, including data collection, storage, analysis, monitoring, and sharing. These pipelines can involve many different parties in both the production and consumption of data, including patients, developers, physicians, advocacy groups, academic researchers, and regulatory government bodies. Currently, no one institution manages and oversees the whole process, which can lead to issues with ensuring accountability throughout the medical pipeline. Such transparency issues in health technology are often exacerbated by the degree of data siloing between different healthcare institutions. As a result of information asymmetry and lack of process transparency, patients and providers are unable to manage the entire healthcare technology pipeline.

Relying on technology vendors to police themselves can also be an imperfect solution. Research ethics boards (REBs) at tech companies, such as Meta’s Internal Review Board, often

face challenging internal politics⁠

when holding their own institutions accountable. Tech workers commonly do not receive education on handling sensitive health data, and healthcare providers are not always provided with the resources they need to protect patient privacy and rights.

Future Actions: To promote meaningful and sustainable accountability in purchased AI tools, procurement officials should:

Seek out third-party auditor organizations with the necessary skill sets, experience, and technical access to perform regular compliance checks and verify the system still operates according to the contract specifications. When intellectual property is a concern, the technology vendors can request a non-disclosure agreement in order to protect

trade secrets⁠

Require that vendors continuously monitor their products through monitoring plans. These plans should include minimum acceptable performance and data quality standards that the vendor should guarantee over time. The vendor should have a process in place for notifying the procuring organization about changes to their products in advance of deployment, as well as evidence that these changes will not impact product performance.

Contain a discontinuation provision for off-boarding a procured tool in case the vendor tool consistently exhibits unacceptable performance.

Now that you have reviewed these design principles, head over to the
`Procurement Language Generator`⁠
to generate your template and put these principles into practice.

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.