HEALTH TECHNOLOGY ASSESSMENT METHODS GUIDELINES FOR MEDICAL DEVICES: HOW CAN WE ADDRESS THE GAPS? THE INTERNATIONAL FEDERATION OF MEDICAL AND BIOLOGICAL ENGINEERING PERSPECTIVE

Objectives: Current health technology assessment (HTA) methods guidelines for medical devices may benefit from contributions by biomedical and clinical engineers. Our study aims to: (i) review and identify gaps in the current HTA guidelines on medical devices, (ii) propose recommendations to optimize the impact of HTA for medical devices, and (iii) reach a consensus among biomedical engineers on these recommendations. Methods: A gray literature search of HTA agency Web sites for assessment methods guidelines on devices was conducted. The International Federation of Medical and Biological Engineers (IFMBE) then convened a structured focus group, with experts from different fields, to identify potential gaps in the current HTA guidelines, and to develop recommendations to fill these perceived gaps. The thirty recommendations generated from the focus group were circulated in a Delphi survey to eighty-five biomedical and clinical engineers. Results: Thirty-two panelists, from seventeen countries, participated in the Delphi survey. The responses showed a strong agreement on twenty-seven of thirty recommendations. Some uncertainties remain about the methods to accurately assess the effectiveness and safety, and interoperability of a medical device with other devices or within the clinical setting. Conclusions: As medical devices differ from drug therapies, current HTA methods may not accurately reflect the conclusions of their assessment. Recommendations informed by the focus group discussions and Delphi survey responses aimed to address the perceived gaps, and to provide a more integrated approach in medical device assessments in combining engineering with other perspectives, such as clinical, economic, patient, human factors, ethical, and environmental.

The World Health Organization describes medical devices (MDs) as essential tools for prevention, diagnosis and treatment of illnesses and diseases, and patient rehabilitations (1). The European Parliament and Council of the European Union defines a MD as "any instrument, apparatus, appliance, software, implant, reagent, material or other article intended by the manufacturer to be used, alone or in combination, for human beings" (2). Furthermore, they are used for different purposes such as diagnosis, prevention, monitoring, prediction, prognosis, treatment, alleviation, or compensation of disease, injury, or disability (2). Moreover, the U.S. Food and Drug Administration specifies that MDs are "intended to affect the structure or any function of the body of man or other animals, and which does not achieve any of its primary intended purposes through chemical action within or on the body of man or other animals and which is not dependent upon being metabolized for the achievement of any of its primary intended purposes" (3).
There are important differences between drug therapies and medical devices. These differences can impact health technology assessment (HTA) methods and can be grouped into five categories: product lifecycle, clinical evaluation, user issues, costs and economic evaluation, and intellectual property (4). This diversity among drugs and MDs results in different assessment needs, evaluation criteria, and approaches.
The perspective of clinical and biomedical engineers is particularly relevant in the assessment of MDs and, given the drugoriented development of HTA methods and processes in the past decades, it is unlikely to be addressed in many HTA guidelines (5). Some HTA agencies or networks have developed guidelines specific to MDs and diagnostics (6)(7)(8)(9)(10)(11) (Supplementary  Table 1). However, it remains uncertain whether these guidelines considered all the unique features of MDs as relevant for the clinical and biomedical engineers' community.
In 2017, Schnell-Inderst et al. (12) developed recommendations for the planning and conduct of systematic reviews of therapeutic MDs. A targeted literature review of methodological publications and guidelines on the design, conduct, analysis, and reporting of primary studies on the clinical evaluation of therapeutic MDs and surgical procedures was performed (12). The recommendations were based primarily on the information from the literature review, judgement and experience of the authors, and discussions with their project partners at MedTech and the European Network for Health Technology Assessment (EUnetHTA) (12). The authors highlighted the necessity to define the intervention and describe the technical characteristics and to identify and measure the potential effect modifiers, such as the incremental development of the device, the experience and learning curve of the user, and contextual factors (12).
The involvement of biomedical and clinical engineers in HTA can provide insights on the technical characteristics, usability, safety, user setting dependences, organizational impact, and maintenance of devices throughout their lifecycle (13)(14)(15). In a 2013 study, Margotti et al. interviewed clinical engineers, healthcare providers, and managers in four public hospitals in Brazil to inquire about their perspectives on the decision making process in acquiring new medical equipment, HTA, and the identification of aspects to guide HTA in their institutions. Based on the participants' responses, the authors concluded that the decision making process to introduce a new medical equipment was not based on evidence, and therefore, recommended that the hospitals should establish a formal HTA process that would involve clinical engineers to integrate medical equipment in the hospitals. The HTA process would be separate from a technical evaluation for procurement decisions (16).
In November 2016, as part of the 2015-18 HTA program, the International Federation of Medical and Biological Engineers (IFMBE)-HTA Division held a focus group to contextualize the differences between MDs and drugs and their impact on HTA. The study objectives were to: (i) review and identify the gaps in the current HTA guidelines on MDs; (ii) propose recommendations (4) to improve the accuracy of MD assessments and compare current HTA MD guidelines with the proposed recommendations; and (iii) to reach a consensus among clinical and biomedical engineers on the proposed recommendations with a modified Delphi survey.

METHODS
We conducted a structured focus group to review and identify the gaps in HTA guidelines for MDs, and we developed a modified Delphi survey to establish consensus within the community of biomedical and clinical engineers on the recommendations proposed to address the perceived gaps. The study received full ethics approval by University of Warwick Ethical BSREC Committee (REGO-2017-2072).

Focus Group
Health Technology Guidelines. Guidelines were obtained from a limited gray literature search of HTA agency Web sites and from contributions among the study investigators (7)(8)(9)(10)(11). They were identified through a focused Internet search on HTA organizations and research initiatives up until May 2018. These searches were supplemented by reviewing the bibliographies of the guidelines and through contacts with appropriate experts and manufacturers. HTA guidelines were selected for inclusion if they were specific to any type of MD and described methods to evaluate medical technology. Guidelines that presented solely the clinical review methods, were focused on HTA processes only, or were not explicit about the appropriate methods for medical device assessments were excluded.
Six HTA guidelines (6)(7)(8)(9)(10)(11), published from 2011 (7) to 2018 (10), were selected for inclusion in the review. A standardized data extraction form was designed a priori to document and tabulate all relevant information from the included guidelines. One investigator (J.P.) extracted information for each HTA component in the guidelines, and a second investigator (R.C.) reviewed the extracted information. Another investigator (L.P.) prepared a table to extract the relevant information from the guidelines (17). Accordingly, the description of how HTA guidelines addressed the product lifecycle, clinical evaluation, user issues, and costs and economic evaluation for medical devices were extracted.
A narrative summary of the HTA methods described in the guidelines was performed. The methods described in the HTA guidelines were contrasted with the proposed recommendations developed by the focus group to determine which recommendations were addressed in full or partially in the existing guidelines and which ones were not addressed at all. The aims of the focus group were to: (i) contextualize the differences between MD and drugs in terms of their characteristics and functionality (18;19); (ii) review and expand the list of the main differences between MD and drugs that can impact the HTA methods identified in previous studies (4); and (iii) develop a set of recommendations to fill any identified gap on HTA guidelines to assess MDs.
Using the narrative summary of the guidelines, the focus group discussed and developed an initial set of recommendations. These initial recommendations were then mapped onto the European Network for Health Technology Assessment (EUnetHTA) (4) domains (i.e., health problem and current use of the technology, description and technical characteristics of technology, safety, clinical effectiveness, costs and economic evaluation, ethical analysis, organizational aspects, patients, social, and legal aspects).

Delphi Survey
Survey Design. The Delphi method is a process that uses a series of rounds of questionnaires to gather information in a structured manner until consensus is obtained (20). For our study, we conducted an online Delphi survey. This approach enabled: (i) the anonymity of each panelist; (ii) to reach panelists from various geographic locations; and iii) to reduce the risk of one or more panelists dominating the consensus process (20)(21)(22). The aim of the Delphi process for the present study was to obtain consensus from panelists with different perspectives and expertise, and not to achieve statistical power (21). In particular, the objective of the Delphi survey was to achieve consensus on the 30 recommendations proposed by the focus group to fill the perceived gaps in HTA process to evaluate MDs.
The online Delphi survey (Supplementary File 1) consisted of a demographic section and a section to inquire about the panelist's agreement level with the thirty recommendations proposed by the focus group to enhance the HTA guidelines for MDs.
Before a wider distribution, the survey was piloted among a handful of clinical and biomedical engineers to estimate the time to complete the survey, and to review the flow of interaction, coherence, and appropriateness of format and contents.
Delphi Process. We invited a purposive sample of approximately 85 international professionals through an email invitation. The panelists were identified through the IFMBE associates, study investigators' networks, and scientific and professional societies. They were selected based on their experience with health technology assessments, including design, development, testing, implementation, selection, procurement, reimbursement, maintenance, and from various areas in the healthcare sector.
Our email invitation included a letter inviting the individual to participate, and described the study objectives, an overview of the Delphi methods, expected time to complete survey, and a link to the survey. The invitation also indicated that any information provided would have been used in a publication. If the participants were unable to participate in the Delphi process, we invited them to suggest alternates, whom they believed would have been appropriate for the study.
A 5-point Likert scale was applied, where "1" was strongly disagree and "5" was strongly agree. Median scores were calculated per recommendation to characterize the answer category above and below which 50 percent of the answers fall. Interquartile ranges (IQRs) were used to represent the spread of the data and to assess the level of consensus per recommendation. Ratings with a median of ≤ 2 (i.e., high level of disagreement with the proposed recommendation) and a narrow IQR (i.e., IQR range between 1 and 2) were considered to have reached consensus on a strong disagreement. Those with a median ≥ 4 with an IQR between 4 and 5 were considered to have reached consensus on a strong agreement with the proposed recommendation. In addition, the free text comments were assessed, especially for ratings that seem to be equivocal as they can provide some insights on the respondents' thought process (23).
As consensus (i.e., median ≥ 4) was reached for all proposed recommendations in Round 1, another round of Delphi iterations was not conducted.
Polisena et al. Table 1 summarizes the output of the focus group in terms of potential recommendations to improve HTA process to evaluate MDs. The table presents the proposed recommendations and their association with the EUnetHTA domains, the MD characteristics and the impact of those characteristics for the application of HTA methods. The results of the focus group discussions were organized in four main MD characteristics: product lifecycle, clinical evaluation, issues in use, costs and economic evaluation.

Focus Group
Product Lifecycle. Compared with drug therapies, MDs usually have a shorter lifecycle due to iterative innovations, and the facility to overcome patient protections often results in previous generations of devices to become obsolete. One direct consequence of a shorter product lifecycle is that the evaluation of MDs at market launch usually relies on limited evidence to measure their safety, efficacy, effectiveness, and value for money. Additional items related to the medical device lifecycle include the maintenance required, potential instability of individual parts that can impact the overall performance of the device, and possible interferences with or dependency on other devices. A lack of consideration for any of these items can result in a risk of inaccurate estimates related to the efficacy, effectiveness, safety, and cost-effectiveness of the device. Some recommendations proposed to help address these perceived gaps include obtaining as much data as feasible to more accurately predict the medical device lifespan, using appropriate statistical methods to analyze the different types of evidence, understanding the maintenance requirements with validated tools (e.g., ISO and IEC standards), and conducting risk assessments and technical analyses of the minimum requirements for possible interferences with other medical devices (Table 1).
Clinical Evaluation. The use of MDs involves a longer learning curve if the design is more complex; thus, impacting the estimation of the clinical efficacy and effectiveness, safety, cost-effectiveness, and service provision of the device. To help ensure the accuracy of the estimates, it is suggested to use both pre-and postmarket and registry data that capture the impact of the learning curve on outcomes, ensure that device users receive appropriate training to reduce the risk of bias in measuring clinical efficacy and effectiveness, and apply appropriate statistical methods to incorporate the impact of the learning curve into the estimates of outcomes and costs (Table 1).
Issues in Use. Most of the guidelines reviewed did not provide any recommendations on how to identify and assess the technical requirements and logistics, setting, and training, and accreditation considerations for the user. The NICE-DAP (6) mentioned that in some instances special implementation issues and recommendations for use of a diagnostic test were identified in their reports and Health Quality Ontario (HQO) considers policies or legislation that can impact the implementation of the health technology in their jurisdiction (10). The recommendations in Table 1 indicate the need for learning curve estimations and the collection of data that reflect the setting in which the MD will be used.
Costs and Economic Evaluation. Most guidelines discussed costs and economic evaluations. The EUnetHTA guidelines focused on the methods to assess the relative clinical effectiveness of therapeutic medical devices, but did not discuss cost-effectiveness and other non-clinical benefits or harms (8). In addition to primary economic evaluations, HQO also indicated that a budget impact analyses may be included in their HTA reports, and they conduct systematic reviews of the economic evidence for the health technology (10).
Costs associated with the supply, installation, training of users, supply of consumables, maintenance, and ongoing facilities are important considerations to appropriately evaluate the budget impact and cost-effectiveness of an MD. Compared with drugs, different purchasing methods and financial schemes may be available to the payers willing to adopt a new medical device. For example, manufacturers may offer to supply infusion devices free of charge to a hospital provided that they also contract to purchase a given volume of giving sets. Such arrangements are often referred to as "consumables deals." As well, hospitals use different ways to consolidate their purchasing demand, sometimes at the interdepartmental, inter-organizational, or system level. In European and other countries, most highly innovative MDs are reimbursed through diagnosis-related group-based payment systems, which cover expenditures of hospital providers and encourage lowering costs. Long-term tenders that procure large quantities of specific products are common and facilitate interaction with the suppliers while allowing clinicians to gain experience with the devices; however, short-term targeted contracts are also used in practice (24,25).
Considerations of such schemes are relevant when conducting an economic evaluation (26). The proposed recommendations suggest incorporating all operation costs and the appropriate financial model to acquire the device in economic evaluations and budget impact analyses. The operating costs will be different for single-use devices.
The focus group prepared a list for each of the identified ten gaps in the HTA guidelines on medical devices. These recommendations and associated gaps were used then to develop the Delphi survey (Supplementary File 1).

Delphi Survey
Characteristics of Panelists. We invited eighty-five professionals to participate in the study, and thirty-two completed the survey (37.6 HTA methods guidelines for MDs • Potential instability of individual parts (e.g. software) • Risk of inaccurate data and device failure may impact safety, efficacy, effectiveness, and cost-effectiveness • Possible interferences with other MDs (e.g., radiofrequency) • There may be minimum requirements in terms of organization (e.g. personnel), technology (e.g. radiofrequency interferences) and structure (e.g., physical spaces) • Data exchange and interoperability • Organizational aspects (ORG) • Possible dependency on other MDs • Risk of inaccurate data if other devices fail or are ineffective (e.g., working sub-optimally) • Understand the setting and map the process of the device use • Conduct a risk assessment to Identify and assess level of dependency on other devices • Clinical Effectiveness (EFF) • Costs and economic evaluation (ECO) • Longer learning curve • Impact on the estimation of efficacy, effectiveness, user satisfaction, safety, cost-effectiveness, and service provision • Use both pre-market (e.g., usability and risk assessment of the device use; ISO 62336-1 2015) and post-market data to capture impact of learning curve on outcomes • Collect and report data on the effects of learning on relevant procedural and clinical outcomes during clinical trials, both at the physician and centre levels • Collect registry data that allow the estimation of the learning curve based on routine use of the MD once it has been adopted in clinical practice • Ensure that device users enrolled in a trial have received appropriate material for the device use (guidelines and service process of use) and training to reduce the risk of bias in measuring clinical efficacy and effectiveness • Use appropriate statistical methods to incorporate the learning curve into the measurement of relevant outcomes and costs • Costs and economic evaluation (ECO) • Organizational aspects (ORG) • Safety (SAF) • Clinical Effectiveness (EFF) • Designing a randomised control trial (RCT) for a MD is more challenging than for drugs • Blinding is a challenge in a study with MDs.
• Unlike drugs, MDs are diagnostic and therapeutic, and they can influence the clinical decision making process and the patient's clinical care pathway.
• Adopt appropriate study designs for MDs. The design can include preliminary phases of clinical pathway mapping and qualitative analysis to identify the most appropriated setting, comparators and variables to be considered in a trial. • Reinforce the use of simulation in case of incremental innovation • See above (i.e., longer learning curve)

Issues in use
• Performance is stronger dependent on user and context of use • See above (i.e., learning curve) • Efficacy and effectiveness, satisfaction (i.e., usability) are also dependent on user workload, stress level, etc… The performance in the use of a device is a context dependent factor. In this instance, the context is defined as "users, tasks, equipment (hardware, software and materials), and the physical and social environments in which a product is use"; ISO 9241-11:1998)(37) • See above (i.e., longer learning curve) • Identify the setting and collect data on outcomes in the user training phase • See above (i.e., longer learning curve) • Often requires intensive training • See above (i.e., learning curve) • See above (i.e., longer learning curve) percent) ( Table 2). Between one to three panelists did not provide scores for several recommendations. Panelist characteristics are presented in Table 2, they represented seventeen countries. They ranged from five panelists from Mexico (15.6 percent) to one each from Belgium, Chile, China, Cuba, Ecuador, Egypt, Italy, the Netherlands, Spain, and the United Kingdom. Over 25 percent of panelists were clinical engineers (n = 12; 37.5 percent) or managers (n = 9; 28.1 percent). Almost 70 percent completed graduate studies (masters: n = 10; 31.1 percent and doctorate: n = 12; 37.5 percent), and approximately 85 percent had 11 years or more of service. The majority of respondents were involved in HTA (n = 22; 68.8 percent) or health technology management (n = 18; 56.3 percent), and over 20 percent indicated an acute hospital (n = 8; 25 percent) or university (n = 7; 21.9 percent) as their principal employer.
Consensus on Recommendations. The median, the IQR, and the number of panelists for each recommendation are presented in Table 3.
In Round 1, most of the recommendations reached a consensus with a strong agreement (i.e., median ≥ 4 and IQR between 4 and 5). Two recommendations on the application of sensitivity analyses in economic models to assess the impact on the results of varying lifespans or incremental innovations and the processes of data exchange and on the interoperability of the device within the hospital system or with other devices had a median ≥ 4, but the IQR was between 3 and 5. As well, one recommendation centered on the appropriate methods to assess the evidence on medical device effectiveness and safety presented a median equal to 4, but with an IQR between 3 and 4.
Although the panelists agreed with the proposed recommendations, several commented on their feasibility based on the availability of high-quality evidence and questioned whether existing statistical methods and economic models used in HTAs can accurately assess the effectiveness and safety of more complex health technologies. In addition, there was some uncertainty on whether an HTA is the appropriate approach to evaluate the interoperability of a MD with other devices or within the hospital system and if a more technical assessment conducted by entities with technical expertise is warranted.

DISCUSSION
This study aimed to reach a consensus via an online Delphi survey on recommendations proposed by the IFMBE-HTA Division to address the perceived gaps in current HTA methods guidelines for MDs. The thirty-two participants were experts with numerous years of experience in the field of HTA or health technology management in an academic hospital or university. Consensus with a strong agreement was achieved in the first round for 90 percent (27/30) of the recommendations related to the product lifecycle, clinical evaluation, issues in use, and costs and economic evaluations. For the remaining recommendations, consensus was reached but the IQR was wider (e.g., 3 to 5). Comments indicated that panelists were uncertain about the feasibility of accurately measuring the effectiveness and safety of a medical device throughout the product lifecycle given the available evidence and existing statistical methods and economic models used in HTAs. Furthermore, the appropriateness of an HTA to assess the interoperability of a medical device with other devices or within a hospital system was questioned and a more technical assessment was proposed instead.
Unlike drugs, the permission to market MDs may not be based solely on the evaluation of efficacy and safety data from randomized clinical trials. Although manufacturers must perform clinical studies on human subjects for devices labeled as high risk by the regulators, there are no explicit standards on the sample size, design, or follow-up period required (27)(28)(29). Panelists in the Delphi survey commented that to accurately assess the effectiveness and safety of MDs, more appropriate evidence is required as well as the need for relevant statistical and economic methods to measure the outcomes.
HTA methods guidelines for MDs Timeframe to perform a complete a HTA is much reduced in the MD lifecycle compare to drugs. Spending several months or years to conduct a HTA may result in an outdated or obsolete (e.g., a newer version of the device is available) report. • Limited evidence is available to meet the objectives of HTA • The time horizon of the economic evaluation may be inaccurate • Estimates of cost recovery may be inaccurate in the cost-effectiveness analyses 1. Use the available evidence to accurately estimate the cost-effectiveness of the MD and to quantify its uncertainty. When evidence is lacking, HTA experts could run clinical performance and usability analysis to gather relevant insights for their analysis. Maintenance of the device and the characteristics of the services for the device may impact costs, efficacy, effectiveness, and safety of the MD over its lifespan.

4.
Obtain additional insights about the maintenance required, and capture maintenance impact in the HTA by using appropriate methods of contextual inquiry. The context inquiry can include, for instance, gather information about the service requirements in terms of preventive maintenance planning, costs, downtime and by gathering qualitative data about the needs of the stakeholders of the service.  process because they can identify durability and rare serious adverse events from long-term use of the device (32). Given the important methodological issues and contextual considerations that require attention in the assessment of MDs, research initiatives are under way to address them (18,19,(31)(32)(33).
Delphi panelists also agree with focus group results that the adoption of a medical device has a more significant impact in an organization compared with the introduction of a new drug therapy due to implementation consideration and maintenance issues for those lasting longer than single use devices. Furthermore, a medical device can have numerous applications and a shorter lifecycle, and its effectiveness is impacted by the user interaction, setting, and the learning curve to operate it (27)(28)(29).
An important gap in HTAs identified by focus group experts and, confirmed by panelists, is the need of methods to estimate the impact of the learning curve related to the effectiveness, safety, and costs associated with the use of MDs, and relevant evidence is also lacking. Methods for the estimation of the learning curve are well-established, although these have not yet been incorporated in guidelines or used to inform postmarket data collection (34). The incorporation of the learning curve and contextual factors into economic evaluations still represent a great challenge in MD assessments (35).
In terms of risk assessment, some panelists believed that the original manufacturers of the device should be responsible for reporting risks, including the required maintenance, and provide user, technical, and maintenance training to professionals accountable for the application and maintenance of the MD.
Panelists identified value-based procurement, a process which focuses on health system performance and patient outcomes, and longer-term cost efficiencies, and working with suppliers to identify opportunities to develop innovative products and services (36), as a relevant forum to assess the impact of the maintenance required and financial models (e.g., leasing agreements) represented in economic evaluations because an HTA would unlikely be able to address these items in its current form.

Limitations
The interpretation and application of our study may be influenced by several limitations. The survey presented a description of each gap in the current HTA methods guidelines for medical guidelines and proposed recommendations to address them. Although none of the panelists requested further information or clarifications it is uncertain how the panelists interpreted the descriptions and the items of the survey. Some element of complexity due to the textual presentations of gaps and associated recommendations may also have pushed some respondents to drop-out of the Delphi process. Finally, the panelists represented a variety of perspectives, but it is not feasible to identify all clinical and biomedical engineering perspectives based on the number of survey respondents (n = 32).

Directions for Future Research
The recommendations put forth will help to provide a more integrated and enhanced approach to MD assessments. Based on the scores, the panelists agreed with the proposed recommendations, but remained uncertain on the feasibility for some of them, such as assessing the effectiveness and safety of a medical device in the product lifecycle based on the evidence and current statistical methods and economic models. A more technical assessment was proposed to evaluate the interoperability of a MD with other devices or within a hospital system. Future research on which proposed recommendations are more feasible for an HTA versus which ones are more suitable for a study design or technical assessment of medical device use is warranted. Another follow-up study to this research initiative would be to collaborate with HTA units at different levels in the healthcare system to assess the feasibility of implementing these recommendations based on the available data and evidence and to adapt existing or develop new methods accordingly.
In conclusion, as the MD characteristics and functionality differ from drug therapies, current HTA methods may not accurately reflect the conclusions of their assessment. Our paper presents recommendations informed by the focus group discussions and Delphi survey responses that are aimed to address the perceived gaps in the HTA guidelines and to provide a more integrated and improved approach to HTAs on MDs. They were organized into four categories: product lifecycle, clinical evaluation, issues in use, and costs and economic evaluation. According to the scores from the survey responses, consensus was achieved for all recommendations among thirty-two international panelists in one round. As there is uncertainty on the feasibility in the implementation on some of the recommendations, future research can involve an evaluation on how to implement them in an assessment based on the available data, evidence, and statistical methods, and economic models used in HTAs.