FormalPara Key Points for Decision Makers

Ramucirumab (RAM) monotherapy and combination therapy seem to be clinically effective for treating advanced gastric cancer or gastro-oesophageal junction adenocarcinoma patients who were previously treated by chemotherapy, as both treatments (monotherapy and combination therapy) lead to limited increase in overall survival and progression-free survival compared with best supportive care (BSC) and paclitaxel alone, respectively, from REGARD and RAINBOW trials.

The evidence review group (ERG) considered that other comparators than BSC and docetaxel, which were mentioned in the scoping document (e.g. paclitaxel, irinotecan etc.) should also have been included. Furthermore, the ERG mentioned that the results of the indirect treatment comparison (ITC) should be interpreted with caution, due to considerable heterogeneity between the trials and the exclusion of potentially relevant trials from the ITC by the company. Finally, the ERG had concerns on the generalizability of the evidence to the UK patient population.

The Appraisal Committee (AC) considered that paclitaxel is the most appropriate comparator for the combination therapy, as its comparative effectiveness is based on direct evidence. The AC also considered some of the inputs should be adjusted for the UK population.

In the end, it was concluded that ramucirumab alone or with paclitaxel could not be considered a cost-effective use of National Health Service resources.

In this appraisal, all incremental cost-effectiveness ratios (ICERs) from the company submission (base-case, sensitivity/scenario analyses) were far above the currently accepted threshold. For these cases, a faster procedure appraisal might be more efficient.

1 Introduction

The National Institute for Health and Care Excellence (NICE) is an independent organization providing national guidance on promoting good health and preventing and treating ill health [1]. The single technology appraisal (STA) process is designed to provide recommendations on a single product, device or other technology with a single indication. The process covers new technologies and enables NICE to produce guidance shortly after the technology is introduced into the UK. The NICE Appraisal Committee (AC) obtains relevant evidence from several sources: the company submission (CS), a report from the appointed independent Evidence Review Group (ERG) and advice from consultees (i.e. patients, experts and other stakeholders). The CS includes a written report and a mathematical model that describes the clinical and cost effectiveness of the technology under investigation. The ERG, an external organization independent of the NICE, reviews the CS and produces a summary report and provides a critique of the submitted evidence. After consideration of all the relevant evidence, the AC formulates preliminary guidance in the form of the Appraisal Consultation Document (ACD) as to whether or not to recommend the intervention. The stakeholders are invited to comment on this ACD and the submitted evidence. A subsequent ACD may be produced or a Final Appraisal Determination (FAD) is issued. Once published, NICE technology guidance provides a legal obligation for NHS providers to reimburse technologies that have been approved. This paper presents a summary of the ERG report and the development of NICE guidance based on the findings of the AC for the STA of ramucirumab (alone or in combination with paclitaxel), for treating advanced gastric cancer or gastro-oesophageal junction (GC/GOJ) adenocarcinoma previously treated with chemotherapy. Full details of all the relevant appraisal documents can be found on the NICE website [2].

2 The Decision Problem

The indication GC/GOJ refers to cancers that originate in the lining of the stomach and the gastro-oesophageal junction. GC/GOJs are rare (designated orphan status by the European Medicines Agency [EMA]) and aggressive cancers. The annual incidence of GC/GOJ is low in the UK; in 2012, there were 5637 new cases of GC and 3085 new cases of GOJ [3]. Of patients diagnosed with GC, approximately 80% are diagnosed with advanced, metastatic disease [4]. The prognosis in this group is very poor with a 5-year survival rate of approximately 5% [5].

For inoperable patients with advanced GC/GOJ, chemotherapy is administered. The standard first-line treatment in the UK is a regimen comprising a fluoropyrimidine and a platinum agent, with or without an anthracycline [6]. For GC/GOJ, there are two NICE guidance documents available, capecitabine in combination with a platinum-based regimen [7], and trastuzumab in combination with fluoropyrimidine [8], both recommended for first-line treatment.

There is currently no established second-line treatment and as a consequence, second-line treatments for GC/GOJ patients vary [6]. Docetaxel, paclitaxel (PAC) and irinotecan are among the therapies that are administered.

Ramucirumab is a human receptor-targeted monoclonal antibody that specifically binds to the vascular endothelial growth factor receptor. It is approved and has been designated orphan status by the EMA [10, 11] for the treatment of adult patients with advanced GC/GOJ with disease progression after prior platinum or fluoropyrimidine chemotherapy:

  • in combination with paclitaxel (RAM + PAC);

  • as monotherapy (RAM), in patients for whom treatment in combination with PAC is not appropriate.

The remit of this appraisal was specified by NICE’s final scope [2], which was to assess the clinical and cost effectiveness of ramucirumab within its licensed indication. In line with its license, the scope addressed the use of RAM + PAC as well as RAM; the latter option being for patients in which treatment in combination with PAC is inappropriate. The comparators listed in the scope were docetaxel monotherapy, irinotecan monotherapy, irinotecan- and fluorouracil-based therapy (FOLFIRI), PAC monotherapy and best supportive care (BSC).

3 The Independent Evidence Review Group (ERG) Report

Kleijnen Systematic Reviews Ltd (KSR), in collaboration with Erasmus University Rotterdam, acted as the ERG, and reviewed the evidence on the clinical and cost effectiveness of RAM + PAC and RAM among adults with advanced GC/GOJ, who were previously treated with chemotherapy as submitted by the company (Eli Lilly and Company).

The review embodied three aims:

  • to assess whether the CS conformed to the methodological guidelines issued by NICE [1];

  • to assess whether the company’s interpretation and analysis of the evidence were appropriate;

  • to indicate the presence of other sources of evidence or alternative interpretations of the evidence that could inform NICE guidance.

The ERG critically reviewed the evidence in the CS, in the response to clarification questions and evidence provided after the publication of the ACD. Furthermore, it conducted additional literature searches, explored the impact of assumptions on the incremental cost-effectiveness ratio (ICER), revised the economic model and explored additional scenario analyses.

3.1 Summary of the Clinical Evidence

The CS included a systematic review of the literature on the clinical/cost effectiveness of ramucirumab.

3.1.1 Ramucirumab Monotherapy

Estimation of the efficacy of RAM compared with BSC relied on the REGARD trial (Ramucirumab monotherapy for previously treated advanced gastric or gastro-oesophageal junction adenocarcinoma) [9]. In this global, multicentre trial, adult patients with advanced GC/GOJ, who progressed after chemotherapy (n = 355), were randomized (2:1) to receive BSC and either ramucirumab (8 mg/kg administered intravenously [IV] every 2 weeks) or placebo.

Randomization was stratified by weight loss, geographic location and location of the primary tumour. Overall survival (OS) was the primary outcome. Key secondary outcomes included progression-free survival (PFS), overall response rate (ORR) and quality of life (QoL). All analyses were performed using the intention-to-treat (ITT) population. The OS and PFS results are given in Table 1.

Table 1 OS and PFS results from the REGARD study

Health-related QoL in REGARD was assessed using the European Organisation for Research and Treatment of Cancer (EORTC) QLQ-C30 instrument [10]. At 6 weeks, the proportion of patients with improved or stable QoL was higher for the ramucirumab arm (34.1%) than for the placebo arm (13.7%); but the difference between these two arms was not statistically significant (p = 0.23).

Overall safety results from REGARD showed that 45% of the patients in the ramucirumab arm had at least one serious adverse event (AE), compared with 44% of the patients in the placebo arm. The proportion of patients who stopped treatment was 10.5% in the ramucirumab arm and 6% in the placebo arm.

3.1.2 Ramucirumab plus Paclitaxel

Estimation of the efficacy of RAM + PAC compared with PAC relied on one global, multi-centre trial (RAINBOW; Ramucirumab plus paclitaxel versus placebo plus paclitaxel in patients with previously treated advanced gastric or gastro-oesophageal junction adenocarcinoma) [11], in which patients with advanced GC/GOJ who had disease progression after chemotherapy (n = 665) were randomized (1:1) to receive RAM 8 mg/kg plus PAC 80 mg/kg, or placebo plus PAC 80 mg/kg administered IV. RAM was given every 2 weeks whereas PAC was given on Days 1, 8 and 15 of each 28-day cycle.

Randomization was stratified by geographic location, time to progression from the start of first-line chemotherapy (<6 months or not) and disease measurability (measurable or not). The primary outcome was OS, key secondary outcomes included PFS, ORR and QoL. All analyses were performed on the ITT population. The OS and PFS results are given in Table 2.

Table 2 OS and PFS results from the RAINBOW study

Health-related QoL in RAINBOW was also assessed using the EORTC QLQ-C30 instrument [10]. RAM + PAC was associated with statistically significant improved outcomes for two symptom scales (i.e. emotional function and nausea and vomiting), compared with PAC.

A similar percentage of patients in both arms stopped treatment because of AEs (11.8% in the RAM + PAC arm and 11.3% in the PAC arm). The most frequently reported treatment-emergent serious AE was neutropenia, 54.4% in the RAM + PAC arm and 31.0% in the PAC arm.

3.1.3 Network Meta-Analysis (NMA)

The company carried out a NMA to compare the OS of RAM + PAC with BSC and docetaxel. The company identified 23 trials for inclusion in the systematic review; however, only the five trials listed in Table 3 were included in the NMA of OS. The other 18 trials (out of the identified 23) were not included as they were considered to compare treatments that were not in the scope [2]. However, among these excluded studies, Sym et al. [12] compared irinotecan with FOLFIRI, both of which were listed as comparators in the scope. Upon request from the ERG, in the company’s response to clarification letter, FOLFIRI was incorporated into the evidence network using Sym et al. [12].

Table 3 Trials that were included in the network meta-analysis

The hazard ratio (HR) results from the OS NMA are given in Table 4.

Table 4 Hazard ratio results from the overall survival network meta-analysis

In order to compare the PFS of RAM + PAC with BSC and docetaxel, additional assumptions were needed since some studies did not report PFS HRs [13, 14, 16]. Using the PFS HRs reported [9, 11, 15] and assuming HR = 1 (standard error [SE] = 0.01) between RAM, irinotecan and docetaxel, the results from the NMA suggested that RAM + PAC was associated with a statistically significantly improved PFS compared with BSC (HR 0.27; 95% CI 0.14–0.53) and docetaxel (HR 0.56; 95% CI 0.41–0.76).

3.2 Critique of the Clinical Evidence

3.2.1 Ramucirumab Monotherapy

REGARD was deemed to be a good-quality randomized controlled trial. The uncertainty about long-term efficacy was considered to be small, because both OS and PFS data were mature.

In the final scope, RAM monotherapy was indicated for patients who were not suitable for PAC. However, in REGARD’s eligibility criteria, nothing was mentioned on patients’ suitability for PAC treatment. Therefore, the clinical evidence for ramucirumab monotherapy was not necessarily based on the indicated population (patients not suitable for PAC).

Furthermore, in the submission, the company assumed that if a patient is not suitable for PAC, s/he is not suitable for any other cytotoxic chemotherapy either. If this is plausible, the comparison of RAM versus BSC is in line with the NICE final scope. Otherwise, comparisons with cytotoxic chemotherapy other than PAC (i.e. docetaxel, irinotecan and FOLFIRI) should have been included.

An indirect comparison with docetaxel, using the COUGAR-02 [13] trial (Docetaxel versus active symptom control for refractory oesophagogastric adenocarcinoma), showed that the hazard ratio of OS of monotherapy ramucirumab versus docetaxel was not significantly different from 1 (HR 1.16; 95% CI 0.77–1.73).

3.2.2 Ramucirumab Combination Therapy

RAINBOW was deemed to be a good-quality randomized controlled trial. Uncertainty about long-term efficacy was considered to be small, because both OS and PFS data were mature. There were few (n = 15) UK patients in the trial, in the region 1 stratum (includes patients from Europe, Israel, USA and Australia).

The ERG concluded that the NMA results should be interpreted with caution, due to considerable heterogeneity resulting from the inclusion of predominantly Asian studies (e.g. the Hironaka et al. [15] study was based only on Japanese patients). There are substantial differences between Asian and Western countries in terms of gastric cancer incidence, histology, screening and treatment approaches [17,18,19].

Furthermore, the study by Thuss-Patience et al. [16], which was included in the NMA, was underpowered (N = 40) since it was closed prematurely because of poor recruitment. The ERG also stated that the NMA would have been more reliable if the results from Roy et al. [14], which also included an irinotecan arm, were included in the base case.

3.3 Summary of the Cost-Effectiveness Evidence

The company submitted two separate partitioned survival models to assess the cost effectiveness of RAM and RAM + PAC. The structures of both models were the same, comprising three states: pre-progression, post-progression and death. Patients entered the model in the pre-progression state. The cycle length was 1 week, and half-cycle correction was applied. Lifetime horizon was used in both models. Both models adopted a National Health Service (NHS) perspective and costs and benefits were discounted at a rate of 3.5%.

The only comparator in the monotherapy economic model was BSC since the company claimed that people who are ineligible for PAC are not eligible for any cytotoxic chemotherapy.

The comparators in the combination therapy model were BSC and docetaxel. According to the company, a comparator was eligible for inclusion if it was used sufficiently (i.e. above an arbitrary threshold of 10% use). PAC was only included as a means of validating the model results by comparing model outcomes with the clinical evidence from RAINBOW.

Transition probabilities between the health states for either RAM or RAM + PAC were determined from parametric survival functions fitted to the OS and PFS data from the RAINBOW and REGARD trials. Transition probabilities for the comparators docetaxel and BSC were estimated using HRs from the NMA.

In the monotherapy model, the company used the gamma distribution to model OS and the interval-censored log-normal distribution to model PFS.

For the combination therapy model, the company used the OS Kaplan–Meier curve from RAINBOW until the end of the follow-up period and afterwards an exponential extrapolation was assumed. An interval censored Weibull distribution was chosen for PFS.

Utility values for the pre-progression and the post-progression health states were derived from EQ-5D data from RAINBOW in both models. For the pre-progression state, the average utility of the patients at baseline was used, whereas for the post-progression state, the average utility of the patients who discontinued treatment due to progressive disease was utilized. According to the company, data from the monotherapy trial (REGARD) could not be used since only EORTC-QLQ-C-30 data were collected and post-baseline data was insufficient due to rapid disease progression in both arms.

Utility decrements were applied for AEs in both models. AEs were included based on grade (i.e. 3 and 4) and occurrence (i.e. >5% in any of the relevant trials). Utility decrements were taken from the literature in other cancer areas [20,21,22,23]. Duration of AEs was taken from the NICE STA for pixantrone in non-Hodgkin’s B cell lymphoma [24].

A utility increment was applied in the combination therapy model to the responders. For RAM + PAC, the ORR was taken from RAINBOW (27.9%). The company assumed the ORR for docetaxel was similar to PAC in RAINBOW (16.1%). Since ORR for BSC and RAM in REGARD was very low (2.6 and 3.4%), no response was assumed for BSC and RAM in the monotherapy model.

The costs included in the model were for drug acquisition, drug administration, monitoring/testing and follow-up care. All costs used in the model calculations were based on their 2014 values. The prices of generic chemotherapies were taken from the electronic market information tool (eMIT) [25]. Non-generic drug prices were taken from the British National Formulary (BNF) [26]. The drug dosages, treatment durations and relative dose intensity data for RAM, RAM + PAC and docetaxel were derived from REGARD, RAINBOW and COUGAR-02 [27]. In the base case, drug wastage from open vials was assumed and pre-medication for RAM + PAC and DOC treatments were based on their summary of product characteristics. The cost components of BSC were identified from a review of hospital medical records [6]. Drug administration costs were based on NHS reference costs.

Costs for tests and monitoring were based on expert opinion. Costs of grade 3 or 4 AEs were based on NHS reference costs. Hospitalization costs were based on trial data. It was assumed that only 12% of patients receive a third-line therapy and relevant costs (drug acquisition, administration and follow-up care) were applied in the first cycle after progression. Inflation-adjusted terminal care costs from Coyle et al. [23] were applied in the base case.

The base-case deterministic ICER for RAM versus BSC was £188,640 per QALY gained. Deterministic sensitivity analysis showed that the ICER was most sensitive to hospital admission rates, length of hospital stay, assumptions on vial waste and extrapolation of post-progression survival. Assuming a log-normal distribution (the distribution with a better fit using the goodness-of-fit diagnostic tests) instead of gamma distribution for OS reduced the ICER to £174,485 per QALY gained.

The base-case deterministic ICER for RAM + PAC versus BSC was £118,209 per QALY gained. The deterministic sensitivity analysis showed that the ICER was most sensitive to the assumptions surrounding source of drug prices (eMIT vs BNF), length of hospital stay, relative dose intensity and body surface area/weight. Assuming a Weibull distribution for OS gave similar results to the base-case analysis (£117,236 per QALY gained), whereas the log-logistic distribution reduced the ICER to £96,103 per QALY gained. In a scenario analysis, the OS, PFS and time on treatment were adjusted based on region 1 specific data, which led to similar ICER results.

3.4 Critique of the Cost-Effectiveness Evidence

The ERG considered excluding comparators due to an arbitrary threshold (10%) not appropriate and found that treatments listed in the scope for the combination therapy model (PAC, irinotecan and FOLFIRI) should have been included. Furthermore, even under the 10% threshold rule, the ERG found that PAC should have been included since 10.5% of second-line GC/GOJ patients received PAC [6].

In general, the process for extrapolating survival curves was clear, but the choice of the survival regression model was not always consistent. It was also not clear which approach was followed for interval-censoring adjustments. In the model, mortality in the pre-progression state was neglected while calculating the number of ‘newly progressed patients’. However, its impact on incremental results was expected to be low because further-line treatment costs, derived from the number of ‘newly progressed patients’, constitute a minor part of the total costs.

There were also some issues concerning the generalizability of results from RAINBOW and REGARD for UK patients. For instance, the ERG believed that region 1 data better reflects the UK population weight and body surface area for calculating the drug costs for RAM + PAC, as the whole RAINBOW population included a lot of patients from Asian countries, who had relatively lower weight and body surface areas.

Also, double counting of hospitalization costs was identified because, in addition to the modelled costs based on observed hospitalization rates, Health Resource Groups’ codes referring to AEs in the model also included hospitalizations. In its response to the clarification letter, the company provided a scenario that reduced the rate of hospitalizations by an estimate of the proportion of hospitalizations due to AEs. The ERG used these adjusted hospitalization rates in its exploratory analyses for its base case. In addition, the ERG found a few minor programming errors in the original company model; correcting those had negligible impact on the ICER.

3.5 Additional Exploratory Analyses Conducted by the ERG

3.5.1 Additional Comparators

In exploratory analyses, the ERG included the comparators defined in the final scope for the combination therapy model. These analyses were presented using the company’s base-case assumptions (except confirmed programming errors). The results of these exploratory analyses are presented in Table 5. They should be interpreted with caution because they relied on the NMA, which was associated with significant uncertainty as a result of heterogeneity between the studies.

Table 5 Pairwise base-case results for additional comparators compared with ramucirumab plus paclitaxel using the company’s base-case assumptionsa

3.5.2 ERG Base-Case and Scenario Analyses

The ERG base case included the adjustments listed in Table 6, and the results are given in Table 7. In addition, the ERG explored three different scenarios in the combination therapy model, as listed in Table 8.

Table 6 List of adjustments on company submission base case
Table 7 The results of the ERG and CS base-case ICERs (Cost per QALY gained)
Table 8 Description and impact of the ERG scenarios

3.5.3 End-of-Life Considerations

NICE end-of-life (EOL) supplementary advice at the time applied in the following circumstances and when all criteria referred to in Table 9 are satisfied.

Table 9 The end-of-life criteria for the National Institute for Health and Care Excellence (NICE)

For RAM + PAC, the company claimed that EOL criteria should be applied based on the mean additional survival in comparison with BSC (6.03 months) and docetaxel (4.13 months). However, the additional survival of RAM + PAC versus other comparators in the scope was 1.44 months for PAC, 2.27 months for irinotecan and 1.1 months for FOLFIRI. Therefore, the ERG argued the EOL criteria were not fulfilled.

3.6 Conclusions of the ERG Report

In REGARD, RAM was associated with a slightly higher OS and PFS compared with BSC, and RAINBOW showed more favourable OS and PFS results for RAM + PAC in comparison with PAC. The NMA suggested some gains in OS and PFS for RAM + PAC compared with BSC, and gains in PFS for RAM + PAC in comparison with docetaxel. The ERG considered that the results based on indirect comparisons should be interpreted with caution due to significant heterogeneity between studies. For instance, some studies were predominantly Asian, and the histology and treatment pattern of gastric cancer between Western patients and Asian patients might be very different [17,18,19].

By correcting issues in the model and changing a few input parameters, a new ERG base case was defined for both monotherapy and combination therapy models. The ERG conducted some exploratory analyses in which additional comparators from the NICE scope were included. Furthermore, additional scenario analyses were conducted in the combination therapy model. In all analyses, the ICER of RAM + PAC or RAM compared with any one of the comparators was never below £90,000 per QALY gained. Similarly, the probability of RAM + PAC or RAM becoming the most cost-effective therapy was negligible for thresholds below £100,000 per QALY gained in all analyses.

In order to improve the robustness of the health economic results for the UK, a direct comparison of RAM and RAM + PAC with all of the relevant comparators among a predominantly western patient population would be necessary. In addition, QoL data for RAM monotherapy and for comparators among the targeted population (i.e. patients ineligible for PAC combination therapy) would reduce the uncertainty around utilities.

Nevertheless, regardless of any problems with the evidence and the model, the ICERs from even the company base case far exceeded the usual threshold of £20,000 to £30,000 per QALY gained and even the EOL threshold of £50,000 per QALY gained.

3.6.1 Key Methodological Issues

For many of the key cost-related inputs (i.e. body weight, hospitalization rate, length of stay, etc.), data from all patients in the REGARD and RAINBOW trials were used, but average estimates from these trials may not be generalizable to the UK population as the body weight/surface as well as the gastric cancer histology and treatment patterns in the Asian population might differ from those in the Western population [17,18,19].

If direct comparisons based on phase III randomized controlled trials for all relevant comparators from the scope are not available, STAs often rely on indirect treatment comparisons. As was the case in this STA, sometimes these analyses involve assumptions that are not evidence based (e.g. assuming PFS HR = 1 for RAM vs docetaxel). Also, this STA revealed that decisions regarding exclusion of trials, (e.g. Roy et al. [14]) might influence outcomes substantially. Finally, results of indirect treatment comparisons should be interpreted with extreme caution if these are obtained by pooling data from heterogeneous trials.

4 National Institute for Health and Care Excellence Guidance

4.1 Preliminary Guidance

The Committee considered the company’s decision problem, and noted that it was in line with the NICE scope, with the exception of the choice of comparators.

While BSC would be the only comparator for RAM, the Committee concluded that for people for whom RAM + PAC is appropriate, PAC and docetaxel are both relevant comparators.

The Committee concluded that the REGARD and RAINBOW trials formed suitable evidence on which it could base its decision on the clinical efficacy of RAM and RAM + PAC.

For combination therapy, the Committee considered that there was no reason to use the NMA results, rather than using relevant head-to-head data from a good-quality, international, randomized controlled trial (RAINBOW) with mature OS and PFS data.

For cost effectiveness, the Committee agreed with the error corrections and adjustments carried out by the ERG to use region 1 data for body surface area and body weight, to correct for double counting of hospitalizations, and to adjust length of hospitalization stay for region 1. It concluded that the model submitted by the company was robust and suitable for the purposes of its decision making and that the ERG’s suggested amendments to the model were appropriate.

The Committee considered the most robust analysis was the ERG’s exploratory analysis, which used RAINBOW trial data for RAM + PAC compared with PAC, and used time-varying utility values from RAINBOW, collected during the pre-progression period. This analysis provided the most plausible ICER of RAM + PAC versus PAC for people with GC/GOJ for whom treatment in combination with cytotoxic chemotherapy is appropriate: £408,200 per QALY gained (with incremental costs of £35,100 and incremental QALYs of 0.09).

The Committee concluded that the most plausible ICER for people with GC/GOJ adenocarcinoma for whom further cytotoxic chemotherapy is not appropriate (i.e. RAM vs BSC) was £188,100 per QALY gained (representing incremental costs of £22,500 and incremental QALYs of 0.12).

According to the Committee, EOL criteria were not met, and the overall conclusion was that, for the treatment of adults with advanced GC/GOJ adenocarcinoma with disease progression after platinum and fluoropyrimidine chemotherapy, neither RAM + PAC nor RAM (where paclitaxel is not appropriate) were a cost-effective use of NHS resources at the usual range of ICERs (£20,000–£30,000 per QALY gained).

4.2 Response to the Preliminary Guidance

The response of the company to the ACD first focused on the unwarranted variation and inequity in the provision of second-line GC/GOJ treatments in clinical practice, which might be aggravated by the lack of a licensed second-line treatment. Secondly, the company expressed that the weaknesses of the NMA were overestimated (arguing that the Thuss-Patience et al. study [16] should be considered as systematically unbiased, and that the Hironaka et al. study [15] results are transferable to the UK), and that the NMA results should be considered sufficiently plausible to permit its use. Thirdly, concerning the comparators, the company did not agree with the Committee’s decision that the comparison of PAC with RAM + PAC based on RAINBOW data provided a good basis for assessing the EOL criteria and the most plausible ICER estimates, because the company considered BSC and docetaxel were more relevant and commonly used comparators. Finally, the company discussed a number of potential factual inaccuracies and inconsistencies, which were clarified before the final guidance.

4.3 Final Guidance

The Committee considered the comments raised by the company in its response to the ACD. However, this did not lead to any change in the ACD and the Committee concluded that RAM or RAM + PAC could not be recommended within the market authorization.

5 Conclusions

This appraisal demonstrated that the selection of the relevant comparators is crucial and might have a substantial impact on the ICER. Even though the comparators were identified in the scope, the company and the decision maker had different views on the relevance of some of the comparators for the inclusion in the cost-effectiveness analysis.

Similarly, the inclusion of available evidence for the NMA is another crucial decision that may have a considerable impact on ICER, and different parties may have differing views on this decision as well.

Another important issue is the generalizability of the evidence to the UK patient population. This STA showed that when differences are expected between the UK population and the population on which the evidence is built (e.g. weight/body surface area, histology and treatment patterns of the Asian patients), it might be important to adapt these analyses for the UK.

Finally, this appraisal shows that decision makers may prefer to base their decision on direct evidence rather than indirect evidence, especially if the latter has the potential to be heterogeneous and biased.

Despite these differing opinions between the company and decision maker, in all of the analyses from the company and the ERG, ICERs of monotherapy and combination therapy were always far above the currently accepted thresholds. This review described a STA process without a patient access scheme (PAS), with unacceptably high ICERs, even in the best-case scenarios of the company. In the case of no PAS, if the best-case scenario from the company is not within the reach of an acceptable cost-effectiveness ratio, one may wonder to what extent the current STA process, in its full scale, is a cost-effective way of spending NHS resources.