Introduction

Molecular imaging with [18F] fluorodeoxoyglucose positron emission tomography ([18F]FDG PET) has an established role in the staging of patients with non-small cell lung cancer (NSCLC). In addition, an increasing number of studies have shown that [18F]FDG PET is useful for early response assessment in NSCLC patients treated with cytotoxic chemotherapy [14].

Tumor hypoxia is a common phenomenon in lung cancer and it is related to poor prognosis due to treatment resistance [511]. Preclinical studies have shown that nitric oxide (NO)-donating drugs increase blood flow and thereby decrease hypoxia [12]. Nitroglycerin (NTG), a vasodilator, is such a drug. By increasing the tumor blood flow, NTG consequently augments antitumor drug delivery and inhibits hypoxia inducible factor (HIF-1α) [13]. In preclinical models, administration of low doses of NTG, at least partially, reverses the hypoxia-induced resistance to anticancer drugs [14].

Yasuda et al. [15] showed that the combination of platinum-based chemotherapy and NTG improves overall survival (OS) in patients with stage IIIb/IV NSCLC. However, two recent randomized studies, including the Dutch NVALT12 study, could not confirm these results and no clinical effect was observed by the addition of NTG [16, 17].

Negative correlation between perfusion computed tomography (CT) and hypoxia PET on a population basis were also described in literature [18], suggesting that hypoxia is negatively correlated to tumor blood flow. Consequently, if treatment with NTG improved tumor perfusion, this could translate into a change in FDG uptake [13]. This concept was tested in the context of the randomized NVALT12 study that sought to investigate whether the addition of NTG to first-line paclitaxel-carboplatin-bevacizumab (PCB) chemotherapy would improve progression free survival (PFS).

In clinical practice, tumor response assessment is based on changes in tumor size, according to response evaluation criteria in solid tumors (RECIST) at week 6 [19, 20]. However, response monitoring is complex because the tumor has to change significantly in size and shape before a response is reliably detected by CT [21, 22]. This leads to an underestimation of the efficacy of cytostatic therapeutic agents that stabilize the disease, in contrast to conventional cytotoxic drugs, which induce shrinkage of tumor dimensions in the case of tumor response [19]. Metabolic changes, measured by [18F]FDG PET, will occur earlier than changes in size and may, therefore, be used for early treatment response assessment. A decrease in metabolic activity of the primary tumor after one cycle of chemotherapy treatment is predictive for better outcome [1, 18, 23, 24].

In this paper, we investigated the feasibility of [18F]FDG PET for response assessment to PCB treatment with and without NTG patches. Furthermore, we compared the [18F]FDG PET response with both the commonly used RECIST and survival.

Materials and methods

Patient characteristics

In the multicentric NVALT12 trial (NCT01171170), 223 patients with metastatic non-squamous NSCLC were randomized between PCB with or without NTG (see [17] for patient inclusion criteria and treatment specifications) with the primary endpoint of that trial being PFS. Response was assessed every two cycles by the local investigator according to RECIST 1.1 based on CT imaging [20]. In patients undergoing [18F]FDG PET/CT at baseline as part of the standard work-up (median number of days between baseline scan and start treatment 17 days; range 73 days before treatment to 1 scan performed 1 day after the start of treatment), the protocol pre-specified a second [18F]FDG PET/CT between day 22 and 24 (after second chemotherapy infusion and with NTG application for patients in the experimental group; Fig. 1). To include more patients (17) in the analysis presented here, scans acquired with a time interval between the first chemotherapy and the second [18F]FDG PET/CT scan less than 35 days were accepted. This study was approved by the medical ethical committee and all patients provided informed written consent prior to any study handling.

Fig. 1
figure 1

NVALT12 trial timeline. At day one of the 21-day cycle, the paclitaxel-carboplatin-bevacizumab therapy is administered (grey square). The patients in the experimental arm wear the nitroglycerin (NTG) patch from day −3 to +2. The baseline [18F]FDG PET/CT is performed before the start of chemotherapy and the second [18F]FDG PET/CT is performed between day 22 and 24 (black arrow). The baseline diagnostic CT is performed before the start of chemotherapy and repeated after every two cycles of chemotherapy (grey arrow)

Scan protocol

Injected [18F]FDG activity depended on individual patient and scanner characteristics, following the Netherlands protocol for standardization of [18F]FDG whole-body PET studies in multi-center trials (NEDPAS) [25], which was the precursor of the EANM guidelines, and images were reconstructed to institutional standards. Typically, a low-dose CT scan as part of the [18F]FDG PET/CT was made, according to institutional standards, and used for attenuation correction. Due to variations between the institutes, for quality control purposes, a spherical volume of interest (VOI) with a diameter of 3 cm was delineated in the right lobe of the liver [26]. This measurement was used as quality index and scans with a mean standardized uptake value (SUV) of the liver below 1.3 or above 3.0 were excluded from further analysis [27].

Early prediction of survival

The primary tumor was manually delineated by experienced radiation oncologists using a treatment planning system (Eclipse Version 11.0, Varian Medical Systems, Inc.) and used as the region of interest (ROI). A standard delineation protocol was used, which included fixed window/level settings of CT (lung: 1700/-300; mediastinum: 600/40). Patients without a measurable primary tumor on the baseline [18F]FDG PET/CT scan were excluded from analysis.

The maximum standardized uptake value (SUVmax), mean SUV (SUVmean), peak SUV (SUVpeak; mean uptake in a sphere with a diameter of 1.2 cm [21]), total lesion glycolysis (TLG; TLG was defined by SUVmean multiplied by the tumor volume), maximal CT diameter, and CT volume (number of voxels within the delineated ROI multiplied by the voxel size) were calculated in our institute on the [18F]FDG PET/CT scan (Matlab R2013a, The Mathworks, Natrick, MA, USA) using an adapted version of CERR (Computational Environment for Radiotherapy Research) extended with in-house developed Radiomics image analysis software to extract imaging features [28, 29]. Early metabolic response was defined using relative changes in [18F]FDG PET uptake parameters of the primary tumor expressed as a percentage change from baseline. Patients were grouped according to a 30 % decrease in CT and PET parameters in the primary tumor ROI of the [18F]FDG PET/CT scan [26, 30, 31]. For the PET response assessment, SUVpeak was used and for the CT response assessment, CT diameter (CT was part of the [18F]FDG PET/CT) was used [26, 30]. The RECIST analysis performed during week 6 by the local investigators was used in the analysis to separate patients into responders and non-responders. The 30 % CT and PET response assessments, performed after 3 weeks of therapy, were compared against the RECIST response assessment performed during week 6 by a specificity and sensitivity analysis.

Statistics

Since normality checks suggested an abnormal distribution for the changes in CT and PET parameters from baseline, non-parametric tests were used for the analysis of these variables. Comparison of the mean changes in CT and PET parameters from baseline for responders vs. non-responders was carried out by an independent samples Mann–Whitney U test. PFS was defined as the interval from randomization to progressive disease or death, whichever occurred first, and OS was defined as the interval from randomization to death from any cause. Differences in PFS and OS were investigated using Cox regression. For calculating the hazard ratio (HR), the different response assessment criteria were used, as a binary variable. To compare CT diameter and SUVmax response with survival, in the waterfall plots a survival cut-off of 6 months was used. This is the median PFS of the combined group (NTG group combined with control group). Statistical tests were based on a two-sided significance level, and the level of significance was set at 0.05. All statistics were performed in SPSS v.21 (IBM Corp. Released 2012, IBM SPSS Statistics for Windows, Version 21.0, Armonk, NY, USA).

Results

Patients

87 out of the 223 included patients in the randomized phase II study had two [18F]FDG PET/CT scans available with a measurable primary tumor; however, 27 patients were subsequently excluded for analysis due to poor image quality (see methods). Hence, 60 patients (characteristics in Table 1) had two evaluable consecutive [18F]FDG PET/CT scans (Fig. 2) with a median interval of 42 days. PFS and OS were similar for patients treated with PCB and PCB + NTG (Table 1).

Table 1 Patient characteristics
Fig. 2
figure 2

CONSORT diagram. SUV: standardized uptake value

Image characteristics

Experimental vs. control arm

The mean decrease in SUVmax between the 31 patients treated with PCB (46 ± 27 %) and the 29 patients treated with PCB + NTG (42 ± 29 %) was not statistically significantly different (p = 0.510). The other PET parameters (SUVmean, SUVpeak and TLG) showed on average > 40 % decrease from baseline, but this was also not statistically significantly different between the experimental arm and the control arm (Fig. 3). Although for CT, part of the [18F]FDG PET/CT, in the control arm, the CT diameter decreased significantly more than in the experimental arm (19 ± 14 % vs. 7 ± 23 %; p = 0.028).

Fig. 3
figure 3

Mean values and standard deviations for the CT- and PET-derived image parameters for the experimental arm and the control arm. p values of the independent samples Mann–Whitney U test of the mean change from baseline of the control arm vs. the mean change from baseline of the experimental arm (*significantly different for the experimental arm compared to the control arm with a significance level of 5 %). SUV: standardized uptake value; TLG: total lesion glycolysis

Early prediction of survival

According to the 30 % PET criteria, 74 % of patients in the control arm and 72 % of the patients in the experimental arm showed response after 3 weeks (median time interval 42 days). According to the 30 % CT criteria, 26 % of the patients in the control arm and 10 % of the patients in the experimental arm had a response. According to the RECIST analysis performed after 2 cycles (median time interval 56 days) by the local investigator, 29 % of the patients in the control arm had a response and 17 % of the patients in the experimental arm had a response (Table 1).

The predictive value of the 30 % CT-based and 30 % PET-based response assessments performed after 3 weeks (on the primary tumor) was assessed for response according to RECIST after 2 cycles (Table 2). The 30 % PET-based response assessment had a higher sensitivity compared to the 30 % CT-based response assessment but a lower specificity (Table 2).

Table 2 Comparison of 30 % CT-based and 30 % PET-based response assessment performed after 3 weeks with the RECIST response assessment of week 6

The 30 % CT-based and 30 % PET-based response assessments were for neither of the arms predictive for PFS nor OS (Table 3).

Table 3 The hazard ratios (HR) for 30 % PET- and CT-based response assessment with 95 % confidence interval and corresponding p values for OS and PFS are shown per parameter

The changes in CT diameter and SUVmax between baseline and early response assessment were depicted in a waterfall plot showing that PET defined more patients as responders than CT (Fig. 4 and Table 2). However, this decline was not predictive for longer PFS (than 6 months).

Fig. 4
figure 4

Change in CT diameter (upper) and SUVmax (lower) from baseline in individual patients. Patients of the experimental arm are plotted in red, patients of the control arm in blue. The pattern-filled bars represent patients with a progression free survival longer than 6 months. The black line represents the used response threshold of 30 %. SUV: standardized uptake value; PFS: progression-free survival

Discussion

The hypothesis of the NVALT12 trial was that the addition of NTG, by increasing tumor blood flow and oxygenation status, would improve outcome. While the clinical study of the NVALT12 already showed that NTG did not improve outcome, in the current study, we investigated if we could predict outcome based on early response assessment using [18F]FDG PET imaging [17]. This image analysis study of the NVALT12 trial could not show a predictive value of [18F]FDG PET imaging for the evaluation of the addition of NTG to bevacizumab-containing chemotherapy when compared to control patients. In a previous study, the administration of NO donating drugs decreased hypoxia-induced resistance to anticancer drugs in cancer cell lines [14]. In the NVALT12 trial, this could not be confirmed based on [18F]FDG PET analysis. This could be due to a lower NTG dose than that used in the Yasuda study [15], or to an interference with bevacizumab. From recent studies, it is known that FDG is only a moderate surrogate for hypoxia [32]. The study of Zegers et al. [33] showed that 42 ± 21 % of the primary tumor volume has a high FDG uptake (SUV > 50 % of SUVmax) of which 10 ± 12 % is hypoxic (high [18F]HX4 uptake TBR > 1.4), and that 3 % of the primary tumor volume outside the high FDG uptake volume is hypoxic as depicted by [18F]HX4 PET. In our study, we, therefore, only measure the effects of NTG on tumor metabolism and survival but not on hypoxia directly. Surprisingly, in nearly all patients, irrespective of treatment arm, a major decrease in FDG uptake was observed in the [18F]FDG PET scan performed after 3 weeks. Importantly, this [18F]FDG PET scan was acquired within 3 days after administration of the second cycle of chemotherapy. A study by van der Veldt et al. [34] showed that bevacizumab reduces tumor perfusion and [11C] docetaxel uptake in NSCLC, which was accompanied by rapid reduction in circulating levels of VEGF. This decrease in tumor blood flow after bevacizumab administration may explain the lower uptake of FDG in the tumor. Consequently, our results do not exclude the possibility that NTG decreases hypoxia.

A number of studies have demonstrated that changes in SUV parameters as early as the third week after the start of treatment are predictive for response to chemotherapy and PFS [1, 23, 24, 35]. A variety of approaches have been developed to measure the response, starting with the World Health Organization (WHO) criteria and continuing to RECIST and RECIST 1.1 [20, 26, 36]. These criteria refer to an anatomical decrease in tumor diameter. However, this response must be viewed with some caution when one is trying to predict outcomes in therapies that may be more cytostatic than cytotoxic. With such therapies, lack of progression may be associated with a good improvement in outcome, even in the absence of major shrinkage of tumors [37]. Newer metrics such as PET may be more informative [38]. PET/CT-based response evaluation has proven to be valuable in chemotherapy [39]. Currently, two sets of treatment response criteria for PET are available: EORTC and PET response criteria in solid tumors (PERCIST) [30]. PERCIST operates with a fixed ROI of 1 cm3 in the most [18F]FDG-avid part of the single most metabolically active tumor in the patient at each PET/CT scan. In the current study, a specific ROI, defining the primary tumor, was used for response evaluation. A consideration for anatomic and functional imaging is that many of the changes in response are at the border zones between response groups.

These border zones are quite artificial, as changes in tumor size are on a continuous scale (Fig. 3). The comparison of 30 % CT-based and 30 % PET-based response assessment performed after 3 weeks (median time interval between scans 42 days) with the RECIST analysis performed in week 6 (median time interval between scans 56 days) showed that the RECIST analysis defined more patients as responders than the 30 % CT-based analysis performed after 3 weeks. This can be caused by the difference in timing but also due to the fact that for the 30 % CT-based analysis, only one lesion was measured while in RECIST, multiple lesions were measured. The 30 % PET-based response assessment performed after 3 weeks showed more responders than the RECIST analysis, which is probably caused by decreased perfusion due to the bevacizumab treatment, which led to a decrease in FDG uptake for both treatment arms. The response assessment for PET was not influenced by NTG. A previous study of our group showed that after 3 weeks of treatment, five of nine patients were classified as responder by CT while six of nine were classified as responders by [18F]FDG PET [40]. In the same study, patients with a metabolic response (decrease in SUV > 20 %) at week 3 had a longer PFS than those without (9.7 months vs. 2.8 months), while patients with a response on CT at week 3 did not have a significantly longer PFS than those without. These two findings combined showed that PET may be able to show treatment response earlier than CT. In the former study, [18F]FDG PET scans were performed before bevacizumab infusion, while in our study the [18F]FDG PET scan was performed shortly after bevacizumab infusion. This might have impacted the uptake of FDG. A study of Hoekstra et al. [41] also shows that [18F]FDG PET has additional value over conventional radiologic techniques for monitoring response in locally advanced NSCLC patients.

The scans used for this study were made within the scope of the Dutch multicenter NVALT12 phase II trial and 60/223 patients underwent 2 [18F]FDG PET scans with the second scan after cycle 2, but before day 35. For quality control purposes, only scans with a mean SUV in the liver between 1.3 and 3 were used, reducing the number of assessed patients in this analysis to only 60 of the 223 original patients. For the analysis, these 60 patients were also divided between the control and the experimental arm, which means the study cohort was limited in size, hampering in-depth subgroup analyses.

Conclusion

The addition of NTG did not lead to enhanced reduction in FDG uptake compared to the control arm. Although PET-based response assessment identified more responders than CT-based response assessment, this did not correlate to progression-free survival or overall survival. This might be due to the timing of the [18F]FDG PET shortly after the bevacizumab infusion.