INTRODUCTION

This article is part of a series of reports from the “Orlando Inhalation Conference—Approaches in International Regulation” which was held in March 2014. In vitro testing for orally inhaled products (OIPs) is considered from the perspectives of the following: (i) the fundamental science that underpins understanding of the drug delivery system and the subsequent fate of drugs inhaled into the lungs and (ii) current and emerging methods for in vitro testing and their applications. The focus of the latter is the extent to which in vitro methods correlate with in vivo outcomes and the impact on OIP developers and manufacturers of current techniques and their interpretation. The distinction between fundamental science and current practice is made simply to facilitate the organization of the article—clearly developments in each should inform and influence the other. The Orlando meeting builds upon, among other things, the published outputs from the “Thousand Years of Pharmaceutical Aerosols” meeting in Iceland 5 years ago which considered the question “what remains to be done in the field of pharmaceutical aerosol science” (1). In vitro testing of OIP was reviewed in outputs from the meeting in Iceland (2,3). In this article, we consider subsequent developments and the extent to which research priorities are being addressed.

In vitro-in vivo correlation (IVIVC) is critical if in vitro methods are to be used to predict in vivo performance, e.g., total lung deposition (4). To predict clinical outcomes, however, realistic test conditions with regard to the patient are necessary, e.g., the realistic oropharyngeal geometry for the testing apparatus, matching the inspiratory flow rate to a wide range of different patients (children to adults; different types and extent of lung disease) and consideration of how devices are used clinically, e.g., with a spacer. Despite their shortcomings, in vitro tests are used extensively and are a key element in the development and regulation of OIPs. Currently, comparison of OIPs for regulatory purposes is performed in vitro primarily by evaluating emitted doses and comparing particle size distribution profiles as assessed by impactors or impingers. In vitro data are used alone or in conjunction with PK/PD data in applications for marketing authorization (5) and for tracking a product through its lifecycle management and change history. The use of in vitro data to predict in vivo performance, when no correlation is established, gives rise to issues such as the validity of predefined limits for product performance based on observed variation in the reference product. At present, performance limits are serving as regulatory targets, e.g., to obtain a defined numerical and statistical overlap between test and reference data sets.

Recognizing bioequivalence as an issue of increasing importance for orally inhaled products (68), key conference themes were (i) new approaches to in vitro testing methods, (ii) how in vitro methods are applied to generic product development, and (iii) how in vitro data augments interpretation of pharmacokinetic, pharmacodynamic, clinical, and device data (Table I).

Table I Important Issues for In Vitro Testing of Orally Inhaled Products Considered at the IPAC-RS Inhalation Conference 2014. Presentation titles, reference to IPAC-RS archive [in brackets], and Key Questions Considered

UNDERSTANDING THE FATE OF DRUGS IN THE LUNGS

The link between in vitro properties of locally acting OIP and their clinical performance is complex. Aerosol deposition, particle dissolution, permeation into lung tissue, tissue binding, and transfer into the systemic circulation are all phenomena that influence the concentration-time profile of drug at the relevant local effect compartment (Fig. 1). The latter, in most cases, cannot be measured directly in humans. Hence, demonstrating equivalent clinical performance between different OIPs during development of innovator products or for approval of generics is not straight forward, as shown by the current lack of harmonization between regulatory guidelines and practices in the area, i.e., differences between EMA and FDA guidelines (5). Discussion at the conference covered the use of in vitro techniques to (i) estimate lung deposited dose and its lung distribution, and (ii) predict post deposition events such as dissolution and permeation into lung tissue.

Fig. 1
figure 1

Schematic of particle deposition, mucociliary clearance (MCC), dissolution, absorption, target engagement, and diffusion into the system. Reproduced with permission from (21)

Aerosol Deposition

The airway deposition pattern of an OIP is related to patient-dependent variables, such as the physiology of the respiratory tract and the inhalation maneuver, and product-dependent variables such as emitted dose (ED), aerodynamic particle size distribution (APSD), and the device resistance. In vitro techniques for measuring these OIP qualities include standard filter, Dose Unit Sampling Apparatus, or impinger methods to collect the ED and various impactor methods to measure APSD. Delvadia (9) described how these methods may incorporate physiologically realistic models of the mouth and throat region and be used in combination with a breath simulator to estimate the oropharyngeal and lung deposited doses (4,2224). The implications of these developments for IVIVC are considered further in “In Vitro-In Vivo Correlation”.

Lung deposited doses estimated using current oropharyngeal models correlate well with total deposited lung dose measured by gamma scintigraphy (24) or pharmacokinetic methods (systemic exposure) for compounds with complete lung bioavailability (4). However, for compounds with low water solubility and less than 100% lung bioavailability, the deposited lung dose is under-predicted by systemic exposure due to mucociliary clearance of the dose fraction deposited in the tracheo-bronchial region (25). The impact of mucociliary clearance in such circumstances is greater for aerosols which favor deposition in the bronchial region (26,27). The influence of the aerosol deposition pattern on the efficacy of an OIP is well established as demonstrated by Usmani and co-workers (28) for an inhaled bronchodilator drug given as discreet monodisperse aerodynamic particle sizes in the respirable size range. These observations indicate clearly that the pattern of deposition, as well as measures of total lung dose, are critical for the prediction of therapeutic effect.

In silico methods may be used to estimate aerosol deposition in airways. One-dimensional (1D) algebraic approaches treat the airways as a series of filters in which gravitational settling, diffusion, and impaction are competing deposition processes. Currently, 1D models such as ICRP-96 (29) may provide the most accessible means of approximating total, as well as central:peripheral, deposition to help interpretation of drug exposure data. More recently, three-dimensional (3D) computational fluid dynamics (CFD) models have been developed to provide improved predictions of aerosol deposition on airway surfaces (30,31) and combined with in vitro measurements as described by Longest at this conference (10) (see “In Vitro-In Vivo Correlation”). Although CFD models may be useful in device design and provide a better understanding of device and mouth-throat deposition, validation of CFD predictions by in vivo performance is hampered by poor resolution of techniques such as gamma scintigraphy. In future, the combination of CFD models and CT imaging has the potential to provide better understanding of how variations in airway geometry, including the influence of lung disease, impact on aerosol deposition pattern.

Dissolution, Permeation, Particle Clearance, and Tissue Exposure

Inhalation delivers locally acting inhaled drugs to the airways, thus generating a high local concentration leading to improved dose potency and optimized therapeutic ratio. Common means of retaining inhaled drugs in the lungs include tissue entrapment (e.g., water soluble di-bases) and slow dissolution of particles of poorly soluble drugs such as corticosteroids (Fig. 2) (21). For compounds with high water solubility such as the di-bases, sustained local tissue concentration, and hence therapeutic effect, is influenced mainly by the extent of tissue binding or tissue entrapment which is governed by molecular properties rather than material or formulation properties. For slowly dissolving compounds, therapeutic effect is sensitive to material properties governing solubility and/or dissolution rate. This has been demonstrated experimentally: a difference in systemic exposure between two formulations with similar aerodynamic particle size distributions was consistent with predicted difference in the dissolution rate (21). A relationship between dissolution rate and appearance of drug in plasma has also been reported (35).

Fig. 2
figure 2

Mean absorption time (MAT) as a function of water solubility (PBS pH 7.4) for fluticasone furoate (FF), fluticasone propionate (FP); an inhaled selective glucocorticoid receptor modulator (SGRM) and budesonide (BUD). 1 Thorsson et al. 2001 (32); 2 Allen et al. 2013 (33); 3 Prothon et al., unpublished data (34)

Despite the potential impact of dissolution rate on clinical performance, there is currently no regulatory recommendation for in vitro dissolution testing for OIPs. A working group of IPAC-RS concluded in 2012 that any attempts to standardize a dissolution method for compendial inclusion would be premature since there is insufficient knowledge to translate dissolution data into statements about quality, safety, and efficacy (36). Forbes (11) presented on the difficulty in relating in vitro dissolution data to therapeutic effect. Dissolution in vivo is influenced not only by drug substance properties such as solubility and specific surface area—which can be controlled and measured in an in vitro setting—but also by physiological factors including the composition of airway lining fluid, permeability of the airway epithelium, and rate of particle clearance, all of which vary between different regions of the lung. For example, epithelial permeability for many compounds (depending upon molecular properties) is significantly higher in respiratory regions compared to the conducting airways and permeation could conceivably replace dissolution as the rate limiting step as the deposition site moves from peripheral to central lung.

Causality between rate of dissolution and the clinical performance of an OIP is complex, but differences in dissolution rate between different OIP’s with otherwise similar aerosol performance (with potential effects on local and systemic bioavailability, and drug safety) have been demonstrated in several recent publications (11,25,3538). A variety of dissolution test methods have been developed (3541), and it is generally accepted that these methods may have value, even if they are not strictly in vivo predictive, by discriminating between formulations with similar aerodynamic properties but different release properties.

The absorption of drugs from the lungs to the systemic circulation is generally rapid, with high bioavailability for compounds with a wide range of physicochemical properties (42,43). Despite interest in lung transporters (44,45), passive diffusion appears to be the dominant mechanism. Permeability to drugs in different regions of the lungs is thought to vary according to how the properties of molecules dictate their interaction with the extracellular environment, i.e., lining fluid composition, cellular type and thickness, the extent and dimensions of intercellular junctions, blood flow, and competing clearance mechanisms. For example, absorption of a small, hydrophilic molecule in human subjects was enhanced to a greater extent by co-application of an absorption enhancer when delivered more peripherally (46). Although they may deliver predominately to central or peripheral regions, current inhaled medicines do not target exclusively a particular region of the lungs, and clinical absorption profiles are composite of the drug deposited and absorbed in different regions.

In vitro methods are available for measurement of lung permeation and tissue binding, although these are mainly used as research tools. Cell culture models of the epithelium, the principal respiratory absorption barrier, are typically used to characterize drug permeability (47). Although alternative cell lines are being developed, most studies utilize the cell types for which in vitro-in vivo relationships have been reported, namely, 16HBE14o- (48), Calu-3 cell lines (42), and primary alveolar cell cultures (49). The isolated perfused lung is an ex vivo technique which has been used to evaluate permeability in the intact lung and has been correlated with permeability in cell cultures and in vivo (5052). Increasingly, there is interest in methods for realistically depositing aerosol particles on the air-interfaced surface of in vitro cell models (53,54). These provide integrated dissolution-absorption models in which aerosol particles dissolve in the fluid lining air-interfaced cell cultures, and absorptive transport across the cell layer is measured (sometimes confusingly referred to as “uptake”). Models for assessing tissue binding are also in development (55). At present, permeability and tissue binding techniques are used as development tools rather than validated assays providing data for regulatory submissions.

Bäckman (20) described how recent advances in the design and use of mechanistic models (25,56) may help to identify scientifically-based principles upon which to base regulatory goal posts by providing a better understanding of how deposition, dissolution, permeation, tissue binding, and clearance influence therapeutic effect. A significant portion of a poorly water soluble inhaled drug is cleared from the central lung by the mucociliary escalator, resulting in a reduced pulmonary bioavailability (27). In the peripheral regions of the lung, clearance by alveolar macrophages dominates, although insoluble particles in the lower airways can give rise to adaptive reversible alveolar macrophage responses, or at higher doses, irreversible alveolar macrophage related adverse events (57). The interplay between dissolution and permeability in the lungs is not well understood, particularly in central regions where low permeation of compounds with low water solubility could result in non-sink conditions for dissolution, making permeation through epithelium the rate-limiting step. The complexity presented by competing particle clearance mechanisms occurring concomitantly is compounded by the influence of different lung diseases (type and severity) which are known to alter to a different extent the lung lining fluid composition, the mucociliary system, macrophage function, and the permeability of the epithelium (3). Physiologically-based pharmacokinetic modeling incorporates deposition patterns and rate-limiting steps that occur after aerosol particle deposition in the lungs. Such techniques provide insight into in vitro data and have the potential to be used to set science-based regulatory specifications.

In conclusion, dissolution is likely to be a key rate-limiting step for systemic absorption for OIP containing drugs or formulations with low aqueous solubility, and there is the potential for appropriately designed dissolution tests to provide additional supportive information for establishing bioequivalence for this class of compound. However, the impact of dissolution on local tissue concentration-time profiles, and hence on efficacy and duration of effect, is less evidenced and may vary between different regions of the lungs. More data on the impact of dissolution on local exposure, and specifically on the relationship between dissolution, permeation, and particle clearance, is thus required before any recommendations can be made with respect to the use of dissolution test methods to predict therapeutic performance.

PREDICTING IN VIVO PERFORMANCE FOR ORALLY INHALED PRODUCTS

In vitro measurements characterizing performance of OIPs are the first steps towards obtaining regulatory approval to market a therapeutic aerosol. The ED of drug and APSD of an aerosol provide metrics of product performance during the various stages of development of an OIP. Once the inhaler design meets defined specifications and production begins, these same in vitro tests are implemented for quality control of the product and are performed on product sampled from manufactured batches (58,59).

In Vitro-In Vivo Correlation

In vitro techniques designed for quality control purposes have been investigated as models for prediction of in vivo performance of an OIP (4). A good in vitro-in vivo correlation (IVIVC) would permit an in vitro model to be used as a development tool (24) and potentially replace in vivo tests for product registration. The incorporation of physiologically realistic models of the mouth and throat region and use of inhalation profiles has the potential to improve IVIVC. However, a number of issues require further discussion in this area if a consensus is to be reached with respect to realistic in vitro testing including the selection of both the oropharyngeal model and the appropriate test flow rate to simulate patient use. The goal is to use in vitro data to simulate the delivery of the aerosol to the patient. However, even in a healthy subject population, a distribution of airway sizes and volumes exist, which then give rise to a distribution of “in use” flow rates for each inhaler.

Delvadia advocated that airway models should be simplified or idealized, as long as simplification does not compromise the predictability of the model (9). However, if the range of oropharyngeal geometries in the population is to be considered, a single-sized model would not be expected to be predictive of the in vivo population. Perhaps, what is necessary to capture this variability range is a “minimal” number of models to represent inter-subject variation. The selection of the model geometry should be based on the ability to predict in vivo deposition. Similarly, “in use” flow rates will vary across the population. Only when in vitro testing is performed using a number of different flow profiles would the in vitro differences mimic the kind of variation observed in vivo. Key characteristics of the inhalation flow profiles to vary include the peak flow rate, inhalation volume, time to peak flow, and inhalation time. Ranges to study for these factors may be obtained from different sources depending upon the particular product being developed, including literature sources or predictions based upon device airflow resistance or from initial in vivo measurements with device prototypes. The number of in vitro experiments needed to be performed to obtain a good IVIVC is currently a highly debated issue as the complexity of the in vitro methods increase. A clearer justification and value (i.e., a good IVIVC) for expanding the in vitro testing regimes is needed if industry is to continue to pursue this development strategy.

Longest (10) presented the use of a combined in silico and in vitro approach for predicting the fate of pharmaceutical aerosols in the lungs. The in vitro component of this concurrent approach was used to characterize the initial aerosol exiting the inhaler and provide deposition data in the upper airway models (60). Computational fluid dynamics (CFD) simulations began with the in vitro measured polydisperse aerosol size distribution and were used to capture the effects of the inhaler on the flow field and predict deposition throughout the lungs (61). The in vitro deposition data were used to validate model performance, and the CFD model could be applied to predict regional lung deposition and the effects of variability. In this manner, regional lung deposition can be predicted with the inclusion of the transport complexity associated with pharmaceutical aerosols, which goes beyond what can be achieved with whole-lung or algebraic correlation-based models (31). As an example of this combined in silico-in vitro approach, a case study was presented illustrating similar total lung deposition for a pMDI and DPI, but different regional deposition and different responses to errors in the inhalation flow profile (62). Current efforts in this approach are focused on comparisons with in vivo data for aerosols from multiple inhalers and resolutions using 2D and 3D gamma scintigraphy.

In vitro data are usually compared to in vivo performance measured as deposited lung dose by scintigraphic methods and/or pharmacokinetic methods to quantify the deposited lung dose (63,64). The most common imaging technique is 2D planar imaging, which does not provide depth information as only one 2D coronal view is available to estimate the total deposition of the inhaled radioactivity in the lungs (65,66). Separation into inner (large, central airways) and outer (small, peripheral airways) regions to estimate regional deposition introduces inaccuracies in 2D due to overlapping airway geometries (67), and thus, only total deposited lung dose can be measured with accuracy. Estimates of regional deposition can be made from the APSD for a particular OIP, e.g., by applying the dose fraction <2 μm to the deposited dose to estimate the portion of aerosol deposited in peripheral airways. However, these deposition calculations only apply to healthy, patent airways.

Other imaging modalities such as 3D single photon emission computed tomography (SPECT) and positron emission tomography (PET) are more complex, both technically and in the analysis of the imaging data, but can be used to assess regional deposition (68,69). A computed tomography (CT) scanner is now integral with many SPECT and PET scanners, providing airway structure and morphology to perhaps the 7th generation airway, depending on the CT resolution, and in addition, tissue attenuation correction factors that are applied to the lung imaging data (70). Limitations with scintigraphic imaging include the choice of labeling compound, preparation and validation of the radiolabelled product, variability in the inhalation technique used by the subjects, the positioning of the subject in the scanner, subject movement during image acquisition, sensitivity and resolution of the scanner, background build-up of tracer during acquisition, and techniques to segment the lung and definition of regions. Alteration of the original OIP by introducing a radioactive tracer to the formulation is a concern that has limited the acceptance of imaging studies. An assessment of the radiolabeled formulation APSD versus the original non-radiolabeled product is first performed to establish if the radiolabeling could inherently affect deposition (6). Most 2D studies use a tracer that is not firmly bound to the test drug and hence, once deposited on the airway surface, the drug separates from the tracer and subsequent images reflect the kinetics of the radioactive tracer only (71). 3D SPECT studies are performed with non-absorbable tracers because acquisition times can be lengthy, but the issue of tracer binding to the drug remains (72). PET has the advantage that the tracer, if one can be synthesized, is a component of the inhaled drug itself and measurements reflect the fate of the deposited drug (73,74).

Realistic In Vitro Studies—Product Use and Patients

In vitro techniques have been adapted to add features that account for conditions that mimic “real-life” use (Table II). Some of these models are now fairly sophisticated, with design of impactor inlets based on actual human oropharyngeal geometries as measured by MRI, simulated inspiratory flow patterns and test face models built with realistic features, e.g., to accommodate facemasks designed for pediatric patients. It is well known that inhalation technique can change the delivered aerosol particle size distribution, as demonstrated in several conference presentations (9,18,75). Using inspiratory flow patterns that mimic patient effort during inhalation from the OIP being tested can improve the IVIVC (18). However, “man” is not “model” and for actual OIP delivery to target populations with asthma, chronic obstructive pulmonary disease, and cystic fibrosis, the effect of lung disease needs to be incorporated into models as these conditions affect ventilation patterns and thus, the dose and pattern of deposition of aerosol in the lungs and, potentially, the clinical outcome.

Table II A Comparison of the Variables Associated with In Vitro Testing Methods for OIPs Versus the Reality of Factors that Influence Delivered Dose and Particle Size Distribution In Vivo

Predictions of lung deposition based solely on in vitro data are often incorrect in patients due to factors related to the severity of their illness and impairment of lung function. Airway disease is diverse and heterogeneous in severity and pathology and classifications of lung generations, i.e., model airway geometry, are altered in disease states. In the healthy, non-smoking subject, deposition of aerosols with a volume mean diameter <5 μm and high fine particle fraction (FPF) is uniform throughout the lung, whereas a patient with airway disease due to bronchoconstriction or mucus accumulation or inflammation (edema) will present with a non-uniform image, with “hot” spots of radioactivity throughout the lung, denoting impaction of aerosol at areas of airway narrowing (7678). These regional deposition differences are difficult to model in the laboratory, and thus while total deposited dose measurements from imaging the patient may correlate with the in vitro measurements, the clinical response is also a function of regional deposition, dissolution, and uptake of drug. It may be difficult, but not impossible by using CFD, to build some of these real-life conditions into lung models (7981).

Factors such as realistic inlet geometries to the cascade impactor, use of realistic inhalation profiles, and metrics for different dose strengths are important considerations (82). Conference presentations described the use of in vitro methods that mimic product use by patients, but the disease aspects that influence response to OIP therapy were not addressed. Using conventional in vitro test methods, Reisner presented a case study comparing differences in in vitro fine particle dose (FPD) compared to ED between formoterol fumarate DPI and pMDI (75). The ED was more predictive than FPD with respect to demonstrating PK bioequivalence and PD non-inferiority. This suggested that the FPD, while not equivalent for the two inhalers, was sufficient to provide comparable clinical response, despite the very different inspiratory flow rate required to deliver the OIPs. Wachtel presented a realistic oropharyngeal and upper airway model designed for various pediatric age ranges. These upper airway models were coupled with a mixing inlet and lung simulator in an attempt to mimic the inhalation airflow profile of children (18). The apparatus was attached to the test inhalation device and spacer and in vitro particle size distribution data collected and compared to published in vivo data to determine how well the in vitro apparatus performed in terms of lung deposition and deposition on components such as the actuator and valve holding chamber or face mask (83). Although data were not statistically analyzed, there appeared, visually, to be similarity between in vitro and in vivo results, showing the potential for the in vitro approach to avoid exposing children in a clinical trial setting. Device-formulation interactions were studied for formulations of albuterol sulfate blended with two different particle sizes of lactose, 34 or 19 μm, aerosolized using the Easyhaler or Novolizer device and evaluated using three different oropharyngeal models. Easyhaler delivered a more consistent dose to the lung regardless of oropharyngeal model configuration or carrier particle size. These case studies illustrated how the use of in vitro test platforms based on realistic patient throat and mouth configurations and inspiratory flow profiles potentially provides greater understanding of device-formulation interaction in delivery of drug to the lungs during product development and may decrease product development timelines and costs.

Currently, in vitro data are used primarily for the following: (i) compendial testing and regulatory compliance or (ii) simulation of device use by patients. The introduction of a valved holding chamber (VHC) or spacer into such tests adds complexity as these components can impact critical product characteristics such as delivered dose (84,85). VHC are not interchangeable and European Medicines Agency and Health Canada recommend a specific spacer or VHC be used with a particular product (58,59). Dolovich (19) showed the complexity of assessing equivalence when using VHCs or spacers as there are multiple comparisons that need to be made: (1) equivalence of the test product to the reference with no spacer or VHC, (2) equivalence of the test product with and without the spacer or VHC, (3) equivalence of the reference product with and without the spacer or VHC and potentially a fourth comparison of equivalence of the reference product with spacer or VHC to the test product with spacer or VHC if this is not identical between products.

To obtain IVIVC, investigators are building into their in vitro models the factors that are known to affect the delivery of aerosolized drug to the lungs. Measurements of ED and APSD with these factors taken into account are more likely to parallel the variability introduced by these factors in vivo. Although 2D and 3D SPECT imaging studies provide limited information, this may be useful for IVIVC provided the stringent conditions for reliable and accurate data acquisition and analysis are met.

IN VITRO TESTING IN THE DEVELOPMENT AND REGULATION OF ORALLY INHALED PRODUCTS

In Vitro Data Decisions

In vitro data are a critical component of OIP development. This was reflected in the number and diversity of conference presentations focused on the design and analysis of in vitro data. Understanding how changes in product characteristics impact in vitro data and how those changes may impact the patient response to therapy is becoming more important as more product development and product quality decisions are made on the basis of in vitro data alone. For example, some product design changes or product comparison assessments are made based on in vitro data only. The industry is in its infancy for how best to utilize this information in product development decisions, improved product lifecycle management, and overall product comparisons. In vitro studies may detect change in product performance for less financial investment compared to PK or PD studies; however, the implications of those in vitro findings for product quality and patient safety and efficacy are still not well understood. An important component in understanding formulation and device development is how scientifically valid statistical conclusions can be drawn from in vitro data. Research into measures of product quality and equivalence, statistical methods to interpret the results, and regulatory aspects of product change were reflected in the conference presentations (Table I).

Understanding the Formulation and Device

Device and formulation controls are critical in achieving in vitro comparability. Price (12) emphasized that understanding formulation and device characteristics is key to demonstrating in vitro comparability and delivering consistent products. Various statistically-based experimental approaches and data analysis techniques, such as design of experiment (DOE), analysis of variance (ANOVA), and multivariate analysis (MVA), have been useful to demonstrate that in vitro comparability is dependent on control of particle size and surface properties of the API and particle size of the excipient (8689). However, this empirical knowledge must be extended to understand how raw material properties relate to powder blend parameters for DPIs and adhesion and cohesion properties of the particles for all suspension or powder-based OIP formulations, while accounting for the device and the measurement system (9093). Gaps in this knowledge impede efficient product development generally lead to iterative-based development programs requiring the collection and interpretation of large amounts of in vitro data. This is resource intense and time consuming, can yield different interpretations if different data analysis techniques are used and detracts from progressing understanding based on first principles. In vitro data can be generated relatively quickly compared to in vivo studies but, while informative, in vitro data alone do not explain relationships between formulation and device properties.

Product Quality and Equivalence

Design of a product development program to meet the different regulatory frameworks for different markets is complex. The requirement for collection of in vitro data to serve as quality performance indicators extends project timelines, speed of product development, and therefore, cost. Unless relationships between in vitro data, PK data and clinical outcomes can be better understood and predictable across different respiratory formulations, these challenges will remain an inherent part of the process for introducing second entry OIPs. Rebello (94) highlighted that the challenge of developing affordable second entry OIPs, i.e., the complexity of the strategy required to minimize resource demands and yet meet the varied regulatory recommendations for different regions of the globe (57,95). Appropriate selection of medical centers with respiratory expertise and analytical skills to conduct phase I studies and proper study design to ensure both inclusion of vulnerable populations and adequate sample size and statistical power for the populations studied are key to designing studies for several markets. Demonstrating in vitro comparability brings a further challenge in that reference product variability must be understood as the test product itself is being developed. For stages of an impactor where there is very low deposition, it is difficult to obtain accurate measures of low amounts. Under these conditions, even different batches of a reference product may not demonstrate in vitro equivalence on stages with very low deposition. Furthermore, stages with low deposition may not be clinically relevant.

A case study based on the Flutiform® pMDI by Venthoye (96) illustrated the complexity of lifecycle management after a product is licensed. It is clear that achieving consistent in vitro product performance over time is an investment that continues throughout the life of the product. Continuous monitoring and trending product quality in vitro data from research and development into commercial manufacturing and through product evolution with the use of statistical and data analysis tools allows the impact of product changes on product quality to be better quantified and understood (97100). Working across organizations in a multi-disciplinary manner to rapidly share this learning and product knowledge between manufacturing, supply chain and R&D groups allows development time and costs to be reduced (101,102).

Test Methods and Their Application

The measurement systems for determining APSD and ED are inherent components monitoring product quality. Research efforts over the years have focused on developing an in vitro testing apparatus to predict or correlate to in vivo responses in an effort to minimize patient exposure during drug development while still ensuring final drug product that is safe and efficacious. Conference presentations are summarized in the context of this wider research arena in “PREDICTING IN VIVO PERFORMANCE FOR ORALLY INHALED PRODUCTS,” while emerging techniques such as dissolution testing were considered in “UNDERSTANDING THE FATE OF DRUGS IN THE LUNGS”.

Less emphasis was given within the conference presentations on the role of in vitro data in the long-term management of the product lifecycle. Within the process validation framework, additional samples are collected and tested initially on a limited number of batches. This degree of sampling and testing can be reduced over the life of the product as more product quality data are collected and product performance is better understood. Similar concepts could be applied to the generation and use of the APSD data (102) and have been introduced through work in developing quality control metrics for use with abbreviated impactor testing (103). This approach, together with efficient management of analytical changes across a broad network and diverse markets, is pertinent to many issues raised by Venthoye regarding product lifecycle management (96) and is an area for further debate and discussion.

Statistical Approaches

Insights gained from empirical in vitro data constitute observational learning that can be used to focus research effort. Empirical studies, however, can be influenced by how a study was conducted. “Noise” factors that are not accounted during data collection and analysis can increase the risk of erroneous product development decisions and add complexity in interpreting outcomes across different studies. “Noise” factors include measurement system set-up, environmental or laboratory conditions, analyst training, and day-to-day variation. Additionally, different treatment of the data through the application of different statistical methods can lead to different conclusions. The science of data analysis in correctly interpreting the product changes is as critical as the science of the formulation and device in the development of OIP, but is in very early stages.

The importance of data analysis was highlighted by Sandell (16) in a case study comprising a test and a reference version of a DPI product with three strengths, each tested at three different pressure drops, in which the output from different statistical methods was compared. The data presented showed that achieving comparable APSD is challenging when the effect of airflow through the device on the delivered dose is not the same between products. For the reference product, the delivered dose varied more as a function of airflow compared to the test product. This led to differences observed in the coarse particles (125–175% for the test to reference ratio) and marginal differences in total dose and fine particle dose (105–140% test to reference ratio for delivered dose and approximately 80–110% test to reference ratio for FPD) as analyzed by average bioequivalence method (ABE) or population bioequivalence method (PBE). Product differences were generally large at the lowest flow rate. For the data presented, for most in vitro parameters, the ABE and PBE produced consistent conclusions, but the differences in the data between products were pronounced. Different statistical methods reaching the same conclusion should be expected when differences are larger. These scenarios, however, do not provide as much insight on the performance properties of the statistical methods for marginal or borderline cases. Through simulation work, based on this set of reference data but considering a range of smaller differences, the PBE was shown to be, on average, more stringent compared to ABE for allowable differences in the mean based on current regulatory constants (16). Such observation is limited to the dataset chosen in the current study. The probability that the reference product would be shown equivalent to itself through the use of PBE varied between 52–100% dependent on the type of in vitro comparison being made, leading to a question of whether the regulatory constants should be reviewed. The case study provided an opportunity to reflect on the totality of data collected for a second entry product being developed for the market. How much influence should in vitro data have in considering the next phase of product development as the product moves into the clinic and, for second entry products, should the weight of the in vitro data be more significant considering the amount of clinical data being generated is less? This also underscores the importance of developing and studying statistical methods specifically for in vitro data.

Statistical methods for comparison of particle size distribution are another area of current focus. The development of these methods has yet to reach a stage where comparisons among them can be considered. Once methods are developed fully, additional work to understand their performance will be required. One such method being developed is the modified chi-square ratio statistic considered by Weber (15,104). In vitro data analysis has, over time, been viewed in two different ways. Data may be analyzed univariately (i.e., based on a single metric) to determine equivalence using the ABE or PBE (105108). This is the case for delivered dose (the sum of grouped stages of cascade impaction test data) and can also be applied to individual stages, although this is not recommended as the error rate increases due to multiple comparisons. Further, comparison of individual stages based on the test/reference (T/R) ratio is generally inappropriate for stages with low deposition as the ratio becomes notoriously instable, resulting in unreliable estimates and wider confidence intervals. Alternatively, data may be treated multivariately and for this, another technique is being developed in the form of the modified chi-square ratio statistic (109,110). Both the PBE and modified chi-square ratio metrics are based on the principle of quantifying the difference between test product and reference product relative to reference product variation. The modified chi-square ratio statistic measures the difference of a test profile from the mean of the reference product and compares this measure to the difference of a reference profile from the mean of the reference product. For a given set of test product and reference product data, this ratio is repeated comparing every test product profile with every reference product profile and the median of these ratios is reported as the modified chi-square ratio test statistic. The modified chi-square ratio test is non-parametric (i.e., does not assume an underlying distribution of the data) and does not use a log-transformation. The PBE is the sum of the squared difference between test and reference means and the difference between test product variance and reference product variance, scaled by the reference product variance or a regulatory constant for data in the log scale, and assumes a log-normal distribution. The ABE metric is the difference between reference product and test product means, again in the log scale, assuming a log-normal distribution. However, each of these methods defines equivalence in a different way based on a different statistical metric. The modified chi-square ratio and PBE methods directly incorporate reference product variation in the statistical claim and “reward” less variation in the newer product by allowing greater differences in the mean and still conclude equivalence based on the degree of overlap between different distributions.

Within the multivariate arena, this type of comparative research across different statistical methods for decision making properties has not yet begun in earnest. One reason is that, for the modified chi-square approach, research was still needed for defining an algorithm for an appropriate critical value. The work that Weber (15) presented in the conference is a significant step forward. This initial research work may be at a stage to be considered for standardization for consistent implementation in the industry, at which point this approach could be reviewed across other multivariate comparative statistical approaches for its performance characteristics.

For all these approaches, to date, understanding is limited regarding exactly how disparate the dose uniformity and APSD can be between products that are concluded to be equivalent or non-equivalent, how the performance of these tests compares to more standard statistical approaches, and whether there is consensus for the claims to be made from the data. Summaries of current statistical approaches proposed for equivalence assessment are provided in Tables III and IV.

Table III Current Univariate Statistical Methods Proposed for Equivalence Assessment of In Vitro data for Orally Inhaled and Nasal Drug Products
Table IV Current multivariate statistical method proposed for equivalence assessment of In vitro data for orally inhaled and nasal drug products

Regulatory Complexities

The challenge of establishing bioequivalence of OIPs can be attributed to the complexity of this type of drug product, i.e., formulation integrated with a device, as well as the location of the drug target, i.e., majority of the drug products in this category are locally acting. Due to this complexity, the regulatory standards for bioequivalence of OIPs posed by various regulatory agencies throughout the world contain many differences (111). For locally acting OIPs, FDA recommends an approach where in vitro bioequivalence tests, PK bioequivalence studies and PD/clinical bioequivalence studies are considered in their entirety, i.e., the “weight-of-evidence” approach. European jurisdiction adopts an approach under which bioequivalence is established as soon as criteria are met in a test category, with in vitro, PK and PD/Clinical BE studies considered in a step-wise order (5).

It should be noted that there is a difference in the bioequivalence definition between the USA and that of many European jurisdictions. In the USA, bioequivalence is defined as the absence of a significant difference in the rate and extent to which the active ingredient or active moiety in a pharmaceutically equivalent presentation becomes available at the site of drug action when administered at the same molar dose under similar conditions in an appropriately designed study (112). Under this definition, PD and clinical studies which are direct indices of drug availability at its action sites, as well as PK studies, which are generally considered reflective of drug availability at its action site, are all included under the USA bioequivalence context. In contrast, under the EMA definition, the emphasis is primarily on therapeutic equivalence.

In the presentation by Garcia-Arieta (13), the EU step-wise approach for second entry OIPs was elaborated using examples from various categories of OIPs. For nebulized solution products, if the formulation is qualitatively (Q1) and quantitatively (Q2) identical, product approval can be granted without in vitro testing. However, if the composition is Q1 and Q2 different, in vitro testing is necessary. Bioequivalence for a budesonide suspension product for nebulization was based on in vitro data ensuring similar particle size distribution of the particles in the suspension, and similar aerodynamic particle size distribution of the nebulized droplets. In US regulatory history, budesonide inhalation suspension has also been approved based on an in vitro bioequivalence study (113). Historically, the key uncertainty for FDA regarding the in vitro only approach to demonstrate bioequivalence for suspension drug products was that typical suspension products contain insoluble excipients in the product formulation, which makes comparative physical measurement of the API particles in the test and reference formulation unfeasible. Therefore, a confirmatory in vivo clinical bioequivalence study is recommended to address such uncertainty. In the budesonide inhalation suspension formulation, the active ingredient forms the only undissolved particles in the formulation and all excipients are present in their soluble form. Given that the API is the only particle in the drug product formulation, i.e., the particle size measurement is not confounded by the presence of other insoluble inactive ingredient and is a true representation of the API, the FDA accepted an in vitro test only approach for bioequivalence of this product (105).

For mometasone furoate nasal spray, a suspension product recently approved by EU based on in vitro tests only, it was interesting to note that although adopting a different overall approach (in vitro tests only for EU) to that of FDA (weight-of-evidence approach), the in vitro test package supporting bioequivalence in the EU approved application was the same package described in the FDA draft guidelines. However, the statistical methods for evaluating the in vitro data in EU are based on average bioequivalence (ABE), not population bioequivalence (PBE) as recommended by FDA. For ipratropium bromide HFA pMDI, an orally inhaled solution pMDI approved in EU based on in vitro data only, the considerations of the in vitro tests were similar to those for the nasal spray suspension product outlined above.

These generic products were approved by the EU based on in vitro data alone using the rationale that these are minor variations of the reference product, and IVIVC is not necessary since the in vitro test is more discriminatory. A stringent limit was set for the in vitro test comparison; thus, there was no need to utilize a clinically relevant difference as acceptance criteria. To date, the US FDA has not considered an in vitro only option for an orally inhaled pMDI or DPI product.

FUTURE DEVELOPMENTS

Progress in addressing previously published consensus research priorities emerged at the conference (Table V). A higher total lung dose alone does not necessarily denote a better OIP as deposition pattern also influences clinical outcome, although these are not the only factors. Furthermore, total lung deposition does not equate to 100% bioavailability, e.g., for poorly soluble drugs, such as fluticasone, the total lung dose is not all systemically available, and may not be available locally. Here, recent developments in computational mechanistic models (20,25,56) may provide valuable insights into how deposition profiles, together with dissolution, local clearance processes, and permeation, influence local and systemic exposure and thus therapeutic performance.

Table V Mapping of Some of the Key Areas Highlighted as Being in Need of Attention at the Iceland Meeting in 2009 (1), to Developments with In Vitro Methodology Reported and Discussed at the IPAC-RS Meeting in Orlando, 2014

In vitro methods for measuring particle size profiles, which were originally developed for quality control testing rather than IVIVC, are being adapted and novel systems are being developed to provide tests that are more clinically relevant and capable of improving IVIVC (as discussed in “PREDICTING IN VIVO PERFORMANCE FOR ORALLY INHALED PRODUCTS”). Validation of these approaches is a key challenge before moves towards standardization can be considered. In performing IVIVC, differences between healthy volunteers and patients should be considered, i.e., healthy volunteers are trained in device use, typically less diverse in terms of age, size and ethnicity, and do not have lung disease. For patients with lung disease, the varying severity and impact on lung function will affect both dose deposited and deposition pattern; thus, it is important to recognize the limitations of relying on total lung deposition in that this measurement does not accommodate targeting in the lungs. In Reisner’s presentation (75), investigators designing a clinical trial to demonstrate bioequivalence used in vitro findings of similar ED for the two study inhalers rather than FPD to select the in vivo dose. The test inhalers, a DPI and a pMDI, demonstrated very different inspiratory flow rate dependency in terms of FPD, as might be expected from such different devices, and the data affirm the need to obtain in vitro and in vivo measurements over a range of realistic flow rates when inhalers are being compared.

Incorporating features that mimic patient-relevant conditions into laboratory testing of OIPs has the potential to improve the IVIVC and provide a more realistic dataset representing drug delivered to the lung (Table II). The “clinical” inputs being utilized in laboratory tests include pressure-flow profiles which can simulate inhalation by patients through a variety of inhalers and can be defined for various age groups and disease severities. Other clinically realistic features that can be integrated into an in vitro model include the use of humidified air through the impactor when measuring particle size, building in a delay between the actuation of a pMDI ± valved spacer and deposition of the aerosol in the impactor, and using “inlets” or throat models that more closely resemble the geometry of the patients being tested. Data produced with these refinements may provide a more accurate assessment of how inhalers will perform “in the field”.

Several emerging techniques may provide improved insights into IVIVC, including PET studies (direct labeling or ligand displacement) to indicate target engagement as performed for the neurological research area and CT with CFD, termed functional respiratory imaging (FRI) by de Backer (114). FRI can be used to assess OIPs by comparing morphologic (airway volume and geometry) and ventilation changes in the segmented airways (up to the 7th generation) with clinical responses measured using conventional tests (115,116). With FRI, improvements in both airway volume and distribution of ventilation in responders to test drug have been shown (117). No data supporting the use of this approach to establish bioequivalence between a generic product and its innovator OIP are currently available.

Analysis and interpretation of in vitro data is increasingly important for decision-making. A consequence of industry and regulators focusing more attention on in vitro testing for product development and product comparisons is the need for further work to address questions in the area of statistical equivalence and analysis, which was reflected across multiple presentations. First, more research is needed to understand the most appropriate metrics: should data be analyzed in terms of original units, such as micrograms, or as fractions, such as the percent total impacted? It should be noted that inhaled medicines are prescribed in terms of their ED; thus, deposition fractions alone can be misleading as a high FPF, for example, can equate to a low FPD if the inhaler yields a low ED. Should the data from particular in vitro tests be analyzed in its original form or after transformation (e.g., log transformed or some other transformational approach)? What are the most robust statistical models for analysis and do those models describe adequately univariate data, such as delivered dose, FPD, and spray pattern, and multivariate data, such as cascade impaction or droplet size distribution test data? The models also need to be applicable to data that span a large range numerically. Clarity is needed regarding the appropriateness of regulatory constants applied to data that span such a wide range numerically and data with a wide range of variability. Finally, the statistical hypotheses being tested, and the statistically-based conclusions claimed, need to be clear and well understood by scientists and regulators to ensure that the claims are appropriate, scientifically-sound, and provide the most beneficial outcomes for patients and the developers and manufacturers of OIPs.