1 Introduction

Coronal mass ejections (CMEs) represent an important source of solar variability from the point of view of plasma and magnetic field: CMEs remove billions of tons of magnetized plasma from the Sun and dump them into the Sun-Earth connected space once every other day during solar minimum and several times per day during solar maximum. CMEs also provide dramatic variable energy input to the magnetosphere, in addition to and sometimes in combination with the high speed streams that originate from coronal holes. CMEs are the source of major disturbances in the interplanetary medium, and can be directly observed up to 32 solar radii (R) from the Sun, thanks to the sensitive coronagraphs on board the Solar and Heliospheric Observatory (SOHO) mission launched in 1995. The Large Angle and Spectrometric Coronagraph (LASCO) on board SOHO has unprecedented dynamic range and large field of view obtaining coronal images of very high quality (Brueckner et al. 1995). Although we know that CMEs originate from closed magnetic field regions such as active regions, filament regions, or a combination thereof, we do not yet know how CMEs are initiated. Several groups are involved in developing models of CMEs to understand CME initiation and interplanetary propagation with the ultimate aim of quantifying the CME impact on Earth and hence on the society. The CME catalog will provide the necessary data sets to test both empirical and physics-based models of CMEs. Although information on CMEs has been available sporadically since their discovery in the early 1970s, it is only after the advent of the SOHO mission that we have continuous and uniform data on CMEs (see Gopalswamy 2004 for a review).

The CME Catalog we describe in this paper grew out of a coordinated data analysis workshop (CDAW) in 1999 organized to study the properties of CMEs associated with interplanetary radio bursts observed by the Radio and Plasma Wave (WAVES) Experiment (Bougeret et al. 1995) on board the Wind spacecraft overlapping with SOHO observations from 1996 to June 1998 (Gopalswamy et al. 2001a). During the course of analyzing these events, it was realized that creating a set of plots and tables for each CME is a convenient way of comparing CME data with those of related phenomena. An effort to compile the properties of all LASCO CMEs was already underway (St Cyr et al. 2000), which was expanded by us to include plots and tables in the form of the SOHO/LASCO CME Catalog. The Catalog resides in the CDAW Data Center at the Goddard Space Flight Center (http://cdaw.gsfc.nasa.gov). The Catalog has been improved and expanded ever since it was open to the public based on input from the scientific community. Currently the Catalog contains many types of movies in addition to the measurements, thus greatly enhancing its utility. An earlier description of the Catalog can be found in Yashiro et al. (2004), which is updated describing the new entries.

2 General Layout of the Catalog

The top-level of the Catalog is a year-month matrix, with each element linked to the monthly lists of CMEs (see Fig. 1). A set of links are provided below the table to (i) a complete description of the Catalog, (ii) a search engine to search the entire Catalog, (iii) a text-only version of the Catalog, and (iv) related links useful for CME research. The text-only version lists all the basic information about the CMEs, except for plots, movies and tables of measured parameters.

Fig. 1
figure 1

The top page of the SOHO/LASCO CME Catalog. One row corresponds to all the months in the year. Months not underlined (no link) have no data. The inset (bottom right) shows the text-only version of the Catalog

3 Monthly Lists

The monthly list is a 13-column html table containing CME measurements, movies, and plots for each CME identified in a given month. It also contains information on the associated phenomena such as solar energetic particles (SEPs) and geomagnetic storms compiled from online data sources. Entries in the monthly list have links to additional information on CMEs. At the top of the monthly lists, a simple explanation is provided for getting information from additional layers. A link to the list of data gaps (≥3 h) during the month is also provided. The data-gap list must be consulted before deciding the existence or nonexistence of CMEs. If there is a data gap, it is difficult to say there was a CME or not during the data gap.

Each row in the monthly list corresponds to one CME. The first three columns of the monthly list serve as identification (ID) for each CME: the date and time of first appearance in the LASCO/C2 field of view (FOV) and the central position angle (CPA). More than 10 CMEs can occur on a single day, and many CMEs can appear at the same time in the C2 FOV. The CPA can essentially distinguish these CMEs appearing simultaneously. CMEs with an apparent width of 360° (Howard et al. 1982) are marked as “Halo” in the CPA column. Halo CMEs can be symmetric (S) or asymmetric with respect to the occulting disk. Brightness asymmetry (BA) and outline asymmetry (OA) can be recognized in most halos. The halo CMEs are accordingly labeled as “Halo (S)”, “Halo (BA)”, and “Halo (OA)”. The first column is linked to direct and difference LASCO/C2 movies with direct and difference 195 Å EUV images superposed. The EUV images are obtained by SOHO’s Extreme-ultraviolet Imaging Telescope (EIT, Delaboudinière et al. 1995). These movies provide a complete view of the CME in question (including CMEs that cross day boundaries). The superposed EIT images, especially the difference images, are very useful in locating the solar sources of CMEs (eruption region on the disk, at the limb or behind the limb). Column 4 lists the sky-plane width of the CMEs, typically measured within the C2 FOV after the width becomes stable (early on, the width often increases). Information as to when the width was measured (#WDATA) is available in the text data containing original measurements as a sub-layer of column 2.

Each CME is characterized by three speeds (columns 5–7): (1) the linear speed obtained by fitting a straight line to the height–time measurements made at the fastest section of CMEs, (2) quadratic speed obtained by fitting a parabola and evaluating the speed at the time of final height measurement, and (3) speed obtained as in (2) but evaluated when the CME is at a height of 20 R. Acceleration (column 8) is obtained from the quadratic fit to the height–time measurements. All quantities refer to the sky plane. Caution must be exercised in dealing with CMEs that fade away before reaching 20 R. For some CMEs that show significant acceleration, the linear fit is not suitable. However, the linear speed serves as an average speed within the LASCO FOV. Clicking on any of the speeds one can get the height–time plots with the fitted curves superposed (see Fig. 2). The actual height time measurements are also available in a text file linked to the first-appearance column. It must be pointed out that the measurement is made at a single Position Angle (PA) in the 2-dimensional images. This means there is more information in the original data than presented in the Catalog. Users should consult the original data for additional information.

Fig. 2
figure 2

Linear fit (left) and quadratic fit (right) to the height–time measurements. In the height–time plots, the C2 and C3 data points are distinguished by asterisk and diamond symbols, respectively. For the linear fit, measured feature, position angle, and velocity are noted on the plot. For quadratic fit, there are two plots: the top one is the height–time plot and the bottom one is the speed-height plot derived from the quadratic fit. The CME speed at 20 R listed in the CME table is obtained from this plot. The position angle and the acceleration from the fit are also noted on this plot

The acceleration of a CME can be positive, negative or close to zero meaning CMEs speed up, move with constant speed or slow down within the LASCO FOV. A minimum of three height–time measurements are needed for an estimate of the acceleration, but the accuracy increases when there are more measurements. Accelerations with just three measurements are not reliable and are marked with a superscript, *1.

Each CME is also characterized by a mass (in g, column 9) and a kinetic energy (in erg, column 10). There are generally large uncertainties in these numbers. Estimation of CME mass involves a number of assumptions, so the values given should be taken as representative. For example, most CMEs show an increase in mass when they traverse the first several solar radii, and then the mass reaches a quasi-constant value. This constant value is taken as the representative mass listed in the Catalog. Some CMEs fade within the first few solar radii. In these cases the mass corresponds to the time of last measurement. The mass estimates of halo CMEs are also very uncertain. The kinetic energy is obtained from the linear speed and the representative mass. Mass and kinetic energy values subject to such uncertainties are superscripted with *2.

Column 11 gives the position angle at which the height–time measurements are made (MPA for measurement position angle). Ideally, the MPA and CPA must be the same. However, some CMEs move nonradially, so the two do not always coincide. Even though there is no CPA for a halo CME, there is an MPA, corresponding to the PA of the fastest moving segment of the CME leading edge.

Column 12 has links to a number of movies and composite plots related to the CME in question. C2, C3, and 195 denote links to LASCO and EIT daily movies available at the Naval Research Laboratory (NRL) (http://lasco-www.nrl.navy.mil/daily_mpg/). ‘PHTX’ (proton, height-time, X-ray) links to three-day overview plots of SEP events (protons in the >10, >50 and >100 MeV energy channels of the particle detectors on board the Geostationary Operational Environment Satellite (GOES)), CME height–time, and GOES soft X-ray flares. These are useful in identifying the CMEs and flares associated with SEP events. ‘DST’ links to composite 6 day plots of CME height–time and the geomagnetic storm index (Dst), useful for recognizing the CMEs responsible for intense geomagnetic storms. The 6 day interval is chosen because typically CMEs take <1 day to ~5 days to reach Earth. Finally, links are provided to several Javascript daily movies. LASCO/C2 daily movies with the EIT images superposed (c2eit) and LASCO/C2 difference daily movies with EIT difference images superposed (c2_rdif) are useful in identifying the solar sources of CMEs. The difference movies use running difference (rdif) images (i.e., each frame has its previous frame subtracted so only the changes are seen in the movies). Additional versions of these two movies including GOES soft X-ray plots in the 1–8 Å band with the times of the LASCO frames marked (c2eit_gxray, c2rdif_gxray) are helpful in identifying flare association of CMEs. Finally, the LASCO/C3 running difference movies (c3_rdif) provide information on CMEs propagating beyond the C2 FOV. Similarly, there are movies combining LASCO images with Wind/WAVES dynamic spectra (c2eit_waves, c3rdif_waves). In the movies involving GOES X-ray light curves and Wind/WAVES dynamic spectra, there are also modified versions to suit the screens of laptop computers. The times of LASCO frames are marked on GOES light curves and WAVES dynamic spectra so that the heliocentric distance of the CME at various phases of flares and radio bursts can be identified. The EIT images have difference cadence than those of C2 and C3, so EIT images nearest in time are superposed on C2 images. All the images that form the frames of the Javascript movies (for the whole day) are also given under the movie link. The dynamic spectra and direct images are in color. The last column (13) of the monthly list contains some remarks regarding the number of data points and other limitations, as well as links to the halo CME alerts from the LASCO operator. Table 1 shows details of all the movies listed in the Catalog.

Table 1 List of daily Javascript movies available in the Catalog

We regard the linear speed, width, CPA, and acceleration as the basic attributes of a CME. The text file linked to the first appearance time contains the actual height–time measurements, which may be useful for over plotting with other data. The text file also contains the CME onset times obtained by extrapolating the linear fit (#ONSET1) or quadratic fit (#ONSET2) to the solar surface (height = 1 R). Note that theses extrapolations are accurate only for limb events. For disk events, the estimated onset is likely to be after the actual onset. There is a quality index (1 to 5) listed in the text file for each CME, 1 being poor and 5 being excellent.

The Catalog is searchable from the Virtual Solar Observatory (VSO) site (http://vso.nascom.nasa.gov/cgi-bin/vso/catalog.pl) and from the Catalog itself (see Fig. 3). The Catalog can also be searched by the SolarSoft (SSW) routine ssw_getcme_list.pro. SSW users may use that function directly for application development and to access the Catalog within an SSW session. Optionally, output from that function may be input into ssw_cme2files.pro to map from events to the links on the Catalog website (http://cdaw.gsfc.nasa.gov). The Catalog search returns an html table similar to the monthly list in the Catalog and an additional “Event Summary”. The Event Summary contains (1) the LASCO/C2 image of first CME appearance (difference image with EIT images superposed), (2) the GOES (1–8 Å) soft X-ray profile with the time of the LASCO/C2 frame marked by a vertical line, (3) the height–time plot with linear fit, and (4) a simple html table giving the time of first appearance, the extrapolated onset time at 1 R, the number of CMEs in the range searched, the search criteria used, and the number of CMEs meeting the search criteria.

Fig. 3
figure 3

The Search Engine form available online, which can be used to group CMEs based on time range (start and stop times), linear speed, angular width, acceleration, mass, and central position angle. The inset is a simple guidance for using the search engine

4 Data Products for Space Weather Research

The PHTX plots, DST plots and the Javascript movies are designed to facilitate correlation studies of Sun–Earth Events and associated phenomena. In this section we describe these with particular emphasis on their geospace consequences.

4.1 SEP Events and Shocks

Figure 4 shows an example of the PHTX plot for 2005 May 13. The plot extends from a day before and a day after the given day. The red, blue and green plots on the top show the proton intensity observed by the GOES satellite in the >10 MeV, >50 MeV, and >100 MeV energy channels. The middle panel shows CME height–time plots, with the LASCO data gaps indicated by the slanted lines. CMEs heading predominantly in the North, East, South, and West directions are shown in different colors as marked. The line styles of the height–time plots indicate the width (W) of the CMEs: halo CMEs (solid, W = 360°), partial halos (W ≥ 120°), wide CMEs (60° ≤ W < 120°) and normal CMEs (W < 60°). Fast and wide CMEs are important for space weather. The bottom panel shows the light curves in the two GOES soft X-ray channels (1–8 Å: upper curve and 0.5–4 Å: lower curve). The heliographic locations of the flares are indicated on the 1–8 Å whenever available. For example, the largest flare on the plot is marked N12E11because the flare originated from northern (N) latitude (12°) and eastern (E) longitude (11°) on 2005 May 13. This long duration flare was associated with the halo CME at 17:12 UT, which gave rise to the large SEP event in the top panel. It is easy to identify the flare and CME associated with isolated events like this. The SEP intensity profile is very slowly rising because the eruption site is not well connected to the GOES satellite. For events occurring in quick succession, one has to play the Javascript movies to distinguish the events. One of the useful tools is the Javascript movie that combines LASCO images with Wind/WAVES dynamic spectrum. From these movies one can easily recognize a type II radio burst, which is indicative of a fast mode shock in the corona and interplanetary medium. The same CME-driven shock is thought to accelerate electrons (to produce type II burst) and ions (observed as SEP events). Figure 5 shows one of the frames of the Javascript movie with the radio dynamic spectrum. The onset of the eruption is usually marked by a group of type III bursts (vertical features marked). The type II burst is the slanted feature, which drifts from higher to lower frequencies. The radio emission occurs at the plasma frequency in the corona and interplanetary medium ahead of the CME-driven shock, so higher frequency corresponds to distances closer to the Sun. For very strong shocks, the radio burst can be seen all the way from the Sun to the observing spacecraft near Earth. In these cases, the shock, the type II burst, and the locally produced SEPs can all be seen at the same time. SEP events are observed approximately in 50% of the cases with type II bursts. This is easily understood because SEPs need good magnetic connectivity to the observer, whereas the type II bursts can be seen from eruptions occurring anywhere on the disk (Gopalswamy et al. 2008b). The LASCO image in Fig. 5 shows the 2005 May 13 halo CME at 17:22 UT. The white circle is the size of the optical Sun. The gray disk represents the coronagraph’s occulting disk, which is necessary to block the bright photospheric light so that we can observe the faint corona. Overlaid on the occulting disk is the nearest EUV difference image, which helps identify the surface activity, associated with the CME. The EUV disturbance has spread over the entire disk by 17:07 UT. A faint envelope of the CME is actually outside the field of view (in the south) at 17:22 UT, so the C2 image was not used for height–time measurements. Just two LASCO/C3 frames were available to measure the speed. The speed was 1,689 km/s. At 17:22 UT, the type II radio emission occurs at ~1 MHz, consistent with the CME height ≥6 R. The type II burst continued to drift to longer and longer wavelengths all the way to the local plasma frequency in the vicinity the Wind spacecraft. The shock arrived at the SOHO spacecraft on May 15 at 02:19 UT and was accompanied by a huge energetic storm particle (ESP) event. These are also SEPs, but accelerated locally when the shock arrives at 1 AU.

Fig. 4
figure 4

PHTX plot for 2005 May 13: (top) GOES SEP plots in three energy channels, (middle) CME height–time plots, and (bottom) GOES soft X-ray light curves in two wavelength bands

Fig. 5
figure 5

One frame of a Javascript movie combining the LASCO C2 images (left) with the dynamic spectrum (right) from the Wind/WAVES experiment. The vertical white line in the dynamic spectrum marks the time of the LASCO image. When the movie runs, the vertical line shifts to the right because the dynamic spectrum moves to the left showing later times. Such movies are useful identifying the CME responsible for the type II radio bursts. The type III bursts mark the onset of the eruption at the Sun. Note that the C2 images are replaced by C3 images for c3rdif_waves movies. Similarly, the dynamic spectrum is replaced by GOES soft X-ray plots for c2rdif_waves movies. See Table 1 for other combinations

4.2 Geomagnetic Storms

CMEs take anywhere from less than a day to a few days to reach Earth and cause geomagnetic storms. CMEs cause geomagnetic storms when they contain internal magnetic structures with south pointing out-of-the-ecliptic component of the magnetic field. This can happen in the sheath and ejecta portions of the CMEs in the interplanetary medium (see, e.g., Gopalswamy et al. 2008a). In the example shown above, the shock took only ~33 h to arrive at 1 AU because it was driven by a fast CME. The maximum time taken by slow CMEs to reach Earth is about six days. Keeping this travel time in mind, the Dst plots are made for 6 day intervals including the day of the CME in question (see Fig. 6). The DST plots consist of the geomagnetic storm index (Dst), CME height–time history, and the GOES soft X-ray flux. From these plots, one can identify the CME and the solar source location corresponding to a geomagnetic storm. The CME height–time plot is shown extrapolated to a larger heliocentric distance (150 R) for easier identification of CME-storm pairs. The Dst index is obtained from the World Data Center (WDC) in Kyoto (http://swdcwww.kugi.kyoto-u.ac.jp/). The Dst plot shows two horizontal lines: the dashed line showing Dst = 0 and the dotted line showing Dst = −100 nT. When the Dst index reaches −100 nT or lower, the geomagnetic storm is said to be intense (or strong). When Dst reaches −50 nT, the storm is considered moderate. If the Dst data are not final, it is indicated so by the phrase, “Provisional Data”. Quick look and provisional data are likely to change, so the WDC page must be consulted for final values to be used in scientific analyses. The 2005 May 13 CME discussed above is responsible for a super-intense geomagnetic storm on May 15 at 9 UT, when the Dst index reached −263 nT. Before going negative, the Dst index becomes positive, which indicates the arrival of the shock at Earth and is known as the storm sudden commencement (SSC). This time also coincides with the ESP event and other manifestations of the shock mentioned above. Note that the halo CME on 2005 May 13 is isolated and there is no confusion in identifying the solar source of this storm. Some of the storms weaker than −100 nT may be caused by corotating interaction regions (CIRs), so caution must be exercised in the identification of weaker storms. The bottom panel identifies the GOES flares with their heliographic coordinates marked. By clicking on the peaks of the flares (the black open circles at the peaks of flares in Fig. 6), one can see EUV movies (direct and difference movies) showing the location of the flare and the extent of the EUV disturbances. LASCO C2 movies are also linked to GOES flare plots to identify the associated GOES flare and its intensity. Currently, links are provided for all X-class flares and some selected flares. However, all the movies listed in Table 1 are available under the “Javamovie” link in the monthly list.

Fig. 6
figure 6

Plots of the Dst index (top), CME height–time (middle) and GOES soft X-ray light curves (bottom). The circles at the flare peaks are links to EIT movies (direct and difference) which show the location of the flare on the disk and the related EUV disturbance

Figure 7 shows one of the frames of the Javascript movie linked to the flare on 2005 May 13. The GOES plot on the Javascript movie shows the time of the EUV difference image (16:37 UT) marked. The EUV eruption is of large scale, which covered the whole solar disk in the next frame (not shown, but can be viewed on line (http://cdaw.gsfc.nasa.gov/flare_lasco/2005/05/13/1613/00_rdeit.html). This plot is useful in inferring the reason for the slow rise in SEP intensity (eastern event so the connectivity improves with time) and the origin close to the disk center enables the CME plasma impact directly on Earth’s magnetosphere causing the intense magnetic storm. Additional Javascript movies linked to the flares are c2eit movies and c2rdif movies which can be used to compare the relative location of the EUV disturbance and the early phase of the white light CME. Fig. 8 displays one frame of a Javascript movie (c2rdif_gxray) showing another halo CME (2006 December 13) associated with the last ground level enhancement (GLE) event of solar cycle 23. The CME was associated with an X-class flare, and resulted in an intense geomagnetic storm.

Fig. 7
figure 7

One frame of a Javascript movie linked to flare peaks in the Dst plots. (left) EIT difference image (16:37 minus 16:17) showing the huge eruption in the northwest quadrant of the Sun. The circle represents the optical Sun. (right) GOES soft X-ray light curves with the time of the EIT difference image (16:37 UT) shown as the vertical line

Fig. 8
figure 8

One frame of a Javascript movie combining the LASCO C2 images (left) with the GOES soft X-ray light curve (right). The vertical white line in the GOES plot marks the time of the LASCO image. Such movies are useful in identifying the solar source of the eruption. The EUV difference image shows the huge disturbance in the southwest quadrant of the Sun, which becomes the huge halo CME

5 Some Statistical Properties

This section provides an overview of CME properties obtained from the CME Catalog using the current list of more than 11,000 CMEs identified up to the end of 2006. Earlier results can be found in Gopalswamy (2004), Yashiro et al. (2004), and Gopalswamy (2006). It must be pointed out that the total number of CMEs is not an absolute number. Since the CMEs are identified manually, there is always the possibility that some CMEs are missed. In addition, it is likely that some narrow CMEs are missed due to the visibility function. Yashiro et al. (2005) estimated that ~20% of CMEs may not have been detected by LASCO because they are either masked by the occulting disk or they are back-sided. The undetected CMEs are likely to be narrow on the average because they are mostly associated with C-class GOES X-ray flares. The daily CME rate averaged over Carrington Rotation periods (27.3 days) increases from one every other day during solar minimum to more than six per day during solar maximum (see Fig. 9). More than 10 CMEs per day is not uncommon when certain super active regions are on the Sun, but the average daily rate is lower. The rate increase was somewhat sudden in 1998 and continues to remain high (~2 CMEs per day) even at the end of 2006. Automatic CME detection techniques such as CACTuS (Robbrecht and Berghmans 2004) detect a lot more CMEs. These are not likely to be independent CMEs, and some of them may be due to noise because difference images are used to identify CMEs. It is well known that CMEs are highly structured with a three-part structure and a leading shock for fast CMEs. It is possible that many of the substructures of a CME are identified as different CMEs. This is one of the major problems to be solved before automatic detection schemes can yield realistic CME counts. There may also be faint CMEs easily missed by human eye, but these are not likely to be important in terms of heliospheric impact. In view of the above discussion, we think that the CME Catalog contains most of the CMEs and the missing CMEs may not significantly influence the statistical properties. We are in the process of revisiting the LASCO data for the period before 2004 to identify any narrow CMEs that might have been missed during periods of high solar activity.

Fig. 9
figure 9

(left) CME rate (averaged over Carrington Rotation periods) as a function of time for the first 11 years of SOHO observations. The minimum rate is similar to the pre-SOHO values, but the maximum rate is higher by a factor of two. (right) Mean CME speed as a function of time showing that the speed is higher by a factor of 2 during solar maximum phase. The spikes on the CME speed plot are due to some of the super active regions as marked with the month and AR. The mean speed is obtained by averaging the speeds of all CMEs occurring in a given Carrington rotation period

The CME mean speed in Fig. 9 clearly shows a solar cycle variation. The mean speed corresponds to all CMEs occurring during a given Carrington rotation period. The mean speed during solar minimum is less than 300 km/s, while it is close to 600 km/s during maximum, thus varying over a factor of two. The mean-speed plot also shows a number of spikes, which correspond to periods of high-speed ejections. These CMEs are from active regions with prolific CME activity and are sometimes referred to as “super active regions”. Several of these active regions are marked in the plot. The largest spike corresponds to the October–November 2003 period, when active region NOAA 0486 produced many fast and wide CMEs in quick succession (Gopalswamy et al. 2005b). A detailed account of this active region in comparison with a few other recent regions can be found in Gopalswamy et al. (2006).

The average speed of CMEs (475 km/s, see Fig. 10) is slightly above the slow solar wind speed. It was not possible to measure the speed of ~11% of CMEs for various reasons, including the appearance of a CME in just a single frame. The average apparent width is ~44° (counting CMEs with width <120° because the true width of halo CMEs is unknown). About 3% of all CMEs are full halos and ~11% CMEs have widths ≥120°. The fast and wide CMEs are the most energetic CMEs, which can propagate far into the interplanetary medium. One can see a tendency for faster CMEs to decelerate and slower CMEs to accelerate as given by the relation, a = − 0.015 (V-466), with the acceleration a in m/s2 and the speed V in km/s. CMEs with V ~ 466 km/s do not accelerate and move with the solar wind. CMEs with V > 466 km/s decelerate, while those with V < 466 km/s accelerate. Such a tendency is also known when the CME propagation is considered over the Sun-Earth distance (Gopalswamy et al. 2000).

Fig. 10
figure 10

Speed (left), width (middle) and acceleration (right) of all the CMEs identified up to the end of 2006. All measurements are with respect to the sky plane. No attempt was made to correct for projection effects. The number of CMEs used in each distribution is indicated on the plots. The numbers used for width and speed distributions differ because height–time measurements could not be made for some CMEs due to insufficient data. The acceleration (a) is negative at large speeds (v) as indicated by the regression line in the (right) panel. The number 466 roughly corresponds to the solar wind speed

Figure 11 shows the mass and kinetic energy distributions of CMEs. The mass was determined only for non-halo CMEs. The average mass (3.5 × 1014 g) is significantly smaller than the pre-SOHO values (~3 × 1015 g) because LASCO can detect a large number of narrower, low-mass CMEs (see also Vourlidas et al. 2002). Inclusion of halo CMEs is likely to increase the average value. The average kinetic energy is also smaller (2.9 × 1029 erg for SOHO CMEs, compared to 3.4 × 1030 erg for Solwind CMEs). It must be pointed out that many CMEs have kinetic energies exceeding 1032 erg, especially when they are fast and wide (Gopalswamy et al. 2005b). The largest kinetic energy measured so far is 1.2 × 1033 erg for a CME on October 28, 2003 which provides a rough estimate of what is available as free energy on the Sun. The CME kinetic energy is expected to be a fraction of the available free energy in solar active regions.

Fig. 11
figure 11

Distributions of CME mass and kinetic energy of all CMEs for which mass and speed measurements were possible. The average (Ave) and median (Med) values of the distributions are shown on the plots

As of the end of 2006, more than 10,000 CMEs have been detected since 1996. But only a small fraction (1–2%) of them are important in causing geomagnetic storms or producing large SEP events. For example, only ~80 large geomagnetic storms caused by CMEs have been observed during cycle 23 (1996–2006). The number of large SEP events is also similar (~95). The number (99) of magnetic clouds (MCs) is not too different, either. A CME is considered to be geoeffective if it produces significant effect in the vicinity of Earth. For example, large geomagnetic storms are caused by CMEs impinging upon Earth. These are typically front-side halo CMEs. Shock-driving CMEs resulting in a significant enhancement of solar energetic particle intensity in geospace are also considered geoeffective (or SEPeffective).

The extent and importance of CMEs as derived from the past decade of SOHO observations is summarized in a cumulative distribution of CME number in Fig. 12. The first thing to notice is that each of the special populations has an average speed well above that of the general population consisting of more than 11,000 CMEs. The special populations are: (1) CMEs associated with metric type II radio bursts (m), which drive shocks very close to the Sun (within 2 R). (2) CMEs associated with MCs. These CMEs originate close to the disk center of the Sun and arrive at Earth as magnetic clouds characterized by enhanced magnetic field, smooth rotation of the magnetic field direction, and low proton temperature (Burlaga et al. 1982). About two-thirds of MCs are known to be geoeffective (Gopalswamy et al. 2008a). (3) CMEs that cause major geomagnetic storms (GEO), some of which are not MCs probably because they originate slightly away from the disk center. CMEs associated with MCs are subject to projection effects, so their true speeds are likely to be higher than what are measured in the sky plane. (4) Halo CMEs are those which appear to surround the occulting disk of the coronagraph (HALO). Halos can originate from the frontside or on the backside of the Sun, but the frontside ones are important for geospace impact. About 70% of frontside halos are geoeffective (Gopalswamy et al. 2007), similar to MCs. In fact, there is a large overlap among MC, GEO, and HALO populations. (5) CMEs associated with type II bursts having emission components at all (from metric to kilometric) wavelengths (mkm). These CMEs drive shocks over the entire Sun–Earth distance because longer-wavelength type II bursts are produced when CMEs reach larger distances from the Sun (Gopalswamy et al. 2005c). (6) CMEs producing large solar energetic particle (SEP) events. The current paradigm is that such SEPs are produced by CME-driven shocks. The reason that the SEP and mkm speeds are close to each other is that the same shock accelerates electrons and protons. (7) CMEs associated with the ground level enhancement (GLE) in SEPs represent the fastest population of CMEs. These are rare events with the average speed of CMEs is ~2,000 km/s (Gopalswamy et al. 2005d). Note that the three fastest populations (mkm, SEP, GLE) all drive shocks and accelerate particles. The three populations with intermediate speed (MC, GEO, and HALO) are important for plasma impact on geospace. The m population also drives shocks, but very close to the Sun. The average speed of CMEs associated with purely metric type II bursts is only ~600 km/s, so they do not have enough energy to drive shocks far into the IP medium. CMEs faster than 1,000 km/s contain GEO, HALO, SEP-related, mkm type II related, and GLE-related CMEs. There are ~1,000 CMEs over the solar cycle, which have speeds exceeding ~1,000 km/s. In other words, ~10% of the fastest CMEs make significant impact on the heliosphere.

Fig. 12
figure 12

(left) Cumulative distribution of CME speeds with average speeds of various CME populations indicated: all CMEs (all), CMEs associated with metric type II radio bursts (m), CMEs associated with magnetic clouds (MCs), CMEs that cause major geomagnetic storms (GEO), Halo CMEs which appear to surround the occulting disk of the coronagraph (HALO), CMEs associated with type II bursts having emission components at all (from metric to kilometric) wavelengths (mkm), CMEs producing large solar energetic particle (SEP) events, and CMEs associated with the ground level enhancement (GLE) in SEP events. (right) The annual number of various special populations: all CMEs, halos CMEs, fast and wide CMEs, and fast and wide western CMEs

The annual numbers shown in Fig. 12 indicate that only a small fraction of all observed CMEs belong to the special populations: halo CMEs, fast and wide CMEs, and fast and wide western CMEs. CMEs are considered fast if they have speed ≥900 km/s. Wide CMEs have width ≥60°. The 900 km/s is not a magic number: it is based on the average speed of CMEs that produce interplanetary type II radio bursts (Gopalswamy et al. 2001b). The halo CMEs are inherently more energetic and cause geomagnetic storms when frontsided (Gopalswamy et al. 2007) even though their true width is unknown. Fast and wide CMEs include halos and are slightly larger in number because some halos have speeds <900 km/s. The fast and wide western CMEs originate on the western hemisphere of the Sun, which are closely associated with SEP events at Earth because of the magnetic connectivity. Another point to notice is the rapid decline in the number of CMEs with speeds exceeding 2,000 km/s. The Catalog contains only 37 CMEs with speeds ≥2,000 km/s, only nine with speed ≥2,500 km/s, and a single CME with speed >3,000 km/s. The lack of CMEs at very high speeds is probably related to the maximum amount of free energy that can be stored in active regions, which in turn may also be related to the size of the active regions that form on the solar surface.

6 Some Unsolved Problems

Several problems on the origin, interplanetary propagation, and heliospheric consequences of CMEs remain unsolved. Here we point out a few, especially those that can be addressed using the data base described in this paper.

The CME initiation follows storage of energy in closed magnetic field regions on the Sun over a certain period of time but we do not know what triggers the release of energy (Gopalswamy et al. 2006). The stored energy is known as the free energy available to be released as flares and CMEs. We do not know how this free energy is portioned into flare energy and CME kinetic energy. We do know that flares occur without CMEs, but CMEs are always accompanied by flares. Flares accompanying CMEs (known as eruptive flares) are generally of lower temperature compared to flares not accompanied by CMEs. This may indicate that flares without CMEs may contribute to coronal heating (Yashiro et al. 2006) while CMEs carry away the mass and magnetic field into the heliosphere. Based on the highest mass (1017 g) and speed (~3500 km/s) observed one can estimate a maximum kinetic energy of ~6 × 1033 erg. Assuming that only a fraction of the stored energy is released in a single episode, we can set a limit of ~1036 erg for the maximum free energy available in a solar active region. This is consistent with the size and magnetic field strengths in solar active regions. Large active regions with high magnetic field can store more energy.

CMEs are subject to propelling and retarding forces in the corona and interplanetary medium (see e.g., Vršnak et al. 2004). The propelling force is not properly identified yet. Solar gravity and the drag force due to momentum exchange between CMEs and the ambient medium constitute the main retarding forces. The net result is that CMEs tend to acquire the speed of the ambient solar wind at large distances from the Sun. This can be quantified as an effective interplanetary acceleration (Gopalswamy et al. 2000; 2001a). However, CMEs come in all sizes and shapes and the ambient solar wind also is highly variable. The propagation of CMEs is also affected by the presence of preceding CMEs, especially during solar maximum years when CMEs occur in quick succession (Gopalswamy et al. 2004). CMEs can also be deflected by other CMEs and by nearby coronal holes (Gopalswamy et al. 2005a). We need a proper quantification of these effects in order to accurately predict the arrival of a CME at a desired location in the heliosphere, once its launch has been observed and the initial speed measured. Another issue is the true speed with which CMEs propagate towards a location in the heliosphere. LASCO measures speeds in the sky plane, but the travel time prediction needs space speed. For example if we consider CMEs heading towards Earth, we need to deproject the sky-plane speed and reproject it along the Sun-Earth line. There have been several attempts to convert the sky plane speed into Earth-directed speed using cone models with reasonable success, but more work is needed (Xie et al. 2006; Michalek 2006).

Even though the fastest of CMEs produce energetic particles, we do not fully understand why some seemingly energetic events produce only low levels of SEPs. There are clear indications that particle acceleration is a complex issue with multiple sources (shocks and flares) and multiple factors deciding the acceleration efficiency (Kahler 2001; Gopalswamy et al. 2004; Kahler and Vourlidas 2005). We do not know what the flare and shock contributions are for a given SEP event. We also do not know how the ambient medium consisting of previously ejected CMEs, shocks, and SEPs decide the properties of a subsequent event.

We do not fully understand how the CMEs remote-sensed evolve into CMEs observed in situ in the solar wind. Magnetic clouds are CMEs observed in the solar wind with specific magnetic properties, the prominent one being their flux rope structure. It is very difficult to observe flux rope structure near the Sun. Prominences are thought to be flux ropes near the Sun, but observations in the interplanetary medium are not compatible with that. Magnetic clouds are observed with high charge states implying high temperature (several million K) at the source, whereas prominences are cooler structures with a temperature of only ~8,000 K. Coronal cavities observed in eclipse pictures and inner coronal images in X-rays and EUV are thought to be flux ropes, but an alternative explanation is that these are highly sheared magnetic structures. Recent quantitative comparison between reconnected flux at the eruption site and the azimuthal flux in flux ropes in the solar wind suggest that the two fluxes are approximately the same (Qiu et al. 2007), implying that the flux ropes are formed during the eruption process rather than present in the pre-eruption state. White-light CME observations mainly provide information on the mass content of the CME, but very little on the magnetic structure. Many related observations (magnetic and other) need to be pooled to try to obtain the magnetic structure of CMEs. Another related issue is whether all CMEs in the interplanetary medium are magnetic clouds. If that is the case, it will be very useful for space weather purposes because it implies a definite magnetic structure, from which one can infer the onset time of geomagnetic storms. The magnetic cloud structure indicates a definite leading and trailing field orientation, which decides the day-side reconnection with Earth’s magnetic field that ultimately results in the magnetic storm. While the flux ropes in the interplanetary medium have a well defined strength and structure of the magnetic field, the same cannot be said about CMEs near the Sun. At present, we have to infer the nature of interplanetary CMEs based on the magnetic properties of solar active regions at the photospheric or chromospheric levels, but the eruption itself starts in the corona.

CMEs not only produce energetic particles, but also deflect energetic particles coming from outside the heliosphere––the galactic cosmic rays accelerated in supernova shocks. At the time of CME passage at Earth, the cosmic ray intensity decreases and the effect is known as Forbush decrease (e.g., Zhang and Burlaga 1988). Such decreases have also been observed at various locations in the heliosphere where observations could be made. Newkirk et al. (1981) proposed that CMEs in the heliosphere essentially represent the magnetic fluctuations needed to produce the observed solar cycle variation of cosmic ray intensity. Although the idea was not pursued further because the observed number of CMEs and the frequency of occurrence were not high enough, SOHO observations indicate that both these restrictions are no longer an issue. Preliminary investigation suggests that CMEs can indeed account for the 22 year modulation cycle (Gopalswamy 2004; Lara et al. 2005). This is an important issue to be resolved because cosmic rays pose significant threat to space exploration, especially to the missions to other planets.

7 Summary

Compared to other forms of solar eruptions such as flares, prominence eruptions and type II radio bursts, CMEs have a shorter history. Over the past 30 years, several questions relating to CMEs have been answered and their immense consequences throughout the heliospace have been recognized. CMEs start producing particle radiation soon after they lift off and continue to do so at Earth’s orbit and beyond. They also produce intense geomagnetic storms, which are known to have adverse consequences in geospace as well as on the ground. CMEs also present an important laboratory to study magnetized plasmas of size scales much larger than possible on Earth. However, there are several problems that remain unanswered especially regarding CME origin and interplanetary propagation. The CME data base described in this paper can be used to address several of the outstanding issues on CMEs. The SOHO mission has observed more than 11,000 CMEs over the past decade. These observations represent an extensive data set with high degree of uniformity and quality over an unprecedented field of view. The simultaneous availability of solar wind data from Wind, ACE and Ulysses has greatly enhanced the utility of the SOHO data to understanding the far-reaching influence of CMEs in the heliospace. Similarly, readily available geomagnetic data from the World Data Center, and SEP and X-ray data from the GOES satellite have helped produce additional data products such as the composite plots containing SEP intensity and Dst index along with CME height–time plots and GOES X-ray flux with the heliographic coordinates of the flares. These data products enable extensive correlative studies on several aspects of CMEs. CME Movies with Wind/WAVES dynamic spectra and GOES X-ray flares help identify CMEs associated with flares and type II radio bursts. These data products are very useful for space weather research because disparate data sets are pieced together in a format easily usable by non-specialists.