1 Introduction

Since the mid-1980s, entrepreneurship has been increasingly considered as an important tool for economic growth and innovation across economies, regardless of stage of economic development. Entrepreneurship is now at the center of many policy questions related to science and technology, sustainability, poverty, human capital, endogenous resources, employment, regional and comparative advantages, etc. The surge of policy interest in entrepreneurship has, not surprisingly, been accompanied by growing academic research into its dynamics and processes. With respect to policy, research priorities have focused first on understanding (measuring) and second on creating environments supportive of entrepreneurship (Acs and Szerb 2007). One particularly important public policy issue for international development is the role played by institutional features of investment climate, for instance, the role of environmental conditions (Levie and Autio 2008) and indicators of the business environment (such as those measured by Doing Business reports—see World Bank 2007). For example, this includes measures of the regulatory burden for starting, operating and closing a business, such as the cost, number of days and number of procedures required to start a business.

In recent years, different sources of data on “entrepreneurship” have led to contradictory or inconclusive empirical findings for research into its dynamics.Footnote 1 For example, it is still unclear if—and in what direction—a causation exists between entrepreneurship and unemployment, poverty, taxation, regulatory burden, etc. Country-specific differences may certainly lead to contradictory findings, as well as the variety in the types of data used as broad measures of “entrepreneurship.” This has contributed to a great deal of confusion in entrepreneurship research. For this reason, it is critically important to understand what the data indicate, and exactly what element of entrepreneurial dynamics is being measured. The WBGES data, for example, measures the registration of LLCs, which is one kind of legal arrangement for a new firm. We discuss the implications of the various definitions of start-ups further in the comparative analysis section of our paper.

Separate studies using GEM and WBGES data have found contradictory results: While no relationship is found between GEM data and administrative barriers to starting a business, a significantly negative effect is found with WBGES data (van Stel et al. 2007 and Klapper et al. 2007, respectively).Footnote 2 It is possible that this—and similar contradictory results in the empirical entrepreneurship research—can be attributed to some degree to the differences in what the data capture. For this reason, we compare the two popular datasets designed to capture entrepreneurial dynamics.

In this paper, we compare the GEM dataset for early stage entrepreneurial activity and the WBGES dataset for formal business registration. We find two important trends when the data are compared descriptively. First, GEM data tend to report significantly higher levels of early stage entrepreneurship in developing economies than do the World Bank business entry data. Second, the World Bank business entry data tend to be higher than GEM data for developed countries.

There are at least three possible ways to interpret this discrepancy. First, the datasets simply measure different dynamics related to “entrepreneurship.” The WBGES measure rates of entry in the formal economy, and even more specifically, entry in the form of LLC establishments. The GEM data are perhaps more reflective of entrepreneurial intent and what some might call “entrepreneurial spirit.” For this reason, GEM data capture informality of entrepreneurship, particularly in developing countries. In particular, firm formation does not necessarily mean firm registration. Second, this discrepancy can also be interpreted as the spread between individuals who could potentially operate businesses in the formal sector—and those that actually do so. If this is the case, then GEM data may represent the potential supply of entrepreneurs, whereas WBGES data would represent the actual rate of entrepreneurship. This is interesting especially in the context of the allocation of talent (Murphy et al. 1991) and the allocation of entrepreneurship (Baumol 1990). In the allocation of talent model, the stock of talent is relatively constant, but its allocation towards a range of activities can change. Similarly, in the allocation of the entrepreneurship model, the stock of entrepreneurs in the economy is relatively constant, but the nature of their activities changes.

The motivation for entrepreneurs to operate in the formal versus informal sector is examined further in our empirical analysis. We find that the magnitude of differences reported in the datasets across countries is related to the institutional and environmental conditions for entrepreneurs. In terms of institutional differences, we find that the conditions related to registration, operation and closure of business are important; in terms of environmental differences, we find significant affects of economic and political conditions. Overall, entrepreneurs in developed countries have greater ease and incentives to incorporate, both for the benefits of greater access to formal financing and labor contracts, as well as for tax and other purposes not related to business activities. We elaborate on this further in the comparative analysis section of this paper.

2 Data description

2.1 GEM

The Global Entrepreneurship Monitor (GEM) project is unique in that while all countries collect official data on self-employment, the size distribution of firms, census data on all or most plants and firms, firm and plant entry, almost none of these registry sources are comparable across countries, even in developed countries. Official data sources differ in the way they define when an establishment enters a file and when it leaves, and how they handle self-employment makes cross-national comparisons almost impossible.Footnote 3 Therefore, one of the major strengths of the project is the application of uniform definitions and data collection across countries for international comparisons.

The intent of GEM is to systematically assess two things: the level of start-up activity or the prevalence of nascent firms and the prevalence of new or young firms that have survived the start-up phase. First, start-up activity (the “nascent” rate) is measured by the proportion of the adult population (18–64 years of age) in each country that is currently engaged in the process of creating a nascent business. Second, the proportion of adults in each country who are involved in operating a business that is less than 42 months old measures the presence of new firms (the “baby” rate). The distinction between nascent and new firms is made in order to determine the relationship of each to national economic growth. For both measures, the research focus is on entrepreneurial activity in which the individuals involved have a direct, but not necessarily full, ownership interest in the business.

2.2 World Bank group data

The goal of the 2007 World Bank Group Entrepreneurship Survey was to collect a benchmark of formal entrepreneurial activity for a large number of developed and developing countries. The intent is that these data will be used to compare private sector development across countries, as well as to monitor and evaluate the impact of regulatory reforms over time. In order to measure entrepreneurship and make data universally comparable, we developed a methodology that can be applicable across heterogeneous legal regimes and economic systems. Previous efforts had been made in this regard, but the great majority focused solely on the developed world and did not take into account differences in legal systems, sectors and economic structures (see United Nations 2005).

The WBGES defines the unit of measurement of entrepreneurship as:

Any economic unit of the formal sector incorporated as a legal entity and registered in a public registry, which is capable, in its own right, of incurring liabilities and of engaging in economic activities and transactions with other entities.

Notably, this definition excludes informal sector initiatives. This exclusion is based on the difficulties of quantifying the number of firms in the informal sector, rather than on its relevance for developing economies (Nielsen and Ploving 1997). The only way to measure the informal sector is through economic censuses, which due to their high costs are infrequently collected.

Furthermore, entrepreneurship is defined as:

The activities of an individual or a group aimed at initiating economic activities in the formal sector under a legal form of business.

However, few countries (i.e., Denmark) maintain “active” registries that annually confirm that registered firms are still operating. Therefore, official registration data include both businesses incorporated for economic activities, as well as those incorporated for tax or other non-business purposes (e.g., shell companies). An additional limitation of the data is that they do not report the number of closed businesses. The reasons differ from country to country, but are mainly due to the fact that the registrars generally have no enforcement mechanisms to obligate businesses to report closures. Although the number of closed companies is essential to paint a clear picture of the economic and entrepreneurial activities of a country, it is not yet feasible to obtain comparable data (Nucci 1999).

The WBGES database includes data on formal business registrations in 84 countries. The information was collected from business registries and other government sources via a survey and follow-up phone calls.Footnote 4 These other sources include statistical agencies, tax and labor agencies, chambers of commerce and private vendors (such as D&B), which were used only when business registry data were unavailable or non-existent.Footnote 5 The survey collected data on the year-end stock of total registered firms and new firms registered in the calendar year from 2003 to 2005.Footnote 6 Importantly, the definition of entrepreneurship includes only businesses that operate in the formal sector, and to maximize comparability across countries of different legal and economic systems, the database includes only limited liability corporations (LLCs).

For the purpose of the analysis in this paper, the data are used to calculate the “corporate” entrepreneurship rate, which is defined as the number of newly registered companies as a percentage of the adult population.

2.3 Comparative analysis

To compare entrepreneurship rates between the two databases, we calculate the spread between the “nascent” and “baby” entrepreneurship rates in GEM and the “corporate” entrepreneurship rate in WBGES.Footnote 7 The first new indicator, SPR_N_C, measures the difference between percentages of individuals who in the process of starting a business (the GEM “nascent” rate) and those who have actually started a formal corporation. The second new indicator, SPR_B_C, measures the difference between the percentage of individuals operating a young business in either the formal or informal sector (“baby”), with the percentage of individuals who have chosen and/or succeeded in starting a formal corporation (“corporate”).

We interpret these spreads to reflect, in some part, a loss of potential formal sector participation. In other words, this can represent those individuals who were unsuccessful in registering their business because of barriers to registration that we later introduce or who chose to operate in the informal sector. The tendency of GEM data to be higher than WBGES data for developing countries is likely partly indicative of lost formal sector participation due to barriers to participation, and partly indicative of the informal economy due to choice. These are not mutually exclusive. In either case, the individual may still have started a business—but as we mentioned in the introduction, firm formation does not mean registration. We expect a higher spread—indicating a larger loss of entrepreneurial potential—in countries with weaker business environments.Footnote 8 The quality of the business environment, as measured by the Doing Business and other indicators, is collectively accepted as a critical determinant of entrepreneurial activity. These spreads, by country, are shown in Fig. 1.

Fig. 1
figure 1

Nascent, young and formal entrepreneurship. Variables are defined in Table 1. Panel A: SPR_N_C [“nascent” (GEM) less “Corporate” (WB) entrepreneurship rates]. Panel B: SPR_B_C [“Baby” (GEM) less “Corporate” (WB) entrepreneurship rates]

What would we expect the data to show from a theoretical perspective? If the nascent rate represents early stage activity, we expect this to be higher than the young entrepreneurship rate. This is because many people that take “some steps” towards starting a business do not actually succeed. We also expect the young entrepreneurship rate to be larger than the formal rate, since many firms first are initially established under sole proprietorship, but incorporated at a later stage. In fact, for the United States, these rates are 8.12%, 4.98% and 2.55%, respectively. This does not, however, hold across developed and developing countries.

In fact, it appears that in many countries—developed and developing—the young entrepreneurship rate and the nascent entrepreneurship rate are less than the formal entrepreneurship rate. This is the case not only in Hong Kong, but also in Latvia, The Netherlands, Norway, Israel, Iceland, New Zealand, Denmark, Sweden, Belgium, etc. In many developed countries, therefore, the formal entrepreneurship rate actually exceeds the young entrepreneurship rate and even the nascent entrepreneurship rate. Even within developing countries, the structure and types of entrepreneurial activities can differ (see Acs and Amoros 2008).

There are several possible explanations. In developing countries, a lower corporate rate might actually represent a shift towards increased formalization of the economy. Newly registered companies may represent some aspect of formalization, where businesses that were not previously LLCs have newly converted their legal status. It is also important to note the unit of analysis is different in the datasets: GEM measures the number of individual entrepreneurs, possibly overlooking individuals that are involved in multiple new businesses. The WBGES dataset instead measures the number of businesses and can capture this dynamic. However, a possible complication also results from the WBGES measure: Formal entrepreneurship includes both actual businesses and LLCs that are a legal vehicle for purposes other than starting a new business. For instance, entrepreneurs might use registrations to achieve other business ends such as reducing taxes (e.g., shell companies) and avoiding regulatory burdens (e.g., labor laws).Footnote 9 For example, in the United States, firms may register several LLCs as a way to limit liability for different lines of businesses. In Hong Kong, where the formal rate far surpasses the young business formation rate, all real estate sales are first converted to an LLC to avoid taxes. The incentive to register firms for redundant or non-business activities might be greater in developed countries with more complex (and enforced) tax and regulatory structures.

2.4 Data and summary statistics

The sample for the analysis is a pooled, cross-sectional, longitudinal unbalanced panel of 90 observations across 40 countries with non-missing explanatory variables in both the GEM and WBES databases for 2003, 2004 and 2005.Footnote 10 Summary statistics are shown in Table 1. The mean spread with nascent entrepreneurs (SPR_N_C) is –0.36%, and the spread with young firms (SPR_B_C) is −1.55%, which suggests that on average the two measures are very similar. However, we find a standard deviation of over 4% for both indicators—maximum values of over 9% and minimum values less than −9%—and variation across economic and political environments.

Table 1 Variable definitions and summary statistics

We consider a variety of country characteristics as predictors of entrepreneurial activity, which vary over time. We include log GDP per capita (GDPPC) in all estimations to control for economic development because of the varied levels of development of countries for which we have data. As an additional explanatory variable, we include the ratio of domestic credit to the private sector as a percentage of GDP as a measure of financial development (DomCredit).

We use four measures of the regulatory barriers: first, an indicator of the difficulty of hiring and firing employees (Labor_Rig); second, the log cost of business registration (Entry_Cost); third, the log number of procedures required to start a business (Entry_Proc); fourth, the ease of closing a business, proxied by the estimated recovery rate claimants can expect following foreclosure or bankruptcy (Rec_Rate). These measures indicate the difficulties in starting, operating and closing a business.

It is important to note that these indicators measure the barriers for a “typical” formal sector firm, which might in part explain the weak relationship with GEM data. For instance, the methodology for entry barriers assumes:

“The business is:

  • A limited liability company.

  • Has start-up capital of ten times income per capita at the end of 2005, paid in cash.

  • Has a turnover of at least 100 times income per capita.”Footnote 11

We expect that these barriers would have a stronger relationship with the formal entrepreneurship rates in the WB database. Furthermore, these indicators might be important predictors of a firm’s decision to operate in the formal versus informal sector.

Next, we include indicators of operational risk, which may proxy for the risks and benefits of individuals of operating a firm in the formal (rather than informal) sector. For instance, we would expect individuals to be less willing to operate illegally (and more likely to pay taxes) in countries where registration laws are enforced, corruption is lower, and the economy is healthy. First, we include an index of political risk (Pol_Risk), which measures corruption, government stability, etc. Second, we include an index of law and order (Law_Order), which measures the efficiency of the legal and judicial system. Third, we include an index of economic risk (Econ_Risk), which measures the economic growth of the country. Fourth, we include a composite risk index, which is an average of political, economic and governmental financial risk and stability.

3 Empirical results

Table 2 shows the correlation matrix of our variables. Univariate tests show significance with all variables except employment laws. An explanation might be that both formal and informal young firms are less likely to hire a large number of employees.Footnote 12 Because of the large and significant correlation between the explanatory variables, estimations are run separately, while controlling for economic development through logGDP per capita.

Table 2 Correlation matrix

Figure 2 shows scatter plots and univariate tests of our explanatory variables. We find significant relationships for both the SPR_N_C and SPR_B_C. As expected, the spread between the two measures is negatively related to per capita GDP, composite risk, recovery rate and law and order. It is positively related to the number of procedures needed to register a business and the share of the informal economy.

Fig. 2
figure 2figure 2

Scatter plots of “Potential” entrepreneurship

Table 3 shows our estimation results for the spread between nascent and formal entrepreneurship. We find no relationship between this spread and domestic credit, which might suggest that start-ups are less dependent on formal bank financing (and depend more on personal savings). The strongest relationship among our investment climate variables is with closure costs—since the default rate of new firms is very high, firms that expect to get the lowest return on their investment might be least likely to undertake the time and cost of joining the formal sector (and benefiting from formal legal bankruptcy proceedings). We find the interaction terms of entry costs, entry procedures and recovery rates with GDP per capita to be significant barriers to starting (and closing) a business matter more in lower-income countries. Or, in other words, individuals in developing countries are only likely to have incentives to join the formal sector if entry barriers are low. A possible explanation is that many developing countries host substantial informal sectors, so entrepreneurs are able to operate entirely within the informal economy. For example, the ILO estimates 60 percent of the workforce in Asia to be in the informal sector (ILO 2007). Individuals can start businesses that meet demand, and derive supply, within the informal sector. In such cases, they have little actual need to join the formal sector in order to operate.

Table 3 The effect of the investment climate on “potential” nascent entrepreneurship

Table 4 shows the relationship between the spread with nascent entrepreneurs and measures of country risk. We find a strong and significant relationship with the composite risk index—again, individuals are more likely to choose and succeed in joining the formal sector if the political, economic and financial risks are low. Furthermore, the interaction with law and order is significant.

Table 4 The effect of country risks on “potential” nascent entrepreneurship

Next, we use as our dependent variable the spread between young business—both formal and informal—and formal entrepreneurship. We expect this spread to be the largest in countries with weaker business environments (and larger informal sectors). Table 5 shows that in this case, in addition to recovery rates, entry procedures (and the interaction with GDP per capita) is significant, i.e., entry barriers matter. Table 6 shows that law and order—legal and judicial efficiency—is the most important determinant in the decision whether or not to operate in the formal sector and/or to register as a limited-liability company.

Table 5 The effect of the investment climate on “potential” young entrepreneurship
Table 6 The effect of country risks on “potential” young entrepreneurship

The results raise one interesting question. As entry barriers increase, the spread between the informal and the formal sector rises, as expected, and as entry procedures fall, the spread between the formal and informal sector falls. The implication is that barriers to entry are greater for corporate entrepreneurship than for young businesses that have not incorporated or for nascent entrepreneurs where they are in the process of starting a business. However, in developed countries, the spread between the informal and formal sectors not only decreases, but is often positive; i.e., the number of limited-liability companies is greater than the sum of sole proprietors and informal firms. This implies that it is at least as easy to start a limited liability company as a sole proprietorship.

4 Conclusion

The purpose of this paper is to compare two datasets designed to capture entrepreneurial dynamics: the GEM data for early stage entrepreneurial activity and the World Bank Entrepreneurship Group dataset for formal business registration. We find a number of important differences in the data. First, the GEM data tend to report significantly lower levels of early stage entrepreneurial activity in developed countries. In other words, it is more common to start a formal business in a developed country than a sole proprietorship. Second, the GEM data tend to be higher for developing countries than for developed countries. One possible explanation is the distinction between intent and informality of entrepreneurial activity particularly in developing countries that is captured by GEM data. However, important exceptions to this are found for both the United States and Germany in particular. This suggests that firms in developed countries have greater ease and incentives to incorporate, both for the benefits of greater access to formal financing and labor contracts, as well as for tax and other purposes not related to business activities.