The Federal Reserve Board eagle logo links to home page

Skip to: [Printable Version (PDF)] [Bibliography] [Footnotes]
Finance and Economics Discussion Series: 2013-20 Screen Reader version

Taxation of Human Capital and Wage Inequality: A Cross-Country Analysis

Fatih Guvenen*
Burhanettin Kuruscu1
Serdar Ozkan2

Keywords: Wage inequality, human capital, skill-biased technical change, tax policies

Abstract:

Wage inequality has been significantly higher in the United States than in continental European countries (CEU) since the 1970s. Moreover, this inequality gap has further widened during this period as the US has experienced a large increase in wage inequality, whereas the CEU has seen only modest changes. This paper studies the role of labor income tax policies for understanding these facts, focusing on male workers. We construct a life cycle model in which individuals decide each period whether to go to school, work, or stay non-employed. Individuals can accumulate skills either in school or while working. Wage inequality arises from differences across individuals in their ability to learn new skills as well as from idiosyncratic shocks. Progressive taxation compresses the (after-tax) wage structure, thereby distorting the incentives to accumulate human capital, in turn reducing the cross-sectional dispersion of (before-tax) wages. Consistent with the model, we empirically document that countries with more progressive labor income tax schedules have (i) significantly lower before-tax wage inequality at different points in time and (ii) experienced a smaller rise in wage inequality since the early 1980s. We then study the calibrated model and find that these policies can account for half of the difference between the US and the CEU in overall wage inequality and 84% of the difference in inequality at the upper end (log 90-50 differential). In a two-country comparison between the US and Germany, the combination of skill-biased technical change and changing progressivity of tax schedules explains all the difference between the evolution of inequality in these two countries since the early 1980s.

JEL Classification: None

1 Introduction

Why is wage inequality significantly higher in the United States than in continental European countries (CEU)? And why has this inequality gap between the US and the CEU widened substantially since the 1970s (see Table 1)? More broadly, what are the determinants of wage dispersion in modern economies? How do these determinants interact with technological progress and government policies? The goal of this paper is to shed light on these questions by studying the impact of labor market (tax) policies on the determination of wage inequality, focusing on male workers and using cross-country data.

We begin by documenting two new empirical relationships between wage inequality and tax policy. First, we show that countries with more progressive labor income tax schedules have significantly lower wage inequality at different points in time.3 The measure of wages we use is "gross before-tax wages" and can therefore be thought of as a proxy for the marginal product of workers.4 From this perspective, progressivity is associated with a more compressed productivity distribution across workers. Second, we show that countries with more progressive income taxes have also experienced a smaller rise in wage inequality over time, and this relationship is especially strong above the median of the wage distribution. These findings reveal a close relationship between progressivity and wage inequality, which motivates the focus of this paper. However, on their own, these correlations fall short of providing a quantitative assessment of the importance of the tax structure--e.g., what fraction of cross-country differences in wage inequality can be attributed to tax policies? For this purpose, we build a model.


Table 1: Log Wage Differential Between the 90th and 10th Percentiles (Male Workers)
  1978-1982 average 2001-2005 average Change
Denmark - 0.97 -
Finland 0.89 0.94 0.05
France 1.22 1.14 -0.08
Germany 0.93 1.06 0.07
Netherlands 0.84 1.05 0.11
Sweden 0.73 0.87 0.14
CEU 0.92 1.01 0.06
UK 0.99 1.28 0.29
US 1.28 1.60 0.32

Specifically, we construct a life cycle model that features some key determinants of wages--most notably, human capital accumulation and idiosyncratic shocks. Here is an overview of the framework. Individuals enter the economy with an initial stock of human capital and are able to accumulate more human capital over the life cycle using a Ben-Porath (1967) style technology (which essentially combines learning ability, time, and existing human capital for production). Individuals can choose to either invest in human capital on the job up to a certain fraction of their time or enroll in school where they can invest full time. We assume that skills are general and labor markets are competitive. As a result, the cost of on-the-job investment will be borne by the workers, and firms will adjust the wage rate downward by the fraction of time invested on the job. Therefore, the cost of human capital investment is the forgone earnings while individuals are learning new skills.

We introduce two main features into this framework. First, we assume that individuals differ in their learning ability. As a result, individuals differ systematically in the amount of investment they undertake and, consequently, in the growth rate of their wages over the life cycle. Thus, a key source of wage inequality in this model is the systematic fanning out of the wage profiles.5 Second, we allow for endogenous labor supply choice, which amplifies the effect of progressivity, a point that we return to shortly. Finally, for a comprehensive quantitative assessment, we also allow idiosyncratic shocks to workers' labor efficiency and model differences in consumption taxes and pension systems, which vary greatly across these countries.

The model described here provides a central role for policies that compress the wage structure--such as progressive income taxes--because such policies hamper the incentives for human capital investment. This is because a progressive system reduces after-tax wages at the higher end of the wage distribution compared with the lower end. As a result, it reduces the marginal benefit of investment (the higher wages in the future) relative to the marginal cost (the current forgone earnings), thereby depressing investment. A key observation is that this distortion varies systematically with the ability level--and, specifically, it worsens with higher ability--which then compresses the before-tax wage distribution. These effects of progressivity are compounded by endogenous labor supply and differences in average income tax rates: the higher taxes in the CEU reduce labor supply--and, consequently, the benefit of human capital investment--further compressing the wage distribution.

The main quantitative exercise we conduct is the following. We consider the eight countries listed in Table 1, for which we have complete data for all variables of interest. We assume that all countries have the same innate ability distribution but allow each country to differ in the observable dimensions of its labor market structure, such as in labor income (and consumption) tax schedules and retirement pension system. We then calibrate the model-specific parameters to the US data and keep these parameters fixed across countries. The policy differences we consider explain about half of the observed gap in the log 90-10 wage differential between the US and the CEU in the 2000s and 84% of the wage inequality above the median (log 90-50 differential). The model explains only about 24% of the difference in the lower tail inequality between the US and the CEU, which is consistent with the idea that the human capital mechanism is likely to be more important for higher ability individuals and, therefore, above the median of the distribution. We also provide a decomposition that isolates the roles of (i) the progressivity of income taxes, (ii) average income tax rates, (iii) consumption taxes, and (iv) the pension system. We find that progressivity is by far the most important component, accounting for about 2/3 of the model's explanatory power.

The second question we ask is whether the widening of the inequality gap between the US and the CEU since the late 1970s could also be explained by the same human capital channels discussed earlier. One challenge we face in trying to answer this question is that the country-specific tax schedules that we derive in this paper are only available for the years after 2001 (because the detailed information from OECD sources for taxes is only available after that date), whereas the tax structure has changed over time for several of the countries in our sample. Fortunately, for two countries in our sample--the US and Germany--we are also able to derive tax schedules for 1983, which reveal significantly more flattening of tax schedules in the US compared with Germany from 1983 to 2003. When these changes in progressivity and skill-biased technical change (SBTC) are jointly taken into account, the (recalibrated) model generates a much larger rise in inequality in the US than in Germany, in fact, slightly overestimating the actual widening of the inequality gap between these countries.

Finally, in section 6, we test some key implications of our model for lifecycle behavior using micro data. First, the model predicts that a country with a more progressive tax system should have a flatter age profile of average wages (by dampening human capital accumulation) compared with a less progressive one. Similarly, progressivity will imply a flatter profile of within-cohort wage inequality over the life cycle. We provide a comparison of the United States (using the Panel Study of Income Dynamics, PSID, data) and Germany (using the German Socio-Economic Panel, GSOEP) and find strong support for both predictions.


2 The Model

We begin by describing the features of the human capital investment problem. Using this environment, we discuss the various channels through which tax policy affects wage inequality. We then enrich this framework by introducing empirically relevant features (such as idiosyncratic shocks and labor market institutions) that are necessary for a sound quantitative analysis.

2.1 Human Capital Accumulation

Consider an individual who derives utility from consumption and leisure and has access to borrowing and saving at a constant interest rate,  r. Let  \beta be the subjective time discount factor and assume  \beta(1+r)=1. Each individual has one unit of time in each period, which he can allocate to three different uses: work, leisure, and human capital investment. If an individual chooses to work, he can allocate a fraction ( i) of his working hours ( n) to human capital investment. At age  s, new human capital,  Q_{s}, is produced according to a Ben-Porath technology:

\displaystyle Q_{s}=A^{j}\left(h_{s}i_{s}n_{s}\right)^{\alpha}, (1)

where  h_{s} denotes the individual's current human capital stock and  A^{j} is the learning ability of individual type  j. We assume that skills are general and labor markets are competitive. As a result, the cost of human capital investment is completely borne by workers, and firms adjust the hourly wage rate,  w_{s}, downward by the fraction of time invested on the job:  w_{s}=P_{H}h_{s}(1-i_{s}), where  P_{H} is the price of human capital; labor income is simply  y_{s}=w_{s}n_{s}. Finally, let  \bar{\tau}(y) and  \tau(y) denote, respectively, the average and marginal labor income tax functions. The problem of a type  j individual can be written as
\displaystyle \max_{c_{s},a_{s+1},i_{s}} \displaystyle \sum_{s=1}^{S}\beta^{s-1}u(c_{s},1-n_{s})    
\displaystyle \textrm{s.t.}\qquad c_{s}+a_{s+1} \displaystyle =(1-\bar{\tau}(y_{s}))y_{s}+(1+r)a_{s}    
\displaystyle h_{s+1} \displaystyle =h_{s}+A^{j}\left(h_{s}i_{s}n_{s}\right)^{\alpha} (2)
\displaystyle y_{s} \displaystyle =P_{H}h_{s}(1-i_{s})n_{s}. (3)

The opportunity "cost of investment" (in human capital units) is equal to  h_{s}i_{s}n_{s} and, using equation (1), it can be written as  C_{j}(Q_{s}^{j})=\left(Q_{s}^{j}/A^{j}\right)^{1/\alpha}, which will play a key role in the optimality conditions that follow.

A key parameter in the Ben-Porath technology is  A^{j}. Heterogeneity in  A^{j} implies that individuals will differ systematically in the amount of human capital they accumulate and, consequently, in the growth rate of their wages over the life cycle. This systematic fanning out of wage profiles is the major source of wage inequality in this model.


2.2 Inspecting the Mechanisms

We are now ready to discuss how taxation of human capital can affect wage inequality. To this end, it is useful to distinguish between two cases.

Inelastic Labor Supply.

First, suppose that labor supply is inelastic. Assuming an interior solution, the optimality condition for human capital investment is

\displaystyle \left(1-\tau(y_{s})\right)C_{j}^{\prime}(Q_{s}^{j})= \displaystyle \{{\color{black}{\color{blue}{\color{black}\beta}{\color{black}\left({\color{black}1-\tau(y_{s+1})}\right)}}}+\beta^{2}\left(1-\tau(y_{s+2})\right)+..+\beta^{S-s}\left(1-\tau(y_{S})\right)\}, (4)

which equates the after-tax marginal cost of investment on the left hand side to the after-tax marginal benefit on the right.6 To understand the effect of taxes, first consider the case where taxes are flat rate (  \tau'(y)=0,\:\forall y,). In this case, all terms involving taxes cancel out:
\displaystyle C_{j}^{\prime}(Q_{s}^{j})= \displaystyle \{{\color{blue}{\color{black}\beta}}+\beta^{2}+..+\beta^{S-s}\}.    

Thus, flat-rate taxes have no effect on human capital investment. This is a well-understood insight that goes back to at least Heckman (1976) and Boskin (1977).7

Now consider progressive taxes, i.e.,  \tau'(y)>0. We rearrange equation (4) to get:

\displaystyle C_{j}^{\prime}(Q_{s}^{j})= \displaystyle \{{\color{blue}{\color{black}{\color{black}\beta}\frac{1-\tau(y_{s+1})}{1-\tau(y_{s})}}}+\beta^{2}{\color{black}{\color{black}{\color{blue}{\color{black}\frac{{\color{black}1-\tau(y_{s+2})}}{{\color{black}1-\tau(y_{s})}}}}}}+..+\beta^{S-s}\frac{1-\tau(y_{S})}{1-\tau(y_{s})}\}. (5)

With progressivity, as long as the individual's earnings grow over the life cycle, the tax ratios in (5) will be strictly less than one, depressing the marginal benefit of investment, which in turn dampens human capital accumulation. Thus, these tax ratios capture the reduction in the value of future wage earnings compared with the forgone wage earnings today. This observation motivates our first measure of progressivity, what we refer to as the progressivity wedge, defined as:

\displaystyle PW(y_{s},y_{s+k})\equiv1-\frac{1-\tau(y_{s+k})}{1-\tau(y_{s})}, (6)

between any two ages  s and  s+k. A progressivity wedge of zero corresponds to flat taxes, and progressivity increases with the size of the wedge. In the next section, we empirically measure these wedges from the data.

To understand the effect of progressive taxes on wage inequality, note that the distortion created by progressivity differs systematically across ability levels. At the low end, individuals with very low ability whose optimal plan involves no human capital investment in the absence of taxes would experience no wage growth over the life cycle and, therefore, no distortion from progressive taxation. At the top end, individuals with high ability (whose optimal plan implies low wage earnings early in life and very high earnings later) face very large wedges, which depress their investment. Thus, progressivity reduces the cross-sectional dispersion of human capital and, consequently, wage inequality in an economy, even with inelastic labor supply.

Endogenous Labor Supply.

Second, consider now the the case with elastic labor supply. The first order condition can be shown to be (see Appendix A.1) as follows:

\displaystyle C_{j}^{\prime}(Q_{s}^{j})= \displaystyle \{{\color{blue}{\color{black}{\color{black}\beta}\frac{1-\tau(y_{s+1})}{1-\tau(y_{s})}}}{\color{red}{\color{black}n}_{{\color{black}s+1}}}+\beta^{2}{\color{black}{\color{black}{\color{blue}{\color{black}\frac{{\color{black}1-\tau(y_{s+2})}}{{\color{black}1-\tau(y_{s})}}}}}}{\color{red}{\normalcolor n_{s+2}}}+..+\beta^{S-s}{\color{blue}{\color{black}\frac{1-\tau(y_{S})}{1-\tau(y_{s})}}}{\color{red}{\normalcolor n}_{{\normalcolor S}}}\}, (7)

where now the marginal benefit accounts for the utilization rate of human capital, which depends on the labor supply choice. Our second measure of progressivity is precisely motivated by this first order condition subject to a normalization:
\displaystyle PW_{i}^{*}(y_{s},y_{s+k})=1-\frac{1-\tau(y_{s+k})}{1-\tau(y_{s})}\left(\frac{n_{i}}{{\displaystyle n_{\text{avg}}}}\right), (8)

where  n_{i} is the hours per person in country  i and  n_{\text{avg}} is the average of  n_{i} across all countries in the sample.8

Now, once again, consider the effect of flat-rate taxes. The intra-temporal optimality condition for labor-leisure choice implies that labor supply depends negatively on the tax rate and positively on the level of human capital. A higher tax rate depresses labor supply choice (as long as the income effect is not too large), which then reduces the marginal benefit of human capital investment, which reduces the optimal level of human capital. But labor supply in turn depends on the level of human capital, which further depresses labor supply, the level of human capital, and so on. Therefore, with endogenous labor supply, even a flat-rate tax has an effect on human capital investment, which can also be large because of the amplification described here.

In summary, the baseline model studied here implies that countries with more progressive tax systems will have lower wage inequality. As will become clear later, these countries will also experience a smaller change in wage inequality in response to technological changes (such as SBTC). In Section 3, we examine these predictions empirically.

2.3 Enriching the Basic Framework

As stated earlier, the main goal of this paper is to provide a quantitative assessment of the importance of the tax structure--e.g., what fraction of cross-country differences in wage inequality can be attributed to tax policies? For this purpose, we introduce several empirically relevant features that are necessary for a sound quantitative analysis.

Upper Bound on On-the-Job Investment.

We impose an upper bound on the fraction of time that can be devoted to on-the-job investment:  i\in[0,\chi], where  \chi<1. Such an upper bound would arise, for example, when firms incur fixed costs for employing each worker (administrative burden, cost of office space, etc.) or as a result of minimum wage laws. Individuals can invest full-time by attending school ( i=1) and enjoy leisure for the rest of the time. Thus, the choice set is  i\in[0,\chi]\cup\{1\}, which is non-convex when  \chi<1. Finally, human capital depreciates every period at rate  \delta<1.

Idiosyncratic Shocks.

It is difficult to talk about wage inequality without any sort of idiosyncratic shock. In a human capital model, these shocks would interact with investment choice and can greatly affect the quantitative conclusions we draw from the analysis. Thus, we introduce idiosyncratic shocks. Specifically, when an individual devotes  (1-i_{s})n_{s} hours producing for his employer, his effective labor supply becomes  \epsilon n_{s}(1-i_{s}), where  \epsilon is an idiosyncratic Markov shock with a stationary transition matrix  \Pi(\epsilon'\mid\epsilon) that is identical across agents and over the life cycle. Note that these shocks are not to the stock of human capital (as, for example, in Huggett et al. (forthcoming)). Instead, these can be viewed as shocks to the rental rate or to the efficiency of labor supply.

Market Structure.

A full set of one-period Arrow securities is available for trade at every date and state, allowing markets to be dynamically complete. An Arrow security that promises to deliver one unit of consumption good in state  \epsilon' tomorrow costs  q(\epsilon'\vert\epsilon) in state  \epsilon today. Individuals completely insure themselves against consumption risk by trading these securities. Hence, all individuals of a given type  j will have the same (and constant) consumption over the life cycle. However, individuals will have different realized paths of investment, human capital, labor supply, and wages.


Pension Benefits.

It is easy to see from the discussion above of equations (5) and (7) that the existence of a redistributive pension system will have an effect similar to progressive taxation. In addition, the retirement pension system represents a major use of tax revenues collected by governments. Therefore, modeling pensions is important for capturing how funds are returned to households.

During retirement, individuals receive constant pension payments every period. Essentially, the pension of a worker with ability level  j depends on two variables: (i) the average lifetime earnings of workers with the same ability level (denoted by  \overline{y}^{j}), and (ii) the total number of years the worker had Social Security eligible earnings by the time he retired, denoted by  m^{S}. The pension function is denoted as  \Omega(\overline{y}^{j},m^{S}).9

The Tax System and the Government Budget.

The government imposes a flat-rate consumption tax,  \bar{\tau}_{c}, in addition to the (potentially) progressive labor income tax,  \bar{\tau}(y).10 The collected revenues are used for two main purposes: (i) to finance the benefits system, and (ii) to finance government expenditure, G, that does not yield any direct utility to consumers (because of either corruption or waste). The residual budget surplus or deficit,  Tr, is distributed in a lump-sum fashion to all households.

2.4 Individuals' Dynamic Program

Individuals solve the following problem (ability type  j is suppressed for clarity):

\displaystyle V(h,a,m;\epsilon,s) \displaystyle = \displaystyle \max_{c,n,i,a'(\epsilon')}\left[u(c,n)+\beta E\left(V(h',a'(\epsilon'),m';\epsilon',s+1)\vert\epsilon\right)\right] (9)
\displaystyle \textrm{s.t}.      
\displaystyle (1+\bar{\tau}_{c})c+\sum q(\epsilon'\mid\epsilon)a'(\epsilon') \displaystyle = \displaystyle (1-\bar{\tau}(y))y+a+Tr, (10)
\displaystyle y \displaystyle = \displaystyle \epsilon h(1-i)n, (11)
\displaystyle h' \displaystyle = \displaystyle (1-\delta)h+A(hin)^{\alpha}, (12)
\displaystyle m' \displaystyle = \displaystyle m+1\{i<1\;\&\; n\geq n_{\min}\}, (13)
\displaystyle i \displaystyle \in \displaystyle [0,\chi]\cup\{1\},  

for  s=1,2,..,S. Equation (13) shows how individuals accumulate years of service,  m. Specifically, individuals get one more year of service credit if they are not in school ( i<1) and are employed more than a certain threshold number of hours:  n>n_{\min} .

After retirement, individuals receive a pension and there is no human capital investment. Since there is no uncertainty during retirement, a riskless bond is sufficient for smoothing consumption. Therefore, the problem at age  s=S+1,.,T can be written as

\displaystyle W^{R}(a,\overline{y}^{j},m^{S};s) \displaystyle =\max_{c,a'}\left[u(c,0)+\beta W^{R}(a',\overline{y}^{j},m^{S};s+1)\right] (14)
\displaystyle \textrm{s.t}\qquad(1+\bar{\tau}_{c})c+qa' \displaystyle =(1-\bar{\tau}(y_{s}))y_{s}+a+Tr    
\displaystyle y_{s} \displaystyle =\Omega(\overline{y}^{j},m^{S}).    

The definition of a stationary recursive competitive equilibrium in this environment is standard, so the formal statement is relegated to Appendix A.


3 Progressivity and Inequality: Two Empirical Facts

This section has two purposes. First, we discuss the derivation of country-specific tax schedules that are used in the rest of the paper. Using these tax schedules, we construct empirical measures of the two progressivity wedges defined in (6) and (8) above. Second, with these wedges on hand, we go on to document two new empirical relationships between wage inequality and the progressivity of (labor income) tax policy that are consistent with the presented model and further motivate the quantitative analysis that follows.11


3.1 Deriving Country-Specific Tax Schedules

Figure 1: Average Tax Rate Functions, Selected OECD Countries, 2003
Figure 1: Average Tax Rate Functions, Selected OECD Countries, 2003
The figure shows three line graphs, one for the US, one for Germany, and the third for Finland. The y-axis shows average labor income tax rate from -0.1 to 0.6, while the x-axis shows multiples of average earnings in each country from 0 to 8. For the US (the left graph), we see average labor income tax rate increases steeply between 0 to 2 times average earnings, but levels out at around a 0.35 tax rate for multiples greater than 3. For Germany (middle graph), we see average labor income tax rate increases even more steeply than the US between 0 to 3 times the average earning, and levels out at around a 0.6 tax rate for multiples greater than 4. For Finland (right graph), we see average labor income tax rate varies steeply between 0 and 5 times the average earning, and levels out at around 0.5 tax rate for multiples greater than 6. This figure shows that the tax schedule is most progressive in Finland, and least progressive in the U.S.  Germany's tax schedule is in the middle compared to the U.S. and Finland.

For each country, we follow the procedure described here. First, the OECD tax database provides estimates of the total labor income tax for all income levels between half of average wage earnings (hereafter, AW) to two times AW. The calculation takes into account several types of taxes (central government, local and state, social security contributions made by the employee, and so on), as well as many types of deductions and cash benefits (dependent exemptions, deductions for taxes paid, social assistance, housing assistance, in-work benefits, etc.).12 Using these estimates, we calculate the average labor income tax rate,  \bar{\tau}(y), for 50%, 75%, 100%, 125%, 150%, 175%, and 200% of AW. However, tax rates beyond 200% of AW are also relevant when individuals solve their dynamic program. Fortunately, another piece of information is available from the OECD: the top marginal tax rate and the top bracket corresponding to it for each country. As described in more detail in Appendix B.1, we use this information to generate average tax rates at income levels beyond two times AW. Then, we fit the following smooth function to the available data points:13

\displaystyle \bar{\tau}(y/AW)=a_{0}+a_{1}(y/AW)+a_{2}(y/AW)^{\phi}. (15)

Figure 2: Progressivity Wedges At Different Income Levels:  1-\frac{1-\tau(k\times0.5)}{1-\tau(0.5)}\;\textrm{for }k=2,3,.,6.
Figure 2: Progressivity Wedges At Different Income Levels
This figure shows a line graph with eight lines. In order, starting from highest endpoint: one green line representing Finland, one dark blue representing Denmark, one yellow representing Sweden, one pink representing the Netherlands, one turquoise representing Germany, one red representing France, one dark blue representing the United States, and one black representing the United Kingdom. The y-axis shows progressivity wedges, PW, defined by equation six which measures the progressivity of a tax schedule from 0 to 0.35.  The x-axis shows multiples of average earnings in each country from 0 to 3.5. 
Trends: All eight lines are increasing in a concave fashion, with the lines increasing at a steeper rate in the following order from lowest to highest: UK, US, FRA, GER, NET, SWE, FIN and DEN. With all data points beginning at (0.5, 0) the UK finishes at roughly (3, 0.14) the US (3, 0.15), FRA (3, 0.17), GER (3, 0.22), NET (3, 0.24), SWE (3, 0.27), DEN (3, 0.28) and FIN (3, 0.31). This figure shows that the US and UK have the least progressive tax system, Scandinavian countries have the most progressive tax systems and the remaining countries fall between these two categories.

The parameters of the estimated  \bar{\tau}(y) functions for all countries are reported in Appendix B.1, along with the  R^{2} values. Although the assumed functional form allows for various possibilities, all fitted tax schedules turn out to be increasing and concave. The lowest  R^{2} is 0.984 and the mean is 0.991, indicating a very good fit. In Figure 1, we plot the estimated functions for three countries: one of the two least progressive (United States), the most progressive (Finland), and one with intermediate progressivity (Germany).

Figure 2 plots the progressivity wedges computed from the estimated tax schedules for all countries in our sample. Specifically, each line plots  PW(0.5,0.5k) and  k=1,2,..,6, which are essentially the wedges faced by an individual who starts life at half the average earnings in that country and looks toward an eventual wage level that is up to six times his initial wage. As seen in the figure, countries are ranked in terms of their progressivity. Consistent with what one could conjecture, the US and the UK have the least progressive tax system, whereas Scandinavian countries have the most progressive ones, and larger continental European countries are scattered between these two extremes. The differences also appear quantitatively large (although a more precise evaluation needs to await the quantitative analysis in the next section): for example, the marginal benefit of investment for a young worker in the US who invests today when his wage is  0.5\times AW and expects to earn  2\times AW in the future is 13% lower than in a flat-tax system. The comparable loss is 27% in Denmark and Finland. These differences grow with the ambition level of the individual, dampening human capital investment, especially at the top of the distribution.


3.2 Taxes and Inequality: Cross-Country Empirical Facts

Figure 3: Progressivity Wedge (P(0.5, 2.5)) and the L90-10 in 2003.
Figure 3: Progressivity Wedge (PW(0.5, 2.5)) and the L90-10 in 2003.
This figure shows a scatter plot with eight scatter points representing countries. From top left to bottom right the points are as follows: US, UK, FRA, GER, NET, SWE, DEN, and FIN with fitted regression line. The y-axis shows, Log 90-10 Wage Differential ranging from 0.9-1.6. The x-axis shows, Progressivity Wedge ranging from 0.12-0.3. The regression line is downward sloping with a correlation, shown above the scatter plot, of -0.82. This figure shows that countries with a smaller wedge have a higher wage inequality.

The wage inequality data come from the OECD's Labour Force Survey database and are derived from the gross (before-tax) wages of full-time, full-year (or equivalent) workers.14 This is the appropriate measure for the purposes of this paper, as it more closely corresponds to the marginal product of each worker (and, hence, his wage) in the model. The fact that the inequality data pertain to before-tax wages is important to keep in mind; if the data were for after-tax wages, the correlation between progressivity and inequality would be mechanical and, thus, not surprising at all. Furthermore, we focus on male workers to avoid potential selection issues that may arise due to wide differences in female labor force participation rates across countries.

We normalize AW in each country to 1 and focus on  PW(0.5,2.5) as the measure of progressivity. Similarly, when we calculate  PW^{*} for a given country, we use the average hours per person in that country between 2001 and 2005 for  n_{i} in equation (8), and the average of the same variable across all countries for  n_{\text{avg}}.15 Finally, for brevity, in the rest of the paper we will refer to the "log 90-10 wage differential" simply as "L90-10," and similarly for the other wage differentials.

Figure 3 plots the relationship between L90-10 and the progressivity wedge in the 2000s. Countries with a smaller wedge--meaning a less progressive tax system and, therefore, a smaller distortion in human capital investment--have higher wage inequality. The relationship is also quite strong with a correlation of -0.82.16 (Repeating the same calculation using  PW^{*} yields the same correlation.) Both relationships are consistent with the human capital model with progressive taxes presented above.

Figure 4: Progressivity Wedge* (PW*(0.5, 2.5)) and Changein L90-50 (Left) and L50-10 (Right): 1980 to 2003

Figure 4: Progressivity Wedge* (PW*(0.5, 2.5)) and Change in L90-50 (Left) and L50-10 (Right): 1980 to 2003
This figure shows two scatterplots side by side, each with a regression line. The left plot has seven scatter points representing countries. From top left to bottom right the points are as follows: US, UK, SWE, FIN, NET, FRA, and GER. The y-axis shows, Change in Log 90-50 Wage Differential: 1980-2003 and ranges from 0-0.2. The x-axis shows, Progressivity Wedge* and ranges from -0.05-0.3. The right plot has the same seven scatter points. From left to right they are as follows: US, UK, SWE, FIN, NET, FRA, and GER. The y-axis shows, Change in Log 50-10 Wage Differential: 1980-2003 and ranges from -0.1-0.15. The x-axis shows, Progressivity Wedge*and ranges from -0.05-0.3. In the left plot, the data are trending downward, with the slope of the regression line being negative, and the correlation, shown above the plot, being -0.91. In the right plot, the data are also trending downward, with the regression line being less negative in slope than in the left plot and the correlation, shown above the plot, being -0.27.  This figure shows countries with a more progressive tax system in the 2000s have experienced a smaller rise in wage inequality since the 1980s. This is especially true at the top of the wage distribution, where the correlation is much stronger.

Figure 4: Progressivity Wedge* (PW*(0.5, 2.5)) and Change in L90-50 (Left) and L50-10 (Right): 1980 to 2003
This figure shows two scatterplots side by side, each with a regression line. The left plot has seven scatter points representing countries. From top left to bottom right the points are as follows: US, UK, SWE, FIN, NET, FRA, and GER. The y-axis shows, Change in Log 90-50 Wage Differential: 1980-2003 and ranges from 0-0.2. The x-axis shows, Progressivity Wedge* and ranges from -0.05-0.3. The right plot has the same seven scatter points. From left to right they are as follows: US, UK, SWE, FIN, NET, FRA, and GER. The y-axis shows, Change in Log 50-10 Wage Differential: 1980-2003and ranges from -0.1-0.15. The x-axis shows, progressivity Wedge* and ranges from -0.05-0.3. In the left plot, the data are trending downward, with the slope of the regression line being negative, and the correlation, shown above the plot, being -0.91. In the right plot, the data are also trending downward, with the regression line being less negative in slope than in the left plot and the correlation, shown above the plot, being -0.27.  This figure shows countries with a more progressive tax system in the 2000s have experienced a smaller rise in wage inequality since the 1980s. This is especially true at the top of the wage distribution, where the correlation is much stronger.

We next turn to the change in inequality over time. Figure 4 plots  PW^{*} versus the change in L90-50 (left panel) and L50-10 (right panel). Countries with a more progressive tax system in the 2000s have experienced a smaller rise in wage inequality since the 1980s. The relationship is especially strong at the top of the wage distribution and weaker at the bottom: the correlation between progressivity and the change in L90-50 is very strong ( -0.91), whereas the correlation with L50-10 is much weaker (only -0.27); see Figure 4. This result is consistent with the idea that the distortion created by progressivity is likely to be effective especially strongly at the upper end, where human capital accumulation is an important source of wage inequality, but less so at the lower end, where other factors, such as unionization, minimum wage laws, and so on, could be more important.

Finally, Table 2 gives a more complete picture of the differences between the two definitions of wedges. The top panel reports the correlation of each wedge measure with log wage differentials, which reveals that the adjustment for utilization rates through labor hours makes little difference in the correlations in 2003. Turning to the change in inequality over time (bottom panel), the simple wedge measure has a somewhat lower correlation with log wage differentials. However, adjusting for average hours per person increases these correlations significantly to -0.66 for the L90-10, and to -0.91 for L90-50 (plotted in the left panel of Figure 4). We conclude that progressivity is strongly correlated with inequality both in the cross-section and over time, especially above the median of the distribution.

Overall, these findings reveal a close relationship between progressivity and wage inequality, which motivates the focus of this paper. However, on their own, these correlations fall short of providing a quantitative assessment of the importance of the tax structure. For this purpose, we now take the model to the data.


Table 2: Correlation Between Progressivity Measures and Wage Dispersion
Log wage differentials Measure of Wedge:  PW(0.5,2.5) Measure of Wedge:  PW^{*}(0.5,2.5)
2003: 90-10 -.82 -.82
2003: 90-50 -.84 -.67
2003: 50-10 -.70 -.91
Change from 1980 to 2003: 90-10 -.35 -.66
Change from 1980 to 2003: 90-50 -.58 -.91
Change from 1980 to 2003: 50-10 .13 -.27


4 Parameter Choices

We now discuss the parameter choices for the model. We focus on male workers so as to avoid potential selection issues across countries related to different labor market participation rates for female workers. Our basic calibration strategy is to take the United States as a benchmark and pin down a number of parameter values by matching certain targets in the US data.17 We then assume that other countries share the same parameter values with the US along unobservable dimensions (such as the distribution of learning ability), but differ in the dimensions of their labor market policies that are feasible to model and calibrate (specifically, consumption and labor income tax schedules and the retirement pension system). We then examine the differences in economic outcomes--specifically in wage dispersion and labor supply--that are generated by these policy differences alone.

A model period corresponds to one year of calendar time. Individuals enter the economy at age 20 and retire at 65 ( S=45). Retirement lasts for 20 years and everybody dies at age 85. The net interest rate,  r, is set equal to 2%, and the subjective time discount rate is set to  \beta=1/\left(1+r\right). The curvature of the human capital accumulation function,  \alpha, is set equal to 0.80, broadly consistent with the existing empirical evidence (see Browning et al. (1999, Table 2.3)). In Appendix D, we conduct sensitivity analyses with respect to  \alpha and consider cross-country variation in retirement age  S.

Utility Function.

Preferences over consumption,  c, and leisure time,  1-n, are given by this common separable form:

\displaystyle u(c,n)=\log(c)+\psi\frac{(1-n)^{1-\varphi}}{1-\varphi}. (16)

This specification yields two parameters to calibrate: the curvature of leisure,  \varphi, and the utility weight attached to leisure,  \psi. These parameters are jointly chosen to pin down the average hours worked in the economy, as well as the average Frisch labor supply elasticity. In 2003, the average annual hours worked by American males was 1,890 hours, or approximately 5.2 hours per day (Heathcote et al. (2010, figure 2)). Taking the discretionary time endowment of an individual to be 13 hours per day, we get  \overline{n}=5.2/13=0.4.18

With power utility, the theoretical Frisch elasticity of labor supply is given by  (1-n)/(n\varphi). Because in this model, labor supply,  n, varies across individuals, there is a distribution of Frisch elasticities. We simply target the Frisch elasticity implied by the average labor hours,  \overline{n}. The empirical target we choose is 0.3, which is consistent with the estimates for male workers surveyed by Browning et al. (1999), which range from zero to 0.5.19 As will become clear later, a higher Frisch elasticity improves the performance of our model, so in our baseline case we choose the relatively conservative value of 0.3.


Table 3: Baseline Parametrization
  Description Value
Parameter:  \varphi Curvature of utility of leisure 5.0 (Frisch = 0.3)
Parameter:  \psi Weight on utility of leisure 0.20
Parameter:  \alpha Curvature of human capital function 0.80
Parameter:  S Years spent in the labor market 45
Parameter:  T-S Retirement duration (years) 20
Parameter:  r Interest rate 0.02
Parameter:  \beta Time discount factor  1/(1+r)
Parameter:  \delta Depreciation rate of skills (annual)  1.5\%
Parameter:  E\left[h_{0}^{j}\right] Average initial human capital (scaling) 4.95
Parameters calibrated to match data targets:  E\left[A^{j}\right] Average ability 0.195
Parameters calibrated to match data targets:  \sigma\left(h_{0}^{j}\right)/E\left[h_{0}^{j}\right] Coeff. of variation of initial human capital 0.076
Parameters calibrated to match data targets:  \sigma\left[A^{j}\right]/E\left[A^{j}\right] Coeff. of variation of ability 0.396
Parameters calibrated to match data targets:  \gamma Dispersion of Markov shock 0.23
Parameters calibrated to match data targets:  p Transition probability for Markov shock 0.90
Parameters calibrated to match data targets:  \chi Maximum investment time on the job 0.50

Distributions: Learning Ability, Initial Human Capital, and Shocks.

Agents have two individual-specific attributes at the time they enter the economy: learning ability and initial human capital endowment. We assume that these two variables are jointly uniformly distributed in the population and are perfectly correlated with each other.20 Although the assumption of perfect correlation is made partly for simplicity, a strong positive correlation is plausible and can be motivated as follows. The present model is interpreted as applying to human capital accumulation after age 20 and, by that age, high-ability individuals will have invested more than those with low ability, leading to heterogeneity in human capital stocks at that age, which would then be very highly correlated with learning ability. Indeed, Huggett et al. (forthcoming) estimate the parameters of the standard Ben-Porath model from individual-level wage data and find learning ability and human capital at age 20 to be strongly positively correlated (corr: 0.792). Making the slightly stronger assumption of perfect correlation allows us to collapse the two-dimensional heterogeneity in  A^{j} and  h_{0}^{j} into one, speeding up computation significantly.

Therefore, this jointly uniform distribution of  (A^{j},h_{0}^{j}) yields four parameters to be calibrated.  E\left[h_{0}^{j}\right] is a scaling parameter and is simply set to a computationally convenient value, leaving three parameters: (i) the cross-sectional standard deviation of initial human capital,  \sigma\left(h_{0}^{j}\right), (ii) the mean learning ability,  E\left[A^{j}\right], and (iii) the dispersion of ability,  \sigma\left(A^{j}\right). The idiosyncratic shock process,  \epsilon, is assumed to follow a first-order Markov process, with two possible values,  \left\{ 1-\gamma,1+\gamma\right\} , and a symmetric transition matrix with  \Pr(\epsilon'=x\vert\epsilon=x)=p. This structure yields two more parameters,  \gamma and  p, to be calibrated--for a total of five parameters. The sixth and last parameter is  \chi (maximum investment allowed on the job). Finally, because there is measurement error in individual-level wage data, we add a zero mean i.i.d. disturbance to the wages generated by the model (which has no effect on individuals' optimal choices).

Data Targets.

Our calibration strategy is to require that the wages generated by the model be consistent with micro-econometric evidence on the dynamics of wages found in panel data on US households. Specifically, these empirical studies begin by writing a stochastic process for log wages (or earnings) of the following general form:

\displaystyle \log\widetilde{w}_{s}^{j} \displaystyle =\underset{\textrm{systematic comp. }}{\underbrace{\left[a^{j}+b^{j}s\right]}}+\underset{\textrm{stochastic comp.}}{\underbrace{z_{s}^{j}+\varepsilon_{s}^{j}}} (17)
\displaystyle z_{s}^{j} \displaystyle =\rho z_{s-1}^{j}+\eta_{s}^{j},    

where  \widetilde{w}_{s}^{j} is the "wage residual" obtained by regressing raw wages on a polynomial in age; the terms in brackets,  \left[a^{j}+b^{j}s\right], capture the individual-specific systematic (or life cycle) component of wages that result from differential human capital investments undertaken by individuals with different ability levels, and  z_{s}^{j} is an AR(1) process with innovation  \eta_{s}^{j}. Finally,  \varepsilon_{s}^{j} is an iid shock that could capture classical measurement error that is pervasive in micro data and/or purely transitory movements in wages. For concreteness, in the discussion that follows, we refer to the first two terms in brackets as the "systematic component" of wages and to the latter two terms as the "stochastic component."

We begin with  \varepsilon_{s} and assume that it corresponds to the measurement error in the wage data. This is consistent with the finding in Guvenen and Smith (2009) that the majority of transitory variation in wages is due to measurement error. Based on the results of the validation studies from the US wage data,21 we take the variance of the measurement error to be 10% of the true cross-sectional variance of wages in each country, which yields  \sigma_{\varepsilon}^{2}=0.034 for the United States. We then choose the following six moments from the US data to pin down the six parameters identified earlier:

  1. the mean log wage growth over the life cycle (informative about  E(A^{j})),
  2. the ratio of minimum to mean wage (informative about  \chi),
  3. the cross-sectional dispersion of wage growth rates,  \sigma(b^{j}) (informative about  \sigma(A^{j})),
  4. the cross-sectional variance of the stochastic component (informative about  \gamma),
  5. the average of the first three autocorrelation coefficients of the stochastic component of wages (informative about  p), and
  6. L90-10 in the population (which, together with the previous moments, is informative about  \sigma(h_{0}^{j})).
The target value for the mean log wage growth over the life cycle (i.e., the cumulative growth between ages 20 and 55) is 45%. This number is roughly the middle point of the figures found in studies that estimate lifecycle wage and income profiles from panel data sets, such as the Panel Study of Income Dynamics (PSID); see, for example, Gourinchas and Parker (2002) and Guvenen (2007). The second data moment is the legal minimum wage in the economy relative to the average wage of full-time workers, which, according to the OECD,22 was 0.29 for the US in the early 2000s. The third moment is the cross-sectional standard deviation of wage growth rates,  \sigma(b^{j}). The estimates of this parameter are quite consistent across different papers, regardless of whether one uses wages or earnings. We take our empirical target to be 2%, which represents an average of these available estimates (Baker (1997), Haider (2001), and Guvenen (2009)).

The next two moments capture key statistical properties of the stochastic component of wages in the data. These moments are (i) the unconditional variance of the stochastic component, (  z_{s}+\varepsilon_{s}), as well as (ii) the average of its first three autocorrelation coefficients. The empirical counterparts for these moments are taken from Haider (2001)Plain Lays the only study that estimates a process for hourly wages and allows for heterogeneous profiles. The figure for the unconditional variance can be calculated to be 0.109 and the average of autocorrelations is calculated to be 0.33, using the estimates in Table 1 of Haider's paper. Further details and justifications for these parameter choices are in Appendix C.23

Our sixth, and final, moment is L90-10 in 2003. Adding this moment ensures that the calibrated model is consistent with the overall wage inequality in the US in that year, which is the benchmark against which we measure all other countries. The empirical target value is 1.60 (from the OECD's Labour Force Survey). Table 4 displays the empirical values of the six moments, as well as their counterparts generated by the calibrated model. As can be seen here, all moments are matched fairly well.

One point to note is that even though the average of the first three autocorrelation coefficients is pretty low (0.33), the stochastic component includes measurement error as well, which is iid. The Markov shocks themselves have a first order annual autocorrelation of 0.80 (implied by  p=0.90, shown in Table 3).


Table 4: Empirical Moments Used for Calibrating Model Parameters
Moment Data Model
Mean log wage growth from age 20 to 55 0.45 0.44
Ratio of minimum to mean wage rate 0.29 0.30
Cross-sectional standard deviation of wage growth rates 2.00% 2.03%
Cross-sectional variance of stochastic component 0.109 0.106
Average of first three autocorrelation coeff. of stochastic component 0.33 0.34
L90-10 in 2003 1.60 1.60

Benefits System and the Government Budget.

A great deal of variation can be found across countries in the parameters that control the generosity, the duration, and the insurance component of the benefits system.24 We provide the exact formulas for each country in Appendix B.4. Turning to the government budget, the calibration of  G (the surplus wasted by the government) is challenging because of the difficulty of obtaining reliable estimates of its magnitude. In the baseline case, we assume  G=0. So, the government returns all the surplus to households in a lump-sum fashion (Tr). Relaxing this assumption and allowing for  G>0 has very little effect on the results (Appendix D).25

Consumption Taxes.

The average tax rate on consumption is taken from McDaniel (2007), who provides estimates for 15 OECD countries for the period 1950 to 2003 by calculating the total tax revenue raised from different types of consumption expenditures and dividing this number by the total amount of corresponding expenditure. McDaniel (2007) does not provide an estimate for Denmark, so we set this country's consumption tax equal to that of Finland, which has a comparable value-added tax (VAT) rate.

5 Quantitative Results

In this section, we begin by presenting the implications of the calibrated model for wage inequality differences across countries at a point in time. We then provide decompositions that quantify the separate effects of progressivity, average income tax rates, consumption taxes, and the pension system on these results. We next turn to the change in inequality over time and provide a comparison between the United States and Germany from 1983 to 2003. The model statistics below are computed from 10,000 simulated lifecycle paths for individuals drawn from the joint probability distribution of  (A^{j},h_{0}^{j}).

5.1 Cross-Sectional Results: the 2000s

Figure 5 plots L90-10 for each country in the data against the value predicted by the calibrated model. The correlation between the simulated and actual data is 0.91 (and the countries line up nicely along the regression line), suggesting that the model is able to capture the relative ranking of these eight countries in terms of overall wage inequality observed in the data. To explore how the model fares at different parts of the wage distribution, the middle panel of Figure 5 repeats the same exercise for L90-50 and the bottom panel does the same for L50-10. In both cases, the model-data correlations are high: 0.85.

Figure 5: Wage Dispersion: Model versus Data
[L90-10]
Figure 5: Wage Dispersion: Model versus Data
This figure shows three scatter plots, each with a regression line. All three show the same eight countries. The y-axis on all three plots is labeled, Data. The x-axis on all three plots is labeled, Model. In the left plot, labeled, L90-10, the countries, from bottom left to top right, are as follows: DEN, FIN, SWE, GER, NET, FRA, UK and US. The y-axis ranges from 0.8-1.6. The x-axis ranges from 1.2-1.65. In the middle plot, labeled, L90-50, the countries, from bottom left to top right, are as follows: DEN, FIN, SWE, GER, NET, FRA, UK and US. The y-axis ranges from 0.55-0.85. The x-axis ranges from 0.7-0.95. In the right plot, labeled, L50-10, the countries, from bottom left to top right, are as follows: DEN, FIN, SWE, GER, NET, FRA, UK, and US. The y-axis ranges from 0.35-0.8. The x-axis ranges from 0.5-0.68. In all three plots, the regression line is positive, with the left plot having a steeper slope than the middle and right plots. The correlation between the model and the data is shown above each plot, from left to right, are: 0.91, 0.85 and 0.85. This figure shows that the model is able to capture the relative ranking of these eight countries in terms of overall wage inequality observed in the data.

[L90-50]

Figure 5: Wage Dispersion: Model versus Data
This figure shows three scatter plots, each with a regression line. All three show the same eight countries. The y-axis on all three plots is labeled, Data. The x-axis on all three plots is labeled, Model. In the left plot, labeled, L90-10, the countries, from bottom left to top right, are as follows: DEN, FIN, SWE, GER, NET, FRA, UK and US. The y-axis ranges from 0.8-1.6. The x-axis ranges from 1.2-1.65. In the middle plot, labeled, L90-50, the countries, from bottom left to top right, are as follows: DEN, FIN, SWE, GER, NET, FRA, UK and US. The y-axis ranges from 0.55-0.85. The x-axis ranges from 0.7-0.95. In the right plot, labeled, L50-10, the countries, from bottom left to top right, are as follows: DEN, FIN, SWE, GER, NET, FRA, UK, and US. The y-axis ranges from 0.35-0.8. The x-axis ranges from 0.5-0.68. In all three plots, the regression line is positive, with the left plot having a steeper slope than the middle and right plots. The correlation between the model and the data is shown above each plot, from left to right, are: 0.91, 0.85 and 0.85. This figure shows that the model is able to capture the relative ranking of these eight countries in terms of overall wage inequality observed in the data.

[L50-10]

Figure 5: Wage Dispersion: Model versus Data
This figure shows three scatter plots, each with a regression line. All three show the same eight countries. The y-axis on all three plots is labeled, Data. The x-axis on all three plots is labeled, Model. In the left plot, labeled, L90-10, the countries, from bottom left to top right, are as follows: DEN, FIN, SWE, GER, NET, FRA, UK and US. The y-axis ranges from 0.8-1.6. The x-axis ranges from 1.2-1.65. In the middle plot, labeled, L90-50, the countries, from bottom left to top right, are as follows: DEN, FIN, SWE, GER, NET, FRA, UK and US. The y-axis ranges from 0.55-0.85. The x-axis ranges from 0.7-0.95. In the right plot, labeled, L50-10, the countries, from bottom left to top right, are as follows: DEN, FIN, SWE, GER, NET, FRA, UK, and US. The y-axis ranges from 0.35-0.8. The x-axis ranges from 0.5-0.68. In all three plots, the regression line is positive, with the left plot having a steeper slope than the middle and right plots. The correlation between the model and the data is shown above each plot, from left to right, are: 0.91, 0.85 and 0.85. This figure shows that the model is able to capture the relative ranking of these eight countries in terms of overall wage inequality observed in the data.

In Table 5, we quantify the importance of taxes for cross-country differences in inequality. The first two columns report L90-10 in the data for all countries, first in levels (second column) and then expressed as a deviation from the US, which is our benchmark country (third column). For example, in Denmark L90-10 is 0.97, which is 0.63 (i.e., 63 log points) lower than that in the US. The third and fourth columns display the corresponding statistics implied by the calibrated model. Again, for Denmark, the model generates an L90-10 that is 0.38 below what is implied by the model for the US. Therefore, the model accounts for 60% ( =38/63) of the difference in L90-10 between the US and Denmark, reported in column (e). Similar comparisons show that the model does quite well in explaining the level of wage inequality in Germany but poorly in explaining the UK. The fraction explained by the model ranges from 35% for France to 56% for Germany. Overall, the model accounts for 48% of the actual gap in inequality between the US and the CEU in 2003.

To see which part of the wage distribution is better captured by the model, the next two columns display the same calculation performed in column (e), but now separately for L90-50 (f) and L50-10 (g). For all countries in the CEU, the model explains the upper tail inequality much better than the lower tail inequality. For example, for Denmark, the model explains 97% of L90-50 versus only 31% of L50-10. In fact, the model accounts for at least 65% of L90-50 for all countries in the CEU, averaging 84% across all countries, whereas it accounts for on average only 24% of L50-10.26 That our model does a better job at explaining inequality at the upper end (above the median) will be a recurring theme of this paper. This finding is consistent with the idea that progressive taxation affects the human capital investment of high-ability individuals more than others and, therefore, the mechanism is more effective above the median of the wage distribution. Finally, a notable exception to these generally strong findings is the UK, which is an important outlier: the model explains very little of the difference between the UK and US at the upper tail (6% to be exact) and only slightly more (13%) at the lower end.


Table 5: Measures of Wage Inequality: Benchmark Model versus Data
  L90-10 Data Level (a) L90-10 Data  \Delta from US (b) L90-10 Model Level (c) L90-10 Model  \Delta from US (d) L90-10 % explained (d)/(b): (e) L90-50 % explain. (f) L50-10 % explain. (g)
Denmark 0.97 0.63 1.22 0.38 0.60 0.97 0.31
Finland 0.94 0.66 1.27 0.33 0.49 0.78 0.25
France 1.14 0.46 1.44 0.16 0.35 1.23 0.12
Germany 1.06 0.54 1.29 0.30 0.56 0.90 0.28
Netherlands 1.05 0.55 1.36 0.24 0.43 0.65 0.23
Sweden 0.87 0.73 1.28 0.31 0.43 0.75 0.26
CEU 1.00 0.59 1.31 0.29 48% 84% 24%
UK 1.28 0., 1.56 0.03 10 6 13
US 1.60 0.00 1.60 0.00      

Decomposing the Effects of Different Policies.

The baseline model incorporates several differences between the labor market policies of the US and those of the CEU countries. Here, we quantify the separate roles played by each of these components for the results presented in the previous section. We conduct three decompositions. First, we assume that countries in the CEU have the same retirement pension system as the US but differ in all other dimensions considered in the baseline model. This experiment separates the role of the tax system for wage inequality from that of the pension system. Second, we also set the consumption taxes of each country equal to that in the US, but each country retains its own income tax schedule as in the baseline model. This experiment quantifies the explanatory power of the model that is coming from the income tax system alone. Third, we go one step further and assume that each country keeps the same progressivity of its income tax schedule but is identical in all other ways to the US, including the average income tax rate. This experiment isolates the role of progressivity alone. In each case, we adjust the lump-sum transfers to balance the government's budget.

Table 6 reports the results. First, in column 2, we assume that all countries have the same pension system as the US. In panel A, the correlation between the data and model is only slightly lower than in the baseline case for all parts of the wage distribution. Turning to panel B, the fraction of the US-CEU difference explained by the model goes down--but only slightly--indicating that more than 95% of the model's explanatory power is coming from taxes (both income and consumption taxes). Next, in column (3), we also eliminate the differences in consumption taxes across countries. The model-data correlations go further down but, again, somewhat modestly. In panel B, the explanatory power of the model that is attributable to income taxes alone ranges from 75% to 80% for the three measures of wage inequality. The difference between columns 2 and 3 provides a useful measure of the role of consumption taxes, which account for about 17% (  =96\%-79\%) of the model's explanatory power for L90-10.


Table 6: Decomposing the Effects of Different Policies
Diff. from Benchmark: Benchmark (1) All taxes (2) Lab. Inc. Tax (3) Progressivity (4)
Progressivity -- -- -- --
Average income taxes -- -- -- set to US
Consumption tax -- -- set to US set to US
Benefits institutions -- set to US set to US set to US
A. Correlation Between Data and Model: 90-10 0.91 0.90 0.85 0.88
A. Correlation Between Data and Model: 90-50 0.85 0.87 0.85 0.87
A. Correlation Between Data and Model: 50-10 0.85 0.84 0.78 0.81
B. Fraction of US-CEU Difference Explained by Model: 90-10 0.48 0.46 (96%)  ^{\textrm{a}} 0.38 (79%) 0.32 (67%)
B. Fraction of US-CEU Difference Explained by Model: 90-50 0.84 0.79 (94%) 0.67 (80%) 0.55 (66%)
B. Fraction of US-CEU Difference Explained by Model: 50-10 0.24 0.23 (96%) 0.18 (75%) 0.16 (67%)
 ^{\textrm{a}}The numbers in parentheses express the fraction explained by the model in each column as a percentage of the benchmark case reported in column (1).

Next, we investigate whether the power of income taxes comes from differences in the average rates across countries or from differences in the progressivity structure. In other words, if continental Europe differed from the US only in the progressivity of its labor income tax system--but had the same average tax rate on labor income--how much of the differences in wage inequality found in the baseline model would still remain? To answer this question, we proceed as follows. First, adjusting the average tax rate to the US level--without affecting progressivity--requires some care. We show in Appendix B.2 how this can be accomplished. Then, using these hypothetical tax schedules, we solve each country's problem, assuming that all countries have identical labor market policies (set to the US benchmark) and their tax schedules generate the same average tax rate as in the US when using individuals' choices made using the US income tax schedule. In panel B of column 4, we see that progressivity alone is responsible for 2/3 of the explanatory power of the model for L90-10.

Notice that the decomposition we conducted here is not invariant to the order in which different features are eliminated. So, a valid question is whether this conclusion--that average tax rate differences do not matter much--is robust to changing this order. To investigate this, we repeated the last experiment reported in column 4, but instead of eliminating average tax rate differences and keeping progressivity intact, we flipped the order (same progressivity as the US, but match each country's average tax rate). In this case, the model only accounts for 14% of L90-10 differences, 20% of L90-50, and 10% of L50-10. This experiment confirms our previous conclusion that average tax rate differences are responsible for only a small fraction of the differences in wage inequality.

In summary, the pension system and consumption taxes together are responsible for about 20% of the model's explanatory power. The more important finding concerns the role of progressivity, which, for all practical purposes, is the key component of the income tax structure for understanding wage inequality differences. Differences in the average income tax rate do not appear to be very important for inequality differences.


The Role of Labor Supply Elasticity.

We now conduct two sensitivity analyses with respect to the value of labor supply elasticity: we consider (i) the case with a high Frisch elasticity of 0.5 and (ii) the case with only an extensive margin:  n\in\{0,0.40\}. In each case, the model is recalibrated to match the same six targets in Table 4. (Appendix D contains further sensitivity analyses with respect to the values of  \alpha,  \delta,  \chi,  G, as well as the treatment of capital income taxes.)


Table 7: Effect of Labor Supply Elasticity on Wage Inequality Differences
  Frisch = 0.5 L90-10 (a) Frisch = 0.5 L90-50 (b) Frisch = 0.5 Log 50-10 (c) Discrete hours:  n\in\{0,0.40\} L90-10 (d) Discrete hours:  n\in\{0,0.40\} L90-50 (e) Discrete hours:  n\in\{0,0.40\} Log 50-10 (f)
Denmark 0.69 1.07 0.40 0.34 0.53 0.21
Finland 0.57 0.88 0.31 0.29 0.43 0.17
France 0.39 1.32 0.16 0.17 0.56 0.07
Germany 0.68 1.01 0.40 0.29 0.42 0.17
Netherlands 0.48 0.70 0.27 0.27 0.38 0.17
Sweden 0.52 0.87 0.33 0.22 0.38 0.15
CEU 57% 94% 31% 26% 44% 16%
UK 13 6 17 2 3 6

In the first experiment we set  \varphi=3.0, which implies a Frisch elasticity of 0.5. Table 7 reports the counterpart of the analysis we conducted for the benchmark model and reported in Table 5. Comparing the two tables makes it clear that a higher Frisch elasticity improves the model's explanatory power across the board. Now the model can explain 57% of the US-CEU difference in L90-10 (compared with 48% in the benchmark case) and 94% of the upper tail inequality (from 84% before). However, the improvement in L50-10 is modest, going from 24% in the benchmark case up to 31%.

To better understand the role of the intensive margin of labor supply, we now examine another case where workers can only choose between full-time employment at fixed hours ( n=0.40) and nonemployment. The parameters of the utility function are the same as in the baseline case. The results are reported in the last three columns of Table 7. Without the amplification provided by an intensive margin--and the resulting dispersion in hours across countries--the explanatory power of the model falls and, in some cases, it falls significantly. For example, the model accounts for 26% of the difference in L90-10. For the upper-end inequality, the difference is even larger: the model now explains 44%, half of the baseline value, and also much lower than the 94% in the high Frisch case. Finally, the already low explanatory power at the lower tail falls further from 24% in the baseline case to 16%.

These findings underscore the importance of the interaction of endogenous labor supply choice (with an intensive margin) with progressive taxation for understanding wage inequality differences across countries, especially above the median of the distribution.

How to Introduce SBTC?

As noted earlier, in the standard Ben-Porath model studied so far, the price of human capital  (P_{H}) was simply a scaling factor and had no effect on any implication of the model, which is why we normalized it to 1 above. This is an important shortcoming when the goal is to study the changes in human capital investment over time in response to changes in the value of human capital, due to, for example, SBTC. Guvenen and Kuruscu (2010) proposed a tractable way to extend the Ben-Porath model that overcomes this difficulty. This extension basically involves introducing a second factor of production--raw labor ( \ell)--in addition to human capital,  h. The key assumption is that, unlike human capital, raw labor cannot be accumulated over the life cycle (it is fixed). Individuals supply both factors of production for a total hourly wage of  \left(P_{H}h_{s}+P_{L}\ell\right)(1-i_{s}) at age  s, where  P_{L} is now the price (wage) of raw labor. With this two-factor structure, a rise in  P_{H} does increase human capital investment. So SBTC could be modeled as a rise in  P_{H} over time with  P_{L} fixed. The formal statement of this model along with the calibration of SBTC are presented in Appendix D.7. (All parameters other than  P_{H} remain essentially unchanged in calibration.)

Comparing the United States and Germany.

Figure 6: Progressivity Wedges at Different Income Levels: US vs. Germany, 1983 and 2003
Figure 6: Progressivity Wedges at Different Income Levels: US vs. Germany, 1983 and 2003
This figure shows a line graph with four lines. Two separate red lines show data on Germany. One represents Germany in 1983, the other shows 2003. Two separate black dotted lines show data on the United States. One represents the United States in 1983, the other shows 2003. The y-axis shows progressivity wedges, PW, defined by equation 6 which measures the progressivity of a tax schedule, from 0 to 0.35.  The x-axis shows multiples of average earnings in each country from 0 to 3.5.
All four lines are concave down and increasing. All four lines have the same beginning point of (0.5, 0.05). Their end points are roughly as follows: US-1983 (3, 0.31), GER-1983 (3, 0.26), GER-2003 (3, 0.22), and US-2003 (3, 0.15). In 1983, the US and Germany had similar progressivity of the tax structure, with data beyond twice the average earning level showing the US actually had a more progressive system. This trend is reversed by 2003, with Germany having a more progressive system along the entire curve.
This figure shows that tax schedules became much less progressive in the U.S. compared with Germany between 1983 and 2003.

The procedure for constructing the 1983 tax schedules is described in Appendix B.3 and the resulting progressivity wedges are shown in Figure 6. As seen here, in 1983 the progressivity of the tax structure in the US and Germany was similar in both countries up to about twice the average earnings level. And above this point, the US actually had the more progressive system. Over time, the US became much less progressive, whereas the change in Germany was more gradual, making the US tax schedule much flatter than that of Germany over time.

Using these schedules, we conduct three experiments.27 In the first experiment, we assume that the tax schedules remained fixed throughout this period. We choose one parameter that controls the skill bias of technology,  P_{H}, to match the 32 log points rise in L90-10 in the US during the period. Note from column (1) of Table 8 that, in the data, L90-10 rose by only 13 log points in Germany during the same period. Turning to the model and assuming that Germany has been subject to the same SBTC as the US, the model generates a rise of 19 log points in L90-10 for Germany. Thus, whereas the inequality gap widens in the data by  32-13=19 log points, the model predicts  32-19=13 log points, explaining 68% (13/19) of the observed difference in the data.


Table 8: US vs Germany: Changing Tax Schedules and Changing Inequality
Taxes (SBTC) Data (1) Fixed (Calibrated to US) Model (2) Changing (Fixed) Model (3) Changing (Calibrated to US) Model (4)
Panel A: Change in L90-10 US 0.32 0.32 ^{a} 0.21 0.32 ^{a}
Panel A: Change in L90-10 GER 0.13 0.19 0.01 0.09
Panel A: Change in L90-10  \Delta(US-GER) 0.19 0.13 0.20 0.22
Panel B: Change in L90-50 US 0.22 0.23 0.15 0.23
Panel B: Change in L90-50 GER 0.05 0.14 0.01 0.06
Panel B: Change in L90-50  \Delta(US-GER) 0.17 0.09 0.14 0.17
Panel C: Change in L50-10 US 0.10 0.09 0.06 0.09
Panel C: Change in L50-10 GER 0.07 0.05 0.00 0.03
Panel C: Change in L50-10  \Delta(US-GER) 0.02 0.04 0.06 0.06
 ^{a}SBTC ( P_{H}) calibrated so that the model matches the rise in L90-10 for the US exactly.

Second, in column (3), we consider the case where the only change over time is in the tax schedules. We do not recalibrate any parameter to match targets in 1983. In the US, L90-10 rises substantially--by 21 log points--with no SBTC. Hence, the flattening of the tax schedule alone accounts for a significant fraction (about 2/3) of the rise in US wage inequality during this time. To our knowledge, this result is new in the literature. In contrast to the US, wage inequality barely changes (by 1 log point) in Germany. This experiment suggests that the dramatic fall in progressivity in the US and the small change in Germany alone could explain almost all of the widening inequality gap! Third, we now incorporate the change in tax schedules and re-calibrate SBTC such that we match the change in L90-10 for the US.28 Now, L90-10 rises by 9 log points in Germany. Thus, the model slightly over-explains--by 16% (  =0.22/0.19-1.0)--the widening gap in the data.

Panels B and C of the table explore how much of the widening gap has occurred at the top and bottom of the distribution. In the data, the L90-50 gap between the US and Germany rose by 17 log points, whereas the L50-10 gap increased by only 2 log points. Therefore, a remarkable fact is that virtually all of the rise in the inequality gap occurred because top-end inequality increased much more in the US (by 0.22) than in Germany (by 0.05). This observation strongly indicates that to understand the widening inequality gap, one needs to understand the economic forces that operate above the median of the wage distribution--and the human capital channels studied here provide one important candidate. To quantify these human capital effects, we turn to column (4): the model generates the same 17 log points rise in the L90-50 gap as in the data, and overstates the L50-10 gap observed in the data by 4 log points.

While these results are encouraging, a caveat must be noted. First, wage inequality in 1983 depends not only on the tax schedule in 1983, but also on the tax schedules that were in place several years prior, since the dispersion in human capital across individuals results from investments made in previous years. Clearly, the same comment applies to 2003. Although in our exercise we do not account for this fact, it is not clear which way this biases the results. This is because the US tax system was even more progressive before the Economic Recovery Tax Act of 1981, whereas the progressivity change in the years preceding 2003 (say, from 1990 to 2003) was more modest. Therefore, if we were to use a time average of tax schedules in our exercise (say, 1973 to 1983 and 1993 to 2003), we conjecture that the reduction in progressivity over time could be larger than we assumed in the experiment just described (which would attribute an even larger role to taxes). A more complete examination of this issue is an exciting topic for future research.

6 Microeconomic Evidence on the Mechanism

The model also makes predictions for how the lifecycle profile of wages and hours varies across countries. In particular, because progressivity dampens human capital investment, average wages should grow more slowly over the life cycle in the CEU. Similarly, because progressivity compresses the cross-sectional distribution of human capital investment, wage inequality should rise less over the life cycle in the CEU. Testing these two predictions requirespanel data on wages (to disentangle the age profile from time or cohort effects), which is difficult to obtain on a comparable basis for the CEU countries in our sample.29 An exception is the German Socio-Economic Panel (GSOEP), which includes information on wages and hours of German individuals and is available to outside (non-European Union) researchers. In this section, we make use of this dataset and the PSID for the United States to provide a two-country comparison of lifecycle profiles.

Figure 7: Lifecycle Profile of Mean Log Wages: US vs Germany
Figure 7: Lifecycle Profile of Mean Log Wages: US vs. Germany
This figure shows a line graph with two lines, one blue line representing United States the other is a red line representing Germany. The y-axis shows mean log wage and ranges from 0 to 0.45; the x-axis shows age in intervals of 5, ranging from 25 to 55. Both lines curve upwards, though the blue line rises higher and is steeper than the red. The red line rises more flatly and levels out beginning x=45 years at y = 0.2 mean log wage. This figure shows Germany, who has a more progressive tax system, has a flatter average wage profile than the U.S., who has a less progressive tax system.


6.1 Wages and Hours over the Lifecycle: US vs Germany

We focus on male workers who are between 25 and 55 years of age to minimize the effects of early retirement behavior and the consequent fall in employment rates at later ages. The PSID data cover 1968-1992 and the GSOEP data cover 1984 to 2007.

Wages.

Figure 7 plots the lifecycle profile of mean log wages in the US and Germany. The profiles are extracted from panel data by cleaning cohort effects following the usual procedure in the literature; see Appendix E for details. As seen in the figure, from age 25 to 55 the average wage profile rises by 36 log points in the US, but by only 21 log points in Germany, consistent with the prediction of the model that a more progressive tax system generates a flatter average wage profile. Next, figure 8 plots the lifecycle profile of wage inequality (again controlled for cohort effects) for the two countries. In the US, the variance of log wages rises by 26 log points, compared to 15 log points for Germany. Again, inequality rises more over the lifecycle in the less progressive country, consistent with the mechanism in the model.

Although, in figure 8 we normalized the intercept to zero (to help visual comparison), a relevant question is, how much wage inequality is there at the time workers enter the labor market? To answer this question, we compute the variance of log wages for workers between ages 23 and 27 and find it to be very similar in both countries: 0.251 in the US and 0.260 in Germany.30 This implies that virtually all the difference in wage inequality between Germany and the United States documented in the previous section is generated by the faster rise of inequality over the lifecycle in the US compared to Germany and almost none is due to differences in initial inequality. (Incidentally, this finding is also reassuring, given that our model assumes identical inequality at age 20.)

Finally, instead of controlling for cohort effects as we did above, one can alternatively control for time effects. Using this approach, mean log wages rise by 0.37 in the US compared with 0.27 in Germany. Inequality rises by 0.12 in the US compared with only 0.02 in Germany. Thus, while the magnitudes change, the rankings of the two countries remain the same under this alternative approach.31

Figure 8: Within-Cohort Variance of Log Wages: US vs Germany
Figure 8: Within-Cohort Variance of Log Wages: US vs Germany
This figure shows a line graph with two lines, a blue line representing the US (PSID) and a red dashed line representing Germany (GSOEP). The y-axis shows the variance in log wages and ranges from 0-0.3. The x-axis shows age and ranges from 25-55. Both lines are trending up; however, the US line is trending up more rapidly than the German line. This figure shows inequality rises more rapidly over the lifecycle in the less progressive country.

A complementary piece of evidence is presented in Domeij and Floden (2010) from Sweden. These authors construct the analog of figure 8 for Sweden and find that the rise in wage inequality over the life cycle is much smaller than in both the US and Germany.32 Given the high progressivity of income taxes in Sweden compared with the US and Germany, this outcome is exactly what is predicted by the present model.

Labor Hours.

We begin with the dispersion in hours. In Germany (GSOEP), the standard deviation of log hours is 0.369 compared with 0.324 in the United States (PSID).33 It is a well-known fact that incomplete markets models without preference heterogeneity severely understate the level of hours inequality (c.f. Erosa et al. (2009)) and our model is no exception. In the model,  \sigma(log (n))=0.112 in the US and 0.128 in Germany.34 Despite missing on the levels, the model is consistent with the fact that hours inequality is somewhat higher in Germany than in the US.

At first blush, it may seem surprising that the model implies higher dispersion in the more progressive country. The reason has to do with lump sum transfers, which happens to work in the opposite direction to progressivity in this two-country comparison. Specifically, the calibrated model implies that lump-sum transfers in Germany are more than twice as large as in the US. By their nature, these transfers create a larger wealth effect on low-income individuals (it is a larger fraction of their income) and, therefore, reduce their labor supply more than that of higher-income individuals. Thus, countries with higher lump-sum payments (or more redistributive government services), ceteris paribus, have higher hours inequality. To illustrate this point, we solve the model for Germany by fixing the lump sum transfers to the same fraction as in the US and assume the rest of the budget surplus yields no utility. The implied standard deviation of log hours falls from 0.128 to 0.098, which is now lower than in the US. Therefore, the predictions of the model regarding hours inequality is in general ambiguous, being driven by progressivity and the size of lump-sum transfers.

As for average hours, the prediction of the model is much clearer: countries with more progressive taxes should have lower average hours. Consistent with this prediction, it is well documented that Americans on average work much longer hours than Europeans (Prescott (2004), Ohanian et al. (2008)). Here we show that the same is true when we focus on male workers. For Germany, Wanger (2006, Table 3) reports that the average hours per (male) worker in 2003 was 1,557 hours. For the same year, Heathcote et al. (2010, figure 2) report that the average hours per (male) person was 1890 hours, or 21% higher than in Germany.35 Given that hours per worker must be higher than hours per person, this provides a lower bound on the gap between German and US males. This gap is even higher than what is predicted by the model (which is 12.3%).

Overall, the lifecycle evidence on wages and hours documented in this section are in line with--and therefore provide further support to--the human capital mechanism that operates in our model.

6.2 Survey Measures of Human Capital Inequality

So far we have focused on the model's implications for variables that are easily measured in the data, such as wages and hours. However, the model also makes very clear predictions about how human capital dispersion should vary by country (or with the progressivity of the country's tax system). We now test three such predictions in the data.

To conduct this analysis, we need an empirical measure of human capital at the individual-level for the countries in our sample. The data source we use is the International Adult Literacy Survey (IALS), which is a large-scale, international comparative assessment designed to measure a range of skills linked to the economic characteristics of the adult population (ages 16 to 65) within and across nations. The IALS has been extensively used as a measure of human capital of the working age population in the literature (see, among others, Leuven et al. (2004); Nickell and Bell (1995); Devroye and Freeman (2000) and the references therein). We use data from the 1998 survey--the latest available--which contains data from seven of the eight countries in our sample, the exception being France.

First, we investigate whether, in the data, higher wage dispersion in a given country is accompanied with larger human capital dispersion, as robustly predicted by our model. Column (1) of Table 9 reports the cross-country correlations between wage and human capital dispersions, the latter measured by the IALS quantitative literacy test score.36 Each correlation is computed using the same measure of dispersion for both variables (L90-10, L90-50, or L50-10). The correlations are strong regardless of the part of the distribution we focus on. Although not reported in the table, the test score dispersion also varies significantly across countries. For example, the country with--by far--the largest dispersion is the US, with a 90-10 percentile ratio of 2.26 (as measured by the quantitative score), followed by the UK with 1.83. At the other end lie the Scandinavian countries with a 90-10 percentile ratio of 1.45. (The prose and document literacy tests reveal even larger gaps.)


Table 9: Human Capital Dispersion
Dispersion measure  \downarrow Cross-Country Correlation of Wage Dispersion (Data) Test Score Dispersion (Data) with: Human Capital Dispersion (Model)
L90-10 0.88 0.88
L90-50 0.89 0.78
L50-10 0.77 0.88

Second, we compare the human capital dispersion implied by the model to that found in the data across countries. Column (2) of Table 9 reports the correlations between the human capital dispersion in the model and those measured by the IALS data. The correlation is robust, ranging from 0.78 to 0.88. Third, and as discussed earlier, our model predicts that countries with a more progressive tax system will have less dispersion in human capital across individuals. Using  P(0.5,2.5), the measure of wedge employed earlier, the correlation with the L90-10 measure of IALS human capital dispersion is -0.79. (Using other test results or alternative wedges (e.g.,  P(0.5,0.5k),k=2,3,..,6) yields equally strong results.)

When these three empirical findings from survey data are put together with the evidence on the lifecycle profiles of wages from US and Germany, they provide strong support to the human capital mechanism that is operational in our model.

7 Conclusions

In this paper, we have studied the effects of progressive labor income taxation on wage inequality when a major source of wage dispersion is differential rates of human capital accumulation. To understand the main mechanisms and their quantitative importance, we have examined differences in wage inequality between the United States and seven European countries, which differ significantly in their income tax structures as well as in other dimensions of their labor market institutions. A common theme in our findings is that the model is significantly better at explaining inequality differences at the upper tail compared to the lower tail. Institutions, such as unionization, minimum wage laws (as in the case of France, discussed earlier), and centralized bargaining, are likely to be more important for the lower tail. However, since changes in the upper tail have been so important during this time (as we have documented), the mechanisms studied in this paper provide a promising direction for understanding US-CEU differences in wage inequality. We also found that the most important policy difference for wage inequality is the progressivity of the income tax system, which is responsible for about two-thirds of the model's explanatory power.37 Finally, we turn to the changes in wage inequality over time. In a two-country the model can account for all of the widening of the inequality gap between the US and Germany, when the actual changes in the tax schedules were also incorporated.

We have also explored the micro implications of the model, which provided further supporting evidence for the model. For example, the lifecycle profile of mean wages is flatter in Germany than in the United States, as implied by the higher progressivity in the former country. A similar result is found for within-cohort wage inequality in Germany and the US. Similarly, average hours for males is much lower in Germany than it is in the US. These observations are consistent with the predictions of the model and provide further support to the empirical relevance of the human capital mechanisms explored in this paper.

An alternative mechanism that is also consistent with the US-Europe inequality gap was proposed by Becker (1985). In his framework, workers choose both hours of work in the market and effort per hour. High ability workers in the US put more effort per hour (and are therefore more productive) than comparable workers in Europe because the return is relatively higher. Thus, wage inequality will be higher in the US than in Europe. An important difference between this mechanism and ours is that our model implies a widening of wage inequality over the life cycle in the US relative to Europe (as documented in Section 6.1), whereas Becker's model implies that wage inequality would be constant over the lifecycle.

An alternative way of modeling for skill acquisition would be through "learning by doing (LBD)," which differs from human capital models in some subtle ways. To understand this, notice that in an LBD model, human capital is acquired by working longer hours. The marginal cost of work is given by the marginal utility of leisure, which is independent of the current tax rate. The marginal benefit is the increase in utility due to higher after-tax earnings both in the current period (higher earnings from longer hours) and future periods (higher wages because of accumulated skills). So, for example, if current taxes are raised without affecting future taxes, this would increase human capital investment in Ben-Porath as we saw in Section 2.2 (because the cost of investment is the current after-tax wage, which is lower now). In contrast, in an LBD model, this will decrease current hours of work because part of the marginal benefit of work (current after-tax earnings) falls. But if there is less work, there is less skill acquisition in an LBD model. This is one example where a change in taxes can increase investment in Ben-Porath while reducing it with learning by doing. However, that this is a carefully selected example. There are many other cases where both models would have qualitatively the same implication (for example if future taxes are raised without affecting current taxes).

Finally, we have made several assumptions to make the quantitative exercise computationally feasible.38 An important direction to extend the current framework would be by carefully modeling the differences between the US and the CEU in the financing of the education system as well as in the types of skills taught in schools in both places. This is a difficult but interesting question that is at the top of our future research agenda.

NOT FOR PUBLICATION

SUPPLEMENTAL APPENDIX

1.0


A. Theoretical Appendix: Derivations and Definitions


A.1 Derivation of the Optimal Investment Condition (eq. (7))

Here, we derive the optimal investment condition in the most general framework studied in this paper, described in Section 5.2. The optimality conditions presented earlier in the paper ((4), (5), and (7)) can all be obtained as special cases of this formulation.

Under the assumptions stated in Section 5.2 (i.e., setting  \chi\equiv1, eliminating pension payments (  \Omega\equiv0), and setting idiosyncratic shocks to their mean value), the problem of the agent is given by

\displaystyle V(h,a,s) \displaystyle = \displaystyle \max_{c_{s},n_{s},Q_{s}}u((1+r)a_{s}+y_{s}(1-\bar{\tau}(y_{s}))-a_{s+1},1-n)  
  \displaystyle + \displaystyle V(h_{s+1},a_{s+1},s+1)  

\displaystyle \textrm{s.t.}\qquad y_{s}=(\theta_{L}l+\theta_{H}h_{s})n_{s}-C(Q_{s}).

Note that total tax liability of the agent is given by  y\bar{\tau}(y). The derivative of tax liability with respect to  y gives the marginal tax rate. Thus,  \tau(y)=\bar{\tau}(y)+y\bar{\tau}'(y). Using this expression, we obtain the following FOCs for this problem

\begin{displaymath} \begin{array}{ccc} (n_{s}): & & \left(\theta_{L}l+\theta_{H}h_{s}\right)\left(1-\tau(y_{s})\right)u_{1}(c_{s},1-n_{s})=u_{2}(c_{s},1-n_{s})\ (a_{s}): & & u_{1}(c_{s},1-n_{s})=\beta V_{2}(h_{s+1},a_{s+1},s+1)\ \left(Q_{s}\right): & & C^{\prime}(Q_{S})\left(1-\tau(y_{s})\right)u_{1}(c_{s},1-n_{s})=\beta V_{1}(h_{s+1},a_{s+1},s+1) \end{array}\end{displaymath}
Envelope conditions are:
\begin{displaymath} \begin{array}{cccc} (a_{s}): & & V_{2}(h_{s},a_{s},s)=(1+r)u_{1}(c_{s},1-n_{s})\ (h_{s}): & & V_{1}(h_{s},a_{s},s)=n_{s}\left(1-\tau(y_{s})\right)u_{1}(c_{s},1-n_{s})+n_{s+1}\beta V_{1}(h_{s+1},a_{s+1},s+1). \end{array}\end{displaymath}
Combining the envelope conditions with the FOCs yields
\displaystyle C^{\prime}(Q_{s})\left(1-\tau(y_{s})\right) \displaystyle =\theta_{H}n_{s+1}\underset{\frac{1}{1+r}}{\left(1-\tau(y_{s+1})\right)\underbrace{\frac{\beta u_{1}(c_{s+1},1-n_{s+1})}{u_{1}(c_{s},1-n_{s})}}}+    
  \displaystyle +\theta_{H}n_{s+1}\underset{\frac{1}{\left(1+r\right)^{2}}}{\left(1-\tau(y_{s+1})\right)\underbrace{\frac{\beta^{2}u_{1}(c_{s+2},1-n_{s+2})}{u_{1}(c_{s},1-n_{s})}}}+..    

Rearranging this expression delivers equation (7):

\displaystyle C_{j}^{\prime}(Q_{s}^{j})= \displaystyle {\color{black}\theta_{H}}\{{\color{blue}{\color{black}{\color{black}\beta}\frac{1-\tau(y_{s+1})}{1-\tau(y_{s})}}}n_{s+1}+\beta^{2}{\color{black}{\color{black}{\color{blue}{\color{black}\frac{{\color{black}1-\tau(y_{s+2})}}{{\color{black}1-\tau(y_{s})}}}}}}n_{s+2}+..+\beta^{S-s}{\color{blue}{\color{black}\frac{1-\tau(y_{S})}{1-\tau(y_{s})}}}n_{S}\}.    

A.2 Equilibrium Definition

A stationary recursive competitive equilibrium for this economy is a set of equilibrium decision rules,  c(x),  n(x),  Q(x),  i(x), and  a'(\epsilon',x); value functions,  V(x) and  W^{R}(x), for working and retirement periods, respectively, where  x=(h,a,m;\epsilon,s,j) (notice the inclusion of  j into this vector); a pricing function for Arrow securities,  q(\epsilon'\vert\epsilon), and a measure  \Lambda(x) such that

  1. Given the labor income tax function,  \bar{\tau}(y), consumption tax,  \bar{\tau}_{c}, transfers,  Tr, and government's pension function  \Omega, individuals' decision rules and value functions solve problems in (9) to (13) and in (14).
  2. Asset markets clear:  \int_{x(:,\epsilon=\tilde{\epsilon})}a'(\epsilon',x)d\Lambda(x)=0 for all combinations of (  \tilde{\epsilon}, \epsilon').39
  3.  \Lambda(x) is generated by individuals' optimal choices.
  4. The government budget balances:
    \displaystyle \int_{x(:,s<S)}\bar{\tau}_{n}(y(x))y(x)d\Lambda(x)+\int_{x}\bar{\tau}_{c}c(x)d\Lambda(x) \displaystyle =G+Tr    
      \displaystyle +\sum_{s=R}^{T}\int_{x(:,s=S-1)}\Omega(\overline{y}^{j},m^{S}(x))d\Lambda(x).    

The first term in the government's budget is the total tax revenue from labor income collected from all agents who are working and younger than retirement age. Similarly, the second term is the total tax revenue from the consumption tax, but it is collected from all agents including the retirees. On the right-hand side, the pension payments only depend on a worker's ability through  \overline{y}^{j} and the number of years she worked until retirement ( m^{S}(x)), which in turn depends on the full state vector  x at age  S-1. Therefore, we integrate the pension payments over the full state vector  x conditioning on age  S-1 and then sum the same amount over all ages greater than  S-1 to find total pension payments.

B. Country-Specific Labor Market Policies


B.1 Estimating Country-Specific Average Tax Schedules

Here we provide more details on the estimation of tax schedules described in Section 2.2. Define normalized income as  \widetilde{y}\equiv y/AW. For each country, denote the top marginal tax rate with  \tau_{\text{TOP}} and the top bracket  \widetilde{y}_{\text{\text{TOP}}}. The values for these variables are taken from the OECD tax database.40 As noted in the text, we already have average tax rates for all income levels below 2 (i.e., two times AW). For values above this number, we have to consider separately the case where a country's top marginal tax rate bracket is lower and higher than 2. In the former case (  \widetilde{y}_{\text{\text{TOP}}}<2), since we know the average tax rate at  \widetilde{y}=2, each additional dollar up to 2 is taxed at the rate of  \tau_{\text{TOP}}. Therefore, for  \widetilde{y}>2

\displaystyle \bar{\tau}(\widetilde{y})=(\bar{\tau}(2)\times2+\tau_{\text{TOP}}\times(\widetilde{y}-2))/(\widetilde{y})

If instead  \widetilde{y}_{\text{\text{TOP}}}>2 (which is only the case for the US and France), we do not know the marginal tax rate between  \widetilde{y}=2 and  \widetilde{y}_{\text{\text{TOP}}}. Thus, we first set  \tau(2)=(\bar{\tau}(2)\times2-\bar{\tau}(1.75)\times1.75)/0.25 and use linear interpolation between  \tau(2) and  \tau_{\text{TOP}}. We have

\displaystyle \tau(\widetilde{y})= \displaystyle \left\{ \begin{array}{cc} \tau(2)+\frac{\tau_{\text{TOP}}-\tau(2)}{\widetilde{y}_{\text{\text{TOP}}}-2}(\widetilde{y}-2) & \qquad\quad\textrm{if }2<\widetilde{y}<\widetilde{y}_{\text{\text{TOP}}}\\ \tau_{\text{TOP}} & \textrm{if }\quad\widetilde{y}>\widetilde{y}_{\text{\text{TOP}}}. \end{array}\right.    

Then the average tax rate function for  \widetilde{y}>2 is

\displaystyle \bar{\tau}(\widetilde{y})= \displaystyle \left\{ \begin{array}{cc} (\bar{\tau}(2)\times2+\tau(\widetilde{y})\times(\widetilde{y}-2))/\widetilde{y} & \textrm{if }\quad2<\widetilde{y}<\widetilde{y}_{\text{\text{TOP}}}\\ (\bar{\tau}(2)\times2+\frac{(\tau(2)+\tau_{\text{TOP}})}{2}(\widetilde{y}_{\text{\text{TOP}}}-2)+\tau_{\text{TOP}}\times(\widetilde{y}-\widetilde{y}_{\text{\text{TOP}}}))/\widetilde{y} & \textrm{if }\quad\widetilde{y}>\widetilde{y}_{\text{\text{TOP}}} \end{array}\right.    

We use this expression to compute  \overline{\tau} for  \widetilde{y}=3,4,..,8 (in addition to the original average tax rate from OECD website). We then fit the functional form given in equation (8) to these 13 data points as explained in the text. The resulting coefficients are reported in Table A.2.


Table A.1: Tax Function Parameter Estimates  \bar{\tau}(y/AW)=a_{0}+a_{1}(y/AW)+a_{2}(y/AW)^{\phi}
Country:  a_{0}  a_{1}  a_{2}  \phi  R^{2}
Denmark 1.4647 -.01747 -1.0107 -.15671 0.990
Finland 1.7837 -.01199 -1.4518 -.11063 0.999
France 0.5224  \quad.00339 -.24249 -.41551 0.993
Germany 1.8018 -.01708 -1.3486 -.11833 0.992
Netherlands 3.1592 -.00790 -2.8274 -.03985 0.984
Sweden 9.1211 -.00762 -8.7763 -.01392 0.985
UK 0.5920 -.00390 -.32741 -.30907 0.989
US 1.2088 -.00942 -.94261 -.10259 0.993


B.2 Deriving Tax Schedules with Different Progressivity but Same Average Tax Rate

To change the average tax rates in Europe without changing progressivity, we apply the following procedure. Let  \tau_{i}(y) be the marginal tax rate in country  i for income level  y. We would like to obtain a new tax schedule  \tau_{i}^{*}(y) with the same progressivity but with a different level. Thus, we need to have (for all  y and  y')

\displaystyle \frac{1-\tau_{i}^{*}(y')}{1-\tau_{i}^{*}(y)} \displaystyle =\frac{1-\tau_{i}(y')}{1-\tau_{i}(y)}  \displaystyle \Rightarrow\frac{1-\tau_{i}^{*}(y')}{1-\tau_{i}(y')}=\frac{1-\tau_{i}^{*}(y)}{1-\tau_{i}(y)}    

Letting this ratio to be equal to a constant  k, the new tax schedule  \tau^{*} is obtained by the following expression:
\displaystyle 1-\tau_{i}^{*}(y)=k(1-\tau_{i}(y))\displaystyle \mbox{ for all \ensuremath{y}}\displaystyle . (18)

Let the average tax rate be
\displaystyle \bar{\tau}_{i}(y) \displaystyle =a_{0}+a_{1}y+a_{2}y^{\phi}\quad\Rightarrow\quad\tau_{i}(y)=a_{0}+2a_{1}y+a_{2}(\phi+1)y^{\phi}.    

Plugging this last expression into (20) and solving for  \tau^{*}(y), we get

\displaystyle \tau_{i}^{*}(y)=1-k+k\left[a_{0}+2a_{1}y+a_{2}(\phi+1)y^{\phi}\right].
Observing that  y\bar{\tau_{i}}(y)=\int_{0}^{y}\tau_{i}(x)dx, we can solve for the average tax rate  \bar{\tau}_{i}^{*}(y) as
\displaystyle \bar{\tau_{i}}^{*}(y)=1-k+k[a_{0}+a_{1}y+a_{2}y^{\phi}]=1-k+k\bar{\tau}_{i}(y). (19)

The new schedule  \bar{\tau}_{i}^{*}(y) has the same progressivity as  \bar{\tau}_{i}(y) but can have any desired average tax rate. We choose  k so that the average labor income tax rate in country  i is equal to the average labor income tax rate in the US.


B.3 Constructing Tax Schedules for 1983

Here, we describe the formulas we use to calculate the average tax rate at different income levels for Germany and the United States in 1983. This information is obtained from the OECD (1986) (see pages 104-105 and 244-248 for the US and pages 74-75 and 149-154 for Germany. In all calculations for Germany, the monetary figures are in Deutsche Mark (DM). Gross income is denoted by  \mathtt{GM}.

B.3.1 Germany

Social Security Contributions. In 1983, the social security system in Germany had two brackets with their respective tax rates. Specifically, social security contributions ( SSC) were given by:

\displaystyle SSC=0.1138\times(\min(\texttt{GI},64800)+0.0588(\min(\texttt{GI},48600)).

Allowances. Each worker receives an allowance (tax exemption) of DM 1080 and an allowance of DM 564 for work-related expenses. The OECD considers other miscellaneous allowances in the amount of DM 1606. We treat this amount as fixed for all levels of income. Finally, workers are able to deduct part of their social security contributions determined by this formula:

\displaystyle \texttt{SSC Allowance} \displaystyle = \displaystyle \max\{6000-0.18(\texttt{GI}),0\}  
    \displaystyle +\min(2340,\max\{SSC-\max\{6000-0.18(\mathtt{GI)},0\}\})  
    \displaystyle +0.5\times\min(2340,\max\{SSC-\max\{6000-0.18\mathtt{GI},0\}-2340,0\}).  

Total Tax. Putting together the taxes and allowances just described gives the taxable income of a worker:

 \texttt{Taxable Income}=\texttt{GI-\texttt{SSC Allow.}-\texttt{Basic Allow.}-\texttt{Work-related and other Allow.}}

Now, we can calculate the tax liability to the household. The first step is to round the taxable income.

 \texttt{Rounded Taxable Income (RTI)}=round(\texttt{Taxable Income}/54)\times54.

We calculate two variables Y and Z that will be used in the calculations that follow. They are defined as  Y=\frac{\texttt{\texttt{RTI}}-18000}{10000} and  Z=\frac{\texttt{RTI}-60000}{10000}. To obtain the income tax for a worker, we need to apply Germany's tax schedule in 1983:

% latex2html id marker 8853 $\displaystyle \texttt{Income Tax=}\begin{cases} \mathtt{zero} & \qquad\textrm{if }\texttt{\texttt{RTI}}\leq4212\ 0.22\times\texttt{RTI}-926 & \textrm{\qquad if }4213<\texttt{\texttt{RTI}}\leq18035\ (((3.05Y-73.76)Y+695)Y+2200)\times Y+3034 & \qquad\textrm{if }18036<\texttt{\texttt{RTI}}\leq60047\ (((0.09Z-5.45)Z+88.13)Z+5040)\times Z+20018 & \qquad\textrm{if }60048<\texttt{\texttt{RTI}}\leq130031)\ 0.56\times\texttt{RTI}-14837 & \qquad\texttt{\texttt{\textrm{if }RTI}}>130032 \end{cases}$

\displaystyle \mathtt{Average\; Tax\; Rate}=\frac{\texttt{Income Tax}+SSC}{\texttt{Gross Income}}.

B.3.2 The United States

Social Security Contribution. In 1983, the employee social security contribution in the US was given by

 \texttt{SSC Employee}=0.067\times(\min(\texttt{Gross Income},35700))

The employer's social security contribution matches the employee's contribution of  6.7\% on earnings up to  \$35700. Additionally, employers are required to pay an unemployment tax of  6.2\% of earnings up to  \$7000 and a nationwide average for state-sponsored tax plan of 2.8% of earnings up to  \$7624.

\displaystyle \texttt{SSC Employee} \displaystyle = \displaystyle 0.067\times(\min(\texttt{GI},35700))+0.062\times(\min(\texttt{GI},7000))+0.028\times(\min(\texttt{GI},7624))  

Allowances. The total combined allowances and exemptions amount to $2300 per worker.

 \texttt{Taxable Income}=\texttt{Gross Income}-\texttt{Basic Allowance}-\texttt{Tax Bracket Allowance}.

Federal Income Tax. Now, we can calculate the tax liability for the household. We need to apply the US tax schedule in 1983. The first  \$2300 is not taxed, as discussed earlier. The tax rate is  11\% when taxable income is in range  (2300,3400); is  13\% in range  (3400,4400); is  15\% in range (4400,8500); 17% in range  (8500,10800); is 19% in range (10800,12900); is 21% in range  (12900,15000); is 24% in range (15000,18200); is 28% in range  (18200,23500); is 32% in range (23500,28800); is 36% in range (28800,34100); is 40% in range (34100,41500); is 45% in range (41500,55300); and 50% above $55,300.

State and Local Taxes. For the purposes of calculating local and state taxes, the OECD considers a worker that lives in Detroit, Michigan. Detroit allows an exemption of  \$600, then a flat  3\% tax is applied.  \texttt{Tax Detroit}=0.03(\texttt{GI}-600). The formula for Michigan's state income tax is given by

 \texttt{Tax Michigan}=0.0635(\texttt{GI}-1500)-0.05\max(\texttt{Tax Detroit-200},0)+27.5

 \texttt{Total Local Tax}=\texttt{Tax Michigan}+\texttt{Tax Detroit}

Total Tax. The total tax liability is equal to the income tax plus the social security contribution and the local tax. Then, we have

\displaystyle \mathtt{Average\; Tax\; Rate}=\frac{\texttt{Total Tax Liability}}{\texttt{Gross Income}}


B.4 Pension Systems

The details of the pension benefits system for OECD countries used in this paper are taken from the OECD publication entitled "Pensions at a Glance: 2007." The specific numbers used in this section are from Table I.2 and the unnumbered table on page 35 of that document. Further details of these pension systems, including the number of years required to qualify for full benefits, and so on, are described more fully on pages 26-35 of the same document. Let  \overline{y}^{j} be the lifetime average of net (after-tax) labor earnings of all individuals with ability level  j; and let  \overline{y} be the same variable averaged across all ability levels. Finally, recall that  m^{R} is the total number of years a worker has been employed up to the retirement age, and let  \overline{m} be the maximum number of years of work that an individual can accumulate retirement credits in a given country. The net retirement earnings of individual with ability  j is given as

\displaystyle \Omega(\overline{y}^{j},m^{R})=min\left(1,\frac{m^{R}}{\overline{m}}\right)\left[a\overline{y}+b\overline{y}^{j}\right]
The first term approximates the credit accumulation process whereby individuals qualify for full retirement benefits after working a certain number of years and only qualify for partial pensions if they retire before that. We set  \overline{m} equal to 40 years for all countries. Different countries differ mainly in the value of the coefficients  a and  b. Broadly speaking,  a determines the "insurance" component of retirement income, because it is independent of the individual's own lifetime earnings, whereas  b captures the private returns to one's own lifetime earnings. In this sense, a retirement system with a high ratio of  a/b provides high insurance but low incentives for high earnings and vice versa for a low ratio of  a/b. Inspecting the coefficients in the table shows that there is a very wide range of variation across countries. Finally, some countries have a ceiling on pensionable income and entitlements, which is also reported in Table A.2.


Table A.2: Pension System Formulas
   a  b Ranges Ceiling for Pensionable Income (as % of AW)
DEN 0.371 0.528 all --
FIN 0.011 0.695 all --
FRA 0.141 0.484 all 300%
GER -0.004 0.621 if  \overline{y}^{j}\le1.5\bar{y}  
GER 0.927   if  \overline{y}^{j}>1.5\bar{y} 150%
NET 0.005 0.928 all --
SWE -0.021 0.735 all 367%
UK 0.257 0.154 if  \overline{y}^{j}\le\bar{y} 115%
UK 0.315 0.096 if  \bar{y}<\overline{y}^{j}\le1.5\bar{y}  
UK 0.396 0.042  \overline{y}^{j}>1.5\bar{y}  
US 0.168 0.355 all 290%


C. Further Details of Calibration

C.0.0.0.1 Dispersion of wage growth rates.

Using male hourly earnings data, Haider (2001) estimates a value of  \sigma(b^{j})=2.07, and using annual earnings data he estimates it to be 2.02%. Baker (1997, Table 4, rows 6 and 8) uses an annual earnings measure and estimates values of 1.76% and 1.97% in the two most closely related specifications to the present paper, whereas Guvenen (2009) finds a value of 1.94%, again using male annual earnings data. Finally, Guvenen and Smith (2009) estimate a process for household annual earnings and obtain a value of 1.87%.

C.0.0.0.2 Calibration of the stochastic component.

Over the sample period, Haider estimates the average innovation variance to be 0.074, an AR coefficient of 0.761, and an MA coefficient of -0.42. Using these parameters, the unconditional variance is 0.109. We match the average of the first three autocorrelation coefficients because Haider (2001) estimates an ARMA(1,1) process, whereas in our model we employ a slightly more parsimonious structure (AR(1)+ iid shock). This latter formulation is a common choice in calibrated macroeconomic models because it requires one fewer state variable while still capturing the dynamics of wages quite well. Nevertheless, because of this difference, it is not possible to exactly match each autocorrelation coefficient in the ARMA(1,1) specification and, so, we match the average of the first three. In the calibrated model, the first three autocorrelations are 0.48, 0.33, and 0.20 compared to 0.42, 0.32, and 0.24 in the data.


D. Further Sensitivity Analysis

In all of the following robustness exercises, we recalibrate our model to the empirical targets described in Section 4.

D.1 Taxing Capital Income

In our baseline model, we abstracted from taxation of capital income for two reasons. First, some plausible formulations of capital income taxation substantially complicates the numerical solution of the model by invalidating a relatively fast algorithm we were able to use in its absence. Second, the actual treatment of capital income is quite complex, certainly much more so than labor income. For example, some countries (e.g., the United States) tax certain forms of capital income as ordinary income (i.e., they tax "total" income), whereas some other countries (e.g., France, Finland, and Sweden) allow individuals to pay a lower flat-rate tax on certain types of capital income (such as interest income). See, for example, the discussion in Carey and Rabesona (2002, Table 22) and on pages 158-160. Modeling the complexities of this institutional detail is beyond the scope of this paper, so in the benchmark model studied in the main text we abstracted entirely from capital income taxes.

With these caveats in mind, here we attempt to quantify the effects of taxing capital income in a simple way. Basically, we assume that the government taxes total income--inclusive of capital income--subject to the tax schedules derived in this paper. To understand why taxing total income could matter for ours results, first notice that there are essentially two types of assets in our economy: human capital and financial assets. When capital income is taxed at the flat rate as in our benchmark analysis, progressivity reduces only the return on human capital hindering investment in human capital relative to investment in financial assets. On the other hand, when progressive tax is applied to total income, progressivity reduces both the return on human capital and financial assets. Thus progressivity does not reduce investment in human capital relative to investment in financial asset as much as in the case where progressivity affects only labor income.

To conduct this exercise we have to make some simplifying assumptions to our model and develop a new computational method. The reason is that our computational procedure for the benchmark model relies on the property that the return on savings is independent of the tax rate (which is no longer true in this experiment). This allowed us to compute the human capital investment and consumption-savings decision separately and iteratively. When the progressive tax is applied to total income however, we can no longer use this procedure because we need to compute the total income at each age to compute the tax rate the agent is facing. Thus, we need to solve the human capital investment jointly with consumption-saving decision. However, then it becomes very hard to solve this problem with value function methods, since an individual has to know his borrowing limit in a period to make his optimal choices, which depends on his lifetime human capital and labor supply choices.

To circumvent these problems, we consider a benchmark without idiosyncratic shocks and set  \chi=1. Since there are no shocks in this version of the model, our target moments reduce to average wage growth, standard deviation of wage growth rates, and variance of wages due to profile heterogeneity only. The latter two are obtained from Guvenen (2007). Notice that because (i) there are no shocks and (ii) individuals want to invest significantly early on, they would have a very strong incentive to borrow when utility is separable and hence they want constant consumption. This implies that wealth is negative for many individuals with standard power utility preferences. To mitigate this effect and allow consumption to rise over the lifecycle we use preferences as in Greenwood et al. (1988) (often called GHH). With this structure, we are able to solve the model both when capital income is and is not taxed.

The main finding is the following. The new benchmark model with no capital income taxes can account for 69% of the L90-10 gap between the US and CEU in 2003. (This is up from 48% in the baseline model in the text with shocks and  \chi=0.5.) Adding capital income taxes to this structure, reduces the explanatory power to 52.8%, for a fall of 23 percent ( 1-52.8/69). Thus, if all capital income was taxed at the same rate as labor income, the model's explanatory power would be about a one quarter lower than in the baseline case.

Having said that, it should also be stressed that assuming that this exercise is likely to overstate the real effects of capital income taxation. This is because, as mentioned above, in certain CEU countries some capital income is taxed at a flat rate, which is not the case in the United States. Consequently, in those countries, progressivity affects only labor income, making investment in physical assets more attractive than investment in human capital, in turn further compressing the wage distribution. Hence, incorporating such differences would further lower inequality in the CEU and increase the explanatory power of the model. While we do not pursue this approach here, this is an important point to keep in mind.

D.2 Accounting for Cross-Country Variation in Retirement Age

Our baseline model does not allow for variation in retirement age across countries. However, such variation could have important implications for human capital investment by affecting the effective horizon of individuals. Although modeling endogenous retirement is beyond the scope of this paper, here we explore the effects of allowing for exogenous retirement age differences across countries. We estimate the average retirement age by computing the fraction of people who receive social security pensions and disability benefits at each age.41 We then solve each country's problem using the computed retirement age as an exogenous value for  S. With this adjustment, the explanatory power for L90-10 increases to 70%, because countries with more progressivity also turn out to have a lower retirement age than less progressive ones. So the two effects reinforce each other.

D.3 Maximum investment on the job  \mathbf {\chi }

We experiment with two values of  \chi--0.4 and 0.6--one on each side of our baseline choice of 0.5. When  \chi=0.6, the model's explanatory power for L90-10 and L90-50 fall to 35% and 51% respectively, whereas the explanatory power for L50-10 remains unchanged at 24%. It should be noted however that with this choice of  \chi, the model implies a minimum to mean wage ratio of 0.24, which is quite a bit lower than the 0.29 value in the data (and what was used to pin down the baseline choice of 0.50 for  \chi). When  \chi=0.4, the model explains 61% of the L90-10 difference between the US and CEU, 116% of L90-50, and 24% of L50-10. In this case, the min to mean wage ratio is a more reasonable 0.30.

D.4 Wasteful Government Expenditures versus Transfers

In the baseline model, the surplus was returned back to households in a lump-sum fashion, essentially assuming that government expenditures are perfect substitutes for private consumption. To examine if our results are sensitive to this assumption, we now assume that half of the government surplus is wasted:  G=Tr, and each component equals half of the budget surplus (i.e., tax revenues minus benefits payments). This assumption is probably extreme, but it is useful in illustrating whether the results are sensitive to this scenario. From Table A.3, we see that, qualitatively, the explanatory power of the model is lower for some countries for L90-10 and L90-50 but higher for L50-10. Quantitatively, however, the effect is minimal across the board. In fact, in some cases, no difference is visible (because of rounding) compared to the benchmark case in Table 5.


Table A.3: Effect of Wasteful Government Spending on Wage Inequality Results  G=Tr=0.5\times Gov't Surplus
  L90-10 (a) L90-50 (b) L50-10 (c)
Denmark 63 90 38
Finland 49 75 29
France 30 71 14
Germany 69 75 60
Netherlands 45 59 31
Sweden 42 67 23
CEU 49% 73% 29%
UK 21 0 49

D.5 Depreciation of human capital  \mathbf {\delta }

To check the sensitivity of our results to the choice of the human capital depreciation rate, we have experimented with depreciation rates of 1% and 2%. The model's explanatory power goes down to 44% when  \delta=0.01 and it increases slightly above 50% when  \delta=0.02. An important point to note is that it is not possible to match two of our targets, mean wage growth and variance of wage growth rate jointly for depreciation rates below 1 percent. For very low values of depreciation rate, when we match the increase in wage inequality over the lifecycle, the wage growth turns out to be very high relative to data. The reason is the following. First note that the learning ability cannot be negative, and as a result the lowest wage growth is bound by the minus depreciation rate. For a given minimum ability level, we match the variance of  \beta by adjusting the maximum ability level. However, when we increase the maximum ability to match the variance of  \beta, the average wage growth turns out to be very high compared to data when we use a very low depreciation rate.

D.6 Elasticity of human capital production function  \mathbf {\alpha }

When  \alpha is higher, there is less diminishing marginal productivity in human capital production. As a result, human capital investment responds more to changes in incentives due for example to changes in taxes. The model's explanatory power increases to 65% when we set  \alpha=0.9 and it decreases to 28% when we set it to 0.65. Most of the most recent estimates in the literature are above 0.9 (see, e.g., Heckman et al. (1998); Kuruscu (2006)). Thus, our choice of 0.8 is on the conservative side.


D.7 Results: US versus CEU with Fixed Tax Schedules

D.7.0.1 Extended Model with SBTC.

Here is the formal statement of the model studied in Section 5.2:

\displaystyle V(h,a,m;\epsilon,s) \displaystyle = \displaystyle \max_{c,n,i,a'(\epsilon')}\left[u(c,n)+\beta E\left(V(h',a'(\epsilon'),m';\epsilon',s+1)\vert\epsilon\right)\right] (20)
\displaystyle \textrm{s.t}.      
\displaystyle (1+\bar{\tau}_{c})c+\sum q(\epsilon'\mid\epsilon)a'(\epsilon') \displaystyle = \displaystyle (1-\bar{\tau}(y))y+a+Tr, (21)
\displaystyle y \displaystyle = \displaystyle \epsilon\left[P_{L}l^{j}+P_{H}h_{s}^{j}\right]n_{s}^{j}(1-i_{s}^{j}). (22)
\displaystyle h' \displaystyle = \displaystyle (1-\delta)h+A^{j}\left[(\theta_{L}l^{j}+\theta_{H}h^{j})i^{j}n^{j}\right]^{\alpha}, (23)
\displaystyle m' \displaystyle = \displaystyle m+1\{i<1\;\&\; n\geq n_{\min}\}, (24)
\displaystyle i \displaystyle \in \displaystyle [0,\chi]\cup\{1\},  

Notice that the only changes are the introduction of raw labor into the labor earnings equation and human capital accumulation function. The weights  \theta_{H} and  \theta_{L} in the production function in (23) capture the relative efficiency of human capital and raw labor in producing new human capital. As in Guvenen and Kuruscu (2010) we focus on the case where  P_{H}=\theta_{H} and  P_{L}=\theta_{L}.


Table A.4: Rise in Wage Inequality: Model versus Data, 1980-2003 (Change in Log Wage Differentials). The model is calibrated to match the 23 log points rise in L90-10 for the US from 1980 to 2003.
  L90-10 L90-50 + L50-10
CEU Data Level 0.070 0.063 0.007
CEU Data %   91% 9%
CEU Model Level 0.168 0.129 0.039
CEU Model %   77% 23%
US Data Level 0.230 0.160 0.070
US Data %   70% 30%
US Model Level 0.232 0.184 0.048
US Model %   79% 21%
Difference Data: Level 0.160 0.097 0.063
Difference Data: %   61% 39%
Difference Model: Level 0.065 0.056 0.009
Difference Model: %   87% 13%
% Explained 41% 58% 14%

This extended model has some new parameters that need to be calibrated. Except those discussed here, all parameter values are kept at the values given in Table 3. An important point to note is that for the cross-sectional analysis of the previous section, the two-factor model would have precisely the same implications as the one-factor Ben-Porath model used earlier. This is because  \theta_{H} and  \theta_{L} are constant at a point in time and their values can be normalized to generate exactly the same results as in the previous section. Thus, with proper choices of  \theta_{H},  \theta_{L}, and the distribution of  l^{j}, we do not need to recalibrate any other parameter and can still obtain the same results for year 2003 as before. This is the route that we follow in this section.42

For examining the change in inequality over time, we choose  \Delta\log\left(\theta_{H}/\theta_{L}\right) to match the 23 log points in L90-10 in the US from 1980 to 2003. The required change in  \Delta\log\left(\theta_{H}/\theta_{L}\right) is 0.236. With this calibration, wage inequality rises by 0.168 in CEU during the same time, compared to 0.070 rise in the data (fourth column of Table A.4). These results imply that differences in labor market policies, even when they are fixed over time, can generate about 41% (  =(0.232-0.168)/(0.230-0.070)) of the widening in the inequality gap between the US and the CEU during this time period.

Another dimension of the rise in wage inequality is seen in the last two columns of Table A.4. The substantial part of the rise in wage inequality in the CEU has been at the top: L90-50 is responsible for 91% of the total rise in L90-10, whereas only 9% of the rise took place at the lower end. A similar outcome, somewhat less extreme, is observed in the US where 70% of the rise in L90-10 is due to L90-50. The model generates a similar picture: about 77% of the rise in the CEU and 79% in the US is due to L90-50. An alternative way to express these figures is that the model accounts for 58% of the increase in the inequality gap above the median between the US and the CEU but only 14% of the rising gap below the median. As is clear by now, this is a recurring theme in this paper: the model accounts for cross-country inequality facts at the upper tail quite well, but accounts for a smaller fraction at the lower tail.


E. Data Appendix: GSOEP and PSID

E.1 Sample Selection and Data Preparation

The sample period for the German SOEP is 1984-2008 and for the PSID is 1968-1992. We keep only males between 25 and 60 years old, regardless of whether they are heads of household. If an individual does not report hours, wages or income, he is dropped from the sample. To further trim earnings outliers, we exclude observations in which earnings grow by more than 500% or less than -80%, earnings are below 100 Euros (2005) or 2 Dollars (1983) per hour or if they are top-coded. To ensure consistency, we drop those who report zero hours but positive earnings or zero earnings but positive hours. We also drop individuals who report more than 80 hours per week for the entire year, 4160 hours, and flag individuals who work less than one quarter at 40 hours per week, 520 hours. In the PSID, we also drop the SEO oversample.

In the PSID, we have to identify roles within households to pair the "wife" and the "head" of household's hours with that individual. To do so, we use the  \texttt{pnum} variable in 1967 and require that the "wife" is female and the  \texttt{seqnum} and  \texttt{relatehd} variables in subsequent years. The household head gets  \texttt{seqnum} =1, and wives are  \texttt{seqnum} =2 and  \texttt{relatehd} =2 until 1982, when they become  \texttt{relatehd} =20. In a few cases each year, the hours reported from the household level and matched to the individual do not match individually reported hours, and we drop these. We also create consistent a age variable so that the age increments by 1 each observation even when an individual is surveyed at different times in the year.

E.2 Calculations

E.2.1 Residual variables

The lifecycle profiles are based on residual log wages. To obtain residuals we regress log wages on marital status, race in the US case and education level (i.e., dropout, high school or college in the US; and dropout, vocational, high school or college in Germany). In all regressions, the intercept is of an unmarried, white, high school graduate. The regression is repeated for every year of the sample, so the dummy coefficients vary freely over time.

E.2.2 Age Profiles

We construct profiles in much the same way as Deaton and Paxson (1994) and Storesletten et al. (2004b). For each variable, we compute mean and variance within an age-year bin, each defined by a calendar year and a 5 year window of ages. We label these bins by the year and age in the center of the range. We calculate life-cycle profiles with time effects by using coefficients from regressing these bins on both age and year dummies and weighting by the number of individuals in the year-age bin. That is, for mean or dispersion of wages within the age-year bin  (h,t), we estimate

\displaystyle x_{h,t}=d_{h}^{t}+g_{t}+\epsilon_{h,t}
The coefficients on age,  d_{h}^{t} are stored as a profile relative to a base at the level or dispersion at age 25 in 1985, the group represented by the intercept term. To calculate profiles with cohort effects, we follow the same procedure, using age coefficients from a regression on age and cohort dummies. Again, we use the same shift strategy so the average of the profile is the same, whether controlling time effects or cohort effects.


Bibliography

Altig, D. and C. T. Carlstrom
"Marginal Tax Rates and Income Inequality in a Life-Cycle Model," American Economic Review, 1999, 89, 1197-1215.
Baker, Michael
"Growth-Rate Heterogeneity and the Covariance Structure of Life-Cycle Earnings," Journal of Labor Economics, 1997, 15 (2), 338-375.
Becker, Gary S
"Human Capital, Effort, and the Sexual Division of Labor," Journal of Labor Economics, January 1985, 3 (1), S33-58.
Ben-Porath, Yoram
"The Production of Human Capital and the Life Cycle of Earnings," Journal of Political Economy, 1967, 75 (4), 352-365.
Benabou, Roland
"Unequal Societies: Income Distribution and the Social Contract," American Economic Review, 2000, 90 (1), 96-129.
Bils, Mark, Yongsung Chang, and Sun-Bin Kim
"Comparative Advantage and Unemployment," RCER Working Paper 547, University of Rochester 2009.
Boskin, Michael J.
"Notes on the Tax Treatment of Human Capital," in "in" Conference on Tax Research 1975 Washington: Dept. Treasury. 1977.
Bound, John, Charles Brown, and Nancy Mathiowetz
"Measurement error in survey data," in J.J. Heckman and E.E. Leamer, eds., Handbook of Econometrics, Elsevier, 2001, chapter 59, pp. 3705-3843.
Browning, Martin, Lars Peter Hansen, and James J. Heckman
"Micro Data and General Equilibrium Models," in J. B. Taylor and M. Woodford, eds., Handbook of Macroeconomics, 1999.
Carey, David and Josette Rabesona
"Tax Ratios on Labor and Capital Income and on Consumption," in "OECD Economic Studies No 35," OECD, 2002.
Castañeda, Ana, Javier Díaz-Giménez, and José-Víctor Ríos-Rull
"Accounting for the U.S. Earnings and Wealth Inequality," The Journal of Political Economy, 2003, 111 (4), 818-857.
Caucutt, Elizabeth M., Selahattin Imrohoroglu, and Krishna B. Kumar
"Does the Progressivity of Income Taxes Matter for Human Capital and Growth?," Journal of Public Economic Theory, 2006, 8 (1), 95-118.
Coile, Courtney and Jonathan Gruber
"The Effect of Social Security on Retirement in the United States," in Jonathan Gruber and David A. Wise, eds., Social Security Programs and Retirement around the World: Micro- Estimation, The University of Chicago Press, 2004.
Conesa, Juan Carlos and Dirk Krueger
"On the optimal progressivity of the income tax code," Journal of Monetary Economics, October 2006, 53 (7), 1425-1450.
Deaton, Angus and Christina Paxson
"Intertemporal Choice and Inequality," Journal of Political Economy, June 1994, 102 (3), 437-67.
Devroye, Dan and Richard B. Freeman
"Does Inequality in Skills Explain Inequality of Earnings Across Countries?," Technical Report, Harvard University 2000.
Domeij, David and Martin Floden
"Inequality Trends in Sweden 1978-2004," Review of Economic Dynamics, 2010, 13 (1), 179-208.
Duncan, Denvil and Klara Sabirianova Peter
"Tax Progressivity and Income Inequality," Working Paper, Georgia State University 2008.
Erosa, A., L. Fuster, and G. Kambourov
"The Heterogeneity and Dynamics of Individual Labor Supply over the Life Cycle: Facts and Theory," Working Paper, University of Toronto 2009.
Erosa, Andres and Tatyana Koreshkova
"Progressive taxation in a dynastic model of human capital," Journal of Monetary Economics, 2007, 54, 667-685.
_, Luisa Fuster, and Gueorgui Kambourov
"A Theory of Labor Supply Late in the Life Cycle: Social Security and Disability Insurance," Technical Report, University of Toronto 2011.
Fuchs-Schündeln, Nicola, Dirk Krueger, and Mathias Sommer
"Inequality Trends for Germany in the Last Two Decades: A Tale of Two Countries," Review of Economic Dynamics, 2010, 13 (1), 103-132.
Gourinchas, Pierre-Olivier and Jonathan A. Parker
"Consumption over the Life Cycle," Econometrica, 2002, 70 (1), 47-89.
Greenwood, Jeremy, Zvi Hercowitz, and Gregory W Huffman
"Investment, Capacity Utilization, and the Real Business Cycle," American Economic Review, June 1988, 78 (3), 402-17.
Guoveia, Miguel and Robert P. Strauss
"Effective federal individual income tax functions: An exploratory empirical analysis," National Tax Journal, 1994, 47 (2), 317-39.
Guvenen, Fatih
"Learning Your Earning: Are Labor Income Shocks Really Very Persistent?," American Economic Review, June 2007, 97 (3), 687-712.
_,
"An Empirical Investigation of Labor Income Processes," Review of Economic Dynamics, January 2009, 12 (1), 58-79.
_ and Anthony A Smith
"Inferring Labor Income Risk from Economic Choices: An Indirect Inference Approach," Working Paper, University of Minnesota 2009.
_ and Burhanettin Kuruscu
"A Quantitative Analysis of the Evolution of the U.S. Wage Distribution, 1970-2000," NBER Macroeconomics Annual, 2010, 24 (1), 227-276.
_, _, and Serdar Ozkan
"Taxation of Human Capital and Wage Inequality: A Cross-Country Analysis," NBER Working Papers 15526 2009.
Haider, Steven J.
"Earnings Instability and Earnings Inequality of Males in the United States: 1967-1991," Journal of Labor Economics, 2001, 19 (4), 799-836.
Hassler, John, Jose Mora, Kjetil Storesletten, and Fabrizio Zilibotti
"The Survival of the Welfare State," American Economic Review, 2003, 93 (1), 87-112.
Heathcote, Jonathan, Fabrizio Perri, and Giovanni L. Violante
"Unequal We Stand: An Empirical Analysis of Economic Inequality in the United States, 1967-2006," Review of Economic Dynamics, 2010, 13 (1), 15-51.
_, Kjetil Storesletten, and Giovanni L Violante
"Consumption and Labour Supply with Partial Insurance: An Analytical Framework," C.E.P.R. Discussion Papers 6280 2007.
_, _, and Giovanni L. Violante
"The Macroeconomic Implications of Rising Wage Inequality in the United States," NBER Working Papers 14052 June 2008.
Heckman, James J
"A Life-Cycle Model of Earnings, Learning, and Consumption," Journal of Political Economy, August 1976, 84 (4), S11-44.
Heckman, James, Lance Lochner, and Christopher Taber
"Explaining Rising Wage Inequality: Explanations With A Dynamic General Equilibrium Model of Labor Earnings With Heterogeneous Agents," Review of Economic Dynamics, January 1998, 1 (1), 1-58.
Hornstein, A., P. Krusell, and G. Violante
"Technology-Policy Interaction in Frictional Labor-Markets," Review of Economic Studies, 2007, 74 (4), 1089-1124.
Huggett, Mark, Gustavo Ventura, and Amir Yaron
"Sources of Lifetime Inequality," American Economic Review, forthcoming.
Jr., Robert Lucas
"Supply-Side Economics: An Analytical Review," Oxford Economic Papers, April 1990, 42 (2), 293-316.
Kaplan, Greg
"Inequality and the Life Cycle," Technical Report, University of Pennsylvania 2010.
King, Robert G and Sergio Rebelo
"Public Policy and Economic Growth: Developing Neoclassical Implications," Journal of Political Economy, October 1990, 98 (5), S126-50.
Kitao, S., L. Ljungqvist, and T. Sargent
"A Life Cycle Model of Trans-Atlantic Employment Experiences," Working Paper, USC and NYU 2008.
Krebs, T.
"Human Capital Risk and Economic Growth*," Quarterly Journal of Economics, 2003, 118 (2), 709-744.
Kuruscu, Burhanettin
"Training and Lifetime Income," American Economic Review, 2006, 96 (3), 832-846.
Leuven, Edwin, Hessel Oosterbeek, and Hans van Ophem
"Explaining International Differences in Male Skill Wage Differentials by Differences in Demand and Supply of Skill," Economic Journal, 2004, 114, 466-486.
Ljungqvist, Lars and Thomas J. Sargent
"The European Unemployment Dilemma," Journal of Political Economy, June 1998, 106 (3), 514-550.
_ and _
"Two Questions about European Unemployment," Econometrica, 01 2008, 76 (1), 1-29.
McDaniel, Cara
"Average tax rates on consumption, investment, labor and capital in the OECD 1950-2003," Working Paper, Arizona State University 2007.
Moene, Karl Ove and Michael Wallerstein
"Inequality, Social Insurance, and Redistribution," The American Political Science Review, 2001, 95 (4), 859-874.
Nickell, Stephen and Brian Bell
"The Collapse in Demand for the Unskilled and Unemployment across the OECD," Oxford Review of Economic Policy, 1995, 11 (1), 40-62.
OECD
The Tax/Benefit Position of Production Workers 1981-1985, Paris: Organisation for Economic Co-Operation and Development, 1986.
Ohanian, Lee, Andrea Raffo, and Richard Rogerson
"Long-Term Changes in Labor Supply and Taxes: Evidence from OECD Countries, 1956-2004," Journal of Monetary Economics, December 2008, pp. 1353-1362.
Prescott, Edward C.
"Why do Americans work so much more than Europeans?," Federal Reserve Bank of Minneapolis Quarterly Review, 2004, (Jul), 2-13.
Rebelo, Sergio
"Long-Run Policy Analysis and Long-Run Growth," Journal of Political Economy, June 1991, 99 (3), 500-521.
Rodriguez, Francisco
"Inequality, Redistribution, and Rent-Seeking." PhD dissertation, Harvard University 1998.
Rogerson, Richard
"Structural Transformation and the Deterioration of European Labor Market Outcomes," Journal of Political Economy, 2008, 116 (2), 235-259.
Storesletten, Kjetil, Chris I. Telmer, and Amir Yaron
"Cyclical Dynamics in Idiosyncratic Labor Market Risk," Journal of Political Economy, June 2004, 112 (3), 695-717.
_, Christopher I. Telmer, and Amir Yaron
"Consumption and risk sharing over the life cycle," Journal of Monetary Economics, April 2004, 51 (3), 609-633.
Wanger, Susanne
"Erwerbstätigkeit, Arbeitszeit und Arbeitsvolumen nach Geschlecht und Altersgruppen," Technical Report 2, IAB Forschungsbericht 2006.



Footnotes

* University of Minnesota and NBER; [email protected]; https://sites.google.com/site/fatihguvenen/ Return to Text
1. University of Toronto; [email protected]; http://sites.google.com/site/bkuruscu Return to Text
2. Federal Reserve Board; [email protected]; www.serdarozkan.me Return to Text
3. In contemporaneous work, Duncan and Peter (2008) also construct income tax schedules for a broad set of countries and empirically investigate the relation between progressivity and income inequality. Although their measure of progressivity and income is different from ours along important dimensions, they document a strong negative relationship between progressivity and income inequality, consistent with our findings here. Return to Text
4. The precise definition of gross wages is given in footnote 12. Return to Text
5. Recent evidence from panel data on individual wages provides support for individual-specific growth rates in wage earnings (cf. Baker (1997), Guvenen (2007,2009), Huggett et al. (forthcoming)). Return to Text
6. Notice that  P_{H} (the price of human capital) does not appear in the optimality condition (4) and, thus, has no effect on human capital decision. For clarity we set  P_{H}=1 from here on. Return to Text
7. With pecuniary costs of investment, flat taxes can affect human capital investment, as shown by King and Rebelo (1990) and Rebelo (1991). Similarly, Lucas (1990) shows that flat taxes can have a negative impact on human capital investment when labor supply is elastic. Return to Text
8. Notice that because of the rescaling by  n_{\text{avg}}, if a country has sufficiently high labor hours and low progressivity, this wedge measure can become negative (e.g., the US). Therefore, this new measure is defined relative to a given sample of countries, but is still informative about the relative return to human capital within a group of countries, which is the focus of this paper. Return to Text
9. In reality, pension payments depend on the workers' own earnings history, but modeling this explicitly also adds an extra state variable, which this structure avoids. Return to Text
10. Notice that this baseline model does not have a capital income tax, which is challenging to introduce for at least two reasons. First, because capital is mobile across international borders (much more so than labor and consumption), it is not exactly clear how we should think about its taxation. Second, and more importantly, capital income taxation introduces significant complications into the numerical solution of the problem. For these two reasons, we abstract from it in the baseline model here. In Appendix D, we study a particular way of taxing capital income and find that, while it matters quantitatively, it does not alter the main conclusions of the paper. Return to Text
11. In a different context, earlier papers by Rodriguez (1998) and Moene and Wallerstein (2001) empirically documented a negative relation between inequality and redistributive policies other than taxes. In a regression analysis of eighteen advanced industrialized countries, Moene and Wallerstein (2001) find that greater inequality is associated with lower spending on programs to insure against income loss as a share of both GDP and total government spending. Rodriguez (1998) reaches a similar conclusion: using data from 20 OECD countries and controlling for national income, population, and the age distribution, he finds that pretax inequality has a significantly negative effect on every major category of social transfers as a fraction of GDP. Return to Text
12. Non-wage income taxes (e.g., dividend income, property income, capital gains, interest earnings) and non-cash benefits (free school meals or free health care) are not included in this calculation. Return to Text
13. We have also experimented with several other functional forms, including a popular specification proposed by Guoveia and Strauss (1994), commonly used in the quantitative public finance literature (cf. Castañeda et al. (2003), Conesa and Krueger (2006), and the references therein). However, we found that the functional form used here provides the best fit across the board for this relatively diverse set of countries, as seen from the high  R^{2} values in Table A.1. Return to Text
14. More precisely, wages are measured before taxes and employees' social security contributions and also include bonuses and overtime pay when applicable. Therefore, they represent a fairly good measure of the total monetary compensation of a worker. Return to Text
15. The data on average hours per person for each country have been kindly provided to us by Richard Rogerson and are the same as those used in Ohanian et al. (2008). Return to Text
16. This strong relationship is robust to using wedges calculated from different parts of the income distribution: for example, the correlations between L90-10 and  PW(k,k+m) as  k and  m are varied between 0.5 to 2.5 range from -0.74 to -0.87. Return to Text
17. Taking the US as the benchmark is motivated by the fact that its economy is subject to much less of the labor market rigidities present in the CEU--such as unionization and other distorting institutions. Because these institutions are not modeled in this paper, the US provides a better laboratory for determining the unobservable parameters than other countries where these distortions could be more important for wage determination. Return to Text
18. Most countries require a minimum days of work (or income earned) to qualify for pension benefits, which is captured with  n_{\text{min}} in (13). We set  n_{\text{min}}=0.10, which does not bind for any country. Return to Text
19. Although it is common to use higher elasticity values in representative agent macro studies (e.g., Prescott (2004) among many others), values of 0.5 or lower are more common in quantitative models with heterogeneous agents (cf. Heathcote et al. (2008), and Erosa et al. (2009)). Return to Text
20. We prefer the uniform distribution over a Gaussian distribution because it has a bounded support, so initial human capital and ability can be easily ensured to be non-negative. Another choice would be a log normal distribution, but most empirical measures of ability find it more closely approximated by a symmetric distribution, unlike a log normal one. It will turn out, however, that the wage distribution generated by the model will be closer to log normal with a longer right tail (more consistent with the data), as a result of the convexity arising from the human capital production function. Return to Text
21. For an excellent survey of the available validation studies and other evidence on measurement error in wage and earnings data, see Bound et al. (2001). Return to Text
22. http://stats.oecd.org/Index.aspx?DataSetCode=RHMW Return to Text
23. Our calibration produces wage dynamics that are also consistent with what some authors have called a RIP process. Basically, if we fit an AR(1) process plus an i.i.d shock to the wage process generated by the model, we find a persistence parameter of 0.937, an innovation standard deviation of 19%, and an i.i.d shock standard deviation of 18%. These are in line with recent estimates in the literature (see, e.g., Storesletten et al. (2004a)). Return to Text
24. For example, retirees in Denmark and the Netherlands receive the largest pension payments, with the present value of average retirement wealth exceeding half a million US dollars (as of 2007). The US and the UK, however, have the lowest pension entitlements--less than six times the average annual earnings in each respective country (and less than half the wealth in Denmark and the Netherlands). Return to Text
25. In the working paper version (Guvenen et al. (2009)), we also modeled an unemployment insurance system that mimics each country's actual system in place. It turned out that this additional feature made little difference (which can be seen by comparing the results in that draft to those reported below), but it came at significant cost to the exposition of the model. Thus, we decided to omit it in this version. Return to Text
26. The model does poorly in explaining the small L50-10 in France (12%). One reason could be the legal minimum wage (not modeled here), which is equal to 62% of average earnings in France--the highest among the CEU and much higher than the 36% of average earnings in the U.S. If these differences were modeled, it might be possible to reconcile the model better with the very small lower tail wage inequality in France. Return to Text
27. Because of the computational burden, these experiments only provide steady state comparisons. Although solving for the full transition path is beyond the scope of this paper, it could be important for the quantitative results, so future work on this issue is certainly warranted. Return to Text
28. The required change in  \log(P_{H}/P_{L}) is 6.7 log points, which is about one-third of the value we used in the first experiment with fixed tax schedules. Return to Text
29. Although most of the countries in our sample have panel datasets on individuals, many of these datasets are either restricted to researchers that are citizens of that country or have documentation that is not translated into English, making it infeasible for us to use those datasets in our study. The German Socioeconomic Panel (GSOEP) is available to outside researchers upon the submission and approval of a research proposal. Return to Text
30. For this computation, we use data from 1984 to 1992, which is the period the two datasets overlap. Return to Text
31. The model counterparts of these numbers are also of interest. In the model, the rise in the mean log wages (from age 25 to 55) in the US exceeds the same statistic in Germany by 0.16, compared with the 0.15 (  =0.36-0.15) figure in the data when cohort effects are controlled for and 0.10 (  =0.37-0.27) when time effects are controlled for. Similarly, in the model, the rise of wage inequality in the US exceeds that in Germany by 0.16, compared with 0.11 and 0.10 in the data without cohort and time effects, respectively. By and large, the model is consistent with the signs and rough magnitudes of the differences seen in the data. Return to Text
32. In Sweden, from age 25 to 55, the variance of log wages rises by 0.08 when controlling for time effects and falls by 0.06 when controlling for cohort effects; see Domeij and Floden (2010, figs. 13 and 14). Return to Text
33. These statistics are computed using data from 1984 to 1992, which is the period the datasets overlap. Return to Text
34. The standard way to circumvent this problem is to introduce heterogeneity in work-leisure preferences, which is the route followed by, among others, Heathcote et al. (2007), Bils et al. (2009), and Kaplan (2010). Because hours inequality is not the main focus of this paper, we have not pursued this approach here. Return to Text
35. For the average hours statistics, we do not use the GSOEP and PSID because these data sets seem to overstate average hours. For example, Fuchs-Schündeln et al. (2010) document that average hours in GSOEP is about 300 hours lower than the NIPA counterpart (called IAB) in 1980 and this gap grows to more than 500 hours by year 2000. This gap seems to be largely attributed to the insufficient treatment in GSOEP of vacation and sick days and other factors that impact the number of work days per year. For consistency, and because similar issues are also relevant for the PSID, we do not use it either. Instead we rely on CPS for the US and IAB data for Germany. Return to Text
36. The IALS survey is composed of three tests: (i) quantitative literacy (measuring arithmetic and analytical skills used in typical work situations); (ii) prose literacy (the skills needed to understand and use information from texts, including editorials, news stories, poems, etc.); and (iii) document literacy (the skills required to locate and use information contained in various formats, including maps, tables, graphs, job applications, etc.). In Table 9 we reports the results using the quantitative literacy results. We omit the other two measures for brevity because they give very similar results across the board. Return to Text
37. In the working paper version (Guvenen et al. (2009)) we conducted the same analysis using data on all workers as opposed to male workers as we did here. The results of that analysis was remarkably similar to those found here. To us, this suggests that the same mechanisms emphasized in this paper are likely to be as important for female workers as it is for males, despite large differences across countries in female labor force participation. Return to Text
38. The numerical solution of the model requires care because the individuals' dynamic problem has several sources of non-convexities. As a result, solving for the equilibrium takes about 14 hours for the US and UK, and as much as 30 hours for some countries like Denmark. This makes calibration very time consuming, which prevented us from extending the model in other directions. Return to Text
39. The notation  x(:,\epsilon=\tilde{\epsilon}) indicates that the integral is taken over the entire domain of variables in state vector  x, except for , which is set equal to  \tilde{\epsilon}. Others below are defined analogously. Return to Text
40. From Table I.7, available for download at www.oecd.org/ctp/taxdatabase. Return to Text
41. The data for the CEU countries are obtained from Erosa et al. (2011). We thank Gueorgui Kambourov for providing us with their detailed dataset. The data for the US is from Coile and Gruber (2004) Return to Text
42. More specifically, the two-factor model eliminates initial heterogeneity in human capital but instead introduces raw labor. We make the same assumptions for  l^{j} as we made earlier about  h_{0}^{j}. That is, we assume that  l^{j} is uniformly distributed and is perfectly correlated with  A^{j}. We also assume that  \theta_{H}=\theta_{L}=1 in 2003, which allows us to use the same mean value and coefficient of variation for  l^{j} as for  h_{0}^{j} in Table 1. Return to Text

This version is optimized for use by screen readers. Descriptions for all mathematical expressions are provided in LaTex format. A printable pdf version is available. Return to Text