The Federal Reserve Board eagle logo links to home page

Skip to: [Printable Version (PDF)] [Bibliography] [Footnotes]
Finance and Economics Discussion Series: 2008-06 Screen Reader version

Challenges in macro-finance modeling

Don H. Kim*
February 11, 2008

Keywords: Yield curve, term structure, unspanned factors, macro-finance

Abstract:

This paper discusses various challenges in the specification and implementation of "macro-finance" models in which macroeconomic variables and term structure variables are modeled together in a no-arbitrage framework. I classify macro-finance models into pure latent-factor models ("internal basis models") and models which have observed macroeconomic variables as state variables ("external basis models"), and examine the underlying assumptions behind these models. Particular attention is paid to the issue of unspanned short-run fluctuations in macro variables and their potentially adverse effect on the specification of external basis models. I also discuss the challenge of addressing features like structural breaks and time-varying inflation uncertainty. Empirical difficulties in the estimation and evaluation of macro-finance models are also discussed in detail.


1 INTRODUCTION

In recent years there has been much interest in developing "macro-finance models", in which yields on nominal bonds are jointly modeled with one or more macroeconomic variables within a no-arbitrage framework. The need to go beyond "nominal yields only" no-arbitrage models (i.e., to include a description of the macroeconomy or other asset prices) has been felt for a long time by academic researchers and policy makers alike. Campbell, Lo, MacKinlay (1996), for example, have emphasized that, "as the term structure literature moves forward, it will be important to integrate it with the rest of the asset pricing literature." Policy makers have often used traditional theories such as the expectations hypothesis and the Fisher hypothesis to extract an approximate measure of market expectations of interest rates and macroeconomic variables like inflation, but they are also aware that risk premiums and other factors might complicate the interpretation of the information in the yield curve,1 and would welcome any progress in term structure modeling that would facilitate greater understanding of the messages in the yield curve.

Despite a lot of exciting work in macro-finance modeling of late,2 as a central bank economist who monitors markets regularly, I have found it difficult to bring current generation of models to bear on the practical analysis of bond market developments or to implement the models in real time to obtain a reliable measure of the market's expectation of key variables like inflation.3 In the academic literature there is not much evidence in this regard (either for or against macro-finance models). One exception is the recent paper of Ang, Bekaert, and Wei (2007a, henceforth ABW), which performed an extensive investigation of the out-of-sample inflation forecasting performance of various models and survey forecasts and found that the no-arbitrage models that they have used perform worse than not only survey forecasts but also other types of models.4

It thus seems useful at this juncture to review and discuss various challenges in the specification and implementation macro-finance models that might help shed light on the lack of documented practicality of macro-finance models in general and on the findings of ABW (2007a) in particular. To this end, in this paper I propose to take a closer look at what role the no-arbitrage principle is playing in macro-finance models and reconsider the assumptions that are often made in this literature. No-arbitrage itself is clearly a reasonable assumption, but the models also make additional assumptions whose validity may not have been discussed thoroughly in the existing literature. I also discuss "more advanced" issues (such as structural breaks and time-varying volatility) that require going beyond the standard affine-Gaussian framework of most macro-finance models and the challenges encountered in this regard. A big part of the challenge in macro-finance modeling is empirical, hence I shall also discuss at length the difficulties in the implementation stage (estimation and evaluation of models). Although the main focus of this paper is on the extraction of information from the yield curve (particularly inflation expectations), much of the discussion may be relevant for macro-finance models that were developed to address other issues, as they share some of the key assumptions discussed in this paper.

The state variables in the reduced-form no-arbitrage model framework (on which most macro-finance models are based) can be heuristically viewed as forming a basis onto which to project information in yields and other data. In this paper, I make a distinction between models that use (what I shall call) an "internal basis" versus models that use an "external basis". By an internal basis, I refer to a basis that is determined inside the estimation, hence unknown before the estimation. Latent factor models that describe inflation expectations and term structure jointly, such as Sangvinatsos and Wachter (2005) and D'Amico, Kim, and Wei (2007), are examples of internal basis models. By an external basis, I mean a basis that is a priori fixed completely or partially, as when a specific macroeconomic variable (such as inflation) is taken as a state variable. Note that no-arbitrage guarantees the existence of some pricing kernel, but it does not mean that the pricing kernel can be represented well by a priori selected variables. In this paper, I shall argue that external basis models involve strong assumptions, and discuss potential problems that they may give rise to in practice. All is not well with internal basis models either: the weaker assumptions of these models may come at the cost of the ability to give specific, intuitive interpretation of the yield curve movements. Most importantly, internal basis models face many empirical difficulties that are similar to those in the estimation of external basis models, in particular the overfitting and small-sample problems.

The remainder of this paper is organized as follows. Section 2 reviews the standard affine-Gaussian setup of macro-finance models, derives the affine bond pricing formula in a way that emphasizes the replicating portfolio intuition, and introduces a distinction between internal basis models and external basis models. Section 3 provides a critical examination of the assumptions in external basis models, both in the case of the "low-dimensional" and "high-dimensional" external basis models. Section 4 discusses the challenge of accommodating nonlinear/non-Gaussian effects such as structural breaks and time-varying uncertainties. Potential problems with empirical techniques commonly used in the estimation and evaluation of macro-finance models are discussed in Section 5. Section 6 comes back to ask why surveys perform better than models in inflation forecasting (as documented by ABW (2007a)), and Section 7 concludes.

2 Basic Model

2.1 Affine-Gaussian framework

Most macro-finance models in the literature are based on the "affine-Gaussian" model, given by

\displaystyle m_{t+1} \displaystyle \equiv \log M_{t}= -r(x_{t}) -\lambda(x_{t})^{\prime}\epsilon_{t+1}-\frac{1}{2}\lambda_{t}^{\prime}\lambda_{t} (1)
  \displaystyle x_{t+1} = \Phi x_{t} + (I-\Phi)\mu+ \Sigma\epsilon_{t+1},  
  \displaystyle r_{t} = \rho_{o} + \rho^{\prime}x_{t}  
  \displaystyle \lambda_{t} = \lambda_{a} + \Lambda_{b} x_{t}  

where  M_{t} is the pricing kernel,  x_{t} is an  n-dimensional vector of state variables,  r_{t} is the nominal short rate (i.e., one-period yield), and  \lambda_{t} is the market price of risk of the  n-dimensional shocks  \epsilon_{t+1}. ( \Phi,  \Sigma, and  \Lambda_{b} are  n \times n constant matrices,  \rho and  \lambda_{a} are constant  n-dimensional vectors, and  \rho_{o} is a constant.) A well-known result in finance theory says that no-arbitrage implies the existence of a pricing kernel (stochastic discount factor) of the form (1).5

There is freedom in choosing the specific functional form of  r_{t} and  \lambda_{t} and the dynamics of  x_{t}. Having affine forms for  r_{t} and  \lambda_{t} and the Gaussian specification (VAR(1) specification) of  x_{t} constitutes the affine-Gaussian model. This form has certain limitations (discussed in Section 4), but it is still quite general and capable of encompassing many of the known models in finance and macroeconomics.

Using the recursion relation for the price of a  \tau-period zero-coupon bond at time  t

\displaystyle P_{\tau,t}=E_{t}(P_{\tau-1,t+1}M_{t+1}),% (2)

it is straightforward to show that bond prices in this model are given by
\displaystyle P_{\tau,t}=\exp(A_{\tau} + B_{\tau}^{\prime}x_{t})% (3)

where  A_{\tau} and  B_{\tau} are the solution of the difference equations
0 \displaystyle = \rho_{o} + A_{\tau} - A_{\tau-1} -\frac{1}{2}B_{\tau-1}^{\prime}\Sigma\Sigma^{\prime}B_{\tau-1} -B_{\tau-1}^{\prime}((I-\Phi)\mu-\Sigma\lambda_{a})  
\displaystyle 0_{n\times1} \displaystyle = \rho+ B_{\tau} -(\Phi-\Sigma\Lambda_{b})^{\prime}B_{\tau-1}% (4)

with boundary condition  A_{\tau=0}=0, B_{\tau=0}=0_{n\times1}; see, for example, Ang and Piazzesi (2003). Hence the framework (1) leads to a tractable affine formula for bond yields  y_{t,\tau}  (=-\log(P_{\tau ,t})/\tau).

The original "finance term structure models" such as Dai and Singleton (2000) and Duffee (2002) were written for nominal bond yields only. For example, the model (1) could be estimated with just nominal yields data, with suitable (normalization) restrictions on the parameters  \Phi, \mu, \rho,... to insure that the model be econometrically identified. The state variables in this case are "latent factors" without an explicit economic meaning.

In a seminal paper, Ang and Piazzesi (2003, henceforth AP) proposed to combine this setup with a description of macroeconomy. Their basic insight is that the well-known Taylor-rule specification of the short rate also has an affine form:

\displaystyle r_{t} = \rho_{\pi} \pi_{t}^{Y} + \rho_{g} gap_{t} + const, (5)

where  \pi_{t}^{Y} is the annual inflation and  gap_{t} is the GDP gap (log GDP minus log potential GDP).6 Therefore, taking variables like inflation and GDP gap to be part of the state vector in eq. (1), i.e.,
\displaystyle x_{t}=[\pi_{t}^{Y}, gap_{t}, ....]^{\prime}, (6)

one can have a system in which bond yields are linked to key macro variables. Some macroeconomic variables might not be well described by a simple VAR(1) dynamics, but this is in principle not a problem, as a higher-order VAR process (VAR( q) model) can be written as a VAR(1) process with an expanded state vector that includes lags of these variables (e.g.,  [\pi_{t}^{Y}% ,\pi_{t-1}^{Y},..., gap_{t},gap_{t-1},...]^{\prime}).

Various macro-finance models differ by the choice of the restrictions imposed on the matrices like  \Phi, \rho,..., etc. For example, AP (2003) adopt an atheoretical (statistical) approach, reminiscent of Sims (1980)'s original VAR proposal, while Hoerdahl, Tristani and Vestin (2006), henceforth HTV) imposed more structure based on a New-Keyesian model as in Clarida, Gali, Gertler (2000) though still remaining in the reduced-form framework.

These are an innovation from the earlier approach of handling long-term bond yields in macroeconomic models. In fact, most macroeconomic models have not dealt with long-term bond yields at all, despite their importance for savings and investment decisions in the economy. Pre-macro-finance models like Federal Reserve's FRB/US model do contain the 5-year and 10-year nominal yields, which are specified as the expectations-hypothesis-implied yield plus a term premium (the 5-year term premium and the 10-year term premium are modeled separately),7 but the framework (1) allows not just a few selected long-term yields but the entire yield curve information to be integrated with a description of the macroeconomy.

2.2 No-arbitrage and replicating portfolios

While the derivation of the affine bond pricing equation (3) using the recursion relation involving the pricing kernel is simple and elegant, it is useful to re-derive it using the hedging (spanning) argument,8 in order to get a better sense of the role that the no-arbitrage principle is playing in macro-finance models.

Suppose there are  n-dimensional shocks underlying the term structure movements, denoted by a standard normal random vector  \epsilon_{t}. The change in the value of a bond with maturity  \tau can be expressed generally as

\displaystyle \frac{\delta P_{\tau,t+1}}{P_{\tau,t}}= \mu_{\tau,t} + \gamma_{\tau,t}^{\prime}\epsilon_{t+1}, (7)

where I have used the notation  \delta P_{\tau,t+1} for  P_{\tau -1,t+1}-P_{\tau,t} (the change in the value of a bond which was of time-to-maturity  \tau at time  t) to avoid confusion with simple time-differencing  \Delta P_{\tau,t+1}=P_{\tau,t+1}-P_{\tau,t},  \mu_{\tau,t} is the one-period expected return on a bond that has time-to-maturity  \tau at time  t (i.e.,  \mu_{\tau,t}=E_{t}(\delta P_{\tau,t+1}/P_{\tau,t})), and the  n-dimensional vector  \gamma_{\tau,t} is the loading on the shocks that determine the unexpected return.

Consider a portfolio formed by taking positions in  n+1 bonds with maturities  \tau_{1},\tau_{2},...,\tau_{n+1}, with portfolio weights  w_{1t}% ,..,w_{n+1,t}. Denoting the value of this portfolio  V, the return on the portfolio is given by

\displaystyle \frac{\delta V}{V} = w_{1} \frac{\delta P_{\tau_{1}}}{P_{\tau_{1}}} + ... +w_{n+1} \frac{\delta P_{\tau_{n+1}}}{P_{\tau_{n+1}}}=\sum_{i=1}^{n+1} w_{i}\mu_{\tau_{i}} + (\sum_{i=1}^{n+1} w_{i} \gamma_{\tau_{i}})^{\prime}\epsilon, (8)

where the time index  t has been suppressed for notational simplicity. If the portfolio is locally risk-free (  \sum_{i=1}^{n+1} w_{i} \gamma_{\tau_{i}% }=0), then by no-arbitrage it should yield a risk free rate (one-period yield), thus
\displaystyle \sum_{i=1}^{n+1} w_{it} \mu_{\tau_{i},t} = r_{t}. (9)

Together with  \sum_{i} w_{it}=1, this implies, in matrix form,
\displaystyle \left[\begin{matrix}\mu_{\tau_{1},t}-r_{t} & \cdots & \mu_{\tau_{n+1},t}-r_{t} \gamma_{\tau_{1},t} & \cdots & \gamma_{\tau_{n+1},t}\end{matrix}\right] w_{t} =0_{(n+1)\times1}, (10)

where  w_{t}=[w_{1t},...,w_{n+1,t}]^{\prime}. In order for this matrix equation to have a nontrivial (i.e., nonzero) solution  w_{t} for an arbitrary choice of  \tau_{i}'s, the expected excess return  \mu_{\tau ,t}-r_{t} has to be a linear combination of  \gamma_{\tau,t}, i.e.,
\displaystyle \mu_{\tau,t} - r_{t} =\gamma_{\tau,t}^{\prime}\lambda_{t},% (11)

where the  n-dimensional vector  \lambda_{t} ("market price of risk") expresses the linear-dependence between  \mu_{\tau ,t}-r_{t} and  \gamma_{\tau,t}.

It is often more convenient to deal with log prices and log returns on bonds,  \delta\log P_{\tau,t+1} (  =\log P_{\tau-1,t+1}-\log P_{\tau,t}). From the discrete-time version of the Ito's lemma,9 one has

\displaystyle \delta\log P_{\tau,t+1}= \tilde{\mu}_{\tau,t} + \gamma_{\tau,t}^{\prime}\epsilon_{t+1},% (12)

where
\displaystyle \tilde{\mu}_{\tau,t} = E_{t}\left( \frac{\delta P}{P}\right) - var_{t} \left(\frac{\delta P}{P}\right) =\mu_{\tau,t} - \frac{1}{2}\gamma_{\tau,t}^{\prime}\gamma_{\tau,t}. (13)

Thus, eq. (11) can be also written
\displaystyle \tilde{\mu}_{\tau,t}- r_{t} + \frac{1}{2}\gamma_{\tau,t}^{\prime}\gamma_{\tau,t}= \gamma_{\tau,t}^{\prime}\lambda_{t}.% (14)

Note that the derivation thus far has been quite general. If the short rate and market price of risk are affine in the state variables and if the state variables follow a VAR(1) process (i.e, eq (1)), one obtains a particularly simple result. Positing that the bond prices take the form  \log P_{\tau, t}= A_{\tau} + B_{\tau}^{\prime}x_{t}, one has (from eq. (12))

\displaystyle \tilde{\mu}_{\tau,t} \displaystyle = A_{\tau-1} - A_{\tau} +B_{\tau-1}^{\prime}(I-\Phi)\mu+ (B_{\tau-1}^{\prime}\Phi-B_{\tau}^{\prime})x_{t} (15)
\displaystyle \gamma_{\tau,t}^{\prime} \displaystyle = B_{\tau-1}^{\prime}\Sigma.% (16)

Substituting these (and the expressions for  r_{t} and  \lambda_{t}) into eq. (14) gives the same difference equation for bond prices as in eq. (4), hence the same bond prices, as promised earlier.

2.3 Internal basis models versus external basis models

The key formula in the above derivation of the bond pricing equation is eq. (11), or equivalently eq. (14). It states that the expected return on a bond of arbitrary maturity in excess of the short rate depends on the product of the bond-independent market price of risk,  \lambda_{t}, and the bond's sensitivity to risk,  \gamma_{\tau,t}. The basic intuition underlying eq. (11) is that the yield curve is "smooth", so the risks to a bond can be hedged well by a portfolio of (a relatively small number of) other bonds. This is well known from the factor analysis of Litterman and Scheinkman (1991) and other studies. One can also see this from the regression of the quarterly change in the 5-year yield on the changes in 6-month, 2-year, and 10-year yields, which gives very high  R^{2}s (e.g., 99%).

Note that eq. (11) itself is silent about the structure of the  \lambda_{t} vector, except for the condition that it does not depend on bond specific information (like maturity). In fact, the early generation of affine-Gaussian models assumed a constant market price of risk vector  \lambda, which in effect implied a version of the expectations hypothesis. Later studies recognized that  \lambda_{t} can depend on the state of economy, thus a variable influencing the market price of risk would also influence bond prices.10 However, this creates, in a sense, too large a set of possibilities - any variable, e.g., coffee production in Brazil, could in principle enter the expression for market price of risk and, in turn, the expression for bond yields.

Latent-factor models of the term structure, such as the affine-Gaussian model of Duffee (2002) ( EA_{0}(n) model in Duffee's terminology), partly get around this problem by implicitly defining the model in statistical terms. A "maximally flexible"  n-dimensional affine-Gaussian model (1) can be viewed as an answer to the question, "what is the most general  n-dimensional representation of the yield dynamics in which yields are Gaussian, linear in some basis, and consistent with no-arbitrage?"11 As the yield curve seems to be well described by a small number of risk sources, it stands to reason that there exists a suitable representation for a relatively small  n. Thus, the no-arbitrage principle in this setting can help describe the rich variation of the yield curve in a tractable and relatively parsimonious way, while allowing for a general pricing of risk (as opposed to the expectations hypothesis).

Duffee (2002)'s affine-Gaussian model describes only the nominal yield curve, but it is straightforward to write down a "joint model" of nominal yields and inflation in the same spirit by combining eq. (1) with the following specification of the inflation process,

\displaystyle \pi_{t+1} \displaystyle = \chi(x_{t}) + \tilde{\sigma}^{\prime}\tilde{\epsilon}_{t+1}, (17)
  \displaystyle \chi(x_{t}) = \psi_{o} + \psi^{\prime}x_{t}  

where the one-period inflation  \pi_{t+1} (  =\log(Q_{t+1}/Q_{t}),  Q_{t} being the price level) consists of the one-period expected inflation  \chi(x_{t}) and unforecastable inflation  \tilde{\sigma}^{\prime}% \tilde{\epsilon}_{t+1}. As in the case of the nominal short rate  r_{t}, the one-period inflation expectation is specified as an affine function of the state vector  x_{t}. The disturbance vector  \tilde{\epsilon}_{t} includes the vector of shocks that move interest rates (  \epsilon_{t} in eq. (1)) and a shock (say  \epsilon_{t}^{\perp}) that is orthogonal to the interest rate shocks.12 As in the nominal-yields-only model, the state vector  x_{t} is a vector of statistical variables (latent variables), which is determined only up to normalization restrictions (on parameter matrices  \Phi,  \rho_{o},  \rho,  \psi_{o},  \psi,...) that insure the (maximal) identification of the model. I shall refer to such a model as an "internal basis model," as the state vector  x_{t} is unknown before the estimation and is determined inside the estimation with yields, inflation, and possibly other data.13

Such a joint model makes only fairly weak assumptions: writing the one-period inflation as the sum of expected inflation and unexpected inflation in eq. (17) is quite general, and it makes intuitive sense to have the state vector  x_{t} describe inflation expectations and bond yields together, as a variable that moves inflation expectation would be also expected to move nominal interest rates. At the same time, this formulation relaxes the assumptions implicit in the two traditional theories of nominal yields: it goes beyond the expectations hypothesis, as it now allows for time-varying term premia, and the Fisher hypothesis, as it now implicitly allows for a general correlation between real rates and inflation.

Note that the state vector  x_{t} in the joint model has more economic meaning than the nominal-yields-only model in the sense that it is now (implicitly) related to objects like inflation expectations and inflation risk premia. However, the fact that  x_{it}'s are still latent factors is potentially an unattractive feature, and makes it difficult to discuss bond market developments in a simple manner. One would not win an "effective communication award" by telling market participants that "bond yields moved x basis points because latent factors did this and that."

Thus, many papers in the macro-finance literature take all or part of the state vector to be specific macroeconomic variables (or variables with clear macroeconomic interpretation) so as to make the connection between the yield curve and macroeconomy more explicit. These variables form an external basis, in the sense that they are a priori fixed, partially ("mixed" models) or completely (observables-only models). Simply speaking, internal basis models try to project information in yields  y_{\tau,t} and "observable" macro variables  f^{o}_{it} onto the state vector  x_{t} consisting of unobservable variables  f^{u}_{it}, while external basis models try to project information in yields onto "observable" macro variables  f^{o}_{it} and latent variables (if there are any). Schematically,

\displaystyle internal basis: \displaystyle \{ y_{\tau,t} \}, \{ f^{o}_{it} \} \Rightarrow x_{t} = [f^{u}_{1t},f^{u}_{2t},...]^{\prime}.  
\displaystyle external basis: \displaystyle \{ y_{\tau,t} \} \hspace{1cm} \Rightarrow x_{t} =[f^{o}_{1t},f^{o}_{2t},...,f^{u}_{1t},f^{u}_{2t},...]^{\prime}.  

As one moves on to external basis models, one might be also moving away from the relative comfort of the original intuition behind no-arbitrage (the smoothness of the yield curve); hence a close scrutiny of additional assumptions that they involve is warranted.

3 Examining the assumptions in the external basis models

3.1 Unspanned short-run inflation

One implication of having a macroeconomic variable like inflation as a state variable in the setup of eq. (1) is that short-run inflation risk can be hedged by taking positions in nominal bonds.14Many practitioners, however, would be skeptical about this claim. Policy makers are well aware of large short-run variations in price indices such as PPI and CPI that do not require a policy response, and they are careful to "smooth through the noise" in interpreting data on inflation. Blinder (1997) puts this clearly and strongly: "[The noise issue] was my principal concern as Vice-Chairman of the Federal Reserve. I think it is a principal concern of central bankers everywhere."

Market participants are also (implicitly) cognizant of these issues. One striking evidence is the bond market's reaction to the announcement of total CPI (also called "headline CPI", or simply "CPI") and core CPI. Core CPI is an inflation measure obtained by stripping out the volatile food and energy prices from total CPI. As can be seen in Figure 1a, monthly inflation based on total CPI is substantially more volatile than that of core CPI, and annual (year-on-year) inflations based on core CPI and total CPI can also differ significantly (Figure 1b). In the US, core CPI and total CPI for each month are announced in the following month (by the Bureau of Labor Statistics, typically in the second or third week). Before the release of the data, business economists partake in a survey about what the released numbers are going to be, from which "consensus expectations" are computed. The released number minus this consensus number can be viewed as a measure of the surprise component of the announcement.15

Figure 1: US monthly and annual inflation in core CPI and headline CPI (total CPI).
Figure 1: US monthly and annual inflation in core CPI and headline CPI (total CPI). Figure 1 is a line chart with two panels showing the US monthly and annual inflation in core CPI and headline CPI dating back to 1990. The date is shown on each horizontal axis while inflation (in percent) is shown on each vertical axis. Panel A shows monthly (month-on-month) CPI inflation while panel B shows annual (year-on-year) CPI inflation. Panel A shows that monthly inflation based on total CPI is substantially more volatile than that of core CPI.  Headline inflation spikes to +15 percent and -10 percent in some months, while core remains in a narrow range.  Panel B shows that annual inflation based on core CPI and total CPI can also differ significantly, although less dramatically.

The regression of the change in the 2-year yield surrounding the data release (denoted  \Delta y_{2Y,t}) on the surprise component of core CPI or total CPI (denoted  \Delta CORE_{t} and  \Delta TOTAL_{t}, respectively) in the 1990-2006 period gives:16

\displaystyle \Delta y_{2Y,t} \displaystyle = 0.11_{(0.23)} + 18.41_{(2.39)}\Delta CORE_{t} + e_{t}, (18)
\displaystyle \Delta y_{2Y,t} \displaystyle = 0.10_{(0.26)} + 6.97_{(2.07)}\Delta TOTAL_{t} +e_{t},% (19)

where the standard errors are given in parentheses. The coefficients on the surprise component in both cases are positive, in line with intuition: a positive inflation surprise leads to an upward revision in yields.

The more interesting case, however, is when both surprise components are used as regressors:

\displaystyle \Delta y_{2Y,t} = 0.09_{(0.24)} + 19.49_{(2.88)}\Delta CORE_{t} -1.49_{(2.22)}\Delta TOTAL_{t} + e_{t}. (20)

Note that the coefficient on  \Delta TOTAL_{t} is now insignificant. In other words, once the information in the core CPI surprise is taken into account, the total CPI surprise has no explanatory power. There are also many instances of the total CPI and core CPI surprises having opposite signs to re-do the regression (19) for them only (about 40 in 1990-2006), which gives:
\displaystyle \Delta y_{2Y,t} = -0.03_{(0.35)} -8.91_{(3.22)}\Delta TOTAL_{t} + e_{t}. (21)

The coefficient on  \Delta TOTAL_{t} is now counterintuitive (negative) and significant.

These results do not necessarily mean that the "extra" components in the total CPI (food and energy prices) are completely irrelevant to the yield curve. They do, however, raise the question as to whether it is reasonable to treat the fluctuation in total CPI as risks that are spanned by the the yield curve factors (an implicit assumption in most external-basis macro-finance models).

In a more prosaic approach, one can also examine the spanning of short-run inflation risk by regressing the change in quarterly inflation  \pi_{t} onto the changes in 6-month, 2-year, and 10-year yields.17 This regression gives an  R^{2} of at most 10% in the 1965-2006 period, in stark contrast to the aforementioned regression of the change in the 5-year yield ( R^{2} of 99%). Even when the lagged inflation terms are included, as in

\displaystyle \Delta\pi_{t}= a + b_{1}\Delta\pi_{t-1}+ b_{2}\Delta\pi_{t-2} + b_{3}\Delta\pi_{t-3} + b_{4}\Delta y_{6M,t} + b_{5} \Delta y_{2Y,t} +b_{6} \Deltay_{10Y,t} + e_{t},% (22)

the  R^{2}'s do not exceed 40%.18 This exercise is similar in spirit to Collin-Dufresne and Goldstein (2002), who argue that the relatively low  R^{2}'s in the regressions of the changes in interest rate derivative prices on the changes in interest rates indicate the presence of "unspanned stochastic volatility" in interest rates.

The evidence for poor spanning of short-run inflation risk raises questions as to whether external basis models are compatible with the no-arbitrage principle. Let us now address a related question -- whether external basis models can properly describe inflation expectations, which, according to the Fisher hypothesis intuition, are an important determinant of the nominal term structure.

3.2 Do macro variables form a suitable basis for representing expectations?

To those who engage in inflation forecasting extensively, the poor inflation forecast performance of macro-finance models like those of ABW (2007a) might not be a surprise: a long line of research has explored the inflation forecasting performance of the yield curve information and generally obtained disappointing results. Stock and Watson (2003) summarize the situation thus: "With some notable exceptions, the papers in this literature generally find that there is little or no marginal information content in the nominal interest rate term structure for future inflation."

One paper that does find evidence for the predictive information in the yield curve for future inflation is Mishkin (1990), so it is worth updating his results. Mishkin's regression takes the form

\displaystyle \pi_{t+\tau,t}-\pi_{t+u,t}= \alpha_{\tau,u} + \beta_{\tau,u} (y_{\tau,t} -y_{u,t}) + e_{t+\tau},% (23)

where  \pi_{t+\tau,t} is the inflation between time  t and  t+\tau, i.e.,  \pi_{t+\tau,t}=(1/\tau)\log(Q_{t+\tau}/Q_{t}), and  \tau> u. This regression is motivated by the Fisher hypothesis, which can be stated as
\displaystyle E_{t}(\pi_{t+u,t})= y_{u,t} - real yield. (24)

If the real yield were constant, subtracting this from the same equation with maturity  \tau would give eq. (23) with  \beta=1. Mishkin argued that his finding of statistically significant  \betas indicates the usefulness of the information in the yield curve for inflation forecasting.

With yield data from 1960 to 1983, I obtain a result similar to Mishkin: for example, running the regression (23) for  \tau=2-year and  u=1-year gives a  \beta_{\tau,u} of 2.32 (with standard error 0.28), which is indeed large and significant, and in fact larger than 1 (which is also the case in Mishkin (1990) with both his "full" sample and "pre-October 1979" sample). However, the same regression with the more recent 1984-2006 sample gives a much smaller  \beta_{\tau,u} of 0.17 (and standard error of 0.26). As discussed in Appendix B, the Mishkin regression coefficient probably has an upward bias, which may explain why the the coefficient in the earlier-period sample is substantially larger than 1. But this bias also suggests that the coefficient in the 1984-2006 sample, already small, may have been overstated. In sum, even the Mishkin regression provides little support for the usefulness of the yield curve information in the more recent sample period (which is presumably a more relevant period for current applications).

Most of the regression-based inflation forecasting models in the literature include current and lagged inflation as regressors in order to take into account the persistence of inflation. The expected inflation over the next year in these models takes the form

\displaystyle E_{t}(\pi_{t+1Y,t})= a+b_{0}\pi_{t}^{*} + b_{1}\pi_{t-1}^{*}+ .. + c^{\prime}z_{t},% (25)

where  \pi_{t}^{*} is either the one-period inflation or annual inflation, and the vector  z_{t} denotes other regressors, which could include term structure variables.

Consider a macro-finance model (1) that has quarterly (one-period) inflation  \pi_{t} as a state variable. In other words,  x_{t}=[\pi _{t},\tilde{z}_{t}^{\prime}]^{\prime}, where  \tilde{z}_{t} (=[\tilde{z} _{1t},\tilde{z}_{2t},..]^{\prime}) denotes other state variables. The expected inflation over the next year is

\displaystyle E_{t}(\pi_{t+1Y,t}) \displaystyle = [1,0,...,0] ((\Phi+\Phi^{2}+\Phi^{3}+\Phi^{4})(x_{t}-\mu) +\mu)  
  \displaystyle = \tilde{a} + \tilde{b}_{0} \pi_{t} + \tilde{b}_{1} \tilde{z}_{1t} +\tilde{b}_{2} \tilde{z}_{2t}+\cdots, (26)

which is in the same form as eq. (25).19 (The case is similar with models that use annual inflation as a state variable.) As such, the difference between the macro-finance models formulated this way and the regression models is simply in the coefficients, not in the basis. There is a possibility of an "efficiency gain" with no-arbitrage models (through the imposition of useful constraints on the coefficients), but even this is not assured, if the results in ABW (2007a) are any indication. More fundamentally though, the frequently poor inflation forecast performance of regression models and macro-finance models like ABW(2007a) raises questions about the basis itself.

3.3 Lessons from simple models

Some of the key conceptual issues in the representation of the yield curve and inflation expectations may be explained through a comparison of two simple models of inflation, namely AR(1) and ARMA(1,1) models:

\displaystyle \pi_{t} \displaystyle = (1-\phi)\mu+ \phi\pi_{t-1} + \varepsilon_{t}, \hspace{3cm}\mathrm{(AR)} (27)
\displaystyle \pi_{t} \displaystyle = (1-\phi)\mu+ \phi\pi_{t-1} + \varepsilon_{t} -\alpha\varepsilon_{t-1}. \hspace{1.5cm} \mathrm{(ARMA)}% (28)

The  \tau-period-ahead inflation expectations in both models take the form
\displaystyle E_{t}(\pi_{t+\tau})= \phi^{\tau-1} (\chi_{t}-\mu) + \mu,% (29)

where the expected one-period inflation  \chi_{t} \equiv E_{t}(\pi_{t+1}) for the AR(1) model is given by
\displaystyle \chi_{t} = \phi\pi_{t}+(1-\phi)\mu% (30)

and  \chi_{t} for the ARMA(1,1) model is given by
\displaystyle \chi_{t} = \phi\pi_{t} -\alpha\varepsilon_{t} +(1-\phi)\mu.% (31)

The estimate of  \phi in the AR(1) model, based on US quarterly CPI inflation data from 1960Q1 to 2005Q4, is 0.785  _{(0.045)}, while the estimates of  \phi and  \alpha in the ARMA(1,1) model are 0.935  _{(0.031)} and 0.341  _{(0.081)}, respectively, standard errors being in parentheses. These numbers imply fairly similar one-quarter-ahead inflation expectations, as can be seen in Figure 2a. (There is somewhat more jaggedness in the AR(1) forecast.) The same parameter estimates, however, imply very different longer-horizon inflation expectations: the 5-year-ahead (20-quarter-ahead) inflation expectation from the AR(1) model is almost constant, while the 5-year-ahead inflation expectation from the ARMA(1,1) model is more variable. (This reflects the difference between  0.785^{20-1}=0.01 versus  0.935^{20-1}=0.28 in eq. (29).)

Figure 2: US inflation expectations based on AR(1) and ARMA(1,1) models.
Figure 2: US inflation expectations based on AR(1) and ARMA(1,1) models. Figure 2 is a line chart with two panels showing US inflation expectations based on fitting AR (1) and ARMA (1,1) models dating back to 1960. The date is shown on each horizontal axis while inflation expectation (in percent) is shown on each vertical axis. Panel A shows 1-quarter-ahead inflation expectations while panel B shows 5-year-ahead inflation expectations. Panel A shows fairly that the two models give similar one-quarter-ahead inflation expectations (though there is somewhat more jaggedness in the AR (1) forecast). Panel B, however, shows that the two forecasting models produce very different longer-horizon expectations: the 5-year-ahead (20-quarter-ahead) inflation expectation from the AR (1) model is almost constant, while the 5-year-ahead inflation expectation from the ARMA (1,1) model is more variable.

An almost constant 5-year-ahead inflation expectation from the AR(1) model in the past 40 years is highly questionable. The main reason for the qualitative difference between the AR and ARMA models is that the ARMA(1,1) model tries to separate out the "unforecastable inflation" from the expected inflation, while the AR(1) model does not. This can be seen from the fact that the ARMA(1,1) model is a univariate representation of the following "two-component model":

\displaystyle \pi_{t} \displaystyle = \chi_{t-1} + \eta_{t} (32)
\displaystyle \chi_{t} \displaystyle = (1-\phi)\mu+ \phi\chi_{t-1} + e_{t},  
  \displaystyle \eta_{t}\sim N(0,\sigma_{\pi}^{2}), e_{t} \sim N(0,\sigma_{x}^{2}), corr(\eta_{t},e_{t})=\varrho,  

in which  \chi_{t} is an expected inflation process and  \eta_{t} is an unforecastable inflation.20 Though simple, this two-component model (of which the internal basis model (17) discussed in Section 2.3 can be viewed as an extension) is quite useful for illustrating some of the key points in this paper.21

The unforecastable inflation component  \eta_{t} in eq. (32) can help explain several puzzling empirical results in the literature. Among them is the negative one-lag autocorrelation of the changes in quarterly inflation  \Delta\pi_{t} (  = \pi_{t}-\pi_{t-1}), which, according to Rudd and Whelan (2006, Sec III.C), is an evidence against the new-Keynesian Phillips curve models (which generate positive one-lag autocorrelation). In the case of the two-component model (32), one has

\displaystyle cov(\Delta\pi_{t}, \Delta\pi_{t-1}) = cov(\Delta\eta_{t},\Delta\eta_{t-1}) +cov(\Delta\chi_{t-1},\Delta\chi_{t-2})+ cov(\Delta\chi_{t-1},\Delta\eta_{t-1}). (33)

The obviously negative first term dominates the second and third terms at appropriate parameter values, resulting in a negative  cov(\Delta\pi_{t}, \Delta\pi_{t-1}). The unforecastable component  \eta_{t} also plays the role of putting an upper bound on the predictability of inflation.

Economically, the  \eta_{t} term represents the very-short-run effects in total CPI inflation, including part of the food and energy prices that create the wedge between total CPI and core CPI (as seen in Section 3.2), as well as the unforecastable components of the core CPI inflation and potential errors in the measurement of CPI. A large part of  \eta_{t} is beyond the control of monetary policy makers (or economic agents, for that matter); thus, in some sense, the presence of a substantial amount of unforecastable inflation is a "fact of life".

The importance of the  \eta_{t}-term in the two-component model (32) has a parallel implication for no-arbitrage macro-finance models: the failure to separate out the "unspanned macro shocks" in macro-finance models may produce problems that mirror those of the AR(1) inflation model. It is worth mentioning here that Stock and Watson (2007) have also recently emphasized that separating inflation into a trend component and a serially uncorrelated shock (like  \eta_{t} in eq. (32)) is useful for explaining key features of the US inflation dynamics,22though they do not discuss the ramifications for macro-finance (no-arbitrage) models.

It is instructive to ask about the basic variable underlying the term structure of inflation expectations in the ARMA(1,1) model. As is clear from eq. (29), the basic variable is  \chi_{t}, not the realized inflation  \pi_{t}. Note that in the case of the AR(1) model,  \chi_{t} is  \pi_{t} (up to a prefactor and an intercept), as can be seen from eq. (30). This is not the case for the ARMA(1,1) model: it is straightforward to show (by solving for  \varepsilon_{t} in eq. (28) and recursively substituting into eq. (31)) that  \chi_{t} in the ARMA(1,1) model can be expressed as

\displaystyle \chi_{t}=(\phi-\alpha)\sum_{j=0}^{\infty}\alpha^{j} (\pi_{t-j}-\mu)+\mu.% (34)

This is in the exponential smoothing form, which has been familiar at least since the work of Muth (1960).

The expression (34) suggests that the connection between realized macro variables and state variables in no-arbitrage term structure models could be complicated, and that the poor inflation forecasting performance of regression models and no-arbitrage models with macro variables may be a more complex issue than being just a matter of having "efficient" coefficients (with conventional basis). To be sure, the state variables in nominal term structure models are not simply those that underlie the variation of inflation expectations. Factors that affect the real term structure and inflation risk premia should also be included in the nominal term structure model. However, it is not clear that these additional aspects would be any better described by macro variables.

Note that the expression for  \chi_{t} in eq. (34), although useful for conceptual illustration, is still deficient as a description of inflation expectations for both subtle and fundamental reasons. The subtle reason concerns the conditioning information: if the two-component model (32) is the data generating process for inflation, the "true" inflation expectation cannot be expressed simply in terms of the past inflations (except when  \varrho=\pm1 in eq. (32)). Mathematically,

\displaystyle E_{t}(\pi_{t+1} \vert \eta_{t}, e_{t}, \eta_{t-1},e_{t-1},...) \neq E_{t}(\pi_{t+1}\vert \pi_{t},\pi_{t-1},....). (35)

In other words, the one-period inflation expectation based on the past history of inflation (as computed from the ARMA(1,1) model) is not the same as the one-period inflation expectation \chi_{t} in eq. (32) computed using more information than just the inflation data.

More fundamentally, the ARMA model and even the two-component model are deficient, as both models imply that the one-period inflation expectation is an AR(1) process, which means that inflation expectations for all horizons are given by a single factor  \chi_{t}, with the term structure of inflation expectations monotonically sloping up when  \chi_{t} is below its long-run mean and monotonically sloping down when  \chi_{t} is above its long-run mean. This stiffness (lack of flexibility) of the model makes it difficult to describe the inflation environment of the past decades, during which people's perception of Federal Reserve's inflation target is believed to have changed appreciably. Thus, the model's results depend materially on how long the estimation sample is. Figure 2b, based on the estimation with a "long sample" that includes the 1970s, indicates the current (circa 2006) five-year-ahead CPI inflation expectation of about 4%, which is too high to be believed. More generally, one can view the level to which inflation mean-reverts itself as varying over time.23

3.4 Low-dimensional external basis models

Let us now consider some specific issues that arise in external basis models with a "low-dimensional" state vector.

Suppose that one has a three-factor macro-finance model in the setup of eq. (1), with the state vector  x_{t} consisting of all "observable" macro variables, say, the quarterly inflation  \pi_{t}, quarterly GDP growth  g_{t}, and the effective federal funds rate  f\!\!f_{t}. The inflation expectations in this model are then linear functions of contemporaneous variables  \pi_{t},g_{t},f\!\!f_{t}. (To see this, simply substitute  \tilde{z}_{1t}=g_{t}, \tilde{z}_{2t}=f\!\!f_{t} in eq. (26).) This type of forecast (VAR(1)) has more qualitative similarity to the AR(1) model than the ARMA(1,1) model; in particular, despite its multi-factor nature, it still mixes "signal" with "noise" and can therefore be expected to inherit many of the problems with the AR(1) model.

Some of the macro-finance models in the literature, including Ang, Dong, and Piazzesi (2005, henceforth ADP) and ABW (2007a), remain in a relatively low-dimensional framework but use a mix of latent factors and macroeconomic variables, but these "mixed models" may still have difficulties. Consider, for example, the ABW (2007a)'s affine model (their MDL1 model) with quarterly inflation and two latent factors, i.e.,  x_{t}=[\pi_{t},f_{1t},f_{2t}% ]^{\prime}. If the latent factors  f_{1t},f_{2t} are interpreted as  \pi_{t-1},\pi_{t-2}, eq. (26) takes a form similar to the smoothing form (34). However, besides the issue that two lags might not be enough, one may not have the freedom to interpret  f_{t}'s this way, as that would deprive the ability to describe other aspects of the nominal term structure (e.g., real interest rates, time-varying risk premium, time-varying perceived inflation target).

In the mixed models, having a macro variable like  \pi_{t} as a part of the state vector may cause a distortion in the inference, as the latent factors can end up absorbing the "unspanned" variation in  \pi_{t}. To illustrate this schematically, suppose that the true model of the short rate is

\displaystyle r_{t}=\rho\tilde{\pi}_{t} + f_{t}, (36)

where  \tilde{\pi}_{t} is the "spanned" part of the one-period inflation  \pi_{t}, i.e,
\displaystyle \pi_{t} = \tilde{\pi}_{t} + e_{t}, (37)

with  e_{t} denoting the unspanned component. If one uses realized inflation  \pi_{t} in place of  \tilde{\pi}_{t}, then
\displaystyle r_{t}=\rho(\pi_{t} - e_{t})+ f_{t} = \rho\pi_{t} + (f_{t} -\rho e_{t}). (38)

Thus the latent factor  f_{t} would be distorted by an amount  \rho e_{t}. Although this was illustrated with an affine model, the same problem can occur in the non-affine models.

3.5 High-dimensional external basis models

Some of the external basis macro-finance models in the literature use a fairly large number of state variables that include lagged macro variables. Many such models (including those of AP (2003) and HTV (2006)) use annual inflation  \pi_{t}^{Y} (=  \pi_{t,t-1Y}) as a state variable instead of the one-period inflation. This may help alleviate concerns about the problem with the use of the one-period inflation, since the year-on-year inflation partly "smooths out" the noise in quarterly inflation:  \pi^{Y}_{t} can be written

\displaystyle \pi^{Y}_{t}= \sum_{i} w_{i} \pi_{t-i},% (39)

where the weights  w_{i} are 1/4 for  i=0,1,2,3, and 0 for  i > 3.

Note, however, that the construction (39) automatically implies a moving average structure in  \pi^{Y}_{t}, which suggests that the simple VAR(1) description would not be a good description of its dynamics. Thus, macro-finance models that use annual inflation as a state variable typically include additional lags, e.g., AP (2003) use 12 monthly lags, in effect having a VAR(12) model. Bond yields in this case depend on a "large" set of state variables that include lagged macroeconomic variables.24

A problem with this type of "high-dimensional" specification is that it inherits the well-known problems of the the unrestricted VAR models. In fact, AP (2003)'s inflation dynamics is a conventional VAR. They separate the vector of relevant variables into an "observable" macro vector  f^{o}_{t} and an unobservable (latent) vector  f^{u}_{t}, i.e.,  \tilde {x}_{t}=[f^{o^{\prime}}_{t}, f^{u^{\prime}}_{t}]^{\prime},25 and impose the restriction that the latent factors do not affect the expectation of macroeconomic variables. Their macro vector dynamics is given by the VAR( q):

\displaystyle f_{t}^{o}=\Phi_{1}^{o} f_{t-1}^{o} + \Phi_{2}^{o} f_{t-2}^{o} + \cdots+\Phi_{q}^{o} f_{t-q}^{o} + c^{o} +\Sigma^{o} \epsilon_{t}^{o}, (40)

where  q=12. Although the parameters in the matrices  \Phi_{1}^{o}, ...,\Phi_{q}^{o} are in principle identified and can be estimated by OLS, this kind of unrestricted VAR is well known to suffer from overparametrization problems (which will be discussed further in Section 5.1).

By having only the macro variables describe inflation dynamics, AP (2003) turned off the possibility of the yield curve saying something about future inflation. Unfortunately, it is difficult to lift that restriction. The overparametrization problem would get worse, as the full (maximally-identified) model would have an even larger number of parameters: in the specification of the state vector dynamics

\displaystyle \left[\begin{matrix}f^{o}_{t} f^{u}_{t}\end{matrix}\right] = \left[\begin{matrix}\Phi^{o}_{1} & \Phi^{ou}_{1} \Phi^{uo}_{1} & \Phi^{u}_{1}\end{matrix}\right] \left[\begin{matrix}f^{o}_{t-1} f^{u}_{t-1}\end{matrix}\right] + \left[\begin{matrix}\Phi^{o}_{2} & \Phi^{ou}_{2} \Phi^{uo}_{2} & \Phi^{u}_{2}\end{matrix}\right] \left[\begin{matrix}f^{o}_{t-2} f^{u}_{t-2}\end{matrix}\right] + \cdots+ c+ \Sigma\epsilon_{t}, (41)

the matrices  \Phi^{ou}_{1},  \Phi^{ou}_{2},... are now nonzero and have to be estimated. Furthermore, the two-step estimation procedure that AP (2003) used is no longer applicable, hence the estimation now involves a "one-step" optimization of a very-high-dimensional likelihood function.

Models like HTV (2006) have more structure (in the form of the new-Keynesian Phillips curve and IS equations), which may help alleviate overparametrization concerns, but at a possibly greater misspecification risk: various aspects of the new-Keynesian specification are still under debate, e.g., the presence or absence of the interest rate smoothing term (e.g., English et al (2003), Rudebusch (2006)) and the strength of the backward-looking inflation terms (e.g, Rudd and Whelan (2006)).

A common practice in the specification of external basis models that contain lags of macroeconomic variables in the state vector is to set the coefficients of the market price of risk (  \Lambda_{b} matrix in eq. (1)) that load on lagged macro variables to zero (e.g., AP (2003) and HTV (2006)). Even with this restriction, the number of remaining market price risk parameters is large, and modelers often make additional ad hoc restrictions on the  \Lambda_{b} matrices to reduce the number of parameters further.26 Unfortunately, the practice of setting the  \Lambda_{b} coefficients on lagged macro variables to zero is not as innocuous as it might appear. It implies that the expected excess return on a bond,  \mu_{\tau ,t}-r_{t}, is completely spanned by contemporaneous macroeconomic variables (and latent factors, if there are any). Recall, from eqs. (11) and (16), that

\displaystyle \mu_{\tau,t} - r_{t} = B_{\tau}^{\prime}\Sigma\lambda_{t}. (42)

Therefore, if  \lambda_{t} does not depend on lagged macro variables, neither does the bond return premium. This means that while one has
\displaystyle y_{\tau,t} = a_{\tau} + b_{\tau,1}\pi_{t} + b_{\tau,2}\pi_{t-1} + b_{\tau,3}\pi_{t-2} \cdots, (43)

one cannot have
\displaystyle \mu_{\tau,t} - r_{t}= \alpha_{\tau} + \beta_{\tau,1}\pi_{t} + \beta_{\tau,2}\pi_{t-1} + \beta_{\tau,3}\pi_{t-2} + \cdots. (44)

This asymmetry in the way yields and bond risk premia depend on lagged macro variables has nothing to do with no-arbitrage, and has little empirical basis. In other words, in order to cast the model in a "no-arbitrage" framework, many external basis macro-finance models are introducing arbitrary and nontrivial assumptions about the market price of risk. This raises the question of whether the no-arbitrage principle can play its intended role.27

3.6 Would composite factors help?

Several recent studies have explored the use of composite variables created from a large array of macroeconomic variables in modeling the term premia (e.g., Ludvigson and Ng, 2006) or the yield curve (e.g., Moench, 2006).

The composite factors may be appealing because they utilize a much bigger information set and also because they may be cross-sectionally smoothing out some of the idiosyncratic noise in quantities like CPI, hence one can expect them to reflect more of the systematic variation than the individual macro variables.

However, since these models do not address expectations concerning specific macroeconomic variables of potential interest, one cannot tackle issues such as the expectation of the CPI inflation implicit in the yield curve; thus macro-finance models based on purely composite factors would not have much to say about TIPS pricing, as TIPS are specifically indexed to the CPI.

More fundamentally, it is unclear whether composite factors can be valid state variables in no-arbitrage term structure models, as they may still face many of the aforementioned problems with external basis models. In particular, the way the composite factors in Moench (2006) and Ludvigson and Ng (2006) are constructed is such that they are not very persistent variables.28 For example, Ludvigson and Ng (2006) report that their most persistent factor has a monthly AR(1) coefficient of 0.77 (the half-life is less than a quarter). In order for such a variable to describe yields in the setup of (1) even just qualitatively (e.g., producing the kind of persistence that yields have), one needs long lags, which again raise overparametrization concerns. In addition, even if principal components analysis indicates that a small number, say  n, of factors describe the bulk of yield curve movements, it is not clear whether the proper truncation number for cross-sectional composite factors should be also small or is related to  n.

4 Affine Gaussian models versus non-affine/non-Gaussian models

4.1 Structural stability

One potential limitation of the general framework (1) is structural stability. To be sure, the debate about the structural stability of macroeconomic relationships is not new (see, e.g., Rudebusch (1998) and Sims (1998)). However, it may have different ramifications for internal basis models and external basis models, and hence merits a discussion here.

Several well-known structural instabilities are of direct relevance to macro-finance models. Many have noted that in the 1990s a large part of the term structure variation seemed to be due to the variation of real rates, while in the 1970s the variation in inflation expectations seemed to be a more dominant factor. The stark difference between the Mishkin regression coefficients in the 1960-83 and 1984-2006 periods discussed in Section 3.2 lends support to the claim of a change in the relative importance of inflation for explaining yield curve movements. Another instability is that of the Taylor rule coefficients, as argued by Clarida, Gali, and Gertler (2000) and others. Since the Taylor rule underpins the short-rate specification of many macro-finance models, this instability is a serious concern for the macro-finance models that are estimated with a "long" sample that includes the pre-Volcker disinflation period. Note also that the dynamics of many macro-finance models is similar to conventional VARs, but low-dimensional macro-VARs were often found to be unstable (e.g., Stock and Watson (1996)).

The traditional specification may also face difficulty in accommodating relatively new developments. For example, in recent years there has been an increased discussion of the effects of global forces on domestic bond markets. Increased "global liquidity" has been often cited as a potential factor pressing down inflation expectations or bond risk premia, and interest rates movements in various countries, including the United States, Euro area, and Japan, have lately become more highly correlated.29 Whether structurally stable Taylor-rule type specifications are consistent with these developments is an open question.

One may hope that concerns about structural instability would be alleviated if latent factors are also included in external basis models. For example, a macro-finance model with a Taylor-rule-like mixed specification of the short rate (similar to ADP (2005))

\displaystyle r_{t} = const + \rho_{\pi} \pi_{t}^{Y} + \rho_{g} gap_{t} + f_{t}, (45)

where  f_{t} is a latent factor, can be written as
\displaystyle r_{t} = const + \pi_{t}^{Y}+(1-\rho_{\pi})(\pi_{t}^{Y} - \pi_{t}^{*}) +\rho_{g} gap_{t}, (46)

where  \pi_{t}^{*} (=  -f_{t}/(1-\rho_{\pi})) is the time-varying inflation target. However, the factor  f_{t} may have to play a number of other roles in the model, for instance, the interest rate smoothing term, time-varying risk premium, and so on (analogously to an earlier discussion in Section 3.4 regarding ABW (2007a)'s affine model). Thus a model written with  f_{t} as a time-varying inflation target in mind might have some difficulty capturing the intended effect.

Furthermore, there may be instabilities other than time-varying intercept, for instance, changes in the conditional correlation of various macroeconomic variables, changes in the persistence of the macroeconomic variables, and so on. Imagine, heuristically, a situation in which the "true" model is

\displaystyle r_{t}= c+ \rho_{\pi,t} \pi_{t}^{Y} + \rho_{g,t} gap_{t}, (47)

i.e, a Taylor-rule-like short-rate with time-varying loadings on the macroeconomic variables. In this case, the 2-factor affine model in which the state variables are  [\pi_{t}^{Y},gap_{t}]^{\prime} is obviously misspecified. For another example, consider a "time-varying inflation-persistence model"
\displaystyle \pi_{t}^{Y}= \phi_{t-1} \pi_{t-1}^{Y} + c + \varepsilon_{t}. (48)

Again, identifying  \pi_{t}^{Y} as a state variable in an affine setting would be a misspecification.

One way to address this problem is to model these effects explicitly in non-affine/non-Gaussian models. However, these models, being richer than affine-Gaussian models, may be even more susceptible to overfitting concerns and may incur a greater risk of misspecification. The disappointing inflation forecasting performance of the vector regime-switching model and the no-arbitrage "regime-switching" model in ABW (2007a) (referred to by them as RGMVAR and MDL2, respectively) may serve as a reminder in this regard.

Alternatively, the use of an internal basis (while still remaining in the affine-Gaussian setup) may allay structural instability concerns to some extent: internal basis models are agnostic as regards the definition of the factors; thus a model that is obviously unstable from the point of view of an external basis may not necessarily be so from the point of view of an internal basis. For example, going back to eq. (47), choosing the state variable as  x_{t}=[\rho_{\pi,t} \pi_{t}^{Y}, \rho_{g,t} gap_{t}]^{\prime} may be more effective than having  x_{t}=[\pi_{t}^{Y}, gap_{t}]^{\prime}, although there may be an even better internal basis for the problem (depending on how the rest of the model is defined).30

Of course, no-arbitrage models with an internal basis should not be expected to answer all structural stability concerns. A strong structural instability may be difficult to capture even with an internal basis model, in which case it might be better to use a shorter, structurally more homogeneous sample.

4.2 Time-varying uncertainty

Another limitation of the affine-Gaussian models (both internal and external basis models) is that they imply homoskedastic yields, while there is copious evidence for time-varying volatility of interest rates, e.g., from interest rate derivatives as well as the stochastic-volatility models and GARCH-type models. However, it is not clear whether a no-arbitrage model that allows for time-varying volatility would produce better results. Again, the concern is that such a model may incur greater specification errors and implementation errors.

Theoretically and intuitively, one should expect a relation between term structure variables and time-varying uncertainty about interest rates: to the extent that bond market term premia arise from risk, the changing amount of interest rate risk should translate to a changing term premium. It also stands to reason that at least a part of the variation in interest rate volatility is linked to the variation in the uncertainties about key macro variables. Various studies have noted that macroeconomic uncertainties (inflation, GDP, monetary policy) have declined since the Volcker disinflation, a phenomenon often dubbed the "Great Moderation".31 One can expect this effect to be accompanied by a corresponding reduction in term premia in the bond market. Kim and Orphanides (2007) indeed report positive relationships between the term premium in the 10-year forward rate and proxies for uncertainties about monetary policy and inflation based on the dispersion of long-horizon survey forecasts.