The Federal Reserve Board eagle logo links to home page

Nonparametric HAC Estimation for Time Series Data with Missing Observations **

Deepa Dhume Datta and Wenxin Du

NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at http://www.federalreserve.gov/pubs/ifdp/. This paper can be downloaded without charge from the Social Science Research Network electronic library at http://www.ssrn.com/.


Abstract:

The Newey and West (1987) estimator has become the standard way to estimate a heteroskedasticity and autocorrelation consistent (HAC) covariance matrix, but it does not immediately apply to time series with missing observations. We demonstrate that the intuitive approach to estimate the true spectrum of the underlying process using only the observed data leads to incorrect inference. Instead, we propose two simple consistent HAC estimators for time series with missing data. First, we develop the Amplitude Modulated estimator by applying the Newey-West estimator and treating the missing observations as non-serially correlated. Secondly, we develop the Equal Spacing estimator by applying the Newey-West estimator to the series formed by treating the data as equally spaced. We show asymptotic consistency of both estimators for inference purposes and discuss finite sample variance and bias tradeoff. In Monte Carlo simulations, we demonstrate that the Equal Spacing estimator is preferred in most cases due to its lower bias, while the Amplitude Modulated estimator is preferred for small sample size and low autocorrelation due to its lower variance.

Keywords: Heteroskedasticity, serial correlation, robust inference, missing data

JEL classification: C22, C12, C14



1  Introduction

While use of the Newey and West (1987) estimator and its Andrews (1991) implementation have become standard practice for heteroskedasticity and autocorrelation (HAC) robust inference, analogous methods for series with missing observations are far from standardized. When data are missing, the Newey-West formulas do not immediately apply, and the formula for calculating the lagged autocorrelations that are required terms in conventional HAC formulas must be adjusted.

Current practices for working with missing data include treating the missing observations as non-serially correlated, or imputing or ignoring the missing observations. To our knowledge, there has not been formal justification of HAC estimation for robust inference in these contexts, and the effect of employing these work-around methods on the resulting inferences is generally unexplored in applied work. In this paper, we provide formal justification for two methods of HAC estimation, and we compare these two methods to other existing methods. We demonstrate that treating the data as equally spaced is preferred under very general conditions, and that treating the missing observations as non-serially correlated may be preferredin instances with small sample sizes or low autocorrelation. In general, we find that our two newly formalized methods are preferred to imputation.

Especially when our aim is to adjust inferences for serial correlation, it seems counterintuitive that we can either treat the data as equally spaced or treat the missing observations as non-serially correlated, since these procedures require us to depart from the original time structure or autocorrelation structure of the data. However, we show that these procedures both provide consistent estimators of the long-run variance of the observed series with missing data. Though many have suggested that we infer the spectrum of the underlying data from the observed data using the Parzen estimator, we show that the Parzen estimator is not the correct object for inference testing. Rather, we show that our Amplitude Modulated estimator (which treats the missing as non-serially correlated)and Equal Spacing estimator (which treats the observed as equally spaced) are extremely simple to implement and can be used to generate asymptotically valid inferences.

These insights are particularly valuable given the ad hoc approaches widely found in the applied literature. For example, researchers often use imputation procedures to fill in the missing data. Imputation seems to expand the set of observations used for analysis, or at least prevents us from dropping data when some covariates are unobserved. However, because imputed data series are often smoothed versions of the underlying unobserved series, they will often lead to asymptotically valid but extremely poor finite sample performance. Furthermore, researchers using these methods rarely adjust their inferences for this induced serial correlation.

Additional evidence of the confusion in the literature stems from the implementation of the Newey-West HAC estimator in the popular statistical package, Stata. The newey2 command implements Newey-West to obtain the standard error of the coefficient in a regression using time series or panel data. When observations are missing, the option "force" can be applied. This option will apply our Equal Spacing estimator to time series data and apply our Amplitude Modulated estimator to panel data. However, it should be possible to implement either the Equal Spacing estimator or the Amplitude Modulated estimator for time series data. Furthermore, the program will not apply the Amplitude Modulated estimator at all when some lags are unobserved. This condition is likely an artifact of the literature on estimating the long-run variance of the underlying series, which develops estimators that generally require all lags to be observed (Clinger and Van Ness, 1976; Parzen, 1963). Yet for robust inference, we show that the Amplitude Modulated and Equal Spacing estimators do not require all the lags to be observed.

A primary goal of this paper is to help researchers select the correct estimation procedure in applied work. To that end, we follow the style of Petersen (2009), which provides guidance for selecting standard errors in finance panel data sets. We formally present the Amplitude Modulated and Equal Spacing estimators of the long-run variance of the observed series, and we review their asymptotic properties. We contrast these estimators with the Parzen estimator of the long-run variance of the underlying series. After presenting these theoretical contributions, we offer intuition for why the estimators work and how they are related to each other. To generate guidance for choosing the correct estimator, we conduct Monte Carlo simulation results for various sample sizes, correlation structures, and fractions of observations that are missing. In addition to testing our estimators using randomly missing data, we demonstrate the applicability of these estimators to a deterministic cyclical missing structure, as with daily financial data, which usually cover 5 of 7 days of the week. Finally, we discuss an empirical application using recursive regressions for commodities futures returns to demonstrate how the choice of estimator can affect the conclusion of empirical tests.

As a preview, our results demonstrate that the Amplitude Modulated and Equal Spacing estimators are both consistent under random and deterministic missing structures. In finite samples, we find that for the same fixed bandwidth, the Equal Spacing estimator is generally less biased than the Amplitude Modulated estimator, but has larger variances. Consequently, the Equal Spacing estimator is preferred when autocorrelation is high, as the bias will dominate the mean squared error in these cases. Conversely, when autocorrelation is low, variance dominates the mean squared error, and the Amplitude Modulated estimator is preferred. The precise cutoff between these cases depends on the sample size, and whether automatic bandwidth selection procedures are implemented.1

The remainder of the paper is structured as follows. Section 1.1 discusses some examples of missing data problems for which imputation and equal spacing methods have been applied, and provides a brief review of some of the related econometrics literature. Section 2 provides an overview of our estimators by applying them in a simple setting with missing observations. Section 3 formally defines the estimators and discusses their asymptotic and finite sample properties. Section 4 presents the application of these estimators to inference in a regression setting with missing observations. Section 5 describes the Monte Carlo simulations based on these estimators and the results. Section 6 presents an empirical application of the estimators using data on commodities returns. Finally, Section 7 concludes.

1.1  Relation to the Literature

This paper extends the HAC covariance literature to applications with missing observations. In general, the most commonly used HAC covariance matrix is the one proposed by Newey
and West (1987) and further developed by Andrews (1991). The Newey-West estimator equals a weighted sum of lagged autocovariance matrices, in which the weights are calculated using the Bartlett kernel. Newey and West (1987) show that this estimator is positive semi-definite and heteroskedasticity and autocorrelation consistent. Andrews (1991) and Newey
and West (1994) investigate the finite sample properties of these estimators and propose data-dependent bandwidths. Though some papers have proposed variants of the estimators discussed in these seminal papers, the estimators applied in most of the current literature remain largely unchanged from their original form.

There have been earlier attempts to estimate HAC covariance matrices when some observations are missing. The Parzen (1963) paper on spectral analysis for data series with missing observations focuses on estimating the autocovariances of the underlying process in the presence of missing observations, based on a specific cyclical structure of missing data. We contribute to this literature by pointing out that for robust inference, we generally require an estimate of the long-run variance of the observed series rather than the underlying. We differentiate between the Amplitude Modulated estimator and the Parzen estimator. Following Parzen (1963), both of these estimators involve recasting the observed series as an amplitude modulated series in which the value of the underlying series is set to zero when observations are missing. The observed time structure of the data is respected, and the lagged autocovariances are estimated using only the lagged pairs which are fully observed. We show that while the Amplitude Modulated estimator is a consistent estimator of the long-run variance of the observed series, the Parzen estimator is a consistent estimator of the long-run variance of the underlying series. Along with these theoretical results, we provide simulation results that demonstrate consistency of the t-statistic constructed using the Amplitude Modulated estimator.

We also argue to extend the set of missing data structures to which these estimators can be applied. Other researchers have attempted to apply Parzen's work to a variety of missing data structures, including the Bernoulli structure of randomly missing variables and more general cyclical patterns of missing observations (Scheinok, 1965; Bloomfield, 1970; Clinger and Van Ness, 1976; Dunsmuir and Robinson, 1981a,b). While the literature has moved beyond Parzen's original application, it still is focused on applications with randomly missing observations. Yet, many common applications of missing data techniques are for data that have a deterministic missing structure. Our theoretical results and simulation exercise demonstrate that as long the missing data structure satisfies our independence assumption, we can apply the Amplitude Modulated and Equal Spacing estimators to settings in which the pattern of missing observations is deterministic instead of random. This extends the set of possible applications to include business daily data, for example, in which weekends could be considered missing data.

More recently, Kim and Sun (2011) construct a HAC estimator for the two-dimensional case robust to spatial heteroskedasticity and autocorrelation. The focus of the paper is not on missing data, and they do not distinguish the difference between the spatial spectrum of the underlying versus the observed process. However, they discuss the applicability of their method to irregularly observed spatial data. Reducing the spatial HAC on the irregular lattice to one-dimensional time series produces an estimator very similar to our Amplitude Modulated estimator. We clarify the subtlety between the underlying and observed spectrum and develop the Amplitude Modulated estimator in the context of time series with missing data.

Setting the theory aside, many researchers use imputation techniques when faced with missing observations in practice. For example, one common use of imputation is to temporally disaggregate data to generate quarterly data from annual series, or monthly data from quarterly series. The Denton method of imputation smooths data when generating these series by minimizing first-differences or second-differences (Denton, 1971). Relatedly, the Chow-Lin method uses a related indicator series that can be used to interpolate, distribute, or extrapolate data (Chow and Lin, 1971, 1976). When this method is used, some properties of the indicator series, including serial correlation, will be transferred to the imputed series. Even the simplest method of imputing data by naive linear interpolation will induce autocorrelation in the imputed series. Studies based on Monte Carlo simulations suggest that even for reasonably large sample sizes, inference methods based on Newey-West HAC covariance estimators result in significant overrejection when the serial correlation is high (den Haan and Levin, 1997). In an imputed series, the induced high autocorrelation exacerbates this distortion. Yet, researchers using these methods rarely adjust their inferences for this induced serial correlation (for two such examples, see Eaton, Kortum, Neiman, and Romalis (2011) and Forni, Monteforte, and Sessa (2009)). We show in our Monte Carlo simulations and our empirical application that our estimators are simple alternatives that avoid the problems associated with imputation.

To avoid the complication of adjusting HAC estimators for the method and extent of imputation, some researchers simply ignore the missing observations. Formally, this method amounts to relabeling the time index and treating observations as though they are equally spaced in time. While this method has no previous formal justification to our knowledge, it has been widely applied. For example, observations of daily financial data are generally treated as equally spaced consecutive observations, irrespective of their actual spacing in time (examples includeAcharya and Johnson (2007), Beber, Brandt, and Kavajecz (2009), Jorion and Zhang (2007), Pan and Singleton (2008), and Zhang, Zhou, and Zhu (2009)). Yet, for prices that are affected by developments in global markets with different business weeks or national holidays, the lack of price data on weekends and holidays could be treated as missing observations. In this paper, we formalize this Equal Spacing estimator and demonstrate its asymptotic consistency and finite sample performance.

In light of the existing confusion in the literature on HAC estimation with missing data, we provide our Equal Spacing and Amplitude Modulated estimators as alternatives. We discuss the finite sample properties of these estimators and provide simulation results that offer insight into the bias and variance tradeoffs between the two estimators, so that practitioners can make an informed choice before applying either one.


2  A Simple Example

To fix ideas, in this section we introduce each of our estimators in the context of a simple example using three weeks of daily gasoline prices. We reserve for the next section a more detailed discussion of the asymptotic and finite sample properties of the estimators and the practical considerations involved in choosing among them. Suppose we have gasoline price data $ \{z_{t}\}$ for the first three weeks of the month:

MonTuesWedsThursFriSatSun
z1 z2 z3 z4 z5 z6 z7
z8 z9 z10 z11 z12 z13 z14
z15 z16 z17 z18 z19 z20 z21

For clarity of exposition, suppose these data have already been demeaned, so that we have $ E(z_{t})=0$. To estimate the long-run variance of the series $ \{z_{t}\}$, we can apply the standard Newey-West estimator:

$\displaystyle \hat{\Omega}^{NW}=\hat{\gamma}(0)+2\sum_{j=1}^{m}w(j,m)\hat{\gamma}(j), $

where the Bartlett kernel, $ w(j,m)=1-[j/(m+1)]$ if $ j\leq m$ and $ w(j,m)=0$ if $ j>m$, is used to weight the sample autocorrelations at each lag $ j$:

$\displaystyle \hat{\gamma}(j)=\frac{1}{T}\sum_{t=j+1}^{T}z_{t-j}z_{t}, $

In our example, we can estimate the first lagged autocorrelation for the gasoline price series as:

$\displaystyle \hat{\gamma}(1)=\frac{1}{21}[z_{1}z_{2}+z_{2}z_{3}+...+z_{20}z_{21}]. $

Similarly, we estimate the third lagged autocorrelation as:

$\displaystyle \hat{\gamma}(3)=\frac{1}{21}[z_{1}z_{4}+z_{2}z_{5}+...+z_{18}z_{21}]. $

Note that the denominator in both cases is the total number of observations, $ T$, rather than the number of observed lags, $ T-j$.

Now suppose we have only business daily data, with missing data on weekends:

MonTuesWedsThursFriSatSun
z1 z2 z3 z4 z1 - -
z8 z9 z10 z11 z12 - -
z15 z16 z17 z18 z19 - -

When some data points are missing, we have a few choices for how we estimate the lagged autocovariances, $ \hat{\gamma}(j)$, that are components of the long-run variance, $ \hat{\Omega}$. Especially in the context of business daily data, one very common procedure is to ignore the missing data. In this case, we would treat the observed prices as equally spaced in time. When estimating the first lagged autocovariance for our Equal Spacing estimator, we would treat the jump from Friday to Monday (e.g. day 5 to day 8) as a one day lag:

$\displaystyle \hat{\gamma}^{ES}(1)=\frac{1}{15}[z_{1}z_{2}+z_{2}z_{3}+z_{3}z_{4}+z_{4}z_{5}+\underline{z_{5}z_{8}}+z_{8}z_{9}+z_{9}z_{10}+...+z_{18}z_{19}]. $

Similarly, the third autocovariance would be estimated as:

$\displaystyle \hat{\gamma}^{ES}(3)=\frac{1}{15}[z_{1}z_{4}+z_{2}z_{5}+\underline{z_{3}z_{8}+z_{4}z_{9}+z_{5}z_{10}}+z_{8}z_{11}+z_{9}z_{12}+z_{10}z_{15}+...+z_{16}z_{19}]. $

Since we have effectively reduced the sample size by ignoring the missing days, the denominator for each estimated lagged autocovariance is the total number of observed data points, or 15 in our example.

Alternatively, we could estimate the lagged autocovariances using only the instances in which we observe data with the right spacing. We call this the Amplitude Modulated estimator, because we are effectively modulating, or setting to 0, the value (or amplitude) of the series on the missing days. Using this estimator, to estimate the first lagged autocovariance in our example, we would use the lag from day 4 to day 5 and then skip to the lag from day 8 to day 9:

$\displaystyle \hat{\gamma}^{AM}(1)=\frac{1}{15}[z_{1}z_{2}+z_{2}z_{3}+z_{3}z_{4}+z_{4}z_{5}+z_{8}z_{9}+z_{9}z_{10}+...+z_{18}z_{19}]. $

The third autocovariance would be estimated using all observed three day lags, including the three day lags between Friday and Monday:

$\displaystyle \hat{\gamma}^{AM}(3)=\frac{1}{15}[z_{1}z_{4}+z_{2}z_{5}+\underline{z_{5}z_{8}}+z_{8}z_{11}+z_{9}z_{12}+\underline{z_{12}z_{15}}+z_{15}z_{18}+z_{16}z_{19}]. $

In this method, the denominator for each lag is again the total number of observed data points, as in the the Equal Spacing estimator.

While we focus on these two estimators throughout most of the paper, for comparison, our simulations will also implement two alternatives that are not preferred. The first alternative is the Parzen estimator, after Parzen (1963). It is constructed like the Amplitude Modulated estimator, except that we adjust the denominator to equal the number of times we observe data with the right spacing:

$\displaystyle \hat{\gamma}^{PZ}(1)=\frac{1}{12}[z_{1}z_{2}+z_{2}z_{3}+z_{3}z_{4}+z_{4}z_{5}+z_{8}z_{9}+z_{9}z_{10}+...+z_{18}z_{19}]. $

The third autocovariance would be estimated as:

$\displaystyle \hat{\gamma}^{PZ}(3)=\frac{1}{8}[z_{1}z_{4}+z_{2}z_{5}+z_{5}z_{8}+z_{8}z_{11}+z_{9}z_{12}+z_{12}z_{15}+z_{15}z_{18}+z_{16}z_{19}]. $

Finally, we will implement our Imputation estimator,which is the Newey-West estimator applied to the filled series, $ \{z_{t}^{I}\}$, constructed by linearly imputing the missing data. In our example with business daily gasoline prices, the first and third lagged autocorrelations can be estimated as:

$\displaystyle \hat{\gamma}^{IM}(1)=\frac{1}{21}[z_{1}z_{2}+...+z_{4}z_{5}+z_{5}z_{6}^{I}+z_{6}^{I}z_{7}^{I}+z_{7}^{I}z_{8}+z_{8}z_{9}+...+z_{18}z_{19}+z_{19}z_{20}^{I}+z_{20}^{I}z_{21}^{I}] $

$\displaystyle \hat{\gamma}^{IM}(3)=\frac{1}{21}[z_{1}z_{4}+z_{2}z_{5}+z_{3}z_{6}^{I}+z_{4}z_{7}^{I}+z_{5}z_{8}+z_{6}^{I}z_{9}+z_{7}^{I}z_{10}+z_{8}z_{11}+...+z_{16}z_{19}+z_{17}z_{20}^{I}+z_{18}z_{21}^{I}]. $

3  Long-Run Variance of Time Series with Missing Observations

In this section, we formalize our main estimators and describe their asymptotic and finite sample properties.

3.1  Missing Data Structure

Consider a second-order stationary time series $ \{z_{t}\}_{t=1}^{\infty}$ with $ \sum_{j=0}^{\infty}\vert\gamma_{z}(j)\vert<\infty$ and $ E(z_{t})=\mu$ and an indicator series $ \{g_{t}\}_{t=1}^{\infty}$ such that $ g_{t}=1$ if $ z_{t}$ is observed and $ g_{t}=0$ if $ z_{t}$ is missing. Throughout the paper, we maintain the assumptions on the missing data structure:

Assumption 1.  Independence: We assume that the underlying series $ \{z_{t}\}$ is independent of the series $ \{g_{t}\}$. In other words, for any positive integer $ n<\infty$ and any sequence $ t_{1},...,t_{n}$, the random variable $ z\equiv(z_{t_{1}},...,z_{t_{n}})$ and $ g\equiv(g_{t_{1}},...,g_{t_{n}})$ satisfy the condition that $ \mathrm{Pr}(z^{-1}(A)\cap g^{-1}(B))=\mathbf{\mathrm{Pr}}(z^{-1}(A))\mathbf{\mathrm{Pr}}(g^{-1}(B))$ for any two $ n$-dimensional Borel sets $ A$, $ B\subseteqq\mathbb{R}^{n}$.
Assumption 2.   Existence: $ \lim_{T\rightarrow\infty}(S/T)=\alpha$and $ \lim_{T\rightarrow\infty}[1/S]\sum_{t=j+1}^{T}g_{t}g_{t-j}=\kappa(j)$ both exist.

Assumption 1 requires that the missing process is independent of the underlying data, so that missing data do not induce bias in the parameter estimates. Assumption 2 requires that the fractions of observed converges in probability, and the asymptotic ratio of the number of observed lag $ j$ to total number of observations exists. Under these assumptions, we allow very general stochastic or deterministic missing data processes. We give two commonly observed missing data structures as follows:

Example 1.   Bernoulli missing: The series $ \{g_{t}\}_{t=1}^{\infty}$ has an i.i.d. Bernoulli distribution, in which each $ g_{t}$ takes value 0 with probability $ p$ and value 1 with probability $ 1-p$.
Example 2.   Cyclically missing: Given a series $ \{z_{t}\}_{t=1}^{\infty}$ , we can divide the series into cycles which are each of length $ h>2$. In the first cycle of length $ h$, we have $ k$ missing observations for some integer $ k<h$. Define the set of time indexes of these missing observations, $ S=\{s_{1},...,s_{k}\}$, where the integers $ s_{k}\in[1,h]$ for all $ k$. For $ t\leq h$, $ g_{t}=0$ if and only if $ t\in S$. In a cyclically missing structure, for $ t>h$, we have $ g_{s_{k}+hl}=0$ for all integers $ l=1,2,...,\infty$, and $ g_{t}=1$ otherwise.

The indicator series $ \{g_{t}\}$ is stochastic for Bernoulli missing and deterministic for cyclical missing once the missing pattern is known for any $ h$ consecutive elements.

3.2  Newey-West Estimator

First, we review the standard Newey-West estimator that applies to time series without missing observations. Suppose that $ z_{t}$ is continuously observed at $ t=1,...,T$ with $ E(z_{t})=\mu$. We let $ \gamma_{z}(j)=E[(z_{t}-\mu)(z_{t-j}-\mu)]$ denote the $ j$-th lagged autocovariance. Under the standard assumption that $ z_{t}$ is second-order stationary with $ \sum_{j=0}^{\infty}\vert\gamma_{z}(j)\vert<\infty$, we have the standard results that $ \frac{1}{\sqrt{T}}\sum_{t=1}^{T}(z_{t}-\mu)\overset{d}{\rightarrow}N(0,\Omega)$ , where the long-run variance of the underlying process $ z_{t}$ is equal to

$\displaystyle \Omega=\sum_{j=-\infty}^{\infty}\gamma_{z}(j).$ (1)

The Newey-West HAC estimator for $ \Omega$ is given by

$\displaystyle \hat{\Omega}^{NW}=\hat{\gamma}_{z}(0)+2\sum_{j=1}^{m}w(j,m)\hat{\gamma}_{z}(j), $

where $ \hat{\gamma}_{z}(j)=\frac{1}{T}\sum_{t=j+1}^{T}(z_{t}-\bar{z}_{T})(z_{t-j}-\bar{z}_{T})$ and $ \bar{z}_{T}=(1/T)\sum_{t=1}^{T}z_{t}$. In the Newey-West formula, the lagged autocovariances, $ \hat{\gamma}_{z}(j)$, are weighted by the Bartlett kernel, $ w(j,m)=1-[j/(m+1)]$ for $ j\leq m$ and $ w(j,m)=0$ otherwise, to ensure a positive semi-definite covariance matrix. Under fairly general technical assumptions, as long as $ \lim_{T\rightarrow\infty}m(T)=\infty$ and $ \lim_{T\rightarrow\infty}\left[m(T)/T^{1/4}\right]=0$, we have $ \hat{\Omega}^{NW}\overset{p}{\rightarrow}\Omega$ (, ). The choice of optimal bandwidth $ m$ is given by () and () who further explore the properties of alternative choices for the bandwidth and kernel and the finite sample properties of these estimators.

3.3  Long-Run Variance of the Underlying Process - Parzen Estimator

In the presence of missing observations, we follow Parzen (1963) and recast the series as an amplitude modulated version of some underlying full series. We define the amplitude modulated series, $ \{z^{*}\}$, as $ z_{t}^{*}=g_{t}z_{t}$. Using the amplitude modulated series ,$ \{z_{t}^{*}\}$ Parzen (1963) suggests the following estimator for the autocovariance of the underlying series $ \{z_{t}\}$:

$\displaystyle \hat{\gamma}_{z}^{PZ}(j)=\frac{\sum_{t=j+1}^{T}(z_{t}^{*}-g_{t}\bar{z}_{T}^{*})(z_{t-j}^{*}-g_{t-j}\bar{z}_{T}^{*})}{\sum_{t=j+1}^{T}g_{t}g_{t-j}}, $

if $ \Sigma_{t=j+1}^{T}g_{t}g_{t-j}>0$. () establishes $ \hat{\gamma}_{z}^{PZ}(j)\overset{p}{\rightarrow}\gamma_{z}(j)$ provided that $ z_{t}^{*}$ is asymptotically stationary.

Under the special case that $ \lim_{T\rightarrow\infty}\sum_{t=j+1}^{T}g_{t}g_{t-j}>0$ for all $ j$, we can use the observed data to construct our Parzen estimator, which is a Newey-West type consistent estimator of the long-run variance of the underlying process $ z_{t}$:

$\displaystyle \hat{\Omega}^{PZ}=\hat{\gamma}_{z}^{PZ}(j)+2\sum_{j=1}^{m}w(j,m)\hat{\gamma}_{z}^{PZ}(j)\overset{p}{\rightarrow}\Omega $

While this object may be useful in some instances, it is incorrect for inference testing. First, Dunsmuir and Robinson (1981b) study the case in which $ w(j,m)=1$, and point out that $ \hat{\Omega}^{PZ}$ may not be positive semi-definite.2 Secondly, as we further demonstrate, the long-run variance of the underlying process differs from the long-run variance of the observed process. Though the Parzen estimator is formed using observed data only, it is a consistent estimator of the variance of the underlying process. Consequently, inference on the observed data will be invalid if we use the Parzen estimate of the variance.

3.4  Long-Run Variance of the Observed Process

Let $ S=\sum_{t=1}^{T}g_{t}$ be the total number of the observed. The sample mean is given by $ \bar{z}_{T}^{*}=\frac{1}{S}\sum_{t=1}^{T}z_{t}^{*}$. Asymptotic mean and variance of $ \bar{z}_{T}^{*}$ is given by the following proposition.

Proposition 1.   $ \bar{z}_{T}^{*}\overset{p}{\rightarrow}\mu$ and $ \Omega^{*}\equiv\lim_{T\rightarrow\infty}S\cdot E(\bar{z}_{T}^{*}-\mu)^{2}=\sum_{j=-\infty}^{\infty}\kappa(j)\gamma_{z}(j)$ .

Proof. Given $ E(g_{t})=\lim_{T\rightarrow\infty}(S/T)=\alpha$ and $ g_{t}$ is independent of $ z_{t}$, we have $ E(z_{t}^{*})=E(g_{t})E(z_{t})=\alpha\mu.$ We can rewrite $ \lim_{T\rightarrow\infty}\bar{z}_{T}^{*}=\lim_{T\rightarrow\infty}\frac{1}{S}\sum_{t=1}^{T}z_{t}^{*}=\lim_{T\rightarrow\infty}\frac{T}{S}\frac{1}{T}\sum_{t=1}^{T}z_{t}^{*}.$ We know that $ \lim_{T\rightarrow\infty}(S/T)=\alpha$. By the law of large numbers, we have $ \lim_{T\rightarrow\infty}\frac{1}{T}\sum_{t=1}^{T}z_{t}^{*}=\alpha\mu.$

Therefore, $ \lim_{T\rightarrow\infty}\bar{z}_{T}^{*}=\mu.$ We also have

$\displaystyle S\cdot E(\bar{z}_{T}^{*}-\mu)^{2}$ $\displaystyle =$ $\displaystyle [1/S]E\left[\sum_{t=1}^{T}g_{t}(z_{t}-\mu)\right]^{2}$  
  $\displaystyle =$ $\displaystyle [1/S]E\left[\sum_{t=1}^{T}(z_{t}-\mu)^{2}g_{t}^{2}+2\sum_{j=1}^{T-1}\sum_{t=j+1}^{T}(z_{t}-\mu)(z_{t-j}-\mu)g_{t}g_{t-j}\right]$  
  $\displaystyle =$ $\displaystyle [T/S)]\left[\gamma_{z}(0)E\left(\frac{1}{T}\sum_{t=1}^{T}g_{t}^{2}\right)+2\sum_{j=1}^{T-1}\gamma_{z}(j)E\left(\frac{1}{T}\sum_{t=j+1}^{T}g_{t}g_{t-j}\right)\right].$  

Define $ \kappa(j)\equiv\lim_{T\rightarrow\infty}[1/S]\sum_{t=j+1}^{T}E(g_{t}g_{t-j})$ , the share of times lag $ j$ is observed. Given $ \lim_{T\rightarrow\infty}(T/S)=1/\alpha$ and $ E\left[(1/T)\sum_{t=j+1}^{T}g_{t}g_{t-j}\right]=\lim_{T\rightarrow\infty}(1/T)\sum_{t=j+1}^{T}g_{t}g_{t-j}=\alpha\kappa(j)$ , we have

$\displaystyle \Omega^{*}=\lim_{T\rightarrow\infty}S\cdot E(\bar{z}_{T}^{*}-\mu)^{2}=\sum_{j=-\infty}^{\infty}\kappa(j)\gamma_{z}(j).$ (2)

Therefore, the long-run variance of the observed amplitude modulated process, i.e., $ \Omega^{*}$, is a weighted sum of the original autocovariances, with the weights being the asymptotic ratio of the number of the observed lags to the total number of the observed, $ S$. Comparing equations (1) and (2), when $ z_{t}$ is observed at all $ t$, i.e., $ g_{t}=1$ for all $ t$, then $ \Omega^{*}=\Omega$. In the presence of missing observations, if all autocovariances are positive, we have $ \kappa(j)\leq1$. Then the long-run variance of the amplitude modulated process is always weakly smaller than the long-run variance of the underlying process, $ \Omega^{*}\leq\Omega$.

3.4.1  Amplitude Modulated Estimator

To estimate $ \Omega^{*}$ in finite samples with $ S$ observed, a natural candidate is given by the following proposition.

Proposition 2.   A consistent estimator of $ \Omega^{*}$ is given by
$\displaystyle \hat{\Omega}^{*}=\hat{\gamma}_{z^{*}}(0)+2\sum_{j=1}^{T-1}\hat{\gamma}_{z^{*}}(j) $

where $ \hat{\gamma}_{z^{*}}(j)=[1/S]\sum_{t=j+1}^{T}(z_{t}^{*}-g_{t}\bar{z}_{T}^{*})(z_{t-j}^{*}-g_{t-j}\bar{z}_{T}^{*}).$

Proof. We note that

$\displaystyle \hat{\gamma}_{z^{*}}(j)$ $\displaystyle =$ $\displaystyle [1/S]\sum_{t=j+1}^{T}(z_{t}^{*}-g_{t}\bar{z}_{T}^{*})(z_{t-j}^{*}-g_{t-j}\bar{z}_{T}^{*})$  
  $\displaystyle =$ $\displaystyle \frac{T}{S}\frac{1}{T}\sum_{t=j+1}^{T}(z_{t}-\bar{z}_{T}^{*})(z_{t-j}-\bar{z}_{T}^{*})g_{t}g_{t-j}$  

Since $ \lim_{T\rightarrow\infty}T/S=1/\alpha$, $ \lim_{T\rightarrow\infty}E(z_{t}-\bar{z}_{T}^{*})(z_{t-j}-\bar{z}_{T}^{*})=\gamma_{z}(j)$ and $ \lim_{T\rightarrow\infty}E(g_{t}g_{t-j})=\alpha\kappa(j)$ we have

$\displaystyle \hat{\gamma}_{z^{*}}(j)$ $\displaystyle \overset{p}{\rightarrow}$ $\displaystyle \kappa(j)\gamma(j).$  

Therefore, $ \hat{\Omega}^{*}\overset{p}{\rightarrow}\Omega^{*}$.

However, $ \hat{\Omega}^{*}$ is not guaranteed to be positive semi-definite, which is not desirable for inference. We can use a kernel-based method to ensure the covariance estimator is positive semi-definite:

$\displaystyle \hat{\Omega}^{AM}=\hat{\gamma}_{z^{*}}(0)+2\sum_{j=1}^{m}w(j,m)\hat{\gamma}_{z^{*}}(j). $

We follow Newey and West (1987) and illustrate with the most commonly used kernel, the Bartlett kernel.

Proposition 3.   Using the Bartlett kernel, $ w(j,m)=1-[j/(m+1)]$ if $ j\leq m$ and $ w(j,m)=0$ if $ j>m$, suppose (i) the bandwidth $ m$ satisfies $ \lim_{T\rightarrow\infty}m(T)=+\infty$ and $ \lim_{T\rightarrow\infty}[m(T)/T^{1/4}]=0$. Then $ \hat{\Omega}^{AM}$ is positive semi-definite and $ \hat{\Omega}^{AM}\overset{p}{\rightarrow}\Omega^{*}$.

Proof. We follow proofs of Theorems 1 and 2 in Newey and West (1987) by defining $ h_{t}\equiv z_{t}^{*}-g_{t}\mu$ , $ \hat{h}_{t}\equiv z_{t}^{*}-g_{t}\bar{z}_{T}^{*}$ and replace all $ T$ in the denominator in Newey and West (1987) with $ S$.

Our estimator $ \hat{\Omega}^{AM}$ is almost equivalent to applying the Newey-West estimator to the amplitude modulated series. However, we make two minor modifications to the components $ \hat{\gamma}_{z^{*}}(j)=[1/S]\sum_{t=j+1}^{T}(z_{t}^{*}-g_{t}\bar{z}_{T}^{*})(z_{t-j}^{*}-g_{t-j}\bar{z}_{T}^{*}).$ First, we subtract $ z_{t}^{*}$ by $ g_{t}\bar{z}_{T}^{*}$ instead of $ E(g_{t})\bar{z}_{T}^{*},$ so that the difference $ (z_{t}^{*}-g_{t}\bar{z}_{T}^{*})$ equals zero for unobserved data. In the case of a mean-zero series, this modification would not be required. Secondly, since we want to use $ \hat{\Omega}^{AM}$ to make inferences about the mean of the observed process $ \bar{z}_{T}^{*}$, we divide the sum by $ S$ instead of $ T$ so that our inference procedure remains consistent.

3.4.2  Equal Spacing Estimator

Instead of casting the time series with missing observations as an amplitude modulated process, an alternative method is to ignore the missing observations and treat the data as equally spaced over time. We define the function $ \iota(t)$ as the mapping from time index $ t$ to the new equal spacing time domain $ s$: $ \iota(t)=\sum_{\ell=1}^{t}g_{\ell}$. We use this mapping to relabel the time indices of the observed values from the series $ \{z_{t}\}$ to create the series $ \{z_{s}^{ES}\}$ for $ s=1,...,\iota(T)$, in the equal spacing time domain. The sample mean of the observed series is given by $ \bar{z}_{T}^{ES}=\frac{1}{S}\sum_{s=1}^{S}z_{s}^{ES}.$

Proposition 4.   $ \bar{z}_{T}^{ES}\overset{p}{\rightarrow}\mu$ and $ \Omega^{ES}\equiv\lim_{T\rightarrow\infty}S\cdot E(\bar{z}_{T}^{ES}-\mu)^{2}=\Omega^{*}$ .

Proof. We let $ \Delta_{s}^{j^{ES}}\equiv\iota^{-1}(s)-\iota^{-1}(s-j^{ES})$ be a function that maps the time gap $ (j^{ES})$ between $ z_{s}$ and $ z_{s-j^{ES}}$ in the equal spacing domain to the time gap $ (j)$ in the time domain of the underlying process $ \{z_{t}\}$. Using the indicator function $ \mathbf{I}(\cdot)$, we define $ \lambda^{j^{ES}}(j)=\lim_{T\rightarrow\infty}\frac{1}{S}\sum_{s=1}^{S}\mathbf{I}(\Delta_{s}^{j^{ES}}=j)$ , which equals the frequency that the observed lag $ j^{ES}$ maps to lag $ j$ in the original time domain. Then we can rewrite the Equal Spacing autocovariance in terms of the autocovariance of the underlying process:

$\displaystyle \gamma_{z^{ES}}(j^{ES})$ $\displaystyle =$ $\displaystyle \lim_{T\rightarrow\infty}E(z_{s}-\bar{z}_{T}^{ES})(z_{s-j^{ES}}-\bar{z}_{T}^{ES})$  
  $\displaystyle =$ $\displaystyle \sum_{j=-\infty}^{\infty}\lambda^{j^{ES}}(j)\gamma_{z}(j)$  

Applying the same standard results as in Equation 1 to the equal spacing series, we have:

$\displaystyle \Omega^{ES}$ $\displaystyle =$ $\displaystyle \sum_{j^{ES}=-\infty}^{\infty}\gamma_{z^{ES}}(j^{ES})$  
  $\displaystyle =$ $\displaystyle \sum_{j^{ES}=-\infty}^{\infty}\sum_{j=-\infty}^{\infty}\lambda^{j^{ES}}(j)\gamma_{z}(j)$  
  $\displaystyle =$ $\displaystyle \sum_{j=-\infty}^{\infty}\left(\sum_{j^{ES}=-\infty}^{\infty}\lambda^{j^{ES}}(j)\right)\gamma_{z}(j)$  
  $\displaystyle =$ $\displaystyle \sum_{j=-\infty}^{\infty}\kappa(j)\gamma_{z}(j)=\Omega^{*}.$  

The second to last equation holds because

$\displaystyle \sum_{j^{ES}=-\infty}^{\infty}\lim_{T\rightarrow\infty}\frac{1}{S}\sum_{s=1}^{S}\mathbf{I}(\Delta_{s}^{j^{ES}}=j)=\lim_{T\rightarrow\infty}\frac{1}{S}\sum_{t=1}^{T}g_{t}g_{t-j}. $

To estimate $ \Omega^{*}$ using the equally spaced series in finite samples, we can use

$\displaystyle \hat{\Omega}^{ES}=\hat{\gamma}_{z^{ES}}(0)+2\sum_{j=1}^{m}w(j^{ES},m)\hat{\gamma}_{z^{ES}}(j^{ES}) $

where

$\displaystyle \hat{\gamma}_{z^{ES}}(j^{ES})=\frac{1}{S}\sum_{j=1}^{S-1}(z_{s}^{ES}-\bar{z}_{T}^{ES})(z_{s-j}^{ES}-\bar{z}_{T}^{ES}). $

Proposition 5   $ \hat{\Omega}^{ES}$ is PSD and $ \hat{\Omega}^{ES}\overset{p}{\rightarrow}\Omega^{*}$.

Proof. Positive semi-definiteness of $ \hat{\Omega}^{ES}$ can be established using same argument in the proof of Theorem 1 in Newey and West (1987). Using their notation, we let $ \hat{h}_{s}=z_{s}^{ES}-\bar{z}_{T}^{ES}$. To prove consistency, since

$\displaystyle \hat{\gamma}_{z^{ES}}(0)+2\sum_{j^{ES}=1}^{s(T)-1}\hat{\gamma}_{z^{ES}}(j^{ES})=\hat{\gamma}_{z^{*}}(0)+2\sum_{j=1}^{T-1}\hat{\gamma}_{z^{*}}(j) $

and $ w(j,m)\overset{p}{\rightarrow}1$ for all $ j$, $ \hat{\Omega}^{ES}$ and $ \hat{\Omega}^{AM}$ estimators are asymptotically equivalent. We know $ \hat{\Omega}^{AM}\overset{p}{\rightarrow}\Omega^{*}$ by proposition 3, and hence, $ \hat{\Omega}^{ES}\overset{p}{\rightarrow}\Omega^{*}$.

The Equal Spacing estimator is particularly simple to implement, because it only requires relabeling the time index of a series with missing observations to ignore the gaps and treat the data as equally spaced over time. Once this is done, the Equal Spacing estimator amounts to applying the standard Newey-West estimator to the equally spaced series.

3.5  Finite Samples

Although $ \hat{\Omega}^{AM}$ and $ \hat{\Omega}^{ES}$ are both asymptotically consistent, finite sample performance might differ due to different weighting on autocovariance estimators. We use the standard mean squared error (MSE) criterion to to evaluate the performance of the estimator $ \hat{\Omega}^{i}$ where $ i\in\{AM,ES\}$.

$\displaystyle MSE(\hat{\Omega}^{i})$ $\displaystyle =$ $\displaystyle Bias^{2}(\hat{\Omega}^{i})+Var(\hat{\Omega}^{i})$  
  $\displaystyle =$ $\displaystyle \left[E(\hat{\Omega}^{i}-\Omega^{*})\right]^{2}+E[(\hat{\Omega}^{i}-\bar{\Omega}^{i})^{2}]$  

where $ \bar{\Omega}^{i}=E(\hat{\Omega}^{i})$. Consider the case that $ m^{AM}=m^{ES}\equiv m$. The lag length in the original time domain is weakly greater than that in the equal spacing domain: $ j=\iota^{-1}(s)-\iota^{-1}(s-j^{ES})\geq j^{ES}$. Under the same fixed bandwidth $ m$, there are two main differences between $ \hat{\Omega}^{AM}$ and $ \hat{\Omega}^{ES}$. First, since the kernel weight is decreasing in the lag length for the Newey-West estimator, $ \hat{\Omega}^{ES}$ assigns weakly higher weight on all autocovariance estimators compared to $ \hat{\Omega}^{AM}$. To see this more explicitly,

$\displaystyle \hat{\Omega}^{ES}=\sum_{j^{ES}=1}^{m}\frac{w(j^{ES},m)}{S}\sum_{s=j^{ES}+1}^{S}(z_{s}-\bar{z}_{T}^{*})(z_{s-j^{ES}}-\bar{z}_{T}^{*}) $

To write it in the original time domain, we have

$\displaystyle \hat{\Omega}^{ES}=\sum_{j^{ES}=1}^{m}\frac{w(j^{ES},m)}{S}\sum_{t=j+1}^{T}(z_{t}-\bar{z}_{T}^{*})(z_{t-j}-\bar{z}_{T}^{*}) $

where $ t=\iota^{-1}(s)$ and $ j=\iota^{-1}(s)-\iota^{-1}(s-j^{ES})\geq j^{ES}$. We compare $ \hat{\Omega}^{ES}$ with $ \hat{\Omega}^{AM}$,

$\displaystyle \hat{\Omega}^{AM}=\sum_{j=1}^{m}\frac{w(j,m)}{S}\sum_{t=j+1}^{T}(z_{t}-\bar{z}_{T}^{*})(z_{t-j}-\bar{z}_{T}^{*})g_{t}g_{t-j}. $

When $ g_{t}g_{t-j}=1$, we have $ w(j^{ES},m)\geq w(j,m)$ since the weighting function decreases in lag length and $ j\geq j^{ES}$. Therefore, given the same bandwidth, $ \hat{\Omega}^{ES}$ puts weakly more weight than $ \hat{\Omega}^{AM}$ on each observed pairwise product $ (z_{t}-\bar{z}_{T}^{*})(z_{t-j}-\bar{z}_{T}^{*})$. Second, for the same fixed bandwidth $ m$, $ \hat{\Omega}^{AM}$ only estimates autocovariance with lag length up to $ m$ in the original time domain, while $ \hat{\Omega}^{ES}$ also includes observations for lags greater than $ m$ the original time domain. These two differences have different implications on the relative variance and bias of the two estimators.

As discussed in den Haan and Levin (1997), Newey-West type kernel-based estimators suffer from three sources of finite sample bias. First, the summation in the autocovariance estimator is divided by the sample size, instead of the actual number of observed lags. We expect this source of bias to be more severe for $ \hat{\Omega}^{ES}$ because $ \hat{\Omega}^{ES}$ includes higher-order lags that are not included in $ \hat{\Omega}^{AM}$ and puts more weight on these high-order biased lagged autocovariance estimators. However, this bias decreases rapidly as the sample size increases. Second, the kernel-based method assigns zero weights to lags with orders greater than $ T$. This source of bias is the same for $ \hat{\Omega}^{ES}$ and $ \hat{\Omega}^{AM}.$

The third and most significant source of bias is driven by the fact that kernel-based estimators under-weight the autocovariance estimators. They assign weights to autocovariance estimators that are less than unity and are declining toward zero with increasing lag order $ j$. Compared with the long-run variance of the amplitude modulated series, $ \Omega^{*}$, the bias of $ \hat{\Omega}^{AM}$ arising from this source is given by

$\displaystyle Bias(\hat{\Omega}^{AM})=\sum_{j=-(T-1)}^{T-1}\left[1-w(j,m)\right]\gamma_{z^{*}}(j). $

For a fixed bandwidth, the higher the serial correlation, the more severe the bias. The estimator $ \hat{\Omega}^{ES}$ can reduce this kernel-based bias because $ \hat{\Omega}^{ES}$ always assigns weakly higher (or closer to unitary) weight to all autocovariance estimators as compared to $ \hat{\Omega}^{AM}$.

For variance of the estimators, we always have $ Var(\hat{\Omega}^{ES})>Var(\hat{\Omega}^{AM})$ because $ \hat{\Omega}^{ES}$ includes more high-order lags that are relatively poorly estimated. Therefore, the tradeoff between variance and bias determines the relative finite sample performance of $ \hat{\Omega}^{AM}$ and $ \hat{\Omega}^{ES}$. The previous discussion uses a fixed bandwidth. Andrews (1991) and Newey and
West (1994) propose data-dependent choice of the bandwidth that aims to optimize the mean-variance tradeoff. We apply the automatic bandwidth selection procedure proposed by Newey and West (1994) to the AM and ES processes. As we will demonstrate using Monte-Carlo simulations, under both fixed and automatic bandwidth selection, for small sample size and low autocorrelation, $ MSE(\hat{\Omega}^{ES})>MSE(\hat{\Omega}^{AM})$. For moderate sample size or high autocorrelation, we always have $ MSE(\hat{\Omega}^{AM})>MSE(\hat{\Omega}^{ES})$.


4  Regression Model with Missing Observations

We can apply asymptotic theory developed in the previous section to a regression model with missing observations. Suppose we have the time series regression, where $ y_{t}$ and $ u_{t}$ are scalars, $ x_{t}$ is a $ k\times1$ vector of regressors, and $ \beta$ is a $ k\times1$ vector of unknown parameters. Suppose further that $ \left(\frac{1}{T}\sum_{t=1}^{T}x_{t}x_{t}^{'}\right)^{-1}\overset{p}{\rightarrow}\Sigma_{xx}^{-1}$ and $ \mathbb{E}(u_{t}\vert x_{t})=0$, but the $ u_{t}$'s have conditional heteroskedasticity and are possibly serially correlated. In the presence of missing observations, we let $ g_{t}=1$ if $ y_{t}$ and all components of $ x_{t}$ are observed and $ g_{t}=0$ if $ y_{t}$ or any component of $ x_{t}$ is missing. Then we can re-express the regression in terms of amplitude modulated processes,

$\displaystyle y_{t}^{*}=x_{t}^{*}\beta+u_{t}^{*},\; t=1,\ldots,T, $

where $ y_{t}^{*}=g_{t}y_{t}$, $ x_{t}^{*}=g_{t}x_{t}$ and $ u_{t}^{*}=g_{t}u_{t}$. We require the orthogonality condition, $ \mathbb{E}(u_{t}^{*}\vert x_{t}^{*})=0$. The standard result for the OLS estimator is given by

$\displaystyle \hat{\beta}^{AM}-\beta=\left(\sum_{t=1}^{T}x_{t}^{*}x_{t}^{*'}\right)^{-1}\left(\sum_{t=1}^{T}x_{t}^{*}u_{t}^{*}\right).$ (3)

Alternatively, without recasting the series as an amplitude modulated process, we ignore all observations for which $ g_{t}=0$ and assume all observed values are equally spaced in time. Therefore, the estimated regression becomes

$\displaystyle y_{s}^{ES}=x_{s}^{ES}\beta+u_{s}^{ES},\; s=1,\ldots,S $

and

$\displaystyle \hat{\beta}^{ES}-\beta=\left(\sum_{s=1}^{S}x_{s}^{ES}x_{s}^{ES'}\right)^{-1}\left(\sum_{s=1}^{S}x_{s}^{ES}u_{s}^{ES}\right).$ (4)

Comparing equations 3 and 4, we can easily see that AM and ES give the same coefficient estimates:

$\displaystyle \hat{\beta}^{AM}=\hat{\beta}^{ES}\equiv\hat{\beta}. $

We normalize $ \hat{\beta}$ using the number of observed data, $ S$, and then we have

$\displaystyle \sqrt{S}(\hat{\beta}-\beta)=\left(\frac{1}{S}\sum_{t=1}^{T}x_{t}^{*}x_{t}^{*'}\right)^{-1}\left(\frac{1}{\sqrt{S}}\sum_{t=1}^{T}x_{t}^{*}u_{t}^{*}\right). $

Given that $ \left(\frac{1}{T}\sum_{t=1}^{T}x_{t}x_{t}^{'}\right)^{-1}\overset{p}{\rightarrow}\Sigma_{xx}^{-1}$ in the absence of missing observations and $ x_{t}$ and $ g_{t}$ are independent, $ \left(\frac{1}{S}\sum_{t=1}^{T}x_{t}^{*}x_{t}^{*'}\right)^{-1}$ also converges in probability. We let $ \left(\frac{1}{S}\sum_{t=1}^{T}x_{t}^{*}x_{t}^{*'}\right)^{-1}\overset{p}{\rightarrow}\Sigma_{x^{*}x^{*}}^{-1}.$ Using the notation from the previous section, we define $ z_{t}\equiv x_{t}u_{t}$ and let $ z_{t}^{*}\equiv g_{t}z_{t}$ denote the amplitude modulated series and $ z_{s}^{ES}$ denote the ES series. Then we have

$\displaystyle \bar{z}_{T}^{*}\equiv\frac{1}{S}\sum_{t=1}^{T}z_{t}^{*}=\frac{1}{S}\sum_{t=1}^{S}z_{s}^{ES}. $

We know $ E(z_{t})=E(\bar{z}_{T}^{*})=0$ using the orthogonality condition.

Proposition 6.   The asymptotic distribution of the OLS estimator is given by

$\displaystyle \sqrt{S}(\hat{\beta}-\beta)\overset{d}{\rightarrow}N(0,\Sigma_{x^{*}x^{*}}^{-1}\Omega^{*}\Sigma_{x^{*}x^{*}}^{-1}), $

where $ \Omega^{*}=\sum_{j=-\infty}^{\infty}\kappa(j)\gamma_{z}(j)$ and $ \kappa(j)=\lim_{S\rightarrow\infty}\frac{1}{S}\sum_{t=j+1}^{T}E(g_{t}g_{t-j})$ .

To estimate $ \sum_{x^{*}x^{*}}^{-1}$, we can use

$\displaystyle \hat{\Sigma}_{x^{*}x^{*}}^{-1}=\left(\frac{1}{S}\sum_{t=1}^{T}x_{t}^{*}x_{t}^{*'}\right)^{-1}=\left(\frac{1}{S}\sum_{s=1}^{S}x_{s}^{ES}x_{s}^{ES'}\right)^{-1}\overset{p}{\rightarrow}\Sigma_{x^{*}x^{*}}^{-1}. $

Proposition 7.   We define

$\displaystyle \hat{\Omega}^{AM}=\hat{\Gamma}_{0}^{AM}+\sum_{j=1}^{m}w(j,m)[\hat{\Gamma}_{j}^{AM}+\hat{\Gamma}_{j}^{AM\prime}], $

where $ \hat{\Gamma}_{j}^{AM}=(1/S)\sum_{t=1}^{T}z_{s}^{*}z_{s-j}^{*'}$ ,

$\displaystyle \hat{\Omega}^{ES}=\hat{\Gamma}_{0}^{ES}+\sum_{j=1}^{m}w(j,m)[\hat{\Gamma}_{j}^{ES}+\hat{\Gamma}_{j}^{ES\prime}], $

where $ \hat{\Gamma}_{j}^{ES}=(1/S)\sum_{s=1}^{S}z_{s}^{ES}z_{s-j}^{ES'}$ . Then we have $ \hat{\Omega}^{AM}\overset{p}{\rightarrow}\Omega^{*}$ and $ \hat{\Omega}^{ES}\overset{p}{\rightarrow}\Omega^{*}$. For inferences, the t-statistic based on $ \hat{\Omega}^{AM}$ is given by

$\displaystyle t_{k}^{AM}=\frac{\hat{\beta}_{k}-\beta_{0}}{\sqrt{\hat{V}_{kk}^{AM}/(S-k)}}\overset{d}{\rightarrow}N(0,1), $
where $ \hat{V}_{kk}^{AM}$ is the $ (k,k)$-th element of $ \hat{V}^{AM}=\hat{\Sigma}_{x^{*}x^{*}}^{-1}\hat{\Omega}^{AM}\hat{\Sigma}_{x^{*}x^{*}}^{-1}$ . Alternatively, the t-statistic based on $ \hat{\Omega}^{ES}$ is given by
$\displaystyle t_{k}^{ES}=\frac{\hat{\beta}_{k}-\beta_{0}}{\sqrt{\hat{V}_{kk}^{ES}/(S-k)}}\overset{d}{\rightarrow}N(0,1), $

where $ \hat{V}_{kk}^{ES}$ is the $ (k,k)$-th element of $ \hat{V}^{ES}=\hat{\Sigma}_{x^{*}x^{*}}^{-1}\hat{\Omega}^{ES}\hat{\Sigma}_{x^{*}x^{*}}^{-1}$ .


5  Simulation

In the Monte Carlo simulations that follow, we study the properties of our Amplitude Modulated and Equal Spacing estimators using a simple location model. To evaluate these estimators under a variety of circumstances, we generate data with various levels of autocorrelation and for a range of sample sizes. We test our estimators under the two missing structures described in Section 3.1, the Bernoulli missing structure and the deterministic cyclically missing structure. We implement the estimators using a standard fixed bandwidth for our benchmark results, and also provide results that implement the automatic bandwidth selection procedure proposed by Newey and West (1994). Our primary evaluation criteria are the empirical rejection probability of the test, and the power of the test against an appropriate alternative.

5.1  Data Structure

The inference procedures are tested on a simulated data series $ \{y_{t}\}$ that is generated using a simple location model:

$\displaystyle y_{t}=\beta+\epsilon_{t}$    
$\displaystyle \epsilon_{t}=\phi\epsilon_{t-1}+\eta_{t}$    
where $\displaystyle \eta_{t}$ i.i.d.$\displaystyle \sim\mathcal{N}(0,1)$ and $\displaystyle \epsilon_{0}=0$    

For each of $ N=100,000$ iterations, we use $ \beta=0$ and generate a data series $ \{y_{1},...,y_{T^{max}}\}$ with a sample size of $ T^{max}=24,000$. Since we run tests over a range of sample sizes for each estimator, we use the first $ T$ observations in each iteration for $ T\in$ {120, 360, 1200, 4800, 12000, 24000}. To test these methods for a range of autocorrelation parameters, $ \phi$, we generate data separately for $ \phi\in${0, 0.3, 0.5, 0.7, 0.9}. The regressor in this model is a constant 1, and we conduct simulation exercises under different missing structures for the dependent variable $ \{y_{t}\}$, which are described below. For each iteration, we generate a series $ \{g_{t}\}$, which indicates for each $ t$ whether $ y_{t}$ is observed. Finally, we generate the series $ \{y_{t}^{*}\}$, where $ y_{t}^{*}=g_{t}y_{t}$.

Given this data, we estimate the parameter of interest, $ \beta^{i}$, and the estimator for the covariance matrix, $ \hat{\Omega}^{i}$, for each estimator $ i\in\{NW,ES,AM\}$. We also perform simulations for two additional methods. First, we implement the imputation method, in which the missing $ y_{t}$ are linearly imputed before the standard Newey-West estimator is applied to the filled series $ \{y_{t}^{I}\}$. Secondly, we implement the Parzen estimator from Section 3.3. Since the estimator $ \hat{\Omega}^{PZ}$ is not positive semi-definite, we calculate the rejection rate using the number of rejections divided by the number of simulations in which $ \hat{\Omega}^{PZ}>0$.

We use these estimators to calculate the t-statistic, $ t_{\beta}^{i}$, used for a test of the null hypothesis $ H_{0}:\beta=0$ against the two-sided alternative, $ H_{a}:\beta\neq0$. We choose a 5% level of significance and reject the null hypothesis when $ \vert t_{\beta}^{i}\vert>1.96$. For the standard estimations, we use a fixed bandwidth of $ m=4(T/100)^{(2/9)}$. We also apply the automatic bandwidth selection procedure of Newey and West (1994) to the Newey-West, Equal Spacing, Amplitude Modulating, and Imputation methods.

Results are reported for simulation exercises under a variety of sampling schemes. Our benchmark sampling scheme is one with a Bernoulli missing structure as described in Example 1 of Section 3.1. For these simulations, the series $ \{g_{t}\}$ has an i.i.d. Bernoulli distribution with fixed probability of missing, $ p=6/12$. For comparison, we also provide two variants in which the probability of missing is set to 4/12 and 8/12.

We also simulate data under four data structures with cyclically missing observations, as described in Example 2 of Section 3.1. For these, we choose a cycle length of 12, with 6 or 8 observations missing each cycle. In these simulations, the missing structure is cyclical in that we generate a single pattern of missing observations for the first cycle, and apply this same pattern to every cycle of 12 observations. Additionally, the pattern is deterministic in the sense that we apply the same pattern of missing observations for all iterations in the simulation. This sampling structure reflects the potential application of these methods to monthly data with missing observations. For example, because commodities futures contracts expire only some months of the year, the data on monthly commodities returns will have the same missing pattern each year. Another common application is for daily financial data, in which the same 2 weekend days are missing in each cycle of 7 days.

Under a deterministic cyclical missing structure, it is possible to have cases for which certain lagged autocovariances in the original time series domain are never observed. As we noted in the introduction, the popular statistical software Stata forbids implementation of the Amplitude Modulated estimator under this case, even when using the "force" command. Yet, our theoretical results do not require all the lags to be observed to generate asymptotically correct inference using the ES and AM estimators. Consequently, we perform simulations for deterministic cyclical missing structures under both cases: all lags are observed at least once, or some lags are never observed. We show that the finite sample performance does not differ much between these two cases, and neither case differs much from the results under the Bernoulli missing structure.

In our first two cyclical missing structures, we set the cyclical pattern of observed data such that each lagged autocovariance can be estimated from the observed data. In our cyclical structure with 6 of 12 missing, we observe $ \{z_{3},\, z_{6},\, z_{8},\, z_{9},\, z_{10},\, z_{11}\}$, and then observe the same pattern of observations for each cycle of length 12. In our next cyclical structure, we have 8 of 12 missing. We observe $ \{z_{3},\, z_{4},\, z_{7},\, z_{9}\}$ in the first cycle, and the same pattern for each cycle after that. For our final two cyclical missing structures, we require the observed data to be spaced within the cycle such that at least one lag is never observed. In our structure with 6 of 12 missing, we observe $ \{z_{1},\, z_{3},\, z_{5},\, z_{8},\, z_{10},\, z_{12}\}$ in the first cycle. Under this structure, the sixth lag is never observed. For our cyclical structure with 8 of 12 missing, we observe $ \{z_{2},\, z_{3},\, z_{6},\, z_{12}\}$, so that the fifth lag is never observed.

5.2  Evaluation Criteria

The primary evaluation criteria for these estimators is the empirical rejection probability of the tests. The empirical rejection probability measures the likelihood that null hypothesis is rejected when it is in fact true (Type I error). Each iteration of the Monte Carlo simulation represents one hypothesis test, and the reported rejection probability reflects the fraction of iterations for which the t-statistic was large enough in magnitude to reject the null hypothesis.

We also provide measures of the power of the test, as well as measures of the bias and variance of the estimators. The power of the test measures the probability that the null hypothesis is rejected when the alternative hypothesis is true. Since we find empirical rejection probabilities that can be much higher than the 0.05 benchmark, we calculate the size-adjusted power for ease of comparability. Following Ibragimov and Müller (2010), we set the alternative hypothesis to $ H_{a}:\beta_{a}=4/\sqrt{T(1-\phi^{2})}$. To calculate the power, we first calculate $ t_{\beta_{a}}^{i}$, which is analogous to $ t_{\beta}^{i}$ for each $ i$, except that we subtract $ \beta_{a}$ instead of $ \beta_{0}$ in the numerator. For example,

$\displaystyle t_{\beta_{a}}^{NW}=\frac{\hat{\beta}^{NW}-\beta_{a}}{\sqrt{\hat{V}_{kk}^{NW}/T}}. $

Next, we calculate an adjusted critical value, $ t_{0.05}^{crit}$, which is the t-statistic at the 5th percentile of our simulations. This value is equal to the critical value for which the empirical rejection probability would have been exactly 0.05 under our simulation procedure. To calculate the size-adjusted power, we calculate $ t_{\beta_{a}}^{i}$ under the alternative hypothesis above and reject when $ \vert t_{\beta_{a}}^{i}\vert>t_{0.05}^{crit}$.

Finally, in order to understand the finite sample performance, we study the empirical mean squared error of our estimators by decomposing it into the empirical variance and bias. Under the benchmark Bernoulli case, we first calculate the value of $ \Omega^{*}$ under our data generating process as:

$\displaystyle \Omega^{*}$ $\displaystyle =$ $\displaystyle \lim_{T\rightarrow\infty}(T/S)\sum_{j=-\infty}^{\infty}\gamma_{j}E(g_{t}g_{t-j})$  
  $\displaystyle =$ $\displaystyle (1/p)\left(p+2\sum_{j=1}^{\infty}p^{2}\phi^{j}\right)Var(\epsilon_{t})$  
  $\displaystyle =$ $\displaystyle \left(1+\frac{2p\phi}{1-\phi}\right)\left(\frac{1}{1-\phi^{2}}\right)$  

where $ p$ is the probability of being observed. The second equation follows because (1) $ \lim_{T\rightarrow\infty}T/S=1/p$; (2) $ E(g_{t}g_{t-j})=p$ if $ j=0$ and $ E(g_{t}g_{t-j})=p^{2}$ if $ j\geq1$. The third equation holds because $ Var(\epsilon_{t})=1/(1-\phi^{2}).$ Returning to the MSE decomposition, we have:

$\displaystyle \widehat{MSE}$ $\displaystyle =$ $\displaystyle \widehat{Bias}^{2}+\widehat{Variance}$  
  $\displaystyle =$ $\displaystyle \left(\hat{\bar{\Omega}}^{i}-\Omega^{*}\right)^{2}+\frac{1}{N}\sum_{c=1}^{N}\left(\hat{\Omega}_{c}^{i}-\hat{\bar{\Omega}}^{i}\right)^{2},$  

where $ \hat{\bar{\Omega}}^{i}=(1/N)\sum_{c=1}^{N}\hat{\Omega}_{c}^{i}$ is the sample mean of all the covariance estimators and $ c$ indexes the $ N=100,000$ iterations. Note that for $ i=NW$, we have $ p=1$. We use these measures to study the finite sample properties of our estimators, especially to compare the AM and ES estimators.

The primary findings are reported in Tables 2 through 7.3

5.3  Results

Table 2 provides the rejection probabilities for our two main estimators, the Equal Spacing estimator and the Amplitude Modulated estimator. We provide results under a fixed bandwidth and automatic bandwidth selection for each estimator, and for comparison purposes, present results for the application of the Newey-West estimator applied to the full series without missing observations. Our benchmark missing data structure is the Bernoulli missing structure in which observations are missing with probability $ p=1/2$. We focus on the results for the simulation with $ T=360$, and also provide results for a large and very large sample size ($ T=1200$ or $ 24000$).

Our simulation results provide evidence that the ES and AM estimators perform well in finite samples. As is well known in the HAC literature, we find that the empirical rejection probability can be a bit higher than 5.0 for small samples, even when there is no autocorrelation. In addition, when the autocorrelation parameter is high, there can be quite a bit of overrejection even for very large sample sizes ($ T=24000$). However, we do find that the rejection probability is falling towards 5.0 as the sample size increases.

We also find evidence that our ES and AM estimators are well-behaved under deterministic cyclically missing structures. In Table 3, we see little difference in the rejection rates under each of our three data structures: randomly missing under a Bernoulli structure, deterministic cyclical missing when all lags are observed, and deterministic cyclical missing when some lags are unobserved.

Table 2 also provides the empirical rejection probabilities for the Parzen and Imputation estimators for $ T=360$ and $ T=24000$. As expected, the higher serial correlation induced by the imputation procedure results in extremely high rejection rates as compared to the ES and AM estimators. We can also see in Table 2 that the results for the Parzen estimator substantiate our argument that this estimator cannot be used for robust inference for series with missing observations. In our simulation with $ \phi=0$ and $ T=360$, we found 20 instances (out of 100,000) in which the non-PSD Parzen estimator returned a negative estimate of the variance. Additionally, we find that the rejection probability is generally decreasing in the sample size but is U-shaped with respect to the autocorrelation, and often underrejects for low levels of autocorrelation.4

Next we turn to a comparison of the finite sample properties of the ES and AM estimators. In the results for the fixed bandwidth estimators in Table 4, we find that for $ T=360$, the AM estimator is preferred for autocorrelation parameters $ \phi\leq0.3$, while the ES estimator is preferred for series with higher autocorrelation. For larger samples, the ES estimator is preferred for a larger range of $ \phi$, so that for $ T=24000$, we have that the ES estimator is preferred for all the simulations with nonzero autocorrelations.

To better understand these results, we turn to Table 5, which reports the empirical bias and variance for these two estimators under our benchmark simulations. As expected, when using the same fixed bandwidth, the ES estimator has a higher variance than the AM estimator for each sample size and autocorrelation. As discussed in Section 3.5, this is because compared to $ \hat{\Omega}^{AM}$, the $ \hat{\Omega}^{ES}$ estimator includes more high-order lags that are relatively poorly estimated.

With regard to the bias, the ES estimator has better performance for higher autocorrelation parameters, though this effect is mitigated and sometimes reversed in small samples. The poor small sample performance is driven by the first source of bias discussed in Section 3.5, that the summation in the autocovariance estimator is divided by the sample size rather than the actual number of observed lags. This bias declines rapidly as the sample size increases. In contrast, the bias behind the overrejection for high autocorrelations is driven by the underweighting of high-order lagged autocovariances. Since the ES estimator places a higher weight on high-order autocovariances, it has lower bias than the AM estimator when the autocorrelation is high. As the sample size grows and the first source of bias becomes less important, the ES estimator is preferred for a larger range of autocovariance parameters.

This bias and variance tradeoff changes when we implement automatic bandwidth selection. The results in Table 4 indicate that under this procedure, the AM estimator has a lower rejection probability and is thus preferred at all but the highest level of autocorrelation, for every sample size. To provide further context for this result, Table 6 reports the average selected bandwidth for each simulation. We know from Haan and Levin (1997) that using a higher bandwidth will increase the variance of the estimator while decreasing the bias. Given that the AM estimator has a lower variance than the ES estimator when using a fixed bandwidth, it is not surprising that the automatic bandwidth selection typically chooses a higher bandwidth for the AM estimator than for the ES estimator. The incremental improvement in the bias between the fixed and automatic bandwidth selection is larger for the AM estimator than for the ES estimator. Consequently, under automatic bandwidth selection, the ES estimator is only preferred for extremely high autocorrelation, when the bias of the AM estimator is much higher than that of the ES estimator.

Turning to the size-adjusted power, we find in table Table 4 that the power of the two estimators is roughly equivalent. Comparing our two main estimators, we have that the power of the AM estimator is generally stronger than that of the ES estimator. Just as for the Newey-West results, we have that the power falls as the autocorrelation increases, and that the size-adjusted power does not vary as the sample size increases. Unsurprisingly, we can see that the power of the test under the missing structure is weaker than for the full series, due to the smaller observed sample size. This effect is mitigated at high autocorrelation, however, and we can see that under very high autocorrelation, the power is roughly equal for the Newey-West application to the full series and for the application of our two estimators to the series with missing observations.

Finally, Table 7 presents results for varying fractions of missing observations. At low autocorrelation, the rejection rate increases as the fraction of missing observations increases. This is likely driven by the first source of finite sample bias discussed in Section 3.5, which gets worse as the fraction of missing observations increases. In contrast, at high autocorrelation, the rejection rate falls as the fraction of missing observations increases. This effect is likely due to the fact that when a higher fraction of observations is missing, the observed process is less persistent, and the estimators are better able to overcome the underweighting of the higher order autocovariances. Putting these two effects together, we have that the AM estimator is preferred for a larger range of autocorrelation parameters when a higher fraction of data is missing. This is consistent with our previous finding, that the AM estimator is preferred when the serial correlation is low.

Overall, these simulation results are consistent with our theoretical findings. We show that the ES and AM estimators both are well-behaved for random and deterministic missing structures. In general, the ES and AM estimators are preferred to imputation methods, which may induce serial correlation in the imputed series, resulting in more severe bias and overrejection. In finite samples, we find that for the same fixed bandwidth, the ES estimator is generally less biased than the AM estimator, but has larger variance. Consequently, the ES estimator is preferred when autocorrelation is high, as the bias will dominate the mean squared error in these cases. Conversely, when autocorrelation is low, variance dominates the mean squared error, and the AM estimator is preferred.


6  Empirical Application: Recursive Tests for a Positive Sample Mean

In this section, we present an application of our estimators to test for positive returns to investing in commodities futures contracts. While commodities tend to have positive returns on average, they also have extremely high volatility. We apply our methods to construct the sample mean and standard error of the returns series, and test whether the returns are statistically distinguishable from zero. Due to the structure of commodities futures contracts, the time series of returns have missing observations, and are therefore a natural application for our estimators.

Commodities futures contracts specify the quantity and price of a commodity to be traded on a predetermined expiration date at some point in the future. While some commodities have contracts expiring every month, many have contract months that are irregularly spaced throughout the year. For example, copper futures contracts were available for only March, May, July, September, and December until full monthly coverage began in 1989. Consequently, if we want to calculate the monthly return to investing in commodities futures over a long sample period, our monthly returns data will either reflect a fluctuating time-to-maturity, or will be an incomplete series with irregularly spaced missing observations.

For each commodity, we calculate the return as the percentage change in the price of the contract over the three months prior to the contract's expiration. For instance, we calculate the return to the December contract as the change in price from the last trading day in August to the last trading day in November. Since we want our returns series to reflect the change in the price over the same time-to-maturity, we are only able to calculate this return in the months immediately preceding contract expiration. For copper, this means we will have only five observations each year.5 The existence of irregularly spaced commodities futures contracts results in a deterministic cyclical pattern of missing observations in the constant-maturity returns series. Contract availability and spacing differs across commodities, but tends to remain constant year to year for each commodity.

In this application, we calculate the sample mean for three representative commodities: copper, soybean oil, and lean hogs.6 We apply our Equal Spacing, Amplitude Modulated, and Imputation estimators to calculate the HAC standard error of the sample mean, and test the hypothesis that the sample mean is significantly different from zero at the five percent level of significance. Under the Imputation method, the missing observations are linearly imputed, while the Equal Spacing and Amplitude Modulated estimators use only the observed data to calculate the mean and standard error. Because it does not provide robust results, we do not provide inference results using the Parzen estimator.

In addition to performing the t-test for the full sample, we use a recursive method to compare our three methods across various sample sizes. For each commodity, we first calculate the sample mean and standard error over the first twelve months of the sample, and perform a t-test of the mean for just this sample window. We then recursively perform the same test for an expanding sample window, adding one month at a time until the full sample is covered. Lastly, we also perform the same type of recursive tests starting with the last twelve months of the full sample. In this backwards recursive test, we use an earlier starting month in each iteration until the sample window again covers the full sample. Having the forward and backwards recursive results allows us to note any structural shifts that may have occurred over time. Figures 2 and 3 depicts the results, and Table 8 provides an overview of rejection rates over the full set of recursive results.

Figure 2 shows the sample mean and 95% confidence intervals constructed using the Imputation and Equal Spacing methods. (The Amplitude Modulated estimator is omitted from the figure for clarity.) The means are very similar across the two methods for most sample windows. The primary difference is that as expected, the Imputation method estimate of the standard error is generally smaller than the Equal Spacing and Amplitude Modulated estimates, resulting in a higher rejection rate for Imputation. In the figure, we have shaded the samples for which the hypothesis is rejected under the Imputation method but not rejected under Equal Spacing. The fraction of shaded iterations ranges from 3.2% for lean hogs to 18.2% for copper in the forward recursive results. In the backwards recursive results, the fraction of shaded iterations ranges from 0% for soybean oil to 18.4% for copper.

It is unsurprising that the Imputation method results in a higher rejection rate relative to the Equal Spacing and Amplitude Modulated methods. While in many cases naive imputation is likely to bias the parameter of interest, we have tried to construct an example with little to no bias. However, since the imputed observations are constructed using the observed data rather than drawn from the underlying distribution of data, we note that the standard error of the imputed series is likely lower than the standard error of the observed series. Additionally, the induced high serial correlation of the imputed series will make it likely that the standard error of the imputed series will be underestimated by the Newey-West estimator. For all of these reasons, it is likely that we will have overrejection in hypothesis testing. Without knowing the true mean of the series, in this application we cannot know which method gives us the "right" conclusion for a larger fraction of the tests. Yet, the examples illustrate that the Imputation and Equal Spacing methods can lead to different conclusions in a number of cases, depending on the available data.


7  Conclusion

This paper provides two simple solutions to the common problem of conducting heteroskedasticity and autocorrelation (HAC) robust inference when some observations are missing. Our definitions of the Amplitude Modulated and Equal Spacing estimators are simply formal descriptions of ad hoc practices that are already in use. Yet, by formalizing these procedures, we are able to provide theoretical results that clear up some of the existing confusion in the literature. By studying the estimators and their properties, we provide justification for their past application to daily business data and through common statistical software packages such as Stata. We also justify their application under a wide variety of missing data structures, including deterministic and cyclical missing structures.

Our theoretical discussion of the estimators highlights a few main conclusions. After establishing the difference between the long-run variance of the underlying and observed series, we demonstrate that our Amplitude Modulated and Equal Spacing estimators both are consistent for the long-run variance of the observed series. This distinction is important, as we also show that we require the long-run variance of the observed series to construct t-statistics for inference, such as in a regression setting. In addition to discussing the asymptotic properties of the estimators, we provide some discussion of their finite sample properties, based on our previous understanding of the finite sample properties of HAC estimators more generally.

We also provide simulation results and apply our estimators to a real world problem involving missing data in commodities futures returns. These results provide further evidence supporting our description of the asymptotic and finite sample behavior of the estimators. In addition, the results of these exercises are used to draw conclusions that can provide guidance to practitioners who need to decide between the estimators for applied work. Though this paper focuses on applying the estimators in a time series setting, they can also be naturally extended for application in a panel setting. We leave this extension for future work.


References

Acharya, V. V., and T. C. Johnson (2007): "Insider trading in credit derivatives," Journal of Financial Economics, 84(1), 110141.

Andrews, D. W. K. (1991): "Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation," Econometrica, 59(3), 817858.

Beber, A., M. W. Brandt, and K. A. Kavajecz (2009): "Flight-to-Quality or Flight-to- Liquidity? Evidence from the Euro-Area Bond Market," Review of Financial Studies, 22(3), 925957.

Bloomfield, P. (1970): "Sepctral Analysis with Randomly Missing Observations," Journal of the Royal Statistical Society, Series B (Methodological), 32(3), 369380.

Brillinger, D. R. (1973): "Estimation of the Mean of a Stationary Time Series by Sampling," Journal of Applied Probability, 10(2), 419431.

_____(1984): "Statistical Inference for Irregularly Observed Processes," in Time Series Analysis of Irregularly Observed Data, ed. by E. Parzen, pp. 3657. Springer-Verlag, New York.

Chow, G. C., and A.-l. Lin (1971): "Best Linear Unbiased Interpolation, Distribution, and Extrapolation of Time Series by Related Series," The Review of Economics and Statistics, 53(4), 372375.

Chow, G. C., and A.-l. Lin (1976): "Best Linear Unbiased Estimation of Missing Observations in an Economic Time Series," Journal of the American Statistical Association, 71(355), 719721.

Clinger, W., and J. W. Van Ness (1976): "On Unequally Spaced Time Points in Time Series," The Annals of Statistics, 4(4), 736745.

den Haan, W. J., and A. T. Levin (1997): "A Practitioners Guide to Robust Covariance Matrix Estimation," in Robust Inference, ed. by G. Maddala, and C. Rao, vol. 15 of Handbook of Statistics, pp. 299342. Elsevier Science B.V.

Denton, F. T. (1971): "Adjustment of Monthly or Quarterly Series to Annual Totals: An Approach Based on Quadratic Minimization," Journal of the American Statistical Association, 66(333), 99102.

Dunsmuir, W., and P. M. Robinson (1981a): "Asymptotic Theory for Time Series Containing Missing and Amplitude Modulated Observations," Sankhy/a: The Indian Journal of Statistics, Series A, 43(3), 260281.

______ (1981b): "Estimation of Time Series Models in the Presence of Missing Data," Journal of the American Statistical Association, 76(375), 560568.

Eaton, J., S. Kortum, B. Neiman, and J. Romalis (2011): "Trade and the Global Recession," NBER Working Paper No. 16666.

Forni, L., L. Monteforte, and L. Sessa (2009): "The General Equilibrium Effects of Fiscal Policy: Estimates for the Euro Area.," Journal of Public Economics, 93(3-4), 559585.

Harvey, A. C., and R. G. Pierse (1984): "Missing Observations in Economic Time Series," Journal of the American Statistical Association, 79(385), 125131.

Ibragimov, R., and U. K. M|ller (2010): "t-Statistic Based Correlation and Heterogeneity Robust Inference," Journal of Business and Economic Statistics, 28(4), 453468.

Jorion, P., and G. Zhang (2007): "Good and Bad Credit Contagion: Evidence from Credit Default Swaps," Journal of Financial Economics, 84(3), 860883.

Kiefer, N. M., and T. J. Vogelsang (2002a): "Heteroskedasticity-Autocorrelation Robust Standard Errors Using the Bartlett Kernel Without Truncation," Econometrica, 70(5), 20932095.

_____(2002b): "Heteroskedasticity-Autocorrelation Robust Testing Using Bandwidth Equal to Sample Size," Econometric Theory, 18(06), 13501366.

_____(2002c): "A New Asymptotic Theory For Heteroskedasticity-Autocorrelation Robust Tests," Econometric Theory, 21(06), 11301164.

Kiefer, N. M., T. J. Vogelsang, and H. Bunzel (2000): "Simple Robust Testing of Regression Hypotheses," Econometrica, 68(3), 695714.

Kim, M. S., and Y. Sun (2011): "Spatial Heteroskedasticity and Autocorrelation Consistent Estimation of Covariance Matrix," Journal of Econometrics, 160(2), 349371.

McKenzie, C. R., and C. A. Kapuscinski (1997): "Estimation in a linear model with serially correlated erros when observations are missing," Mathematics and Computers in Simulation, 44(1), 19.

Newey, W. K., and K. D. West (1987): "A Simple, Positive Semi-Definite Heteroskedasticity and Autocorrelation Consistent Covariance Matrix," Econometrica, 55(3), 703708.

_____(1994): "Automatic Lag Selection in Covariance Matrix Estimation," Review of Economic Studies, 61(4), 631653.

Pan, J., and K. J. Singleton (2008): "Default and Recovery Implicit in the Term Structure of Sovereign CDS Spreads," Journal of Finance, 63(5), 23452384.

Parzen, E. (1963): "On Spectral Analysis with Missing Observations and Amplitude Modulation," Sankhy/a: The Indian Journal of Statistics, Series A, 25(4), 383392.

Petersen, M. A. (2009): "Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches," The Review of Financial Studies, 22(1), 435480.

Phillips, P. C. B., and S. N. Durlauf (1986): "Multiple Time Series Regression with Integrated Processes," The Review of Economic Studies, 53(4), 473495.

Phillips, P. C. B., and V. Solo (1992): "Asymptotics for Linear Processes," The Annals of Statistics, 20(2), 9711001.

Scheinok, P. A. (1965): "Spectral Analysis with Randomly Missed Observations: The Binomial Case," The Annals of Mathematical Statistics, 36(3), 971977.

White, H. (1984): Asymptotic Theory for Econometricians. Academic Press.

Zhang, B. Y., H. Zhou, and H. Zhu (2009): "Explaining Credit Default Swap Spreads with the Equity Volatility and Jump Risks of Individual Firms," Review of Financial Studies, 22(12), 50995131.

Figure 1: Time Series of Returns

Data for Figure 1 immediately follows.

Data for Figure 1

Year Month Soybean Oil Returns Lean Hogs Returns Copper Returns
1960 1      
1960 2 -32.83582   29.62961
1960 3      
1960 4 -4.500008   0.3868314
1960 5      
1960 6 30.23547   36.61972
1960 7      
1960 8 29.57574   7.705531
1960 9 33.05387   -10.50184
1960 10      
1960 11 60.42154   -14.52057
1960 12 47.35063   -10.50788
1961 1      
1961 2 148.2576   14.21392
1961 3      
1961 4 77.04771   48.32413
1961 5      
1961 6 -41.60001   28.23034
1961 7      
1961 8 -33.04647   -5.170238
1961 9 -13.64058   -11.98697
1961 10      
1961 11 -14.8561   -24.48199
1961 12 -9.728692   2.961982
1962 1      
1962 2 -22.70575   11.18975
1962 3      
1962 4 -34.71698   -9.714471
1962 5      
1962 6 -89.71597   -25.92104
1962 7 -71.88755    
1962 8 -37.25263   6.23054
1962 9 -8.737833   -2.534323
1962 10      
1962 11 24.25741   -1.111522
1962 12 5.256853   10.37671
1963 1      
1963 2 17.62709   13.06007
1963 3      
1963 4 -8.128366   9.474776
1963 5      
1963 6 -13.094   -0.9427825
1963 7 -26.00214    
1963 8 -42.77836   -1.346806
1963 9 -2.620064   5.568758
1963 10      
1963 11 -16.66664   3.6327
1963 12 -53.33332   4.539388
1964 1      
1964 2 -36.40662   58.31948
1964 3      
1964 4 -11.82265   110.0286
1964 5      
1964 6 -1.013941   -5.598928
1964 7 12.26994    
1964 8 19.66426   68.47887
1964 9 89.43487   163.3721
1964 10      
1964 11 148.5971   245.946
1964 12 47.46451   -36.40718
1965 1      
1965 2 5.82192   -9.302352
1965 3      
1965 4 -27.80569   172.2086
1965 5      
1965 6 -58.30509   24.00904
1965 7 -37.1402    
1965 8 43.33694   88.78821
1965 9 63.25167   148.2014
1965 10      
1965 11 65.28836   44.53696
1965 12 57.32759   42.48119
1966 1      
1966 2 36.57818   135.7692
1966 3      
1966 4 18.72189   68.15836
1966 5      
1966 6 11.78571 6.536142 -10.77441
1966 7 28.62069 27.13906  
1966 8 65.28402 49.53133 -102.5185
1966 9 -5.638794 59.75498 -63.96995
1966 10   29.99963  
1966 11 -45.33332 12.08781 36.14214
1966 12 -31.59785   -14.48763
1967 1   9.582613  
1967 2 -21.63187   -29.37852
1967 3   -27.77757  
1967 4 -3.168314   -78.50468
1967 5   53.45793  
1967 6 -42.75998 32.99894 -23.84703
1967 7 -48.81889 3.149391  
1967 8 -35.29411 -64.78554 14.60175
1967 9 -20.21507 -45.46249 79.5238
1967 10   -32.19867  
1967 11 -28.22489 6.77677 116.2841
1967 12 -23.63437 -14.39661 108.9414
1968 1   -3.121382  
1968 2 13.00816 10.51564 109.7691
1968 3   13.91373  
1968 4 -14.30166 41.24201 -78.94737
1968 5   17.77452  
1968 6 -60.61999 7.229726 -55.13308
1968 7 -63.55143 -40.90728  
1968 8 -43.09929 -9.760657 4.514673
1968 9 -24.76447 8.212217 23.95482
1968 10   16.44635  
1968 11 36.76882 29.18399 32.05417
1968 12 82.1683   44.00871
1969 1   22.10623  
1969 2 44.55694   42.75132
1969 3   63.97454  
1969 4 8.622723   55.21126
1969 5   71.98434  
1969 6 -25.21212 55.26598 38.90908
1969 7 -9.638545 34.34897  
1969 8 49.59569   93.48393
1969 9 98.60724 75.75954 78.26842
1969 10      
1969 11 91.84782 101.9487 42.05814
1969 12 54.77388   48.71409
1970 1   53.30919  
1970 2 154.6416   20.46252
1970 3   -3.432659  
1970 4 100.9667   45.75645
1970 5   -35.56203  
1970 6 19.61961 -1.502235 -48.14555
1970 7 24.59311 -29.72137  
1970 8 13.34606   -51.68196
1970 9 88.35822 -18.85536 -40.1646
1970 10      
1970 11 108.3503 -64.80572 -70.05254
1970 12 24.47489   -71.84986
1971 1   42.59569  
1971 2 6.751049   -4.140787
1971 3   28.21989  
1971 4 -41.22011   47.32986
1971 5   45.25625  
1971 6 59.9134 8.897201 -70.19981
1971 7 145.4206 -37.75444  
1971 8 60.298   12.34568
1971 9 -9.904166 1.481805 -9.543581
1971 10 -3.053447    
1971 11 -13.14102 73.59698 -43.70078
1971 12 -17.02836   2.934998
1972 1   96.619  
1972 2 -26.02168   50.27085
1972 3   6.163197  
1972 4 -5.286364   2.392815
1972 5   35.0417  
1972 6 -59.43238 36.16217 -47.61011
1972 7 -57.19296 6.116826  
1972 8 -48.05653   6.425696
1972 9 -11.36141 38.2933 -5.416679
1972 10 10.25641    
1972 11 -22.83464 36.8215 -33.52603
1972 12 -20.92793   7.422693
1973 1   73.63895  
1973 2 131.7123   126.8994
1973 3   88.31491  
1973 4 192.0904   97.72934
1973 5   41.18401  
1973 6 105.7599 68.19283 96.27942
1973 7 311.7117 219.0966  
1973 8 116.3231   99.07739
1973 9 167.5841 -5.114614 45.8763
1973 10      
1973 11 120 -37.73562 112.0393
1973 12 70.52897   77.97469
1974 1   -10.06526  
1974 2 296.5699   113.0334
1974 3   -96.67128  
1974 4 58.93909   205.9017
1974 5   -139.9378  
1974 6 132.4894 -19.81194 -90.40651
1974 7 355.2 66.66664  
1974 8 149.4253   -121.6216
1974 9 328.4132 60.08451 -112.5
1974 10      
1974 11 70.79646 64.99609 -86.25335
1974 12 -80.65012   -94.90536
1975 1   -32.36916  
1975 2 -134.9333   -13.88429
1975 3   -2.631125  
1975 4 -64.90068   16.4486
1975 5   78.30281  
1975 6 -97.35594 77.1394 -51.875
1975 7 72.22463 66.73886  
1975 8 47.41464   17.92115
1975 9 60.4534 136.3637  
1975 10      
1975 11 -93.35759 -10.91884 -48
1975 12 -120.7005   -17.51314
1976 1   -27.66506  
1976 2 -4.030826   36.09578
1976 3   14.91859  
1976 4 6.865873   106.81
1976 5   55.12959  
1976 6 75.84099 25.63653 63.55143
1976 7 64.17638 -34.44405  
1976 8 102.0057   -15.81923
1976 9 22.53523 -88.32681  
1976 10      
1976 11 -21.68999 -26.67847 -73.56322
1976 12 -8.980348   -11.8196
1977 1   105.2347  
1977 2 38.72781   67.4699
1977 3   8.931169  
1977 4 118.362   -14.61186
1977 5   95.86986  
1977 6 -24.76258 70.47282 -65.27195
1977 7 -108.4474 -29.57973  
1977 8 -154.7688   -46.90554
1977 9 -102.9111 -2.812221  
1977 10      
1977 11 50.65218 66.9327 8.664238
1977 12 34.80063   19.01409
1978 1   116.2556  
1978 2 37.16381   -15.91695
1978 3   72.49855  
1978 4 139.6856   0.6779557
1978 5   62.51293  
1978 6 0.3125072 -24.12716 -16.88311
1978 7 -22.85262 -10.35995  
1978 8 -15.83578   -18.64661
1978 9 27.93062 74.87006  
1978 10      
1978 11 18.97567 43.25102 5.517251
1978 12 1.632659   11.83432
1979 1   30.73582  
1979 2 38.58524   134.2105
1979 3   -0.4401987  
1979 4 9.771487   63.28957
1979 5   -73.56789  
1979 6 -7.225434 -73.62942 -58.10528
1979 7 17.5038 -103.5876  
1979 8 45.32322   46.65013
1979 9 10.58913 30.58489  
1979 10      
1979 11 -3.259249 32.57981 53.85452
1979 12 -43.07691   -11.63226
1980 1   -7.190731  
1980 2 -53.01576   65.43332
1980 3   -103.8613  
1980 4 -77.46648   -132.8268
1980 5   -81.7314  
1980 6 49.46135 62.22904 5.777792
1980 7 133.5919 115.9928  
1980 8 78.72437   -30.70408
1980 9 9.124259 82.57276  
1980 10      
1980 11 18.15497 55.53162 2.219756
1980 12 -28.06748   -44.56233
1981 1   -74.98807  
1981 2 -78.44711   -59.84084
1981 3   -46.88725  
1981 4 4.410522   -26.89655
1981 5   27.217  
1981 6 -70.10785 0.1921335 -59.01638
1981 7 -39.96803 -8.550766  
1981 8 -60.82645   -31.42856
1981 9 -42.81939 -7.527967  
1981 10      
1981 11 -37.91208 -58.93811 -39.60641
1981 12 -54.30033   -20.05142
1982 1   8.228021  
1982 2 -43.65038   -34.3688
1982 3   75.20203  
1982 4 -20.11493   -29.60053
1982 5   97.49928  
1982 6 -20.79259 21.17296 -58.41727
1982 7 -50 8.482459  
1982 8 -55.86034   -22.92611
1982 9 -47.84887 20.69634  
1982 10      
1982 11 -26.10151 -17.23491 10.55899
1982 12 -28.63741   51.52766
1983 1   18.98566  
1983 2 -20.90591   33.2353
1983 3   -39.61266  
1983 4 37.96764   4.177562
1983 5   -55.10577  
1983 6 7.131623 -48.7405 -8.871517
1983 7 66.39999 -6.999798  
1983 8 306.8146   -41.07701
1983 9 239.394 7.466312  
1983 10      
1983 11 -80.42767 0 -43.33108
1983 12 -39.21812   -10.94674
1984 1   33.43946  
1984 2 6.981812   -16.81419
1984 3   4.349491  
1984 4 69.21967   -7.61035
1984 5   -0.7728878  
1984 6 53.77356 -19.06408 -66.80358
1984 7 -55.59103 -41.50394  
1984 8 -80.00002   -25.70771
1984 9 -42.11966 -72.66107  
1984 10      
1984 11 19.34031 28.94674 -35.18225
1984 12 21.98524   -12.03783
1985 1   5.755539  
1985 2 51.67464   -1.359368
1985 3   -41.02227  
1985 4 87.88899   -13.93508
1985 5   -28.65453  
1985 6 19.40665 -11.87454 -30.0158
1985 7 -68.73167 -49.89915  
1985 8 -78.22351   -3.574325
1985 9 -83.58878 -28.51021  
1985 10      
1985 11 -52.86103 101.737 -4.203704
1985 12 -4.465112   25.50338
1986 1   -0.216233  
1986 2 -56.05749   12.93452
1986 3   -25.36077  
1986 4 -29.59285   -23.58278
1986 5   65.18896  
1986 6 -69.08532 132.7268 -41.3793
1986 7 -67.50525 141.4168  
1986 8 -95.28088   -36.87945
1986 9 -54.47942 15.76118  
1986 10      
1986 11 36.26144 -1.077784 13.68691
1986 12 5.611223   -5.872742
1987 1   13.49856  
1987 2 -17.77219   15.39723
1987 3   36.51253  
1987 4 -25.41176   8.859711
1987 5   111.4335  
1987 6 14.84277 91.84253 66.77419
1987 7 -3.662994 70.56387  
1987 8 -22.421   73.06217
1987 9 -12.62422 28.12146  
1987 10      
1987 11 52.26997 -44.55789 211.4514
1987 12 83.30424   335.9799
1988 1   54.01889  
1988 2 45.96733   -44.22699
1988 3   73.60419  
1988 4 28.692   37.04137
1988 5   69.66257  
1988 6 183.1012 -30.87942 21.2638
1988 7 40.22157 -31.29779  
1988 8 14.832   76.41998
1988 9 -97.47269 -33.96717  
1988 10      
1988 11 -86.01375 -13.34865 206.7327
1988 12 -34.88   171.6912
1989 1   -19.10923  
1989 2 10.61948   6.28883
1989 3   -48.08491  
1989 4 6.95186   15.23809
1989 5   -2.278358  
1989 6 -44.34819 16.02707 -61.94553
1989 7 -97.63111 -1.737207  
1989 8 -58.51852   71.99999
1989 9 -39.14489 2.29731 83.25535
1989 10     10.07957
1989 11 -4.597687 78.86552  
1989 12 -20.11173    
1990 1   20.73262  
1990 2 33.50356    
1990 3   75.98697  
1990 4 79.79543    
1990 5   77.66654  
1990 6 47.74774 28.37198  
1990 7 17.97371 9.017191  
1990 8 17.2637    
1990 9 -19.21825 15.73425  
1990 10      
1990 11 -44.88776 11.10477  
1990 12 -51.8674    
1991 1   19.62072  
1991 2 -5.109502    
1991 3   53.33332  
1991 4 -29.27274    
1991 5   -3.347097  
1991 6 -56.69291 -19.35484  
1991 7 9.400475 -8.428479  
1991 8 -11.74207    
1991 9 19.37862 6.417705  
1991 10      
1991 11 -32.50608 -15.07227  
1991 12 -38.67187    
1992 1   -11.11226  
1992 2 8.921128    
1992 3   34.1764  
1992 4 -3.895449    
1992 5   22.02379  
1992 6 9.54748 24.68291  
1992 7 -25.34212 -7.079686  
1992 8 -64.79003    
1992 9 -45.70887 13.93884  
1992 10      
1992 11 37.89021 38.1737  
1992 12 28.46674    
1993 1   17.04842  
1993 2 1.350694    
1993 3   76.1785  
1993 4 -3.23964    
1993 5   4.333032  
1993 6 46.64777 -49.1473  
1993 7 38.9598 6.861525  
1993 8 35.82642    
1993 9 -8.630713 67.74271  
1993 10      
1993 11 37.48954 -11.42529  
1993 12 101.0118    
1994 1   3.201138  
1994 2 40.45801    
1994 3   -2.553345  
1994 4 1.665522    
1994 5   -53.5061  
1994 6 -33.43782 -39.09967  
1994 7 -62.71483 -19.87687  
1994 8 -45.04441    
1994 9 -15.7794 -59.17951  
1994 10      
1994 11 64.88673 -88.97227  
1994 12 100.7134    
1995 1   19.49926  
1995 2 10.77443    
1995 3   -24.31054  
1995 4 2.49027    
1995 5   -18.04316  
1995 6 12.02053 38.89138  
1995 7 14.3135 49.35823  
1995 8 9.193613    
1995 9 17.56706 32.16669  
1995 10      
1995 11 -20.5636 4.738698  
1995 12 -29.38778    
1996 1   -0.8552046  
1996 2 -25.04924    
1996 3   30.03229  
1996 4 37.29224    
1996 5   60.68236  
1996 6 -9.843767 17.18421  
1996 7 -47.27273 40.92068  
1996 8 -21.51663    
1996 9 -26.75008 77.80842  
1996 10      
1996 11 -45.37038 41.82686  
1996 12 -28.76993    
1997 1   4.045853  
1997 2 9.987308    
1997 3   -17.05477  
1997 4 14.72899    
1997 5   1.001236  
1997 6 -41.1692 15.58284  
1997 7 -50.91761 -3.028467  
1997 8 -25.20731    
1997 9 26.57029 -31.02039  
1997 10      
1997 11 39.13791 -31.16883  
1997 12 12.30772    
1998 1   -32.03218  
1998 2 11.64751    
1998 3   -19.37777  
1998 4 43.64208    
1998 5   24.79411  
1998 6 -33.51468 -5.287857  
1998 7 -61.60308 -60.63231  
1998 8 -40.19869    
1998 9 -25.45314 -56.07666  
1998 10      
1998 11 29.72293 -90.93371  
1998 12 -25.27699    
1999 1   -25.76964  
1999 2 -129.2355    
1999 3   49.99997  
1999 4 -50.25126    
1999 5   -13.85596  
1999 6 -58.87265 -68.97186  
1999 7 -93.4351 -78.70215  
1999 8 -23.81477    
1999 9 -7.660077 48.79722  
1999 10      
1999 11 -13.93491 80.36847  
1999 12 -30.49852    
2000 1   51.29225  
2000 2 -39.04705    
2000 3   62.94964  
2000 4 18.04155    
2000 5   7.913428  
2000 6 -60.74868 -3.927094  
2000 7 -62.7537 -17.08185  
2000 8 -13.97058    
2000 9 -22.19512 -3.56689  
2000 10      
2000 11 -37.34645 33.002  
2000 12 -37.65589    
2001 1   12.41767  
2001 2 1.816994    
2001 3   49.30312  
2001 4 2.411257    
2001 5   9.977634  
2001 6 -27.78119 39.41111  
2001 7 90.79118 53.00239  
2001 8 37.64705    
2001 9 -5.128224 1.541745  
2001 10      
2001 11 -23.11733 -29.45237  
2001 12 -13.17288    
2002 1   17.05282  
2002 2 -30.28468    
2002 3   -54.45462  
2002 4 19.02279 -126.3316  
2002 5   -120.5828  
2002 6 37.64987 -56.87606  
2002 7 90.83182 4.294802  
2002 8 37.28814    
2002 9 24.74448 14.43301  
2002 10      
2002 11 40.54055 127.4788  
2002 12 27.3642    
2003 1   8.421046  
2003 2 -33.20087    
2003 3   -51.2776  
2003 4 25.95122 -17.49173  
2003 5   9.632832  
2003 6 7.255802 39.45114  
2003 7 -46.21002 -33.86581  
2003 8 -22.14158    
2003 9 56.11442 -10.72742  
2003 10      
2003 11 131.5436 -29.48539  
2003 12 52.8026    
2004 1   -7.863258  
2004 2 106.5185    
2004 3   85.6102  
2004 4 63.78192 93.94192  
2004 5   55.37539  
2004 6 -48.40924 47.18206  
2004 7 -125.2576 29.43939  
2004 8 -12.61596    
2004 9 -76.08524 53.8578  
2004 10      
2004 11 -72.69841 96.35906  
2004 12 -2.890166    
2005 1   39.10714  
2005 2 44.40191    
2005 3   -42.18753  
2005 4 63.38462 10.12858  
2005 5   -36.82221  
2005 6 10.07382 -63.5687  
2005 7 29.88503 -41.84697  
2005 8 -16.26714    
2005 9 -3.012558 75.33155  
2005 10      
2005 11 -30.70176 6.173848  
2005 12 -48.07931    
2006 1   -60.39454  
2006 2 36.04438    
2006 3   -64.96278  
2006 4 48.26678 10.26108  
2006 5   -10.75052  
2006 6 53.1156 42.04113  
2006 7 13.94267 11.09449  
2006 8 -6.861607    
2006 9 -47.19852 9.460965  
2006 10      
2006 11 53.45766 -12.74727  
2006 12 75.00002    
2007 1   -20.74949  
2007 2 2.394417    
2007 3   -10.19535  
2007 4 47.50082 1.753697  
2007 5   -15.27228  
2007 6 45.07902 -21.42192  
2007 7 45.94035 -9.999986  
2007 8 4.091819    
2007 9 22.12724 -32.56395  
2007 10      
2007 11 91.94286 -80.9067  
2007 12 83.54366    
2008 1   -15.64501  
2008 2 183.851    
2008 3   -41.91373  
2008 4 24.35612 -1.728093  
2008 5   7.334651  
2008 6 106.0536 2.947377  
2008 7 -2.524323 14.44373  
2008 8 -55.48386    
2008 9 -137.5485 -5.039597  
2008 10      
2008 11 -159.5572 -57.80016  
2008 12 -104.0889    
2009 1   -28.10639  
2009 2 -29.87402    
2009 3   -48.61717  
2009 4 37.57174 -88.36741  
2009 5   -47.43193  
2009 6 12.97173 -75.38038  
2009 7 -17.66629 -75.09966  
2009 8 -44.59116    
2009 9 -17.67023 -42.64049  
2009 10      
2009 11 51.57304 107.0849  
2009 12 64.46041    
2010 1   18.239  
2010 2 -16.40225    
2010 3   19.74954  
2010 4 21.52419 86.21861  
2010 5   2.211287  
2010 6 -25.68998 -22.06321  
2010 7 7.363858 -3.579684  
2010 8 14.81481    
2010 9 86.26598 11.50795  
2010 10      
2010 11 105.8677 -23.20027  
2010 12 108.8346    
2011 1   85.22765  
2011 2 41.44636    
2011 3   46.97466  
2011 4 -1.712915 -15.43894  
2011 5   -40.01994  
2011 6 -29.23544 -36.04819  
2011 7 -21.62162 22.83365  
2011 8 -3.319795    
2011 9 -41.22821 33.16904  
2011 10      
2011 11 -66.36194 25.45236  
2011 12 12.75753    

Table 1: Summary of Theoretical Results

Population ParametersEstimators
Underlying Series: $ \Omega= {\displaystyle\sum_{j=-\infty}^{\infty}} \gamma_{z}(j)$ Underlying Series: $ \widehat{\Omega}^{PZ\underrightarrow{p}}\Omega$
Amplitude Modulated Process: $ \Omega^{\ast}= {\displaystyle\sum_{j=-\infty}^{\infty}} \kappa(j)\gamma_{z}(j)$ (Proposition 1) Amplitude Modulated Process: $ \widehat{\Omega}^{AM\underrightarrow{p}}\Omega^{\ast}$ (Proposition 3)
Equal Spacing Process: $ \Omega^{ES}=\Omega^{\ast}$ (Proposition 4) Equal Spacing Process: $ \widehat{\Omega}^{ES\underrightarrow{p}}\Omega^{\ast}$ (Proposition 5)

Table 2: Benchmark Results

Autocorrelation (ø):Empirical Rejection Rate: 0.0Empirical Rejection Rate: 0.3Empirical Rejection Rate: 0.5Empirical Rejection Rate: 0.7Empirical Rejection Rate: 0.9
Sample Size T=360: ES, fixed bw6.07.08.010.723.1
Sample Size T=360: AM, fixed bw5.66.78.312.630.9
Sample Size T=360: Parzen, fixed bw6.64.64.56.119.2
Sample Size T=360: Imputation, fixed bw8.99.711.215.534.4
Sample Size T=360: NW, fixed bw5.67.29.114.233.7
Sample Size T=1200: ES, fixed bw5.36.06.88.718.8
Sample Size T=1200: AM, fixed bw5.26.07.210.626.7
Sample Size T=1200: Parzen, fixed bw5.53.63.34.414.9
Sample Size T=1200: Imputation, fixed bw8.18.59.613.129.9
Sample Size T=1200: NW, fixed bw5.26.47.911.929.4
Sample Size T=24000: ES, fixed bw5.15.45.86.611.0
Sample Size T=24000: AM, fixed bw5.15.46.17.516.5
Sample Size T=24000: Parzen, fixed bw5.13.02.32.26.3
Sample Size T=24000: Imputation, fixed bw6.36.57.08.417.9
Sample Size T=24000: NW, fixed bw5.15.66.37.917.5

Notes: Table reports for a range of sample sizes and autocorrelation parameters the empirical rejection rate as defined in Section 5.2. Data follow our benchmark missing structure, the Bernoulli structure with probability of missing 6/12. The Newey-West estimation provided for comparison uses the full series of simulated data without any missing observations. See Sections 3 and 4 for details on the estimators and the regression context, and Section 5.1 for details on the simulation parameters.

Table 3: Varying Missing Structure

Autocorrelation $ (\phi)$:Empirical Rejection Rate: 0.0Empirical Rejection Rate: 0.3Empirical Rejection Rate: 0.5Empirical Rejection Rate: 0.7Empirical Rejection Rate: 0.9
Randomly missing, Bernoulli structure: ES, fixed bw6.07.08.010.723.1
Randomly missing, Bernoulli structure: AM, fixed bw5.66.78.312.630.9
Deterministic cyclically missing, All lags observed: ES, fixed bw5.96.97.910.122.2
Deterministic cyclically missing, All lags observed: AM, fixed bw5.46.58.011.9

30.1

Deterministic cyclically missing, Some lags unobserved: ES, fixed bw6.06.57.59.822.1
Deterministic cyclically missing, Some lags unobserved: AM, fixed bw5.56.38.112.832.0
Full sample banchmark: NW, fixed bw5.67.29.114.233.7

Notes: Table reports for a range of autocorrelation parameters the empirical rejection rate for a sample size of T = 360 under varying missing structures as described in Section 5.1: the Bernoulli missing structure, the deterministic cyclical missing structure in which all lags are observed, and the deterministic cyclical missing structure in which some lags are never observed. The probability of missing is 6/12 for the Bernoulli structure, while the cyclical structures have exactly 6 of 12 observations missing in each cycle. The Newey-West estimation provided for comparison uses the full series of simulated data without any missing observations. See Sections 3 and 4 for details on the estimators and the regression context, and Section 5.1 for details on the simulation parameters.

Table 4: Finite Samples: Fixed and Automatic Bandwidth Selection

Autocorrelation $ (\phi)$:Empirical Rejection Rate: 0.0Empirical Rejection Rate: 0.3Empirical Rejection Rate: 0.5Empirical Rejection Rate: 0.7Empirical Rejection Rate: 0.9Size-Adjusted Power: 0.0Size-Adjusted Power: 0.3Size-Adjusted Power: 0.5Size-Adjusted Power: 0.7Size-Adjusted Power: 0.9
Sample Size T=360: ES, fixed bw6.07.08.010.723.178.963.549.232.714.0
Sample Size T=360: AM, fixed bw5.66.78.312.630.979.464.150.033.014.2
Sample Size T=360: ES, auto bw6.87.78.710.016.877.862.448.231.713.6
Sample Size T=360: AM, auto bw6.07.08.09.919.779.063.749.132.413.8
Sample Size T=360: NW, fixed bw5.67.29.114.233.797.682.362.238.114.7
Sample Size T=360: NW, auto bw5.97.48.110.019.897.581.561.237.214.4
Sample Size T=1200: ES, fixed bw5.36.06.88.718.880.265.251.434.114.5
Sample Size T=1200: AM, fixed bw5.26.07.210.626.780.465.551.534.314.6
Sample Size T=1200: ES, auto bw5.56.36.97.410.879.964.950.533.514.2
Sample Size T=1200: AM, auto bw5.36.16.47.412.680.265.551.033.914.4
Sample Size T=1200: NW, fixed bw5.26.47.911.929.497.883.063.038.615.0
Sample Size T=1200: NW, auto bw5.36.36.67.512.697.882.662.738.114.8
Sample Size T=24000: ES, fixed bw5.15.45.86.611.080.265.451.033.814.4
Sample Size T=24000: AM, fixed bw5.15.46.17.516.580.365.451.133.714.4
Sample Size T=24000: ES, auto bw5.15.45.75.76.280.265.350.933.614.4
Sample Size T=24000: AM, auto bw5.15.45.55.76.480.365.451.033.714.3
Sample Size T=24000: NW, fixed bw5.15.66.37.917.597.883.163.038.614.9
Sample Size T=24000: NW, auto bw5.15.45.55.66.497.883.163.138.714.9

Notes: Table reports for a range of sample sizes and autocorrelation parameters the empirical rejection rate and size-adjusted power as defined in Section 5.2. Data follow our benchmark missing structure, the Bernoulli structure with probability of missing 6/12. The Newey-West estimation provided for comparison uses the full series of simulated data without any missing observations. See Sections 3 and 4 for details on the estimators and the regression context, and Section 5.1 for details on the simulation parameters.

Table 5: Empirical Bias and Variance

Autocorrelation $ (\phi)$:Bias: 0.0Bias: 0.3Bias: 0.5Bias: 0.7Bias: 0.9Variance: 0.0Variance: 0.3Variance: 0.5Variance: 0.7Variance: 0.9
Sample Size T=360: ES, fixed bw0.000760.02010.1743.381,0510.04280.1030.2831.4442.5
Sample Size T=360: AM, fixed bw0.000180.01920.2395.691,4610.02730.0700.1890.8618.6
Sample Size T=360: ES, auto bw0.002930.03270.1942.225900.05940.1420.4492.81132.0
Sample Size T=360: AM, auto bw0.000720.02220.1542.438190.03560.0980.3342.0378.9
Sample Size T=360: NW, fixed bw0.000180.06530.88922.015,7690.02200.084

0.280

1.5543.5
Sample Size T=360: NW, auto bw0.000750.05900.4107.702,9990.03090.1560.6284.93250.6
Sample Size T=1200: ES, fixed bw0.000100.00970.1002.198300.01530.0370.1050.5618.8
Sample Size T=1200: AM, fixed bw0.000020.01110.1574.201,2770.00940.0240.0680.338.1
Sample Size T=1200: ES, auto bw0.000410.01270.0800.812430.02150.0600.2181.3489.6
Sample Size T=1200: AM, auto bw0.000090.01010.0590.923940.01240.0420.1441.0055.4
Sample Size T=1200: NW, fixed bw

0.00002

0.04140.60816.585,0850.00780.03050.1050.6119.4
Sample Size T=1200: NW, auto bw0.000100.02550.1612.931,4220.01090.07330.2822.62179.2
Sample Size T=24000: ES, fixed bw0.000000.00180.0210.513080.00380.0110.0653.040.0
Sample Size T=24000: AM, fixed bw0.000000.00240.0371.186830.00220.0070.0371.290.0
Sample Size T=24000: ES, auto bw0.000000.00200.0110.09180.00990.0390.19620.660.0
Sample Size T=24000: AM, auto bw0.000000.00170.0070.08270.00690.0180.15314.940.0
Sample Size T=24000: NW, fixed bw0.000000.00940.1464.712,7300.00320.01190.0823.600.0
Sample Size T=24000: NW, auto bw0.000000.00340.0180.25980.01170.04070.45153.360.0

Notes: Table reports empirical bias squared and variance as defined in Section 5.2. Data follow our benchmark missing structure, the Bernoulli structure with probability of missing 6/12. The Newey-West estimation provided for comparison uses the full series of simulated data without any missing observations. See Sections 3 and 4 for details on the estimators and the regression context, and Section 5.1 for details on the simulation parameters.

Table 6: Automatically Selected Bandwidths

Autocorrelation $ (\phi)$:6 of 12 Missing: 0.06 of 12 Missing: 0.36 of 12 Missing: 0.56 of 12 Missing: 0.76 of 12 Missing: 0.9
T=360, Fixed bandwidth: m(T)=5: ES, auto bw5.95.15.06.410.0
T=360, Fixed bandwidth: m(T)=5: AM, auto bw5.55.06.49.813.0
T=360, Fixed bandwidth: m(T)=5: NW, auto bw5.45.48.011.414.1
T=1200, Fixed bandwidth: m(T)=6: ES, auto bw6.46.07.010.817.2
T=1200, Fixed bandwidth: m(T)=6: AM, auto bw6.26.710.516.522.2
T=1200, Fixed bandwidth: m(T)=6: NW, auto bw6.28.113.218.823.9
T=24000, Fixed bandwidth: m(T)=12: ES, auto bw13.214.620.233.365.2
T=24000, Fixed bandwidth: m(T)=12: AM, auto bw13.218.031.855.794.0
T=24000, Fixed bandwidth: m(T)=12: NW, auto bw13.223.539.062.398.3

Notes: Table reports for a range of autocorrelation parameters and sample sizes the mean of the bandwidth selected by the Newey and West (1994) procedure described in Section 5.1. Data follow our benchmark missing structure as described in the Bernoulli structure with probability of missing 6/12. The Newey-West estimation provided for comparison uses the full series of simulated data without any missing observations. See Sections 3 and 4 for details on the estimators and the regression context, and Section 5.1 for details on the simulation parameters.

Table 7: Varying Fraction of Missings

Autocorrelation $ (\phi)$:Empirical Rejection Rate: 0.0Empirical Rejection Rate: 0.3Empirical Rejection Rate: 0.5Empirical Rejection Rate: 0.7Empirical Rejection Rate: 0.9Size-Adjusted Power: 0.0Size-Adjusted Power: 0.3Size-Adjusted Power: 0.5Size-Adjusted Power: 0.7Size-Adjusted Power: 0.9
4 of 12 missing: ES, fixed bw5.97.08.211.326.689.272.756.235.814.5
4 of 12 missing: AM, fixed bw5.66.98.613.132.489.473.256.636.214.6
4 of 12 missing: ES, auto bw6.47.58.59.717.488.772.055.034.914.0
4 of 12 missing: AM, auto bw5.97.28.09.819.889.272.555.635.514.2
6 of 12 missing: ES, fixed bw6.07.08.010.723.178.963.549.232.714.0
6 of 12 missing: AM, fixed bw5.66.78.312.690.979.464.150.033.014.2
6 of 12 missing: ES, auto bw6.87.78.710.116.877.862.448.231.713.6
6 of 12 missing: AM, auto bw6.07.08.09.919.779.063.749.132.413.8
8 of 12 missing: ES, fixed bw6.47.17.89.618.761.050.541.229.113.6
8 of 12 missing: AM, fixed bw5.66.47.611.228.362.551.942.529.913.9
8 of 12 missing: ES, auto bw7.88.49.110.515.757.648.239.627.713.1
8 of 12 missing: AM, auto bw5.96.67.69.519.461.951.441.829.213.6
Full sample benchmark: NW, fixed bw5.67.29.114.233.797.682.362.238.114.7
Full sample benchmark: NW, auto bw5.97.48.110.019.897.581.561.237.214.4

Notes: Table reports for a range of autocorrelation parameters the empirical rejection rate and size-adjusted power for a sample size of T = 360 under varying probability of missing observation. Data follow our benchmark missing structure as described in Section 5.1, the Bernoulli structure with probability of missing 4/12, 6/12, or 8/12. The Newey-West estimation provided for comparison uses the full series of simulated data without any missing observations. See Sections 3 and 4 for details on the estimators and the regression context, and Section 5.1 for details on the simulation parameters.

Table 8: Rejection Rates for recursive Tests Panel A: 5% Significance Level

 Copper: ForwardCopper: BackwardSoybean Oil: ForwardSoybean Oiil: BackwardLean Hogs: ForwardLean Hogs: Backward
Imputation64.828.225.30.093.718.3
Amplitude Modulated59.726.518.10.091.48.8
Equal Spacing46.79.812.40.090.78.4
AM rejects, IM does not0.01.20.00.00.20.0
ES rejects, IM does not 0.00.00.00.00.20.0
IM rejects, AM does not5.22.97.20.02.49.5
IM rejects, ES does not18.218.412.90.03.29.9

Table 8: Rejection Rates for recursive Tests Panel B: 10% Significance Level

 Copper: ForwardCopper: BackwardSoybean Oil: ForwardSoybean Oiil: BackwardLean Hogs: ForwardLean Hogs: Backward
Imputation77.234.663.522.296.126.1
Amplitude Modulated72.934.029.70.094.615.1
Equal Spacing65.431.424.80.093.515.1
AM rejects, IM does not0.00.00.00.00.00.0
ES rejects, IM does not 0.0.0.0.00.00.00.0
IM rejects, AM does not4.30.633.822.21.511.0
IM rejects, ES does not11.83.238.722.22.611.0

Notes: Table reports the rejection rate for the test of the hypothesis that the recursive sample mean is equal to zero. The final row of the first panel corresponds to the shaded areas in Figures 2 and 3.

Figure 2: Recursive Test Results - Sample Mean and Error Bands

Figure 2: This figure plots the recursive sample mean and 95% confidence intervals for the imputation and equal spacing methods. The recursive sample mean is calculated as the mean of returns over the period from the start of the sample to sample end date (plotted on the x-axis). Shaded areas indicate recursive samples for which the imputation method finds statistical significance while the equal spacing method finds none.

Note: Figure plots the recursive sample mean and 95% confidence intervals for the imputation and equal spacing methods. The recursive sample mean is calculated as the mean of returns over the period from the start of the sample to sample end date (plotted on the x-axis). Shaded areas indicate recursive samples for which the imputation method finds statistical significance while the equal spacing method finds none.

Figure 3: Backwards Recursive Test Results - Sample Mean and Error Bands

Figure 3: This figure plots the backwards recursive sample mean and 95% confidence intervals for the imputation and equal spacing methods. The backwards recursive sample mean is calculated as the mean of returns over the period from the end of the sample to sample start date (plotted on the x-axis). Note that the sample size is increasing with earlier start dates, moving left to right on the plot. Shaded areas indicate recursive samples for which the imputation method finds statistical significance while the equal spacing method finds none.

Note: Figure plots the backwards recursive sample mean and 95% confidence intervals for the imputation and equal spacing methods. The backwards recursive sample mean is calculated as the mean of returns over the period from the end of the sample to sample start date (plotted on the x-axis). Note that the sample size is increasing with earlier start dates, moving left to right on the plot. Shaded areas indicate recursive samples for which the imputation method finds statistical significance while the equal spacing method finds none.


Footnotes

**  We would like to thank James Stock and Rustam Ibragimov for their invaluable guidance. We would also like to think three anonymous referees, as well as John Campbell, Gary Chamberlain, Dobrislav Dobrev, Neil Ericsson, Jose Luis Montiel, Ulrich Müller, John Rogers, Clara Vega, Jonathan Wright, and seminar participants at the Federal Reserve Board and Harvard University for helpful comments and discussions. Carissa Tudor provided excellent research assistance. The views in this paper are solely the responsibility of the authors and should not be interpreted as reflecting the views of the Board of Governors of the Federal Reserve System or of any other person associated with the Federal Reserve System. Return to text

  Division of International Finance, Board of Governors of the Federal Reserve System, Washington, DC, USA. Email: [email protected]Return to text

  Department of Economics, Harvard University, Cambridge, MA, USA. Email: [email protected]Return to text

1.  To make it easier for researchers to apply these estimators, we have posted Matlab code for both estimators on our websites. We also have posted a basic simulation code that reports empirical rejection rates, size-adjusted power, bias, and variance for the Equal Spacing, Amplitude Modulated, Impuation, and (full sample) Newey-West estimators. Researchers can use the simulation code to evaluate the performance of the estimators under customized sample size, autocorrelation, and missing structure before choosing which estimator to implement.  Return to text

2.  In our simulations, we implement the estimator using Bartlett kernel weights to maintain comparability with results for ES and AM. However, this does not guarantee that the estimator will be positive semi-definite.  Return to text

3.  Full simulation results are available upon request.  Return to text

4.  Interestingly, the test using the PZ estimator is well-behaved when the autocorrelation is 0. This is consistent with our theoretical results, because when there is no autocorrelation, we have that the long-run variance of the underlying and observed series are asymptotically equivalent. Consequently, we have that when there is no autocorrelation, $ \hat{\Omega}^{PZ}$ and $ \hat{\Omega}^{AM}$ are asymptotically equivalent as well.  Return to text

5.  We restrict the full sample for copper to the period with missing contract months, 1960-1989.  Return to text

6.  We selected one representative commodity from each of the major commodity types (metals, animal, and agricultural). We omit the energy commodities, as these commodity contracts do not have missing contract months.  Return to text


This version is optimized for use by screen readers. Descriptions for all mathematical expressions are provided in LaTex format. A printable pdf version is available. Return to text

Home | Economic research and data | Publications and education resources
Accessibility | Contact us
Last update: July 19, 2013