while my first lecture of the graduate econometric theory course here has had several cautionary tales
designed to make students wary of moment based statistical procedures. Yeah, this blog was bound
to get mired down in matters statistical eventually, it is surprising that it took this long.
One of these cautionary tales was based on an American Statistician paper by Bruce Lindsay
and Prasanta Basak called: Moments Determine the Tail of a Distribution (But Not Much Else)
Rarely has there been a more accurately titled paper in the statistical literature. The main
result of their paper was to provide bounds on how different two distribution functions, say F and G
can be, given that they have the same first n moments. They provide a very nice extension of an
old result of Akhiezer for the general case, and give an example showing that for the Gaussian
distribution, G, there is an F with the same first 60 moments, for which |F(0) - G(0)| > 0.1116.
This seems rather astonishing, even after matching the first 60 (!) moments we can not expect
to be within 1/10 at the median of the distribution.
Whenever I thought about this, I wondered what would these F look like? Could we actually
calculate a particular example to illustrate this sort of phenomena. Such problems are usually
called Method of Moment Spaces problems, and it occurred to me a little while ago that it
would be easy to compute such animals by standard linear programming methods. My first
attempt at this employed Mosek, my favorite general linear programming engine, and when
I tried to find an F to match the first 20 Gaussian moments, I discovered that the problem
was numerically pretty nasty: the moments grow quite large and the resulting linear programming
problem becomes quite ill conditioned. This led me to wonder whether the solution that Mosek
produced claiming an F that differed by 0.1846 at zero was really reliable. At this point I decided
that it might be fun to try to solve the same problem in Mathematica, since it was capable of
doing so in extended precision. So with some help and encouragement from my colleague,
Steve Portnoy, I wrote the following simple program:
GaussianMomentMatching[n_, m_] := Module[{x,A,c,b,B,p},
fm[u_,v_] := v^u;
gm[j_] := If[OddQ[j-1],(j-1)!!,0];
x = 8 Table[j, {j, -m, m}]/m;
A = Outer[fm, Range[0, n], x] /. Indeterminate -> 1;
b = Table[gm[j], {j,n}];
c=(1 - Sign[x])/2;
c[[m+1]] = 0;
B = Table[0, {2}, {n+1}];
B[[1]] = Prepend[b,1];
B = Transpose[B];
p = LinearProgramming[c,A,B,Method->"Simplex"];
0.5 - N[c.p]
]
I am restricting attention to F's that have point masses on a grid from -8 to 8 with 2m + 1
possible values, and matching first n moments of the Gaussian. For matching n = 20
moments with m = 250, this produces essentially identical results as Mosek with a gap of 0.1846 at
zero. And now the rational solution in the p's matches precisely. For n = 30, I get
a gap of 0.155. The Mathematica code is a bit slow, so I haven't tried to go beyond
30, but I'm still astonished by the fact that the gaps are so large. Clearly Lindsay is
right when he says that moment matching for distributions with unbounded support
doesn't tell us much about the center of the distribution. Of the 501 possible points
of support for these solutions only n+1 points get positive mass, this is a feature of
the linear programming formulation. An entertaining aspect of the Mathematica solution
is that these p's with positive mass are (wildly) rational, e.g.
of support for these solutions only n+1 points get positive mass, this is a feature of
the linear programming formulation. An entertaining aspect of the Mathematica solution
is that these p's with positive mass are (wildly) rational, e.g.
4016749983193866933980157909278598671875
______________________________________________. So what do these matching solutions
1398521361440385903587657725093918246728368128
look like? I've plotted the n = 20 solution in the figure below. Note that I'm really plotting the cube
root of the probability masses in an effort to make the small tail masses visible. Not surprisingly,
the solution looks quite symmetric since the odd moments must all match zero, but a little mass
out in the tails can make the moments mimic the Gaussian moments exactly.
Another question on might ask about all of this might be: how different can the quantiles of
F and G be? Obviously, the medians agree exactly, but as is apparent from the figure above
there are other quantiles that can be different by as much as 0.5.
look like? I've plotted the n = 20 solution in the figure below. Note that I'm really plotting the cube
root of the probability masses in an effort to make the small tail masses visible. Not surprisingly,
the solution looks quite symmetric since the odd moments must all match zero, but a little mass
out in the tails can make the moments mimic the Gaussian moments exactly.
Another question on might ask about all of this might be: how different can the quantiles of
F and G be? Obviously, the medians agree exactly, but as is apparent from the figure above
there are other quantiles that can be different by as much as 0.5.
Several months after this was originally written Steve Portnoy and I realized
ReplyDeletethat what this LPing was really doing was just solving for the zeros of the
Hermite polynomials and therefore that we had simply reinvented Gaussian
quadrature! Steve wrote a note for Am. Statistician about this:
http://www.tandfonline.com/doi/abs/10.1080/00031305.2014.992960