Saturday, April 18, 2020

Computational Breakthroughs for Quantile Regression

Xuming He sent me a new paper with Xiaoou Pan, Kean Ming Tan, and Wen-Xin Zhou on computation of quantile regression.  They suggest a new way to smooth the objective
function that preserves convexity and use first order coordinate descent.  At first reading I thought that their comparison might be focused exclusively on the simplex method in my quantreg package which is based on the 1974 Barrodale and Roberts algorithm, so I decided to do some comparisons myself  with the interior point method that I call the Frisch-Newton algorithm.  My initial comparisons confirm their conclusions that the new approach has massive advantages in terms of computational speed and a slight advantage in terms of accuracy.  Based on past experience I'm always skeptical about accuracy comparisons since they have often been due to some sort of shrinkage effect that while perhaps unintended had the effect of negating the affine equivariance of the original definition.  This is not the case for Xuming's new proposal.  My interpretation of the efficiency gain is that smoothing the objective function means that the new problem is in effect producing an estimator that is like local averaging of several adjacent quantiles.  One might think of it as local Huberization and it is known that provided that the bandwidth isn't too large this can yield efficiency gains.

My experiments are based on very limited experience with dense n by p designs with both X and y Gaussian.  The He et al implementation is available  in R from https://cran.r-project.org/web/packages/conquer/index.html  Sample sizes range from 1000 to 100,000, and p ranges from 10 to 200.Timings (in seconds) for the mean of 10 replications look like this:

             Frisch-Newton                   He et al
        n = 1K n = 5K n = 10K n = 100K  n = 1K n = 5K n = 10K n = 100K
p = 10   0.006  0.032   0.070    0.531  0.005  0.008   0.017    0.063
p = 50   0.051  0.292   0.499    2.056  0.010  0.016   0.030    0.228
p = 100  0.126  0.591   0.843    6.130  0.011  0.034   0.061    0.485
p = 200  0.459  1.292   2.225   21.391  0.047  0.078   0.124    0.967

Whereas mean absolute error of the coefficient estimates averaged over 10 replications is:

             Frisch-Newton                   He et al
        n = 1K n = 5K n = 10K n = 100K  n = 1K n = 5K n = 10K n = 100K
p = 10   0.438  0.149   0.103    0.034  0.414  0.146   0.096    0.033
p = 50   1.627  0.730   0.524    0.157  1.447  0.682   0.498    0.152
p = 100  3.244  1.447   1.027    0.321  2.849  1.346   0.950    0.313
p = 200  7.141  2.856   2.019    0.616  6.093  2.554   1.823    0.594

Independently, Victor Chernozhukov, Ivan Fernandez-Val and Blaise Melly have proposed new
computational methods for estimating the QR process exploiting the "globbing" ideas that Steve Portnoy and I used in our Hare and Tortoise paper some time ago.  This also yields dramatic improvements in computational speed.  For further details, see https://arxiv.org/abs/1909.05782

Taken together and combined with various new ideas for bootstrapping these new developments substantially improve the scalability of QR methods.  Congratulations, and many thanks to all
contributors to this research.

Friday, April 17, 2020

Dorfman's Group Testing

The estimable Xi'an's OG https://xianblog.wordpress.com/2020/04/09/dorfmans-group-testing/has recently pointed out that the group testing idea that has been recently put forward for Covid-19 appears as Problem IX.9.26 in Feller attributed to the late Robert Dorfman.  This problem is notable for being one of the easier problems in Feller, suitable for a serious undergraduate course and particularly apt in the present circumstances.  Dorfman's application was testing GI's for syphilis before sending them abroad into WWII.  Since I have already besmirched the reputation of Dorfman in a previous post https://davoidofmeaning.blogspot.com/search?q=dorfman it seems only fair that I recant a bit and acknowledge that this group testing idea was really quite clever and practically quite important.  I won't spoil the problem by revealing the answer, but I'll provide a snapshot of it here for those unlucky enough not to have a copy of Feller handy.  (Mine has been inherited by a terrific ex grad student who was kind enough to send me a scan.)



Wednesday, April 15, 2020

Empirical Bayes Confidence Intervals

The second installment of the Chamberlain (Virtual) Econometrics Seminar
featured Tim Armstrong presenting new work on Empirical Bayes Confidence
Intervals coauthored with  Michal Kolesár and Mikkel Plagborg-Møller,
and available from http://arxiv.org/abs/2004.03448.  Their approach seemed to
be motivated by recent work of Chetty using James-Stein shrinkage.  The
objective was to produce "robust" confidence intervals for such linear
shrinkage estimators, and the technique employed was a clever application
of the method of moment spaces.  This seemed to perform well in the
classical Gaussian sequence model when the mixing distribution was itself
Gaussian.  But as shown by Bruce Hansen in his discussion of the paper
following Tim's presentation, the procedure exhibits severe under
coverage in a typical Gaussian outlier model.  Thus, in the 1953 terminology
of Box it apparently possesses neither "robustness of validity", i.e. coverage,
nor "robustness of efficiency", i.e. length.

This led me to wonder how nonparametric empirical Bayes methods would
perform under such circumstances.  Bruce was kind enough to provide
code for his simulations and I spent the Easter weekend exploring this
question.  The result is yet another R Vinaigrette.  I compare two
flavors of nonparametric empirical Bayes Confidence intervals, both
based on what Efron calls G-modeling. A nonparametric estimate of the
mixing distribution called G serves as a "plug-in prior"  and intervals
are then based on posterior quantiles.  We consider two estimators for G:
the classical NPMLE of Kiefer and Wolfowitz, and the log-spline estimator
of Efron.  Spoiler alert:  in keeping with the samokritika (self-criticism)
spirit of the Vinaigrette -- Efron wins.  Links to the text and the code
for the simulations reported therein can be found at:

~