Posts tagged ‘bias’

my first AAS. V. measurement error and EM

While discussing different view points on the term, clustering, one of the conversers led me to his colleague’s poster. This poster (I don’t remember its title and abstract) was my favorite from all posters in the meeting. Continue reading ‘my first AAS. V. measurement error and EM’ »

Eddington versus Malmquist

During the runup to his recent talk on logN-logS, Andreas mentioned how sometimes people are confused about the variety of statistical biases that afflict surveys. They usually know what the biases are, but often tend to mislabel them, especially the Eddington and Malmquist types. Sort of like using “your” and “you’re” interchangeably, which to me is like nails on a blackboard. So here’s a brief summary: Continue reading ‘Eddington versus Malmquist’ »

[ArXiv] Post Model Selection, Nov. 7, 2007

Today’s arxiv-stat email included papers by Poetscher and Leeb, who have been working on post model selection inference. Sometimes model selection is misled as a part of statistical inference. Simply, model selection can be considered as a step prior to inference. How you know your data are from chi-square distribution, or gamma distribution? (this is a model selection problem with nested models.) Should I estimate the degree of freedom, k from Chi-sq or α and β from gamma to know mean and error? Will the errors of the mean be same from both distributions? Continue reading ‘[ArXiv] Post Model Selection, Nov. 7, 2007’ »

Coverage issues in exponential families

I’ve been heard so much, without knowing fundamental reasons (most likely physics), about coverage problems from astrostat/phystat groups. This paper might be an interest for those: Interval Estimation in Exponential Families by Brown, Cai,and DasGupta ; Statistica Sinica (2003), 13, pp. 19-49

Abstract summary:
The authors investigated issues in interval estimation of the mean in the exponential family, such as binomial, Poisson, negative binomial, normal, gamma, and a sixth distribution. The poor performance of the Wald interval has been known not only for discrete cases but for nonnormal continuous cases with significant negative bias. Their computation suggested that the equal tailed Jeffreys interval and the likelihood ratio interval are the best alternatives to the Wald interval. Continue reading ‘Coverage issues in exponential families’ »

Astrostatistics: Goodness-of-Fit and All That!

During the International X-ray Summer School, as a project presentation, I tried to explain the inadequate practice of χ^2 statistics in astronomy. If your best fit is biased (any misidentification of a model easily causes such bias), do not use χ^2 statistics to get 1σ error for the 68% chance of capturing the true parameter.

Later, I decided to do further investigation on that subject and this paper came along: Astrostatistics: Goodness-of-Fit and All That! by Babu and Feigelson.
Continue reading ‘Astrostatistics: Goodness-of-Fit and All That!’ »

All your bias are belong to us

Leccardi & Molendi (2007) have a paper in A&A (astro-ph/0705.4199) discussing the biases in parameter estimation when spectral fitting is confronted with low counts data. Not surprisingly, they find that the bias is higher for lower counts, for standard chisq compared to C-stat, for grouped data compared to ungrouped. Peter Freeman talked about something like this at the 2003 X-ray Astronomy School at Wallops Island (pdf1, pdf2), and no doubt part of the problem also has to do with the (un)reliability of the fitting process when the chisq surface gets complicated.

Anyway, they propose an empirical method to reduce the bias by computing the probability distribution functions (pdfs) for various simulations, and then averaging the pdfs in groups of 3. Seems to work, for reasons that escape me completely.

[Update: links to Peter's slides corrected]