The iPhone App Store has a couple of apps that make life significantly easier for those of us inundated and overwhelmed by the stream of daily arXiv preprints. These are ArXivReader.app and ArXiv.app, both providing a means to browse and search the arXiv preprint database
and both selling for 99c with the first selling for 99c and the second free. Check them out! The former even lets you save papers for off-line reading.
For me at least, the hardest part of going through the arXiv emails every day was to pick out the interesting papers in the deluge of text. These apps do the right thing and segregate the categories and highlight the titles. Fitts’ Law in action — suddenly the daily ritual is orders of magnitude more pleasant!
Earlier this year, Peter Edmonds showed me a press release that the Chandra folks were, at the time, considering putting out describing the possible identification of a Type Ia Supernova progenitor. What appeared to be an accreting white dwarf binary system could be discerned in 4-year old observations, coincident with the location of a supernova that went off in November 2007 (SN2007on). An amazing discovery, but there is a hitch.
And it is a statistical hitch, and involves two otherwise highly reliable and oft used methods giving contradictory answers at nearly the same significance level! Does this mean that the chances are actually 50-50? Really, we need a bona fide statistician to take a look and point out the errors of our ways.. Continue reading ‘Did they, or didn’t they?’ »
There is a new report from Bernabei et al. (arXiv:0804.2741) of the direct detection of the effects of Dark Matter that is causing a lot of buzz. (The Bad Astronomer has a good summary.) They find yearly modulation in their detected scintillation rate that matches what you would expect if the Earth were rushing through Galactic Dark Matter as it goes around the Sun. They have worked out the significance of the modulation to be 8.2 sigma. Significant! But significant of what? Continue reading ‘Is 8-sigma significant enough for you?’ »
Avalanches are a common process, occuring anywhere that a system can store stress temporarily without “snapping”. It can happen on sand dunes and solar flares as easily as on the snow bound Alps.
Melatos, Peralta, & Wyithe (arXiv:0710.1021) have a nice summary of avalanche processes in the context of pulsar glitches. Their primary purpose is to show that the glitches are indeed consistent with an avalanche, and along the way they give a highly readable description of what an avalanche is and what it entails. Briefly, avalanches result in event parameters that are distributed in scale invariant fashion (read: power laws) with exponential waiting time distributions (i.e., Poisson).
Hence the title of this post: the “Avalanche distribution” (indulge me! I’m using stats notation to bury complications!) can be thought to have two parameters, both describing the indices of power-law distributions that control the event sizes, a, and the event durations, b, and where the event separations are distributed as an exponential decay. Is there a canned statistical distribution that describes all this already? (In our work modeling stellar flares, we assumed that b=0 and found that
a>2 a<-2, which has all sorts of nice consequences for coronal heating processes.)
Hyunsook drew attention to this paper (arXiv:0709.4531v1) by Brad Schaefer on the underdispersed measurements of the distances to LMC. He makes a compelling case that since 2002 published numbers in the literature have been hewing to an “acceptable number”, possibly in an unconscious effort to pass muster with their referees. Essentially, the distribution of the best-fit distances are much more closely clustered than you would expect from the quoted sizes of the error bars. Continue reading ‘“you are biased, I have an informative prior”’ »
[arXiv:0709.3093v1] Short Timescale Coronal Variability in Capella (Kashyap & Posson-Brown)
We recently submitted that paper to AJ, and rather ironically, I did the analysis during the same time frame as this discussion was going on, about how astronomers cannot rely on repeating observations. Ironic because the result reported there hinges on the existence of small, but persistent signal that is found in repeated observations of the same source. Doubly ironic in fact, in that just as we were backing and forthing about cultural differences I seemed to have gone and done something completely contrary to my heritage! Continue reading ‘Betraying your heritage’ »
[arXiv:0709.2358] Cleaning the USNO-B Catalog through automatic detection of optical artifacts, by Barron et al.
Statistically speaking, “false sources” are generally in the domain of
Type II Type I errors, defined by the probability of detecting a signal where there is none. But what if there is a clear signal, but it is not real? Continue reading ‘Spurious Sources’ »
arXiv:0709.1067v1 : Wrong Priors (Carlos C. Rodriguez)
This came through today on astro-ph, suggesting that we could be choosing priors better than we do, and in fact that we generally do a very bad job of it. I have been brought up to believe that, like points in Whose Line Is It Anyway, priors don’t matter (unless you have very little data), so I am somewhat confused. What is going on here?
I think of Markov-Chain Monte Carlo (MCMC) as a kind of directed staggering about, a random walk with a goal. (Sort of like driving in Boston.) It is conceptually simple to grasp as a way to explore the posterior probability distribution of the parameters of interest by sampling only where it is worth sampling from. Thus, a major savings from brute force Monte Carlo, and far more robust than downhill fitting programs. It also gives you the error bar on the parameter for free. What could be better? Continue reading ‘An alternative to MCMC?’ »
I don’t know why astro-ph thought this article on the statistics of football dynamics (Mendes, Malacarne, Anteneodo 2007; physics/0706.1758) was relevant to me and emailed the abstract, but I’m glad they did, because they deal with a question I have wrestled with for a long time: how to figure out the underlying distribution that controls a stochastic process. In 2002ApJ…580.1118K, we dealt with modeling the photon arrival time differences as due to flares occuring at random times but with a power-law intensity distribution with index alpha. physics/0706.1758 deals with time-between-touches and tries to characterize that distribution itself in terms of a number of “phases” beta. From a quick reading, it appears that their beta are our flares, and they restrict all flares to have the same intensity. Despite the restriction, this is interesting because it is an analytical estimation that points a way towards speeding up our flare distribution fitting process, which currently is based on a Monte-Carlo grid search method, not the fastest way to do things.
Clauset, Shalizi, & Newman (2007, arXiv/0706.1062) have a very detailed description of what power-law distributions are, how to recognize them, how to fit them, etc. They are also making available their matlab and R codes that they use to do the fitting and such.
Looks like a very handy reference text, though I am a bit uncertain about their use of the K-S test to check whether a dataset can be described with a power-law or not. It is probably fine; perhaps some statisticians would care to comment?