Archive for the ‘arXiv’ Category.
May 20th, 2008| 12:10 am | Posted by vlk
Earlier this year, Peter Edmonds showed me a press release that the Chandra folks were, at the time, considering putting out describing the possible identification of a Type Ia Supernova progenitor. What appeared to be an accreting white dwarf binary system could be discerned in 4-year old observations, coincident with the location of a supernova that went off in November 2007 (SN2007on). An amazing discovery, but there is a hitch.
And it is a statistical hitch, and involves two otherwise highly reliable and oft used methods giving contradictory answers at nearly the same significance level! Does this mean that the chances are actually 50-50? Really, we need a bona fide statistician to take a look and point out the errors of our ways.. Continue reading ‘Did they, or didn’t they?’ »
Tags:
arXiv,
Chandra,
CXC,
Optical,
Peter Edmonds,
positional coincidence,
positional error,
Power,
progenitor,
question for statisticians,
significance,
Supernova,
Type Ia,
White Dwarf,
White Dwarf binary,
X-ray Category:
Astro,
Data Processing,
News,
Objects,
Optical,
Stat,
Uncertainty,
arXiv |
5 Comments
May 19th, 2008| 10:42 am | Posted by hlee
There’s no particular opening remark this week. Only I have profound curiosity about jackknife tests in [astro-ph:0805.1994]. Including this paper, a few deserve separate discussions from a statistical point of view that shall be posted. Continue reading ‘[ArXiv] 2nd week, May 2008’ »
Tags:
bimodality,
bootstrap,
calibration uncertainty,
CF,
Classification,
CMB,
dip,
exoplanet,
Fisher matrix,
flare,
GL,
jackknife,
KS test,
marked point,
maximum likelihood,
MLE,
poisson point process,
spatial data,
XLF Category:
Frequentist,
Uncertainty,
X-ray,
arXiv |
Comment
May 11th, 2008| 10:42 pm | Posted by hlee
I think I have to review spatial statistics in astronomy, focusing on tessellation (void structure), point process (expanding 2 (3) point correlation function), and marked point process (spatial distribution of hardness ratios of X-ray distant sources, different types of galaxies -not only morphological differences but other marks such as absolute magnitudes and existence of particular features). When? Someday…
In addition to Bayesian methodologies, like this week’s astro-ph, studies on characterizing empirical spatial distributions of voids and galaxies frequently appear, which I believe can be enriched further with the ideas from stochastic geometry and spatial statistics. Click for what was appeared in arXiv this week. Continue reading ‘[ArXiv] 1st week, May 2008’ »
Tags:
Classification,
covariance,
FARIMA,
Fisher information,
GL,
GRB,
Levy,
light curve,
limb darkening,
ML,
Pareto distribution,
quasars,
solar flare,
standard candle,
tessellation,
time series,
VO,
void Category:
MCMC,
Uncertainty,
arXiv |
1 Comment
May 5th, 2008| 03:08 am | Posted by hlee
Since I learned Hubble’s tuning fork[] for the first time, I wanted to do classification (semi-supervised learning seems more suitable) galaxies based on their features (colors and spectra), instead of labor intensive human eye classification. Ironically, at that time I didn’t know there is a field of computer science called machine learning nor statistics which do such studies. Upon switching to statistics with a hope of understanding statistical packages implemented in IRAF and IDL, and learning better the contents of Numerical Recipes and Bevington’s book, the ignorance was not the enemy, but the accessibility of data was. Continue reading ‘[ArXiv] 5th week, Apr. 2008’ »
Tags:
ANN,
automation,
Classification,
correlation function,
denoising,
FFT,
gravitational wave,
lensing,
LISA,
machine learning,
missing data,
mock data,
morphology,
PCA,
power spectrum,
robust,
SDSS,
spectrum,
sunspots,
wavelet,
zoo Category:
Galaxies,
Imaging,
MCMC,
Physics,
Spectral,
arXiv |
Comment
Apr 27th, 2008| 11:29 am | Posted by hlee
The last paper in the list discusses MCMC for time series analysis, applied to sunspot data. There are six additional papers about statistics and data analysis from the week. Continue reading ‘[ArXiv] 4th week, Apr. 2008’ »
Tags:
clusters,
CMB,
GALEX,
gravitaional waves,
lensing,
LF,
LMC,
machine learning,
maximum likelihood,
priors,
probability,
SDSS,
stellar populations,
sunspot,
time series Category:
MCMC,
arXiv |
Comment
Apr 25th, 2008| 01:48 am | Posted by hlee
One of the speakers from the google talk series exemplified model based clustering and mentioned the likelihood ratio test (LRT) for defining the number of clusters. Since I’ve seen the examples of ill-mannerly practiced LRTs from astronomical journals, like testing two clusters vs three, or a higher number of components, I could not resist indicating that the LRT is improperly used from his illustration. As a reply, the citation regarding the LRT was different from his plot and the test was carried out to test one component vs. two, which closely observes the regularity conditions. I was relieved not to find another example of the ill-used LRT. Continue reading ‘The LRT is worthless for …’ »
Apr 24th, 2008| 02:56 pm | Posted by vlk
There is a new report from Bernabei et al. (arXiv:0804.2741) of the direct detection of the effects of Dark Matter that is causing a lot of buzz. (The Bad Astronomer has a good summary.) They find yearly modulation in their detected scintillation rate that matches what you would expect if the Earth were rushing through Galactic Dark Matter as it goes around the Sun. They have worked out the significance of the modulation to be 8.2 sigma. Significant! But significant of what? Continue reading ‘Is 8-sigma significant enough for you?’ »
Apr 21st, 2008| 11:56 pm | Posted by hlee
Because of the extensive works by Prof. Peebles and many (observational) cosmologists (almost always I find Prof. Peeble’s book in cosmology literature), the 2 (or 3) point correlation function is much more dominant than any other mathematical and statistical methods to understand the structure of the universe. Unusually, this week finds an astro-ph paper written by a statistics professor addressing the K-function to explore the mystery of the universe.
[astro-ph:0804.3044] J.M. Loh
Estimating Third-Order Moments for an Absorber Catalog
Continue reading ‘[ArXiv] Ripley’s K-function’ »
Apr 20th, 2008| 09:05 pm | Posted by hlee
The dichotomy of outliers; detecting outliers to be discarded or to be investigated; statistics that is robust enough not to be influenced by outliers or sensitive enough to alert the anomaly in the data distribution. Although not related, one paper about outliers made me to dwell on what outliers are. This week topics are diverse. Continue reading ‘[ArXiv] 3rd week, Apr. 2008’ »
Tags:
background,
bootstrap,
calibration errors,
Cash statistics,
clusters,
CMB,
corona,
edge detection,
FFT,
gravitational lens,
maximum likelihood,
multiscale,
neural network,
outlier,
SDSS,
sunspot,
systematic errors,
topology,
WMAP,
XMM-Newton Category:
High-Energy,
MCMC,
arXiv |
Comment
Apr 17th, 2008| 08:39 pm | Posted by hlee
A statistical method developed by insightful and brilliant astronomers is used in bioinformatics:
Detecting periodic patterns in unevenly spaced gene expression time series using Lomb–Scargle periodograms
by Glynn, Chen, & Mushegian [Click for R code and relevant information] [Paper archive at Bioinformatics]
The conclusion clearly indicates the winning points of the Lomb-Scargle periodograms.
The Lomb-Scargle periodogram algorithm is an effective tool for finding periodic gene expression profiles in microarray data, especially when data may be collected at arbitrary time points or when a significant proportion of data is missing.
My personal wish is that data driven statistical methods by hands on scientists (and their statistical collaborators) are to be used in other disciplines because I believe data sets are likely to share the unknown truth of our one universe.
Apr 11th, 2008| 02:21 am | Posted by hlee
Markov chain Monte Carlo became the most frequent and stable statistical application in astronomy. It will be useful collecting tutorials from both professions. Continue reading ‘[ArXiv] 2nd week, Apr. 2008’ »
Tags:
Classification,
GRB,
Hubble constant,
K-S test,
kurtosis,
mask,
maximum likelihood,
SDSS,
skewness,
Solar Oscillation,
Vicent Martinez Category:
Bayesian,
MCMC,
Methods,
Stat,
arXiv |
3 Comments
Apr 8th, 2008| 07:49 pm | Posted by hlee
The breakdown point of the mean is asymptotically zero whereas the breakdown point of the median is 1/2. The breakdown point is a measure of the robustness of the estimator and its value reaches up to 1/2. In the presence of outliers, the mean cannot be a good measure of the central location of the data distribution whereas the median is likely to locate the center. Common plug-in estimators like mean and root mean square error may not provide best fits and uncertainties because of this zero breakdown point of the mean. The efficiency of the mean estimator does not guarantee its unbiasedness; therefore, a bit of care is needed prior to plugging in the data into these estimators to get the best fit and uncertainty. There was a preprint from [arXiv] about the use of median last week. Continue reading ‘[ArXiv] use of the median’ »
Apr 6th, 2008| 11:10 am | Posted by hlee
I’m very curious how astronomers began to use Monte Carlo Markov Chain instead of Markov chain Monte Carlo. The more it becomes popular, the more frequently Monte Carlo Markov Chain appears. Anyway, this week, I added non astrostatistical papers in the list: a tutorial, big bang, and biblical theology. Continue reading ‘[ArXiv] 1st week, Apr. 2008’ »
Tags:
Bible,
big bang,
FFT,
IMF,
microlensing,
misnomer,
model,
NGC 602,
power law,
Stellar association,
wavelet Category:
Jargon,
MCMC,
Misc,
arXiv |
Comment
Apr 3rd, 2008| 04:55 pm | Posted by hlee
Astronomy is ruled by Gaussian distribution with a Poisson distribution duchy. From time to time, ranks are awarded to other distributions without their own territories to be governed independently. Among these distributions, Pareto deserves a high rank. There is a preprint of this week on the Pareto distribution: Continue reading ‘[ArXiv] Pareto Distribution’ »
Tags:
asteroid,
citation,
IMF,
nebula,
Pareto distribution,
survival function,
truncated Category:
Cross-Cultural,
Fitting,
Stars,
Stat,
arXiv |
4 Comments
Mar 30th, 2008| 11:16 pm | Posted by hlee
I began to study statistics with the notion that statistics is the study of information (retrieval) and a part of information is uncertainty which is taken for granted in our random world. Probably, it is the other way around; information is a part of uncertainty. Could this be the difference between Bayesian and frequentist?
The statistician’s task is to articulate the scientist’s uncertainties in the language of probability, and then to compute with the numbers found: cited from Continue reading ‘Statistics is the study of uncertainty’ »