Variable Selection and Updating In Model-Based Discriminant Analysis for High Dimensional Data with Food Authenticity Applications
by Murphy, Dean, and Raftery
Classifying or clustering (or semi supervised learning) spectra is a very challenging problem from collecting statistical-analysis-ready data to reducing the dimensionality without sacrificing complex information in each spectrum. Not only how to estimate spiky (not differentiable) curves via statistically well defined procedures of estimating equations but also how to transform data that match the regularity conditions in statistics is challenging.
Continue reading ‘[ArXiv] classifying spectra’ »
Speaking of XAtlas from my previous post I tried another visualization tool called Parallel Coordinates on these Capella observations and two stars with multiple observations (AL Lac and IM Peg). As discussed in [MADS] Chernoff face, full description of the catalog is found from XAtlas website. The reason for choosing these stars is that among low mass stars, next to Capella (I showed 16), IM PEG (HD 21648, 8 times), and AR Lac (although different phases, 6 times) are most frequently observed. I was curious about which variation, within (statistical variation) and between (Capella, IM Peg, AL Lac), is dominant. How would they look like from the parametric space of High Resolution Grating Spectroscopy from Chandra? Continue reading ‘[MADS] Parallel Coordinates’ »
Another deduced conclusion from reading preprints listed in arxiv/astro-ph is that astronomers tend to confuse classification and clustering and to mix up methodologies. They tend to think any algorithms from classification or clustering analysis serve their purpose since both analysis algorithms, no matter what, look like a black box. I mean a black box as in neural network, which is one of classification algorithms. Continue reading ‘Classification and Clustering’ »
I was questioned by two attendees, acquainted before the AAS, if I can suggest them clustering methods relevant to their projects. After all, we spent quite a time to clarify the term clustering. Continue reading ‘my first AAS. IV. clustering’ »
There’s no particular opening remark this week. Only I have profound curiosity about jackknife tests in [astro-ph:0805.1994]. Including this paper, a few deserve separate discussions from a statistical point of view that shall be posted. Continue reading ‘[ArXiv] 2nd week, May 2008’ »
I think I have to review spatial statistics in astronomy, focusing on tessellation (void structure), point process (expanding 2 (3) point correlation function), and marked point process (spatial distribution of hardness ratios of X-ray distant sources, different types of galaxies -not only morphological differences but other marks such as absolute magnitudes and existence of particular features). When? Someday…
In addition to Bayesian methodologies, like this week’s astro-ph, studies on characterizing empirical spatial distributions of voids and galaxies frequently appear, which I believe can be enriched further with the ideas from stochastic geometry and spatial statistics. Click for what was appeared in arXiv this week. Continue reading ‘[ArXiv] 1st week, May 2008’ »
Since I learned Hubble’s tuning fork for the first time, I wanted to do classification (semi-supervised learning seems more suitable) galaxies based on their features (colors and spectra), instead of labor intensive human eye classification. Ironically, at that time I didn’t know there is a field of computer science called machine learning nor statistics which do such studies. Upon switching to statistics with a hope of understanding statistical packages implemented in IRAF and IDL, and learning better the contents of Numerical Recipes and Bevington’s book, the ignorance was not the enemy, but the accessibility of data was. Continue reading ‘[ArXiv] 5th week, Apr. 2008’ »
Markov chain Monte Carlo became the most frequent and stable statistical application in astronomy. It will be useful collecting tutorials from both professions. Continue reading ‘[ArXiv] 2nd week, Apr. 2008’ »
Warning! The list is long this week but diverse. Some are of CHASC’s obvious interest. Continue reading ‘[ArXiv] 2nd week, Mar. 2008’ »
It seems like I omit papers deserving attentions from time to time. If you find one, please leave a message. Even better if a summary can be left for a separate posting. Continue reading ‘[ArXiv] 3rd week, Feb. 2008’ »
It is notable that there’s an astronomy paper contains AIC, BIC, and Bayesian evidence in the title. The topic of the paper, unexceptionally, is cosmology like other astronomy papers discussed these (statistical) information criteria (I only found a couple of papers on model selection applied to astronomical data analysis without articulating CMB stuffs. Note that I exclude Bayes factor for the model selection purpose).
To find the paper or other interesting ones, click Continue reading ‘[ArXiv] 2nd week, Jan. 2007’ »
The paper about the Banff challenge [0712.2708] and the statistics tutorial for cosmologists [0712.3028] are the personal recommendations from this week’s [arXiv] list. Especially, I’d like to quote from Licia Verde’s [astro-ph:0712.3028],
In general, Cosmologists are Bayesians and High Energy Physicists are Frequentists.
I thought it was opposite. By the way, if you crave for more papers, click Continue reading ‘[ArXiv] 3rd week, Dec. 2007’ »
This week, instead of only filtering AstroStatistics related papers from arxiv, I chose additional arxiv/astro-ph papers related to CHASC folks’ astrophysical projects. Some of papers you see from this week do not have sophisticated statistical analysis but contain data from specific satellites and possibly relevant information related to CHASC projects. Due to the CHACS’ long history (we are celebrating the 10th birthday this year) and my being a newbie to CHASC, I may not pick up all papers related to the projects of current, former, and future CHASC members and dedicated slog readers. For creating a satisfying posting every week, your inputs are welcome to improve my adaptive filter. For the list of this week, click the following.
Continue reading ‘[ArXiv] 1st week, Oct. 2007’ »
Not knowing much about java and java applets in a software development and its web/internet publicizing, I cannot comment what is more efficient. Nevertheless, I thought that PHP would do the similar job in a simpler fashion and the followings may provide some ideas and solutions for publicizing statistical methods through websites based on Bayesian Inference.
Continue reading ‘Implement Bayesian inference using PHP’ »