Archive for the ‘Fitting’ Category.

Everybody needs crampons

Sherpa is a fitting environment in which Chandra data (and really, X-ray data from any observatory) can be analyzed. It has just undergone a major update and now runs on python. Or allows python to run. Something like that. It is a very powerful tool, but I can never remember how to use it, and I have an amazing knack for not finding what I need in the documentation. So here is a little cheat sheet (which I will keep updating as and when if I learn more): Continue reading ‘Everybody needs crampons’ »

From Terence’s stuff: You want proof?

Please, IMS Bulletin, v.38 (10) check p.11 of this pdf file for the whole article. Continue reading ‘From Terence’s stuff: You want proof?’ »

From Quantile Probability and Statistical Data Modeling

by Emanuel Parzen in Statistical Science 2004, Vol 19(4), pp.652-662 JSTOR

I teach that statistics (done the quantile way) can be simultaneously frequentist and Bayesian, confidence intervals and credible intervals, parametric and nonparametric, continuous and discrete data. My first step in data modeling is identification of parametric models; if they do not fit, we provide nonparametric models for fitting and simulating the data. The practice of statistics, and the modeling (mining) of data, can be elegant and provide intellectual and sensual pleasure. Fitting distributions to data is an important industry in which statisticians are not yet vendors. We believe that unifications of statistical methods can enable us to advertise, “What is your question? Statisticians have answers!”

I couldn’t help liking this paragraph because of its bitter-sweetness. I hope you appreciate it as much as I did.

The chance that A has nukes is p%

I watched a movie in which one of the characters said, “country A has nukes with 80% chance” (perhaps, not 80% but it was a high percentage). One of the statements in that episode is that people will not eat lettuce only if the 1% chance of e coli is reported, even lower. Therefore, with such a high percentage of having nukes, it is right to send troops to A. This episode immediately brought me a thought about astronomers’ null hypothesis probability and their ways of concluding chi-square goodness of fit tests, likelihood ratio tests, or F-tests.

First of all, I’d like to ask how you would like to estimate the chance of having nukes in a country? What this 80% implies here? But, before getting to the question, I’d like to discuss computing the chance of e coli infection, first. Continue reading ‘The chance that A has nukes is p%’ »

Scatter plots and ANCOVA

Astronomers rely on scatter plots to illustrate correlations and trends among many pairs of variables more than any scientists[1]. Pages of scatter plots with regression lines are often found from which the slope of regression line and errors bars are indicators of degrees of correlation. Sometimes, too many of such scatter plots makes me think that, overall, resources for drawing nice scatter plots and papers where those plots are printed are wasted. Why not just compute correlation coefficients and its error and publicize the processed data for computing correlations, not the full data, so that others can verify the computation results for the sake of validation? A couple of scatter plots are fine but when I see dozens of them, I lost my focus. This is another cultural difference. Continue reading ‘Scatter plots and ANCOVA’ »

  1. This is not an assuring absolute statement but a personal impression after reading articles of various fields in addition to astronomy. My readings of other fields tell that many rely on correlation statistics but less scatter plots by adding straight lines going through data sets for the purpose of imposing relationships within variable pairs[]

[MADS] logistic regression

Although a bit of time has elapsed since my post space weather, saying that logistic regression is used for prediction, it looks like still true that logistic regression is rarely used in astronomy. Otherwise, it could have been used for the similar purpose not under the same statistical jargon but under the Bayesian modeling procedures. Continue reading ‘[MADS] logistic regression’ »

[Books] Bayesian Computations

A number of practical Bayesian data analysis books are available these days. Here, I’d like to introduce two that were relatively recently published. I like the fact that they are rather technical than theoretical. They have practical examples close to be related with astronomical data. They have R codes so that one can try algorithms on the fly instead of jamming probability theories. Continue reading ‘[Books] Bayesian Computations’ »

Wavelet-regularized image deconvolution

A Fast Thresholded Landweber Algorithm for Wavelet-Regularized Multidimensional Deconvolution
Vonesch and Unser (2008)
IEEE Trans. Image Proc. vol. 17(4), pp. 539-549

Quoting the authors, I also like to say that the recovery of the original image from the observed is an ill-posed problem. They traced the efforts of wavelet regularization in deconvolution back to a few relatively recent publications by astronomers. Therefore, I guess the topic and algorithm of this paper could drag some attentions from astronomers. Continue reading ‘Wavelet-regularized image deconvolution’ »

Curious Cases of the Null Hypothesis Probability

Even though I traced the astronomers’ casual usage of the null hypothesis probability in a fashion of reporting outputs from data analysis packages of their choice, there were still some curious cases of the null hypothesis probability that I couldn’t solve. They are quite mysterious to me. Sometimes too much creativity harms the original intention. Here are some examples. Continue reading ‘Curious Cases of the Null Hypothesis Probability’ »

4754 d.f.

I couldn’t believe my eyes when I saw 4754 degrees of freedom (d.f.) and chi-square test statistic 4859. I’ve often enough seen large degrees of freedom from journals in astronomy, several hundreds to a few thousands, but I never felt comfortable at these big numbers. Then with a great shock 4754 d.f. appeared. I must find out why I feel so bothered at these huge degrees of freedom. Continue reading ‘4754 d.f.’ »

Guinness, Gosset, Fisher, and Small Samples

Student’s t-distribution is somewhat underrepresented in the astronomical community. Having an article with nice stories, it looks to me the best way to introduce the t distribution. This article describing historic anecdotes about monumental statistical developments occurred about 100 years ago.

Guinness, Gosset, Fisher, and Small Samples by Joan Fisher Box
Source: Statist. Sci. Volume 2, Number 1 (1987), 45-52.

No time for reading the whole article? I hope you have a few minutes to read following quotes, which are quite enchanting to me. Continue reading ‘Guinness, Gosset, Fisher, and Small Samples’ »

Likelihood Ratio Technique

I wonder what Fisher, Neyman, and Pearson would say if they see “Technique” after “Likelihood Ratio” instead of “Test.” A presenter’s saying “Likelihood Ratio Technique” for source identification, I couldn’t resist checking it out not to offend founding fathers of the likelihood principle in statistics since “Technique” sounded derogatory to be attached with “Likelihood” to my ears. I thank, above all, the speaker who kindly gave me the reference about this likelihood ratio technique. Continue reading ‘Likelihood Ratio Technique’ »

It bothers me.

The full description is given about “bayes” under sherpa/ciao[1]. Some sentences kept bothering me and here’s my account for the reason given outside of quotes. Continue reading ‘It bothers me.’ »

  1. Note that the current sherpa is beta under ciao 4.0 not under ciao 3.4 and a description about “bayes” from the most recent sherpa is not available yet, which means this post needs updates one new release is available[]

“Thanks to Henrietta Leavitt”


The CfA is celebrating the 100th anniversary of the discovery of the Cepheid period-luminosity relation on Nov 6, 2008. See for details.

[Update 10/03] For a nice introduction to the story of Henrietta Swan Leavitt, listen to this Perimeter Institute talk by George Johnson:

[Update 11/06] The full program is now available. The symposium begins at Noon today.

GSL – GNU Scientific Library

I’ve talked about IMSL on my pyIMSL post, which is a commercial scientific library. There is a GNU version of IMSL, GSL. Finding GSL is the courtesy of Jiangang, who was the author of the poster that I most liked from the 212th AAS, (see My first AAS. V. measurement error and EM and his comment.) Continue reading ‘GSL – GNU Scientific Library’ »