Author Archive

[SPS] Testing Completeness

There will be a special session at the 213th AAS meeting on meaning from surveys and population studies (SPS). Until then, it might be useful to pull out some interesting and relevant papers and questions/challenges as a preliminary to the meeting. I will not list astronomical catalogs and surveys only, which are literally countless these days but will bring out some if they change the way how science is performed with a description of the catalog (the best example would be SDSS, Sloan Digital Sky Survey, to my knowledge). Continue reading ‘[SPS] Testing Completeness’ »

It bothers me.

The full description is given http://cxc.harvard.edu/ciao3.4/ahelp/bayes.html about “bayes” under sherpa/ciao[1]. Some sentences kept bothering me and here’s my account for the reason given outside of quotes. Continue reading ‘It bothers me.’ »

  1. Note that the current sherpa is beta under ciao 4.0 not under ciao 3.4 and a description about “bayes” from the most recent sherpa is not available yet, which means this post needs updates one new release is available[]

after “Thanks to Henrietta Leavitt”

flyer
Personally, it was a highly anticipated symposium at CfA because I was fascinated about the female computers’ (or astronomers’) contributions that occurred here about a century ago even though at that time women were not considered as scientists but mere assistants for tedious jobs. Continue reading ‘after “Thanks to Henrietta Leavitt”’ »

read.table()

The first step of data analysis or applications is reading the data sets into a tool of choice. Recent years, I’ve been using R (see also Learning R) for that regard but I’ve enjoyed freedoms for the same purpose from these languages and tools: BASIC, fortran77/90/95, C/C++, IDL, IRAF, AIPS, mongo/supermongo, MATLAB, Maple, Mathematica, SAS, SPSS, Gauss, ARC, Minitab, and recently Python and ciao which I just began to learn. Many of them I lost the fluency of how to use it. Quick learning tends to be flash memory. Some will need brain defragmentation and recovering time for extensive scientific work. A few I don’t like to use at all. No matter what, I’m not a computer geek. I’m not good at new gadgets, new softwares, nor welcome new and allegedly versatile computing systems. But one must be if he/she want to handle data. Until recently I believed R has such versatility in the aspect of reading in data. Yet, there is nothing without exceptions. Continue reading ‘read.table()’ »

missing data

The notions of missing data are overall different between two communities. I tend to think missing data carry as good amount of information as observed data. Astronomers…I’m not sure how they think but my impression so far is that a missing value in one attribute/variable from a object/observation/informant, all other attributes related to that object become useless because that object is not considered in scientific data analysis or model evaluation process. For example, it is hard to find any discussion about imputation in astronomical publication or statistical justification of missing data with respect to inference strategies. On the contrary, they talk about incompleteness within different variables. Putting this vague argument with a concrete example, consider a catalog of multiple magnitudes. To draw a color magnitude diagram, one needs both color and magnitude. If one attribute is missing, that star will not appear in the color magnitude diagram and any inference methods from that diagram will not include that star. Nonetheless, one will trying to understand how different proportions of stars are observed according to different colors and magnitudes. Continue reading ‘missing data’ »

GSL - GNU Scientific Library

I’ve talked about IMSL on my pyIMSL post, which is a commercial scientific library. There is a GNU version of IMSL, GSL. Finding GSL is the courtesy of Jiangang, who was the author of the poster that I most liked from the 212th AAS, (see My first AAS. V. measurement error and EM and his comment.) Continue reading ‘GSL - GNU Scientific Library’ »

Off the line

I do not like to be serious. papers…papers…papers. Off from papers for bridging two, allow me to talk about something relevant to the cultural difference between astronomers and statisticians. I hope this could generate a series of comments. :) Continue reading ‘Off the line’ »

[tutorial] multispectral imaging, a case study

Without signal processing courses, the following equation should be awfully familiar to astronomers of photometry and handling data:
c_k=\int_\Lambda l(\lambda) r(\lambda) f_k(\lambda) \alpha(\lambda) d\lambda +n_k
Terms are in order, camera response (c_k), light source (l), spectral radiance by l (r), filter (f), sensitivity (α), and noise (n_k), where Λ indicates the range of the spectrum in which the camera is sensitive.
Or simplified to c_k=\int_\Lambda \phi_k (\lambda) r(\lambda) d\lambda +n_k
where φ denotes the combined illuminant and the spectral sensitivity of the k-th channel, which goes by augmented spectral sensitivity. Well, we can skip spectral radiance r, though. Unfortunately, the sensitivity α has multiple layers, not a simple closed function of λ in astronomical photometry.
Or c_k=\Theta r +n
Inverting Θ and finding a reconstruction operator such that r=inv(Θ)c_k leads spectral reconstruction although Θ is, in general, not a square matrix. Otherwise, approach from indirect reconstruction. Continue reading ‘[tutorial] multispectral imaging, a case study’ »

When you register

I bet there are various scams. One of them is automatic user registration. This blog requires a registration for contributing free of approval comments unless one does not put many web links. Recently, there were frequent anonymous user registrations. What I mean by anonymous is that I don’t see their names or part of identities (for example, someone uses initials of their names in their email accounts or uses email accounts from their affiliations). This slog is open to anyone who is interested in AstroStatistics, although not many are currently active. Upon your request, this can be changed very simply and you immediately start writing your ideas about AstroStatistics. However, I must restrict those scams from now on. Please, provide me a small information about you if you do not want to be eliminated after your registration. As I mentioned, the information does not require your full name, nor email account of academic institution. When you register, use your email account that you use daily bases, not the ones that look like results from phishing.

[Book] The Grammar of Graphics

All of a sudden, partially owing to a thought provoking talk about visualization by Felice Frankel at IIC, I recollected a book, The Grammar of Graphics by Leland Wilkinson (2nd Ed. - I partially read the 1st ed. and felt little of use several years ago because there seemed no link for visualization of data from astronomy.) Continue reading ‘[Book] The Grammar of Graphics’ »

A Quote on Model

In order to understand a learning procedure statistically it is necessary to identify two important aspects: its structural model and its error model. The former is most important since it determines the function space of the approximator, thereby characterizing the class of functions or hypothesis that can be accurately approximated with it. The error model specifies the distribution of random departures of sampled data from the structural model.

Continue reading ‘A Quote on Model’ »

survey and design of experiments

People of experience would say very differently and wisely against what I’m going to discuss now. This post only combines two small cross sections of each branch of two trees, astronomy and statistics. Continue reading ‘survey and design of experiments’ »

Make3D

At least two images for reconstructing a 3D scene is a conventional belief. Yet, we do know that our eyes reconstruct 3D scenes from various single snap shot images, just with one picture. Based on our perception and learning ability or our internal pattern recognition ability, a few groups of people have been trying to reconstruct a 3D image from one still image picture. Luckily you can test such progress, reconstructing a 3D scene from a single still image at Make3D (a click brings you to Make3D at Stanford). Continue reading ‘Make3D’ »

Quintessential Contributions

To my personal thoughts, the history of astronomy is more interesting than the history of statistics. This may change tomorrow. Harvard statistics department (chair Xiao-Li Meng) organizes a symposium titled

Quintessential Contributions:
Celebrating Major Birthdays of Statistical Ideas and Their Inventors

When: Saturday, September 27, 2008, 9:45 AM - 5:00 PM
Where: Radcliffe Gymnasium, 18 Mason Street, Cambridge, MA

Continue reading ‘Quintessential Contributions’ »

Classification and Clustering

Another deduced conclusion from reading preprints listed in arxiv/astro-ph is that astronomers tend to confuse classification and clustering and to mix up methodologies. They tend to think any algorithms from classification or clustering analysis serve their purpose since both analysis algorithms, no matter what, look like a black box. I mean a black box as in neural network, which is one of classification algorithms. Continue reading ‘Classification and Clustering’ »