Archive for the ‘Algorithms’ Category.
Apr 18th, 2008| 01:38 pm | Posted by hlee
Prof. Speed writes columns for IMS Bulletin and the April 2008 issue has Terence’s Stuff: PCA (p.9). Here are quotes with minor paraphrasing:
Although a quintessentially statistical notion, my impression is that PCA has always been more popular with non-statisticians. Of course we love to prove its optimality properties in our courses, and at one time the distribution theory of sample covariance matrices was heavily studied.
…but who could not feel suspicious when observing the explosive growth in the use of PCA in the biological and physical sciences and engineering, not to mention economics?…it became the analysis tool of choice of the hordes of former physicists, chemists and mathematicians who unwittingly found themselves having to be statisticians in the computer age.
My initial theory for its popularity was simply that they were in love with the prefix eigen-, and felt that anything involving it acquired the cachet of quantum mechanics, where, you will recall, everything important has that prefix.
He gave the following eigen-’s: eigengenes, eigenarrays, eigenexpression, eigenproteins, eigenprofiles, eigenpathways, eigenSNPs, eigenimages, eigenfaces, eigenpatterns, eigenresult, and even eigenGoogle.
How many miracles must one witness before becoming a convert?…Well, I’ve seen my three miracles of exploratory data analysis, examples where I found I had a problem, and could do something about it using PCA, so now I’m a believer.
No need to mention that astronomers explore data with PCA and utilize eigen- values and vectors to transform raw data into more interpretable ones.
Mar 12th, 2008| 03:32 pm | Posted by hlee
Astrometry.net, a cool website I heard from Harvard Astronomy Professor Doug Finkbeiner’s class (Principles of Astronomical Measurements), does a complex job of matching your images of unknown locations or coordinates to sources in catalogs. By providing your images in various formats, they provide astrometric calibration meta-data and lists of known objects falling inside the field of view. Continue reading ‘Astrometry.net’ »
Mar 5th, 2008| 04:46 pm | Posted by hlee
This is a quite long paper that I separated from [Arvix] 4th week, Feb. 2008:
[astro-ph:0802.3916] P. Carvalho, G. Rocha, & M.P.Hobso
A fast Bayesian approach to discrete object detection in astronomical datasets - PowellSnakes I
As the title suggests, it describes Bayesian source detection and provides me a chance to learn the foundation of source detection in astronomy. Continue reading ‘[ArXiv] A fast Bayesian object detection’ »
Tags:
Bayesian evidence,
coloured background,
CRLB,
decision theory,
filter,
Fisher informatoin,
likelihood,
PowellSnake,
prior,
simulated annealing,
SNR,
source detection,
state space,
Sunyaev-Zel'dovich effect,
symmetric loss,
templates Category:
Algorithms,
Bayesian,
Cross-Cultural,
Data Processing,
Fitting,
Frequentist,
MCMC,
Methods,
Objects,
arXiv |
Comment
Feb 28th, 2008| 10:46 pm | Posted by vlk
Grand statistical challenges seem to be all the rage nowadays. Following on the heels of the Banff Challenge (which dealt with figuring out how to set the bounds for the signal intensity that would result from the Higgs boson) comes the GREAT08 Challenge (arxiv/0802.1214) to deal with one of the major issues in observational Cosmology, the effect of dark matter. As Douglas Applegate puts it: Continue reading ‘The GREAT08 Challenge’ »
Tags:
Banff,
Challenge,
dark matter,
Douglas Applegate,
gravitational lensing,
GREAT08,
image analysis,
inference,
lensing,
LSST,
shear,
STEP Category:
Algorithms,
Astro,
Data Processing,
Galaxies,
Imaging,
News,
Optical |
7 Comments
Feb 20th, 2008| 01:26 pm | Posted by vlk
Sherpa is a fitting environment in which Chandra data (and really, X-ray data from any observatory) can be analyzed. It has just undergone a major update and now runs on python. Or allows python to run. Something like that. It is a very powerful tool, but I can never remember how to use it, and I have an amazing knack for not finding what I need in the documentation. So here is a little cheat sheet (which I will keep updating as and when if I learn more): Continue reading ‘Everybody needs crampons’ »
Tags:
Chandra,
cheat sheet,
ciao,
how to,
Python,
Sherpa,
Sherpa4 Category:
Algorithms,
Astro,
Fitting,
Jargon,
Languages |
Comment
Jan 30th, 2008| 02:33 am | Posted by hlee
Astronomers have developed their ways of processing signals almost independent to but sometimes collaboratively with engineers, although the fundamental of signal processing is same: extracting information. Doubtlessly, these two parallel roads of astronomers’ and engineers’ have been pointing opposite directions: one toward the sky and the other to the earth. Nevertheless, without an intensive argument, we could say that somewhat statistics has played the medium of signal processing for both scientists and engineers. This particular issue of IEEE signal processing magazine may shed lights for astronomers interested in signal processing and statistics outside the astronomical society.
IEEE Signal Processing Magazine Jul. 2007 Vol 24 Issue 4: Bootstrap methods in signal processing
This link will show the table of contents and provide links to articles; however, the access to papers requires IEEE Xplore subscription via libraries or individual IEEE memberships). Here, I’d like to attempt to introduce some articles and tutorials.
Continue reading ‘Signal Processing and Bootstrap’ »
Tags:
bootstrap,
compressive sensing,
confidence interval,
GLM,
IEEE,
jacknife,
machine learning,
multitaper estimate,
particle filter,
signal processing,
statistical inference,
Tutorial,
wavelet Category:
Algorithms,
Bayesian,
Cross-Cultural,
Fitting,
Frequentist,
MC,
MCMC,
Methods,
Misc,
Spectral,
Stat,
Uncertainty,
arXiv |
Comment
Jan 21st, 2008| 03:33 pm | Posted by vlk
One of the big problems that has come up in recent years is in how to represent the uncertainty in certain estimates. Astronomers usually present errors as +-stddev on the quantities of interest, but that presupposes that the errors are uncorrelated. But suppose you are estimating a multi-dimensional set of parameters that may have large correlations amongst themselves? One such case is that of Differential Emission Measures (DEM), where the “quantity of emission” from a plasma (loosely, how much stuff there is available to emit — it is the product of the volume and the densities of electrons and H) is estimated for different temperatures. See the plots at the PoA DEM tutorial for examples of how we are currently trying to visualize the error bars. Another example is the correlated systematic uncertainties in effective areas (Drake et al., 2005, Chandra Cal Workshop). This is not dissimilar to the problem of determining the significance of a “feature” in an image (Connors, A. & van Dyk, D.A., 2007, SCMA IV). Continue reading ‘Dance of the Errors’ »
Tags:
animated,
David Garcia-Alvarez,
DEM,
error bands,
error bars,
flux,
MCMC,
O VII,
O VIII,
PINTofALE,
question for statisticians Category:
Algorithms,
Astro,
Data Processing,
Jargon,
MCMC,
Spectral,
Stars,
Uncertainty |
2 Comments
Oct 24th, 2007| 09:15 pm | Posted by hlee
My friend’s blog led me to Terrence Tao’s blog. A mathematician writes topics of applied mathematics and others. A glance tells me that all postings are well written. Especially, compressed sensing and single pixel cameras drags my attention more because the topic stimulates thoughts of astronomers in virtual observatory[] and image processing[] (it is not an exaggeration that observational astronomy starts with taking pictures in a broad sense) and statisticians in multidimensional applications, not to mention engineers in signal and image processing. Continue reading ‘compressed sensing and a blog’ »
Oct 21st, 2007| 03:59 pm | Posted by vlk
wavdetect is a wavelet-based source detection algorithm that is in wide use in X-ray data analysis, in particular to find sources in Chandra images. It came out of the Chicago “Beta Site” of the AXAF Science Center (what CXC used to be called before launch). Despite the fancy name, and the complicated mathematics and the devilish details, it is really not much more than a generalization of earlier local cell detect, where a local background is estimated around a putative source and the question is asked, is whatever signal that is being seen in this pixel significantly higher than expected? However, unlike previous methods that used a flux measurement as the criterion for detection (e.g., using signal-to-noise ratios as proxy for significance threshold), it tests the hypothesis that the observed signal can be obtained as a fluctuation from the background. Continue reading ‘The power of wavdetect’ »
Tags:
AXAF,
ChaMP,
Chandra,
ciao,
Power,
source detection,
Type II error,
wavdetect,
wavelet Category:
Algorithms,
Imaging,
X-ray |
1 Comment
Oct 5th, 2007| 04:47 pm | Posted by hlee
Not knowing much about java and java applets in a software development and its web/internet publicizing, I cannot comment what is more efficient. Nevertheless, I thought that PHP would do the similar job in a simpler fashion and the followings may provide some ideas and solutions for publicizing statistical methods through websites based on Bayesian Inference.
Continue reading ‘Implement Bayesian inference using PHP’ »
Tags:
Bayesian Inference,
Classification,
Condition Probability,
Estimation,
IBM,
JAVA,
Open Source,
PHP Category:
Algorithms,
Bayesian,
Cross-Cultural,
Data Processing,
Languages |
Comment
Oct 3rd, 2007| 06:41 pm | Posted by aneta
I am visiting Copernicus Astronomical Center in Warsaw this week and this is the reason for Polish connection! I learned about two papers that might interest our group. They are authored by Alex Schwarzenberg-Czerny
1. Accuracy of period determination, (1991 MNRAS.253, 198)
Periods of oscillation are frequently found using one of two methods: least-squares (LSQ) fit or power spectrum. Their errors are estimated using the LSQ correlation matrix or the Rayleigh resolution criterion, respectively. In this paper, it is demonstrated that both estimates are statistically incorrect. On the one hand, the LSQ covariance matrix does not account for correlation of residuals from the fit. Neglect of the correlations may cause large underestimation of the variance. On the other hand, the Rayleigh resolution criterion is insensitive to signal-to-noise ratio and thus does not reflect quality of observations. The correct variance estimates are derived for the two methods.
Continue reading ‘Polish AstroStatistics’ »
Sep 17th, 2007| 03:36 pm | Posted by hlee
VOConvert or ConVOT is a small java script which does file format conversion from fits to ascii or the other way around. These tools might be useful for statisticians who want to convert astronomers’ data format called fits into ascii quickly for a statistical analysis. Additionally, VOConvert creates an interim output for VOStat, designed for statistical data analysis from Virtual Observatory. The softwares and the list of Virtual Observatories around the world can be found at Virtual Observatory India. Please, check a link in VOstat (http://groundtruth.info/AstroStat/slog/2007/vostat) for more information about VOstat.
Sep 14th, 2007| 08:46 pm | Posted by hlee
Sep 12th, 2007| 04:31 pm | Posted by hlee
From arxiv/astro-ph:0709.1359,
A robust morphological classification of high-redshift galaxies using support vector machines on seeing limited images. I Method description by M. Huertas-Company et al.
Machine learning and statistical learning become more and more popular in astronomy. Artificial Neural Network (ANN) and Support Vector Machine (SVM) are hardly missed when classifying on massive survey data is the objective. The authors provide a gentle tutorial on SVM for galactic morphological classification. Their source code GALSVM is linked for the interested readers.
Continue reading ‘[ArXiv] SVM and galaxy morphological classification, Sept. 10, 2007’ »
Sep 4th, 2007| 10:55 pm | Posted by hlee
From arxiv/astro-ph:0708.4274v1
Comparison of decision tree methods for finding active objects by Y. Zhao and Y. Zhang
The authors (astronomers) introduced and summarized various decision three methods (REPTree, Random Tree, Decision Stump, Random Forest, J48, NBTree, and AdTree) to the astronomical community.
Continue reading ‘[ArXiv] Decision Tree, Aug. 31, 2007’ »