This was written more than a year ago, and I forgot to post it.
Continue reading ‘[Book] The Elements of Statistical Learning, 2nd Ed.’ »
Posts tagged ‘machine learning’
This was written more than a year ago, and I forgot to post it.
Thanks to a Korean solar physicist I was able to gather the following websites and some relevant information on Space Weather Forecast in action, not limited to literature nor toy data.
- Space Weather Research Lab at NJIT
- SEEDS — Solar Eruptive Event Detection System at George Mason University.
- CACTUS A software package for ‘Computer Aided CME Tracking
- SRON in the Netherlands
- I must acknowledge him for his kindness and patience. He was my wikipedia to questions while I was studying the Sun.[↩]
Statistical Resampling Methods are rather unfamiliar among astronomers. Bootstrapping can be an exception but I felt like it’s still unrepresented. Seeing an recent review paper on cross validation from [arXiv] which describes basic notions in theoretical statistics, I couldn’t resist mentioning it here. Cross validation has been used in various statistical fields such as classification, density estimation, model selection, regression, to name a few. Continue reading ‘[ArXiv] Cross Validation’ »
Among billion objects in our Galaxy, outside the Earth, our Sun drags most attention from astronomers. These astronomers go by solar physicists, who enjoy the most abundant data including 400 year long sunspot counts. Their joy is not only originated from the fascinating, active, and unpredictable characteristics of the Sun but also attributed to its influence on our daily lives. Related to the latter, sometimes studying the conditions on the Sun is called space weather forecast. Continue reading ‘space weather’ »
I’ve been complaining about how one can do machine learning on solar images without a training set? (see my comment at the big picture). On the other hand, I’m also aware of challenges in astronomy that data (images) cannot be transformed freely and be fed into standard machine learning algorithms. Tailoring data pipelining, cleaning, and processing to currently existing vision algorithms may not be achievable. The hope of automatizing the detection/identification procedure of interesting features (e.g. flares and loops) and forecasting events on the surface of the Sun is only a dream. Even though the level of image data stream is that of tsunami, we might have to depend on human eyes to comb out interesting features on the Sun until the new paradigm of automatized feature identification algorithms based on a single image i.e. without a training set. The good news is that human eyes have done a superb job! Continue reading ‘An excerpt from …’ »
A nice book by Christopher Bishop.
While I was reading abstracts and papers from astro-ph, I saw many applications of algorithms from pattern recognition and machine learning (PRML). The frequency will increase as large scale survey projects numerate, where recommending a good textbook or a reference in the field seems timely. Continue reading ‘[Book] pattern recognition and machine learning’ »
Since I learned Hubble’s tuning fork for the first time, I wanted to do classification (semi-supervised learning seems more suitable) galaxies based on their features (colors and spectra), instead of labor intensive human eye classification. Ironically, at that time I didn’t know there is a field of computer science called machine learning nor statistics which do such studies. Upon switching to statistics with a hope of understanding statistical packages implemented in IRAF and IDL, and learning better the contents of Numerical Recipes and Bevington’s book, the ignorance was not the enemy, but the accessibility of data was. Continue reading ‘[ArXiv] 5th week, Apr. 2008’ »
The last paper in the list discusses MCMC for time series analysis, applied to sunspot data. There are six additional papers about statistics and data analysis from the week. Continue reading ‘[ArXiv] 4th week, Apr. 2008’ »
Astronomers have developed their ways of processing signals almost independent to but sometimes collaboratively with engineers, although the fundamental of signal processing is same: extracting information. Doubtlessly, these two parallel roads of astronomers’ and engineers’ have been pointing opposite directions: one toward the sky and the other to the earth. Nevertheless, without an intensive argument, we could say that somewhat statistics has played the medium of signal processing for both scientists and engineers. This particular issue of IEEE signal processing magazine may shed lights for astronomers interested in signal processing and statistics outside the astronomical society.
This link will show the table of contents and provide links to articles; however, the access to papers requires IEEE Xplore subscription via libraries or individual IEEE memberships). Here, I’d like to attempt to introduce some articles and tutorials.
Continue reading ‘Signal Processing and Bootstrap’ »
I found this website a while ago but haven’t checked until now. They are quite useful by its contents (even pages of the lecture notes are properly flipped for you while the lecture is given). Increasing popularity of machine learning among astronomers will find more use of such lectures. If you have time to learn machine learning and other related subjects, please visit http://videolectures.net/. Specifically classified links to interesting subjects are found by your click. Continue reading ‘On-line Machine Learning Lectures and Notes’ »
Since I began to subscribe arxiv/astro-ph abstracts, from an astrostatistical point of view, one of the most frequent topics has been photometric redshifts. This photometric redshift has been a popular topic as the catalog of remote photometric object observation multiplies its volume and sky survey projects in multiple bands lead to virtual observatories (VO – will discuss in the later posting). Just searching by photometric redshifts in google scholar and arxiv.org provides more than 2000 articles since 2000.
Continue reading ‘Photometric Redshifts’ »
Spectroscopic Surveys: Present by Yip. C. overviews recent spectroscopic sky surveys and spectral analysis techniques toward Virtual Observatories (VO). In addition that spectroscopic redshift measures increase like Moore’s law, the surveys tend to go deeper and aim completeness. Mainly elliptical galaxy formation has been studied due to more abundance compared to spirals and the galactic bimodality in color-color or color-magnitude diagrams is the result of the gas-rich mergers by blue mergers forming the red sequence. Principal component analysis has incorporated ratios of emission line-strengths for classifying Type-II AGN and star forming galaxies. Lyα identifies high z quasars and other spectral patterns over z reveal the history of the early universe and the characteristics of quasars. Also, the recent discovery of 10 satellites to the Milky Way is mentioned.
Continue reading ‘[ArXiv] Spectroscopic Survey, June 29, 2007’ »
Leo Breiman (1928-2005) was one of the most dominant statisticians from the 20th century. He was well known for his textbook in probability theory as well as his contributions to the machine learning, such as CART (Classification and Regression Tree), bagging (bootstrap aggregation), and Random Forest. He was the founding father of statistical machine learning. His works can be found from http://www.stat.berkeley.edu/~breiman/
An excerpt from “A Conversation with Leo Breiman,” from Statistical Science, by Richard Olshen (2001), 16(2), pp. 184–198, casts a second thought on the direction of statistical researches:
Continue reading ‘An excerpt from “A Conversation with Leo Breiman”’ »