<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The AstroStat Slog &#187; Poisson</title>
	<atom:link href="http://groundtruth.info/AstroStat/slog/tag/poisson/feed/" rel="self" type="application/rss+xml" />
	<link>http://groundtruth.info/AstroStat/slog</link>
	<description>Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders</description>
	<lastBuildDate>Fri, 09 Sep 2011 17:05:33 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>[ArXiv] Sparse Poisson Intensity Reconstruction Algorithms</title>
		<link>http://groundtruth.info/AstroStat/slog/2009/arxiv-sparse-poisson-intensity-reconstruction-algorithms/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2009/arxiv-sparse-poisson-intensity-reconstruction-algorithms/#comments</comments>
		<pubDate>Thu, 07 May 2009 16:14:39 +0000</pubDate>
		<dc:creator>hlee</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Astro]]></category>
		<category><![CDATA[Cross-Cultural]]></category>
		<category><![CDATA[Data Processing]]></category>
		<category><![CDATA[High-Energy]]></category>
		<category><![CDATA[Imaging]]></category>
		<category><![CDATA[Jargon]]></category>
		<category><![CDATA[arXiv]]></category>
		<category><![CDATA[compressed sensing]]></category>
		<category><![CDATA[decomposition]]></category>
		<category><![CDATA[EM algorithm]]></category>
		<category><![CDATA[intensity]]></category>
		<category><![CDATA[MPLE]]></category>
		<category><![CDATA[multiscale]]></category>
		<category><![CDATA[penalty]]></category>
		<category><![CDATA[Poisson]]></category>
		<category><![CDATA[Poisson Intensity]]></category>
		<category><![CDATA[Sparcity]]></category>
		<category><![CDATA[wavelet]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/?p=2498</guid>
		<description><![CDATA[One of [ArXiv] papers from yesterday whose title might drag lots of attentions from astronomers. Furthermore, it&#8217;s a short paper.
[arxiv:math.CO:0905.0483] by Harmany, Marcia, and Willet.

Estimating f under &#8220;Sparse Poisson Intensity&#8221; condition is an frequently appearing topic in high energy astrophysics data analysis. Some might like to check references in the paper, which offer solutions to [...]]]></description>
			<content:encoded><![CDATA[<p>One of [ArXiv] papers from yesterday whose title might drag lots of attentions from astronomers. Furthermore, it&#8217;s a short paper.<br />
<a href="http://arxiv.org/abs/0905.0483">[arxiv:math.CO:0905.0483]</a> by Harmany, Marcia, and Willet.<br />
<span id="more-2498"></span><br />
Estimating f under &#8220;Sparse Poisson Intensity&#8221; condition is an frequently appearing topic in high energy astrophysics data analysis. Some might like to check references in the paper, which offer solutions to compressed sensing problems with different kinds of sparsity, minimization approaches, and constraints on f.</p>
<p>Apart from the technical details, the first two sentences from the conclusion,</p>
<blockquote><p>
We have developed computational approaches for signal reconstruction from photon-limited measurements &#8211; a situation prevalent in many practical settings. Our method optimizes a regularized Poisson likelihood under nonnegativity constraints</p></blockquote>
<p>tempt me to study and try their algorithm.</p>
]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2009/arxiv-sparse-poisson-intensity-reconstruction-algorithms/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Poisson vs Gaussian, Part 2</title>
		<link>http://groundtruth.info/AstroStat/slog/2009/poigauss-pdfs/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2009/poigauss-pdfs/#comments</comments>
		<pubDate>Fri, 10 Apr 2009 19:16:31 +0000</pubDate>
		<dc:creator>vlk</dc:creator>
				<category><![CDATA[Jargon]]></category>
		<category><![CDATA[Stat]]></category>
		<category><![CDATA[background]]></category>
		<category><![CDATA[gaussian]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[Poisson]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/?p=2181</guid>
		<description><![CDATA[Probability density functions are another way of summarizing the consequences of assuming a Gaussian error distribution when the true distribution is Poisson.  We can compute the posterior probability of the intensity of a source, when some number of counts are observed in a source region, and the background is estimated using counts observed in [...]]]></description>
			<content:encoded><![CDATA[<p>Probability density functions are another way of summarizing the <a href="http://groundtruth.info/AstroStat/slog/2009/poigauss/">consequences of assuming a Gaussian</a> error distribution when the true distribution is Poisson.  We can compute the <a href="http://groundtruth.info/AstroStat/slog/2008/eotw-bkgsubtract-poisson/">posterior probability of the intensity of a source</a>, when some number of counts are observed in a source region, and the background is estimated using counts observed in a different region.  We can then compare it to the <a href="http://groundtruth.info/AstroStat/slog/2008/eotw-background-subtraction/">equivalent Gaussian</a>.</p>
<p>The figure below (<a href="http://cxc.harvard.edu/csc/conferences/AAS2009/Jan2009/files/cscxap.pdf">AAS 472.09</a>) compares the pdfs for the Poisson intensity (red curves) and the Gaussian equivalent (black curves) for two cases: when the number of counts in the source region is 50 (top) and 8 (bottom) respectively.  In both cases a background of 200 counts collected in an area 40x the source area is used.  The hatched region represents the 68% equal-tailed interval for the Poisson case, and the solid horizontal line is the &#177;1&#963; width of the equivalent Gaussian.</p>
<p>Clearly, for small counts, the support of the Poisson distribution is bounded below at zero, but that of the Gaussian is not.  This introduces a visibly large bias in the interval coverage as well as in the normalization properties.  Even at high counts, the Poisson is skewed such that larger values are slightly more likely to occur by chance than in the Gaussian case.  This skew can be quite critical for marginal results.<span id="more-2181"></span></p>
<div id="attachment_2184" class="wp-caption alignnone" style="width: 310px"><img src="http://groundtruth.info/AstroStat/slog/wp-content/uploads/2009/04/poigauss2-300x150.jpg" alt="Poisson and Gaussian probability densities" width="300" height="150" class="size-medium wp-image-2184" /><p class="wp-caption-text">Poisson and Gaussian probability densities</p></div>
<p>No <a href="http://groundtruth.info/AstroStat/slog/2009/poigauss/comment-page-1/#comment-872">simple IDL code this time</a>; but for reference, the Poisson posterior probability density curves were generated with the <a href="http://hea-www.harvard.edu/PINTofALE/">PINTofALE</a> routine <tt><a href="http://hea-www.harvard.edu/PINTofALE/pro/stat/ppd_src.pro">ppd_src()</a></tt></p>
]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2009/poigauss-pdfs/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Poisson vs Gaussian</title>
		<link>http://groundtruth.info/AstroStat/slog/2009/poigauss/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2009/poigauss/#comments</comments>
		<pubDate>Thu, 09 Apr 2009 23:01:58 +0000</pubDate>
		<dc:creator>vlk</dc:creator>
				<category><![CDATA[Jargon]]></category>
		<category><![CDATA[Stat]]></category>
		<category><![CDATA[bias]]></category>
		<category><![CDATA[gaussian]]></category>
		<category><![CDATA[IDL]]></category>
		<category><![CDATA[Poisson]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/?p=2166</guid>
		<description><![CDATA[We astronomers are rather fond of approximating our counting statistics with Gaussian error distributions, and a lot of ink has been spilled justifying and/or denigrating this habit.  But just how bad is the approximation anyway?
I ran a simple Monte Carlo based test to compute the expected bias between a Poisson sample and the &#8220;equivalent&#8221; [...]]]></description>
			<content:encoded><![CDATA[<p>We astronomers are rather fond of approximating our counting statistics with Gaussian error distributions, and a lot of ink has been spilled justifying and/or denigrating this habit.  But just how bad is the approximation anyway?</p>
<p>I ran a simple Monte Carlo based test to compute the expected bias between a Poisson sample and the &#8220;equivalent&#8221; Gaussian sample.  The result is shown in the plot below. </p>
<p>The jagged red line is the fractional expected bias relative to the true intensity.  The typical recommendation in high-energy astronomy is to bin up events until there are about 25 or so counts per bin.  This leads to an average bias of about 2% in the estimate of the true intensity.  The bias drops below 1% for counts &gt;50.  <span id="more-2166"></span> The smooth blue line is the reciprocal of the square-root of the intensity, reflecting the width of the Poisson distribution relative to the true intensity, and is given here only for illustrative purposes. </p>
<div id="attachment_2168" class="wp-caption alignnone" style="width: 535px"><img src="http://groundtruth.info/AstroStat/slog/wp-content/uploads/2009/04/poigauss.jpg" alt="Poisson-Gaussian bias" width="525" height="375" class="size-full wp-image-2168" /><p class="wp-caption-text">Poisson-Gaussian bias</p></div>
<p>Exemplar <a href="http://www.ittvis.com/ProductServices/IDL.aspx">IDL</a> code that can be used to generate this kind of plot is appended below:<br />
<code><br />
nlam=100L &amp; nsim=20000L<br />
lam=indgen(nlam)+1 &amp; sct=intarr(nlam,nsim) &amp; scg=sct &amp; dct=fltarr(nlam)<br />
for i=0L,nlam-1L do sct[i,*]=randomu(seed,nsim,poisson=lam[i])<br />
for i=0L,nlam-1L do scg[i,*]=randomn(seed,nsim)*sqrt(lam[i])+lam[i]<br />
for i=0,nlam-1L do dct[i]=mean(sct[i,*]-scg[i,*])/(lam[i])<br />
plot,lam,dct,/yl,yticklen=1,ygrid=1<br />
oplot,lam,1./sqrt(lam)<br />
</code></p>
]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2009/poigauss/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Poisson Likelihood [Equation of the Week]</title>
		<link>http://groundtruth.info/AstroStat/slog/2008/eotw-poisson-likelihood/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2008/eotw-poisson-likelihood/#comments</comments>
		<pubDate>Wed, 02 Jul 2008 17:00:32 +0000</pubDate>
		<dc:creator>vlk</dc:creator>
				<category><![CDATA[Jargon]]></category>
		<category><![CDATA[Stat]]></category>
		<category><![CDATA[binomial]]></category>
		<category><![CDATA[counting]]></category>
		<category><![CDATA[EotW]]></category>
		<category><![CDATA[Equation]]></category>
		<category><![CDATA[Equation of the Week]]></category>
		<category><![CDATA[Poisson]]></category>
		<category><![CDATA[Poisson Likelihood]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/?p=333</guid>
		<description><![CDATA[Astrophysics, especially high-energy astrophysics, is all about counting photons.  And this, it is said, naturally leads to all our data being generated by a Poisson process.  True enough, but most astronomers don&#8217;t know exactly how it works out, so this derivation is for them.
Suppose N counts are randomly placed in an interval of [...]]]></description>
			<content:encoded><![CDATA[<p>Astrophysics, especially high-energy astrophysics, is all about counting photons.  And this, it is said, naturally leads to all our data being generated by a Poisson process.  True enough, but most astronomers don&#8217;t know exactly how it works out, so this derivation is for them.<span id="more-333"></span></p>
<p>Suppose <em>N</em> counts are randomly placed in an interval of duration <em>&#964;</em> without any preference for appearing in any particular portion of <em>&#964;</em>.  i.e., the distribution is uniform.  The counting rate <em>R = N/&#964;</em>.  We can now ask, what is the probability of finding <em>k</em> counts in an infinitesimal interval <em>&#948;t</em> within <em>&#964;</em>?</p>
<p>First, consider the probability that one count, placed randomly, will fall inside <em>&#948;t</em>,<br />
<strong><br />
<blockquote>
&#961; = &#948;t/&#964; &#8801; R&#948;t/N &#8801; &#957;/N
</p></blockquote>
<p></strong><br />
where <em>&#957; = R &#948;t</em> represents the expected counts intensity in the interval <em>&#948;t</em>.  When <em>N</em> counts are scattered over <em>&#964;</em>, the probability that <em>k</em> of them will fall inside <em>&#948;t</em> is described with a binomial distribution,<br />
<strong><br />
<blockquote>
p(k|&#961;,N) = <sup>N</sup>C<sub>k</sub> &#961;<sup>k</sup> (1-&#961;)<sup>N-k</sup>
</p></blockquote>
<p></strong><br />
as the product of the probability of finding <em>k</em> events inside <em>&#948;t</em> and the probability of finding the remaining events outside, summed over all the possible distinct ways that <em>k</em> events can be chosen out of <em>N</em>.  Expanding the expression and rearranging,<br />
<strong><br />
<blockquote>
= N!/{(N-k)!k!} (R &#948;t/N)<sup>k</sup> (1-(R &#948;t/N))<sup>N-k</sup><br />
<br />
= N!/{(N-k)!k!} (&#957;<sup>k</sup>/N<sup>k</sup>) (1-(&#957;/N))<sup>N-k</sup><br />
<br />
= N!/{(N-k)!N<sup>k</sup>} (&#957;<sup>k</sup>/k!) (1-(&#957;/N))<sup>N</sup> (1-(&#957;/N))<sup>-k</sup>
</p></blockquote>
<p></strong><br />
Note that as <em>N,&#964; &#8212;&gt; &#8734;</em> (while keeping <em>R</em> fixed),<br />
<strong><br />
<blockquote>
N!/{(N-k)!N<sup>k</sup>} , (1-(&#957;/N))<sup>-k</sup> &#8212;&gt; 1<br />
(1-(&#957;/N))<sup>N</sup> &#8212;&gt; e<sup>-&#957;</sup>
</p></blockquote>
<p></strong><br />
and the expression reduces to<br />
<strong><br />
<blockquote>
p(k|&#957;) = (&#957;<sup>k</sup>/k!) e<sup>-&#957;</sup>
</p></blockquote>
<p></strong><br />
which is the familiar (in a manner of speaking) expression for the Poisson likelihood.</p>
]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2008/eotw-poisson-likelihood/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>[ArXiv] 3rd week, May 2008</title>
		<link>http://groundtruth.info/AstroStat/slog/2008/arxiv-3rd-week-may-2008/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2008/arxiv-3rd-week-may-2008/#comments</comments>
		<pubDate>Mon, 26 May 2008 18:59:38 +0000</pubDate>
		<dc:creator>hlee</dc:creator>
				<category><![CDATA[Bayesian]]></category>
		<category><![CDATA[Fitting]]></category>
		<category><![CDATA[MCMC]]></category>
		<category><![CDATA[Methods]]></category>
		<category><![CDATA[Stat]]></category>
		<category><![CDATA[arXiv]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[high dimension]]></category>
		<category><![CDATA[LF]]></category>
		<category><![CDATA[maximum likelihood]]></category>
		<category><![CDATA[multivariate]]></category>
		<category><![CDATA[Poisson]]></category>
		<category><![CDATA[Schechter]]></category>
		<category><![CDATA[zero count]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/?p=316</guid>
		<description><![CDATA[Not many this week, but there&#8217;s a great read.

[stat.ME:0805.2756] Fionn Murtagh
The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering
[astro-ph:0805.2945] Martin, de Jong, &#038; Rix
A comprehensive Maximum Likelihood analysis of the structural properties of faint Milky Way satellites
[astro-ph:0805.2946] Kelly, Fan, &#038; Vestergaard
A Flexible Method of Estimating Luminosity Functions [my subjective comment is added [...]]]></description>
			<content:encoded><![CDATA[<p>Not many this week, but there&#8217;s a great read.<span id="more-316"></span></p>
<ul>
<li><a href="http://arxiv.org/abs/0805.2756">[stat.ME:0805.2756]</a> Fionn Murtagh<br />
<strong>The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering</strong></p>
<li><a href="http://arxiv.org/abs/0805.2945">[astro-ph:0805.2945]</a> Martin, de Jong, &#038; Rix<br />
<strong>A comprehensive Maximum Likelihood analysis of the structural properties of faint Milky Way satellites</strong></p>
<li><a href="http://arxiv.org/abs/0805.2946">[astro-ph:0805.2946]</a> Kelly, Fan, &#038; Vestergaard<br />
<strong>A Flexible Method of Estimating Luminosity Functions</strong> [my subjective comment is added at the bottom]</p>
<li><a href="http://arxiv.org/abs/0805.3220">[stat.ME:0805.3220]</a> Bayarri, Berger, Datta<br />
<strong>Objective Bayes testing of Poisson versus inflated Poisson models</strong> (will it be of use when one is dealing with many zero background counts, underpopulated above zero background counts, and underpopulated source counts?)
</ul>
<p>[<strong>Comment</strong>] You must read it. It can serve as a very good Bayesian tutorial for astronomers. I think there&#8217;s a typo, nothing major, plus/minus sign in the likelihood, though. Tom Loredo kindly has informed through his extensive slog comments about Schechter function and this paper made me appreciate the gamma distribution more. Schechter function and the gamma density function share the same equation although the objective of their use does not have much to be shared (Forgive my Bayesian ignorance in the extensive usage of gamma distribution except the fact it&#8217;s a conjugate of Poisson or exponential distribution). </p>
<p>FYI, there was another recent arxiv paper on zero-inflation <a href="http://arxiv.org/abs/0805.2258">[stat.ME:0805.2258]</a> by Bhattacharya, Clarke, &#038; Datta<br />
<strong>A Bayesian test for excess zeros in a zero-inflated power series distribution</strong> </p>
]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2008/arxiv-3rd-week-may-2008/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>gamma function (Equation of the Week)</title>
		<link>http://groundtruth.info/AstroStat/slog/2008/eotw-gamma-function/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2008/eotw-gamma-function/#comments</comments>
		<pubDate>Tue, 06 May 2008 22:12:45 +0000</pubDate>
		<dc:creator>vlk</dc:creator>
				<category><![CDATA[Misc]]></category>
		<category><![CDATA[Stat]]></category>
		<category><![CDATA[conjugate]]></category>
		<category><![CDATA[EotW]]></category>
		<category><![CDATA[Equation]]></category>
		<category><![CDATA[gamma]]></category>
		<category><![CDATA[Poisson]]></category>
		<category><![CDATA[prior]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/?p=291</guid>
		<description><![CDATA[The gamma function [not the Gamma -- note upper-case G -- which is related to the factorial] is one of those insanely useful functions that after one finds out about it, one wonders &#8220;why haven&#8217;t we been using this all the time?&#8221;  It is defined only on the positive non-negative real line, is a [...]]]></description>
			<content:encoded><![CDATA[<p>The <em>gamma</em> function [not the Gamma -- note upper-case G -- which is related to the factorial] is one of those insanely useful functions that after one finds out about it, one wonders &#8220;why haven&#8217;t we been using this all the time?&#8221;  It is defined only on the <strike>positive</strike> non-negative real line, is a highly flexible function that can emulate almost any kind of skewness in a distribution, and is a perfect complement to the Poisson likelihood.  In fact, it is the <a href="http://en.wikipedia.org/wiki/Conjugate_prior">conjugate prior</a> to the Poisson likelihood, and is therefore a natural choice for a prior in all cases that start off with counts.<span id="more-291"></span></p>
<p><a href='http://groundtruth.info/AstroStat/slog/wp-content/uploads/eotw_03.jpg'><img src="http://groundtruth.info/AstroStat/slog/wp-content/uploads/eotw_03-300x104.jpg" alt="" width="300" height="104" class="alignnone size-medium wp-image-294" /></a></p>
<p>The gamma function is defined with two parameters, <em>alpha</em>, and <em>beta</em>, over the <strike>+ve</strike> non-negative real line.  <em>alpha</em> can be any real number greater than 1 unlike the Poisson likelihood where the equivalent quantity are integers (values less than 1 are possible, but the function ceases to be integrable) and <em>beta</em> is any number greater than 0.</p>
<p>The mean is <em>alpha/beta</em> and the variance is <em>alpha/beta<sup>2</sup></em>.  Conversely, given a sample whose mean and variance are known, one can estimate <em>alpha</em> and <em>beta</em> to describe that sample with this function.</p>
<p>This is reminiscent of the Poisson distribution where <em>alpha</em> ~ number of counts and <em>beta</em> is akin to the collecting area or the exposure time.  For this reason, a popular non-informative prior to use with the Poisson likelihood is <em>gamma(alpha=1,beta=0)</em>, which is like saying &#8220;we expect to detect 0 counts in 0 time&#8221;.  (Which, btw, is not the same as saying we <a href="http://groundtruth.info/AstroStat/slog/2007/zero-counts/">detect 0 counts in an observation</a>.)  <strong>[Edit:</strong> see Tom Loredo's <a href="#comment-218">comments</a> <a href="#comment-221">below</a> for more on this.<strong>]</strong>  Surprisingly, you can get less informative that even that, but that&#8217;s a discussion for another time.</p>
<p>Because it is the conjugate prior to the Poisson, it is also a useful choice to use as an <em>informative</em> prior.  It makes derivations of formulae that much easier, though one has to be careful about using it blindly in real world applications, as the presence of background can muck up the pristine Poissonness of the prior (as we discovered while applying <a href="http://groundtruth.info/AstroStat/slog/2007/ab-posteriori-ad-priori/">BEHR to Chandra Level3 products</a>).</p>
]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2008/eotw-gamma-function/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>tests of fit for the Poisson distribution</title>
		<link>http://groundtruth.info/AstroStat/slog/2008/tests-of-fit-for-the-poisson-distribution/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2008/tests-of-fit-for-the-poisson-distribution/#comments</comments>
		<pubDate>Tue, 29 Apr 2008 06:24:09 +0000</pubDate>
		<dc:creator>hlee</dc:creator>
				<category><![CDATA[Methods]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[Cramer-von Mises test]]></category>
		<category><![CDATA[Goodness of fit]]></category>
		<category><![CDATA[most powerful test]]></category>
		<category><![CDATA[Poisson]]></category>
		<category><![CDATA[Power]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/?p=280</guid>
		<description><![CDATA[<strong>Abstract:</strong> goodness-of-fit tests based on the Cramer-von Mises statistics are given for the Poisson distribution. Power comparisons show that these statistics, particularly A<sup>2</sup>, give good overall tests of fit. The statistics A<sup>2</sup> will be particularly useful for detecting distributions where the variance is close to the mean, but which are not Poisson.]]></description>
			<content:encoded><![CDATA[<p>Scheming arXiv:astro-ph abstracts almost an year never offered me an occasion that the fit of the Poisson distribution is tested in different ways, instead it is  taken for granted by plugging data and (source) model into a (modified) &#967<sup>2</sup> function. If any doubts on the Poisson distribution occur, the following paper might be useful:<span id="more-280"></span></p>
<blockquote><p>J.J.Spinelli and M.A.Stephens (1997)<br />
<a href="http://www.jstor.org/pss/3315735">Cramer-von Mises tests of fit for the Poisson distribution</a><br />
Canadian J. Stat. Vol. 25(2), pp. 257-267<br />
<strong>Abstract:</strong> goodness-of-fit tests based on the Cramer-von Mises statistics are given for the Poisson distribution. Power comparisons show that these statistics, particularly A<sup>2</sup>, give good overall tests of fit. The statistics A<sup>2</sup> will be particularly useful for detecting distributions where the variance is close to the mean, but which are not Poisson.
</p></blockquote>
<p>In addition to Cramer-von Mises statistics (A<sup>2</sup> and W<sup>2</sup>), the dispersion test D (so called a &#967<sup>2</sup> statistic for testing the goodness of fit in astronomy and this D statistics is considered as a two sided test approximately distributed as a &#967<sup>2</sup><sub>n-1</sub> variable), the Neyman-Barton k-component smooth test S<sub>k</sub>, P and T (statistics based on the probability generating function), and the Pearson X<sup>2</sup> statistics (the number of cells K is chosen to avoid small expected values and the statistics is compared to a &#967<sup>2</sup><sub>K-1</sub> variable, I think astronomers call it modified  &#967<sup>2</sup> test) are introduced and compared to compute the powers of these tests. The strategy to provide the powers of the Cramer-von Mises statistics is that there is a parameter &#947 in the negative binomial distribution, which is zero under the null hypothesis (Poission distribution), and letting this &#947=&#948/sqrt(n) in which the parameter value &#948 is chosen so that for a two-sided 0.05 level test, the best test has a power of 0.5<sup>[1]</sup>. Based on this simulation study, the statistic A<sup>2</sup> was empirically as powerful as the best test compared to other Cramer-von Mises tests. </p>
<p>Under the Poission distribution null hypothesis, the alternatives are overdispersed, underdispersed, and equally dispersed distributions. For the equally dispersed alternative, the Cramer-von Mises statistics have the best power compared other statistics.  Overall, the Cramer-von Mises statistics have good power against all classes of alternative distributions and the Pearson X<sup>2</sup> statistic performed very poorly for the overdispersed alternative. </p>
<p>Instead of binning for the modified &#967<sup>2</sup> tests<sup>[2]</sup>, we could adopt  A<sup>2</sup> of W<sup>2</sup> for the goodness-of-fit tests. Probably, it&#8217;s already implemented in softwares but not been recognized.  </p>
<ol class="footnotes"><li id="footnote_0_280" class="footnote">The locally most powerful unbiased test is the statistics D (Potthoff and Whittinghill, 1966) </li><li id="footnote_1_280" class="footnote">authors&#8217; examples indicate high significant levels compared to other tests; in other words, &#967<sup>2</sup> statistics &#8211; the dispersion test statistic D and the Pearson X<sup>2</sup> &#8211; are insensitive to provide the evidence of the source model is not a good-fit to produce Poisson photon count data</li></ol>]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2008/tests-of-fit-for-the-poisson-distribution/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>[ArXiv] 1st week, Nov. 2007</title>
		<link>http://groundtruth.info/AstroStat/slog/2007/arxiv-1st-week-nov-2007/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2007/arxiv-1st-week-nov-2007/#comments</comments>
		<pubDate>Fri, 02 Nov 2007 21:59:08 +0000</pubDate>
		<dc:creator>hlee</dc:creator>
				<category><![CDATA[arXiv]]></category>
		<category><![CDATA[bootstrap]]></category>
		<category><![CDATA[EGRET]]></category>
		<category><![CDATA[Fisher information]]></category>
		<category><![CDATA[Laplace transform]]></category>
		<category><![CDATA[maximum likelihood]]></category>
		<category><![CDATA[PCA]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[Poisson]]></category>
		<category><![CDATA[Ratio]]></category>
		<category><![CDATA[Uncertainty]]></category>
		<category><![CDATA[variance]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/2007/arxiv-1st-week-nov-2007/</guid>
		<description><![CDATA[To be exact, the title of this posting should contain 5th week, Oct, which seems to be the week of EGRET. In addition to astro-ph papers, although they are not directly related to astrostatistics, I include a few statistics papers which may be profitable for astronomical data analysis.    

[astro-ph:0710.4966]
Uncertainties of the antiproton [...]]]></description>
			<content:encoded><![CDATA[<p>To be exact, the title of this posting should contain <em>5th week, Oct</em>, which seems to be the week of EGRET. In addition to astro-ph papers, although they are not directly related to astrostatistics, I include a few statistics papers which may be profitable for astronomical data analysis.    <span id="more-185"></span></p>
<ul>
<li><a href="http://arxiv.org/abs/0710.4966">[astro-ph:0710.4966]</a><br />
<strong>Uncertainties of the antiproton flux from Dark Matter annihilation in comparison to the EGRET excess of diffuse gamma rays</strong> by Iris Gebauer</li>
<li><a href="http://arxiv.org/abs/0710.5106">[astro-ph:0710.5106]</a><br />
<strong>The dark connection between the Canis Major dwarf, the Monoceros ring,  the gas flaring, the rotation curve and the EGRET excess of diffuse Galactic Gamma Rays</strong> by W. de Boer et.al.</li>
<li><a href="http://arxiv.org/abs/0710.5119">[astro-ph:0710.5119]</a><br />
<strong>Determination of the Dark Matter profile from the EGRET excess of diffuse Galactic gamma radiation</strong> by Markus Weber</li>
<li><a href="http://arxiv.org/abs/0710.5171">[astro-ph:0710.5171]</a><br />
<strong>Systematic Bias in Cosmic Shear: Beyond the Fisher Matrix</strong>  by A.Amara and A. Refregier</li>
<li><a href="http://arxiv.org/abs/0710.5560">[astro-ph:0710.5560]</a><br />
<strong>Principal Component Analysis of the Time- and Position-Dependent Point Spread Function of the Advanced Camera for Surveys</strong> by M.J. Jee et.al.</li>
<li><a href="http://arxiv.org/abs/0710.5637">[astro-ph:0710.5637]</a><br />
<strong>A method of open cluster membership determination </strong> by G. Javakhishvili et.al.</li>
<li><a href="http://arxiv.org/abs/0710.5670">[stat.CO:0710.5670]</a><br />
<strong>An Elegant Method for Generating Multivariate Poisson Data </strong> by I. Yahav and G.Shmueli</li>
<li><a href="http://arxiv.org/abs/0710.5788">[astro-ph:0710.5788]</a><br />
<strong>Variations in Stellar Clustering with Environment: Dispersed Star  Formation and the Origin of Faint Fuzzies</strong> by B. G. Elmegreen</li>
<li><a href="http://arxiv.org/abs/0710.5749">[math.ST:0710.5749]</a><br />
<strong>On the Laplace transform of some quadratic forms and the exact distribution of the sample variance from a gamma or uniform parent distribution</strong> by T.Royen</li>
<li><a href="http://arxiv.org/abs/0710.5797">[math.ST:0710.5797]</a><br />
<strong>The Distribution of Maxima of Approximately Gaussian Random Fields </strong> by Y. Nardi, D.Siegmund and B.Yakir</li>
<li><a href="http://arxiv.org/abs/0711.0177">[astro-ph:0711.0177]</a><br />
<strong>Maximum Likelihood Method for Cross Correlations with Astrophysical Sources</strong> by R.Jansson and G. R. Farrar</li>
<li><a href="http://arxiv.org/abs/0711.0198">[stat.ME:0711.0198]</a><br />
<strong>A Geometric Approach to Confidence Sets for Ratios: Fieller&#8217;s Theorem, Generalizations, and Bootstrap</strong> by U. von Luxburg and V. H. Franz</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2007/arxiv-1st-week-nov-2007/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>[ArXiv] Poisson Mixture, Aug. 16, 2007</title>
		<link>http://groundtruth.info/AstroStat/slog/2007/arxiv-poisson-mixture/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2007/arxiv-poisson-mixture/#comments</comments>
		<pubDate>Fri, 17 Aug 2007 22:15:57 +0000</pubDate>
		<dc:creator>hlee</dc:creator>
				<category><![CDATA[Frequentist]]></category>
		<category><![CDATA[Stat]]></category>
		<category><![CDATA[arXiv]]></category>
		<category><![CDATA[confidence interval]]></category>
		<category><![CDATA[I.J.Good]]></category>
		<category><![CDATA[maximum likelihood]]></category>
		<category><![CDATA[mixture models]]></category>
		<category><![CDATA[Poisson]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/2007/arxiv-poisson-mixture-aug-16-2007/</guid>
		<description><![CDATA[From arxiv/math.st:0708.2153v1
Estimating the number of classes by Mao and Lindsay
This study could be linked to identifying the number of lines from Poisson nature x-ray count data, one of the key interests for astronomers. However, as pointed by the authors, estimating the numbers of classes is a difficult statistical problem. I.J.Good[1] said that
 I don&#8217;t believe [...]]]></description>
			<content:encoded><![CDATA[<p>From <a href="http://arxiv.org/abs/0708.2153">arxiv/math.st:0708.2153v1</a><br />
<strong>Estimating the number of classes</strong> by Mao and Lindsay</p>
<p>This study could be linked to identifying the number of lines from Poisson nature x-ray count data, one of the key interests for astronomers. However, as pointed by the authors, estimating the numbers of classes is a difficult statistical problem. I.J.Good<sup>[1]</sup> said that</p>
<blockquote><p> I don&#8217;t believe it is usually possible to estimate the number of species, but only an appropriate lower bound to that number. This is because there is nearly always a good chance that there are a very large number of extremely rare species.</p></blockquote>
<p><span id="more-115"></span><br />
The authors have been working on the Poisson mixture models on genetic data. I wonder if anything could be extracted for astronomical applications. The Poisson mixture models also explain coverage problems, beyond line identification. Without mathematical equations, summarizing the body of the paper seems impossible so that only their abstract is added.</p>
<p><em>Abstract:</em><br />
Estimating the unknown number of classes in a population has numerous important applications. In a Poisson mixture model, the problem is reduced to estimating the odds that a class is undetected in a sample. The discontinuity of the odds prevents the existence of locally unbiased and informative estimators and restricts confidence intervals to be one-sided. Confidence intervals for the number of classes are also necessarily one-sided. A sequence of lower bounds to the odds is developed and used to define pseudo maximum likelihood estimators for the number of classes.</p>
<ol class="footnotes"><li id="footnote_0_115" class="footnote">courtesy of the paper: Estimating the number of species: A review by Bunge and Fitzpatrick (1993), JASA, 88, 364-373.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2007/arxiv-poisson-mixture/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Coverage issues in exponential families</title>
		<link>http://groundtruth.info/AstroStat/slog/2007/interval-estimation-in-exponential-families/</link>
		<comments>http://groundtruth.info/AstroStat/slog/2007/interval-estimation-in-exponential-families/#comments</comments>
		<pubDate>Thu, 16 Aug 2007 20:36:51 +0000</pubDate>
		<dc:creator>hlee</dc:creator>
				<category><![CDATA[Stat]]></category>
		<category><![CDATA[Uncertainty]]></category>
		<category><![CDATA[arXiv]]></category>
		<category><![CDATA[bias]]></category>
		<category><![CDATA[binomial]]></category>
		<category><![CDATA[coverage]]></category>
		<category><![CDATA[Edgeworth expansion]]></category>
		<category><![CDATA[exponential family]]></category>
		<category><![CDATA[gamma]]></category>
		<category><![CDATA[Gehrels]]></category>
		<category><![CDATA[interval]]></category>
		<category><![CDATA[Jeffreys]]></category>
		<category><![CDATA[likelihood ratio]]></category>
		<category><![CDATA[negative binomial]]></category>
		<category><![CDATA[normal]]></category>
		<category><![CDATA[Poisson]]></category>
		<category><![CDATA[Rao score]]></category>
		<category><![CDATA[Wald]]></category>

		<guid isPermaLink="false">http://groundtruth.info/AstroStat/slog/2007/interval-estimation-in-exponential-families/</guid>
		<description><![CDATA[I&#8217;ve been heard so much, without knowing fundamental reasons (most likely physics), about coverage problems from astrostat/phystat groups. This paper might be an interest for those: Interval Estimation in Exponential Families by Brown, Cai,and DasGupta ; Statistica Sinica (2003), 13, pp. 19-49
Abstract summary:
The authors investigated issues in interval estimation of the mean in the exponential [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been heard so much, without knowing fundamental reasons (most likely physics), about coverage problems from astrostat/phystat groups. This paper might be an interest for those: <a href="http://www3.stat.sinica.edu.tw/statistica/oldpdf/A13n12.pdf">Interval Estimation in Exponential Families</a> by Brown, Cai,and DasGupta ; Statistica Sinica (2003), <strong>13</strong>, pp. 19-49</p>
<p><em>Abstract summary:</em><br />
The authors investigated issues in interval estimation of the mean in the exponential family, such as binomial, Poisson, negative binomial, normal, gamma, and a sixth distribution. The poor performance of the Wald interval has been known not only for discrete cases but for nonnormal continuous cases with significant negative bias. Their computation suggested that the equal tailed Jeffreys interval and the likelihood ratio interval are the best alternatives to the Wald interval. <span id="more-110"></span></p>
<p><em>Brief summary of the paper without equations:</em><br />
The objective of this paper is interval estimation of the mean in the natural exponential family (NEF) with quadratic variance functions (QVF) and the particular focus has given to discrete NEF-QVF families consisting of the binomial, negative binomial, and the Poission distributions. It is well known that the Wald interval for a binomial proportion suffers from a systematic negative bias and oscillation in its coverage probability even for large n and p near 0.5, which seems to arise from the lattice nature and the skewness of the binomial distribution. They exemplified this systematic bias and oscillation with Poisson cases to illustrate the poor and erratic behavior of the Wald interval in lattice problems. They proved the bias expressions of the three discrete NEF-QVF distributions and added a disconcerting graphical illustration of this negative bias.</p>
<p>Interested readers should check the figure 4, where the performances of the Wald, score, likelihood ratio (LR), and Jeffreys intervals were compared. Also, the figure 5 illustrated the limits of those four intervals: LR and Jeffreys&#8217; intervals were indistinguishable. They derived the coverage probabilities of four intervals via Edgeworth expansions. The nonoscillating O(n^-1) terms from the Edgeworth expansions were studied to compare the coverage properties of these four intervals. The figure 6 shows that the Wald interval has serious negative bias, whereas the nonoscillating term in the score interval is positive for all three, binomial, negative binomial, and Poission distributions. The negative bias of the Wald interval is also found from continuous distributions like normal, gamma, and NEF-GHS distributions (Figure 7).</p>
<p>As a conclusion, they reconfirmed their findings like LR and Jeffreys intervals are the best alternative to the Wald interval in terms of the negative bias in the coverage and the length. The Rao score interval has a merit of easy presentations but its performance is inferior to LR and Jeffreys&#8217; intervals although it is better than the Wald interval. Yet, the authors left a room for users that choosing one of these intervals is a personal choice.</p>
<p><em>[Addendum] I wonder if statistical properties of <a href="http://adsabs.harvard.edu/cgi-bin/bib_query?1986ApJ...303..336G">Gehrels&#8217; confidence limits</a> have been studied after the publication. I&#8217;ll try to post findings about the statistics of the Gehrels&#8217; confidence limits, shortly(hopefully).</em></p>
]]></content:encoded>
			<wfw:commentRss>http://groundtruth.info/AstroStat/slog/2007/interval-estimation-in-exponential-families/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

