Normalizing constant




The concept of a normalizing constant arises in probability theory and a variety of other areas of mathematics. The normalizing constant is used to reduce any probability function to a probability density function with total probability of one.




Contents






  • 1 Definition and examples


  • 2 Bayes' theorem


  • 3 Non-probabilistic uses


  • 4 Notes


  • 5 References





Definition and examples


In probability theory, a normalizing constant is a constant by which an everywhere non-negative function must be multiplied so the area under its graph is 1, e.g., to make it a probability density function or a probability mass function.[1][2] For example, if we define


p(x)=e−x2/2,x∈(−,∞){displaystyle p(x)=e^{-x^{2}/2},xin (-infty ,infty )}p(x)=e^{{-x^{2}/2}},xin (-infty ,infty )

we have


p(x)dx=∫e−x2/2dx=2π,{displaystyle int _{-infty }^{infty }p(x),dx=int _{-infty }^{infty }e^{-x^{2}/2},dx={sqrt {2pi ,}},}int _{{-infty }}^{infty }p(x),dx=int _{{-infty }}^{infty }e^{{-x^{2}/2}},dx={sqrt  {2pi ,}},

if we define a function φ(x){displaystyle varphi (x)}varphi (x) as


φ(x)=12πp(x)=12πe−x2/2{displaystyle varphi (x)={frac {1}{sqrt {2pi ,}}}p(x)={frac {1}{sqrt {2pi ,}}}e^{-x^{2}/2}}varphi (x)={frac  {1}{{sqrt  {2pi ,}}}}p(x)={frac  {1}{{sqrt  {2pi ,}}}}e^{{-x^{2}/2}}

so that


φ(x)dx=∫12πe−x2/2dx=1{displaystyle int _{-infty }^{infty }varphi (x),dx=int _{-infty }^{infty }{frac {1}{sqrt {2pi ,}}}e^{-x^{2}/2},dx=1}int _{{-infty }}^{infty }varphi (x),dx=int _{{-infty }}^{infty }{frac  {1}{{sqrt  {2pi ,}}}}e^{{-x^{2}/2}},dx=1

then the function φ(x){displaystyle varphi (x)}varphi (x) is a probability density function.[3] This is the density of the standard normal distribution. (Standard, in this case, means the expected value is 0 and the variance is 1.)


And constant 12π{displaystyle {frac {1}{sqrt {2pi ,}}}}{frac  {1}{{sqrt  {2pi ,}}}} is the normalizing constant of function p(x){displaystyle p(x)}p(x).


Similarly,


n=0∞λnn!=eλ,{displaystyle sum _{n=0}^{infty }{frac {lambda ^{n}}{n!}}=e^{lambda },}sum _{{n=0}}^{infty }{frac  {lambda ^{n}}{n!}}=e^{lambda },

and consequently


f(n)=λne−λn!{displaystyle f(n)={frac {lambda ^{n}e^{-lambda }}{n!}}}f(n)={frac  {lambda ^{n}e^{{-lambda }}}{n!}}

is a probability mass function on the set of all nonnegative integers.[4] This is the probability mass function of the Poisson distribution with expected value λ.


Note that if the probability density function is a function of various parameters, so too will be its normalizing constant. The parametrised normalizing constant for the Boltzmann distribution plays a central role in statistical mechanics. In that context, the normalizing constant is called the partition function.



Bayes' theorem


Bayes' theorem says that the posterior probability measure is proportional to the product of the prior probability measure and the likelihood function. Proportional to implies that one must multiply or divide by a normalizing constant to assign measure 1 to the whole space, i.e., to get a probability measure. In a simple discrete case we have


P(H0|D)=P(D|H0)P(H0)P(D){displaystyle P(H_{0}|D)={frac {P(D|H_{0})P(H_{0})}{P(D)}}}P(H_{0}|D)={frac  {P(D|H_{0})P(H_{0})}{P(D)}}

where P(H0) is the prior probability that the hypothesis is true; P(D|H0) is the conditional probability of the data given that the hypothesis is true, but given that the data are known it is the likelihood of the hypothesis (or its parameters) given the data; P(H0|D) is the posterior probability that the hypothesis is true given the data. P(D) should be the probability of producing the data, but on its own is difficult to calculate, so an alternative way to describe this relationship is as one of proportionality:


P(H0|D)∝P(D|H0)P(H0).{displaystyle P(H_{0}|D)propto P(D|H_{0})P(H_{0}).}P(H_{0}|D)propto P(D|H_{0})P(H_{0}).

Since P(H|D) is a probability, the sum over all possible (mutually exclusive) hypotheses should be 1, leading to the conclusion that


P(H0|D)=P(D|H0)P(H0)∑iP(D|Hi)P(Hi).{displaystyle P(H_{0}|D)={frac {P(D|H_{0})P(H_{0})}{displaystyle sum _{i}P(D|H_{i})P(H_{i})}}.}P(H_{0}|D)={frac  {P(D|H_{0})P(H_{0})}{displaystyle sum _{i}P(D|H_{i})P(H_{i})}}.

In this case, the reciprocal of the value


P(D)=∑iP(D|Hi)P(Hi){displaystyle P(D)=sum _{i}P(D|H_{i})P(H_{i});}P(D)=sum _{i}P(D|H_{i})P(H_{i});

is the normalizing constant.[5] It can be extended from countably many hypotheses to uncountably many by replacing the sum by an integral.



Non-probabilistic uses


The Legendre polynomials are characterized by orthogonality with respect to the uniform measure on the interval [− 1, 1] and the fact that they are normalized so that their value at 1 is 1. The constant by which one multiplies a polynomial so its value at 1 is 1 is a normalizing constant.


Orthonormal functions are normalized such that


fi,fj⟩i,j{displaystyle langle f_{i},,f_{j}rangle =,delta _{i,j}}langle f_{i},,f_{j}rangle =,delta _{{i,j}}

with respect to some inner product <fg>.


The constant 1/2 is used to establish the hyperbolic functions cosh and sinh from the lengths of the adjacent and opposite sides of a hyperbolic triangle.



Notes





  1. ^ Continuous Distributions at University of Alabama.


  2. ^ Feller, 1968, p. 22.


  3. ^ Feller, 1968, p. 174.


  4. ^ Feller, 1968, p. 156.


  5. ^ Feller, 1968, p. 124.




References




  • Continuous Distributions at Department of Mathematical Sciences: University of Alabama in Huntsville


  • Feller, William (1968). An Introduction to Probability Theory and its Applications (volume I). John Wiley & Sons. ISBN 0-471-25708-7..mw-parser-output cite.citation{font-style:inherit}.mw-parser-output q{quotes:"""""""'""'"}.mw-parser-output code.cs1-code{color:inherit;background:inherit;border:inherit;padding:inherit}.mw-parser-output .cs1-lock-free a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/6/65/Lock-green.svg/9px-Lock-green.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .cs1-lock-limited a,.mw-parser-output .cs1-lock-registration a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/d/d6/Lock-gray-alt-2.svg/9px-Lock-gray-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .cs1-lock-subscription a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/a/aa/Lock-red-alt-2.svg/9px-Lock-red-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration{color:#555}.mw-parser-output .cs1-subscription span,.mw-parser-output .cs1-registration span{border-bottom:1px dotted;cursor:help}.mw-parser-output .cs1-hidden-error{display:none;font-size:100%}.mw-parser-output .cs1-visible-error{font-size:100%}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration,.mw-parser-output .cs1-format{font-size:95%}.mw-parser-output .cs1-kern-left,.mw-parser-output .cs1-kern-wl-left{padding-left:0.2em}.mw-parser-output .cs1-kern-right,.mw-parser-output .cs1-kern-wl-right{padding-right:0.2em}




Comments

Popular posts from this blog

Information security

章鱼与海女图

Farm Security Administration