14 Jan, 2016
Decision Theory
Decision Theory
Notation:
- a∈A: a is an “action” and A is the set of all possible actions
- L(θ,a): loss function for θ∈Θ and a∈A, L(θ,a)≥−K>−∞
- Quadratic loss: (θ−a)2
- Absolute loss: |θ−a|
- 0-1 loss: I(θ≠a)
- δ(x): decision rule, function from X to A, X is the sample space
Example:
- Drug company needs to market a new pain killer:
- θ: proportion of the market that the drug will capture
- θ∈Θ=[0,1], at A=[0,1]
- L(θ,a)=θ−a if θ≥a
- L(θ,a)=2(a−θ) if θ<a
- Survey of x peolpe out of n responded “yes”
- possible model x|n∼Bin(n,θ)
- possible decision rule: δ(x)=x/n
- Classical Decision Theory
- Def: The risk function of a decision rule δ(x)
- R(θ,δ)=∫xL(θ,δ(x))p(x|θ)dx
- we say that δ1(x) is “better” than δ2(x) if R(θ,δ1)≤R(θ,δ2)∀θ.
- How do we pick an estimator ˆθ for θ?
- Choose ˆθ(δ(x)) that minimizes R.
- Typically, we need to constrain θ space to get an optimum. For example, unbiased, linear, etc.
- Otherwise, you have many local minimums.
- Bayesian Decision Theory
- θ unknown random variable
- x are observed data
- Def: Let π∗(θ) a pdf at the time of decision making. The Bayesian expected loss for an action a is ρ(π\*,a)=∫ΘL(θ,a)π∗(θ)dθ=Eπ∗(θ)[L(θ,a)]
- Ex|θ[L(θ,a)]=R(θ,a)
- Eθ[L(θ,a)]=ρ(π,a), π is prior
- Eθ|x[L(θ,a)]=ρ(p,a), ρ is posterior
- Bayesian decision principle: choose a∈A that minimizes ρ(π∗,a). This action is called a Bayes action.
- Example (Drug Company):
- Assume:
- π(θ)=1/10 if .1<θ<.2, 0 o.w.
- if no data: ρ(π,a)=∫a0L(θ,a)π(θ)dθ
- =∫a0 2(a−θ)π(θ) dθ+∫1a (θ−a)π(θ) dθ
- case 1: a≤.1⇒ρ(π,a)=.15−a, minimum at a=.1
- case 2: .1<a<.2⇒a=2/15 is optimal and ρ(π,2/15)=.03
- case 3: a≥.2⇒ρ(π,a)=2a−.3, optimal at a=.2. So ρ(π,.2)=.1
- The posterior expected loss of an action a∈A is ρ(p(x|θ),a)=∫ΘL(θ,a)p(θ|x)dθ
we could have a situation like this: picture1
Minimizing ρ(p(x|θ),a)=∫ΘL(θ,a)p(θ|x)dθ
- Say you obtained a Bayes action by minimizing ρ(p(θ|x),a), say δp(θ|x)
- We can compute the “Bayes Risk” = Ex[ρ(p(θ|x),ˆδ)]=∫Xp(x)∫ΘL(θ,ˆδ)p(x|θ)π(θ)dθdx
- =∫ΘR(θ,ˆδ)π(θ)dθ=Eθ[R(θ,δ)]
Recap:
FrequentistBayesianEstimatorδ(x)δπ(x)likelihoodp(x|θ)p(x|θ)PriorNAπ(θ)RiskR(θ,δ)=Ex|θ[L(θ,δ)]ρ(p(θ|x),δπ(x))=Eθ|x[L(θ,δπ(x)]Bayes RiskEθ[R(θ,δ)]Ex[ρ(π,δ)]
The Bayes Risk are equivalent and are equal to r(π,δ)=Eθ,x[L(θ,δ)] =
∫X∫Θ L(θ,δ(x)) f(x|θ) π(θ) dθ dX.