• Nenhum resultado encontrado

On some assumptions of Null Hypothesis Statistical Testing (NHST)

N/A
N/A
Protected

Academic year: 2022

Share "On some assumptions of Null Hypothesis Statistical Testing (NHST)"

Copied!
73
0
0

Texto

(1)

On some assumptions of Null Hypothesis Statistical Testing (NHST)

Alexandre Galv˜ao Patriota

Departamento de Estat´ıstica Instituto de Matem´atica e Estat´ıstica

Universidade de S˜ao Paulo

(2)

Outline

1 Main Goals

2 The classical statistical model

3 Hypothesis testing

4 P-value definition and its limitations

5 An alternative measure of evidence and some of its properties

6 Numerical illustration

7 Final remarks

(3)

Main Goals

Main goals

The main goals of this presentation are

to discuss the classical statistical model and statistical hypotheses,

to present some limitations of the classical p-valuewith numerical examples,

to introduce an alternative measure of evidence, called s-value, that overcomes some limitations of the p-value.

(4)

Main Goals

Main goals

The main goals of this presentation are

to discuss the classical statistical model and statistical hypotheses,

to present some limitations of the classical p-valuewith numerical examples,

to introduce an alternative measure of evidence, called s-value, that overcomes some limitations of the p-value.

(5)

Main Goals

Main goals

The main goals of this presentation are

to discuss the classical statistical model and statistical hypotheses,

to present some limitations of the classical p-valuewith numerical examples,

to introduce an alternative measure of evidence, called s-value, that overcomes some limitations of the p-value.

(6)

Main Goals

Main goals

The main goals of this presentation are

to discuss the classical statistical model and statistical hypotheses,

to present some limitations of the classical p-valuewith numerical examples,

to introduce an alternative measure of evidence, called s-value, that overcomes some limitations of the p-value.

(7)

The classical statistical model

The classical statistical model

The classical statistical model is:

(Ω,F,P), where:

Ω is the space of possible experiment outcomes,

F is a σ-field of Ω,

P is a family of non-random probability measures that possibly explain the experiment outcomes.

P P1:F →[0,1] P2:F →[0,1]

.. . Pj:F →[0,1]

.. .

The quantity of interest isg(P). For instance:

g(P) =EP(Z), g(P) =P(Z1B|Z2A),

etc.

Remark: a random vector Z is a measurable function from (Ω,F) to (Z,B)

(8)

The classical statistical model

The classical statistical model

The classical statistical model is:

(Ω,F,P), where:

Ω is the space of possible experiment outcomes,

F is a σ-field of Ω,

P is a family of non-random probability measures that possibly explain the experiment outcomes.

P P1:F →[0,1]

P2:F →[0,1]

.. . Pj:F →[0,1]

.. .

The quantity of interest isg(P).

For instance:

g(P) =EP(Z), g(P) =P(Z1B|Z2A),

etc.

(9)

The classical statistical model

A particular model

TakeZ = (X, γ), where X is the observable random vector and γ is the unobservableone.

Conditional,marginal and joint distributions can be used to make inferences about γ.

Take P = {P

0

} and build your joint probability P

0

from:

γ ∼f0(·) (with no unknown constants), X|γ∼f1(·|γ)

Now, you are ready to be a hard core

Bayesian!

(10)

The classical statistical model

A particular model

TakeZ = (X, γ), where X is the observable random vector and γ is the unobservableone.

Conditional,marginal and joint distributions can be used to make inferences about γ.

Take P = {P

0

} and build your joint probability P

0

from:

γ ∼f0(·) (with no unknown constants), X|γ∼f1(·|γ)

Now, you are ready to be a hard core

Bayesian!

(11)

The classical statistical model

A particular model

TakeZ = (X, γ), where X is the observable random vector and γ is the unobservableone.

Conditional,marginal and joint distributions can be used to make inferences about γ.

Take P = {P

0

} and build your joint probability P

0

from:

γ ∼f0(·) (with no unknown constants), X|γ ∼f1(·|γ)

Now, you are ready to be a hard core

Bayesian!

(12)

Hypothesis testing

Classical null hypotheses

Can we reduce the familyP to a subfamily P0, where P0 ⊂ P?

The positive claim can be written by means of a null hypothesis: H0: “at least one measure inP0 could generate the

observed data”

(or simply H0 : “P ∈ P0”)

Under a parametric model, there exists a finite dimensional set Θsuch that:

P ≡ {Pθ : θ∈Θ} , whereΘ⊆Rp,p <∞,

H0 :θ∈Θ0 , whereΘ0⊂ΘandP0 ≡ {Pθ: θ∈Θ0}.

(13)

Hypothesis testing

Classical null hypotheses

Can we reduce the familyP to a subfamily P0, where P0 ⊂ P? The positive claim can be written by means of a null hypothesis:

H0: “at least one measure inP0 could generate the observed data”

(or simply H0 : “P ∈ P0”)

Under a parametric model, there exists a finite dimensional set Θsuch that:

P ≡ {Pθ : θ∈Θ} , whereΘ⊆Rp,p <∞,

H0 :θ∈Θ0 , whereΘ0⊂ΘandP0 ≡ {Pθ: θ∈Θ0}.

(14)

Hypothesis testing

Classical null hypotheses

Can we reduce the familyP to a subfamily P0, where P0 ⊂ P? The positive claim can be written by means of a null hypothesis:

H0: “at least one measure inP0 could generate the observed data”

(or simply H0 : “P ∈ P0”)

Under a parametric model, there exists a finite dimensional set Θsuch that:

P ≡ {P : θ∈Θ} , whereΘ⊆Rp,p <∞,

(15)

Hypothesis testing

Alternative hypotheses

According to Fisher, the negation ofH0 cannot be expressed in terms of probability measures.

The alternative hypothesisH1 makes sense if we are certain about the familyP: H1 :P ∈(P − P0).

In the last context, we can choose1 between H0 and H1 — Neyman and Pearson approach.

1since they would be mutually exclusive and exhaustive

(16)

Hypothesis testing

Alternative hypotheses

According to Fisher, the negation ofH0 cannot be expressed in terms of probability measures.

The alternative hypothesisH1 makes sense if we are certain about the familyP: H1 :P ∈(P − P0).

In the last context, we can choose1 between H0 and H1 — Neyman and Pearson approach.

(17)

Hypothesis testing

Alternative hypotheses

According to Fisher, the negation ofH0 cannot be expressed in terms of probability measures.

The alternative hypothesisH1 makes sense if we are certain about the familyP: H1 :P ∈(P − P0).

In the last context, we can choose1 between H0 and H1 — Neyman and Pearson approach.

1since they would be mutually exclusive and exhaustive

(18)

Hypothesis testing

Bayesian hypotheses

Recall the hard core Bayesian approach, whereP ={P0} and Z = (X, γ).

The general null and alternative hypotheses are:

H0 : “γ ∈Γ0” and H1 : “γ ∈(Γ−Γ0)”.

The focus is not on the family of probability measures P, since P0 is given.

A classical statistician may also test Bayesian hypotheses. Rather than p-values, they would use estimated conditional probabilities.

(19)

Hypothesis testing

Bayesian hypotheses

Recall the hard core Bayesian approach, whereP ={P0} and Z = (X, γ).

The general null and alternative hypotheses are:

H0 : “γ ∈Γ0” and H1 : “γ ∈(Γ−Γ0)”.

The focus is not on the family of probability measures P, since P0 is given.

A classical statistician may also test Bayesian hypotheses. Rather than p-values, they would use estimated conditional probabilities.

(20)

Hypothesis testing

Bayesian hypotheses

Recall the hard core Bayesian approach, whereP ={P0} and Z = (X, γ).

The general null and alternative hypotheses are:

H0 : “γ ∈Γ0” and H1 : “γ ∈(Γ−Γ0)”.

The focus is not on the family of probability measures P, since P0 is given.

A classical statistician may also test Bayesian hypotheses. Rather than p-values, they would use estimated conditional probabilities.

(21)

Hypothesis testing

Bayesian hypotheses

Recall the hard core Bayesian approach, whereP ={P0} and Z = (X, γ).

The general null and alternative hypotheses are:

H0 : “γ ∈Γ0” and H1 : “γ ∈(Γ−Γ0)”.

The focus is not on the family of probability measures P, since P0 is given.

A classical statistician may also test Bayesian hypotheses. Rather than p-values, they would use estimated conditional probabilities.

(22)

P-value definition and its limitations

P-value definition

The p-value for testing the classical null hypothesisH0 is defined as follows

p(P0,x) = sup

P∈P0

P TH0(X)>TH0(x)

where TH0 is a statistic such that the more discrepant isH0 fromx, the larger is its observed value.2

p(P0,x)≈0 indicates that the best case inH0 provides a small probability to more “extreme events” than the observed one.

(23)

P-value definition and its limitations

P-value definition

The p-value for testing the classical null hypothesisH0 is defined as follows

p(P0,x) = sup

P∈P0

P TH0(X)>TH0(x)

where TH0 is a statistic such that the more discrepant isH0 fromx, the larger is its observed value.2

p(P0,x)≈0 indicates that the best case inH0 provides a small probability to more “extreme events” than the observed one.

2i.e.,T could be−2 logof the likelihood-ratio statistic.

(24)

P-value definition and its limitations

P-value limitations

Consider two null hypotheses H0 : “P ∈ P0”and H00 : “P ∈ P00” such that H0 =⇒ H00. Then, we would expect that:

P00

P0

| |

p(P0,x) p(P00,x)

But it is not always the case!

The previous p-valueis not monotone over the set of null hypotheses/Sets.

(25)

P-value definition and its limitations

P-value limitations

Consider two null hypotheses H0 : “P ∈ P0”and H00 : “P ∈ P00” such that H0 =⇒ H00. Then, we would expect that:

P00

P0

| |

p(P0,x) p(P00,x)

But it is not always the case!

The previous p-valueis not monotone over the set of null hypotheses/Sets.

(26)

P-value definition and its limitations

P-value limitations

Consider two null hypotheses H0 : “P ∈ P0”and H00 : “P ∈ P00” such that H0 =⇒ H00. Then, we would expect that:

P00

P0

| |

p(P0,x) p(P00,x)

But it is not always the case!

(27)

P-value definition and its limitations

Example: Bivariate Normal distribution

LetX = (X1, . . . ,Xn)be a sample from a bivariate normal distribution with mean vectorµ= (µ1, µ2)> and identity variance matrix.

Notice that (−2 log of) the likelihood-ratio statistic underH0:µ=0 is

TH0(X) =nX¯>X¯ ∼χ22, underH0012 is

TH0

0(X) = n

2( ¯X1−X¯2)2 ∼χ21,

whereX¯ = ( ¯X1,X¯2)> is the maximum likelihood estimator for µ.

(28)

P-value definition and its limitations

Example: Bivariate Normal distribution

LetX = (X1, . . . ,Xn)be a sample from a bivariate normal distribution with mean vectorµ= (µ1, µ2)> and identity variance matrix.

Notice that (−2 log of) the likelihood-ratio statistic underH0:µ=0 is

TH0(X) =nX¯>X¯ ∼χ22,

underH0012 is

TH0

0(X) = n

2( ¯X1−X¯2)2 ∼χ21,

whereX¯ = ( ¯X1,X¯2)> is the maximum likelihood estimator for µ.

(29)

P-value definition and its limitations

Example: Bivariate Normal distribution

LetX = (X1, . . . ,Xn)be a sample from a bivariate normal distribution with mean vectorµ= (µ1, µ2)> and identity variance matrix.

Notice that (−2 log of) the likelihood-ratio statistic underH0:µ=0 is

TH0(X) =nX¯>X¯ ∼χ22, underH0012 is

TH0

0(X) = n

2( ¯X1−X¯2)2 ∼χ21,

whereX¯ = ( ¯X1,X¯2)> is the maximum likelihood estimator for µ.

(30)

P-value definition and its limitations

P-values do not respect monotonicity

Observed sample H0 :µ=0 H0012 (¯x1, x¯2) ¯x1−x¯2 p-value p-value

(0.05,-0.05) 0.1 0.9753 0.8231

(0.09,-0.11) 0.2 0.9039 0.6547

(0.14,-0.16) 0.3 0.7977 0.5023

(0.19,-0.21) 0.4 0.6697 0.3711

(0.23,-0.27) 0.5 0.5331 0.2636

(0.28,-0.32) 0.6 0.4049 0.1797

(0.33,-0.37) 0.7 0.2926 0.1175

(0.37,-0.43) 0.8 0.2001 0.0736

(0.42,-0.48) 0.9 0.1308 0.0442

(31)

P-value definition and its limitations

Level curves (contour curves)

−1.0 −0.5 0.0 0.5 1.0

−1.0−0.50.00.51.0

Significance level 10%

µ1

µ2

●●

●●●●

●●

●●●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●●

●●

●●●●

●●●

●●●

●●●●

●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●●●

●●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●●

●●●

●●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●●

●●

●●●●

●●

●●●

●●

●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●

●●●●

●●

●●●

●●●

●●●

●●●

●●●●

●●●

Region of rejection Region of rejection

(32)

P-value definition and its limitations

Level curves (contour curves)

−0.50.00.51.0

Significance level 10%

µ2

●●●●

●●●

●●●

●●●●

●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●●●

●●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●●

●●●

●●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●●

●●

●●●●

●●

●●●

●●

●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●

●●●●

●●

●●●

●●●

●●●

●●●

●●●●

●●●

Region of rejection Region of rejection

H0012

(33)

P-value definition and its limitations

Level curves (contour curves)

−1.0 −0.5 0.0 0.5 1.0

−1.0−0.50.00.51.0

Significance level 10%

µ1

µ2

●●

●●●●

●●

●●●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●●

●●

●●●●

●●●

●●●

●●●●

●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●●●

●●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●●

●●●

●●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●●

●●

●●●●

●●

●●●

●●

●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●

●●●●

●●

●●●

●●●

●●●

●●●

●●●●

●●●

Region of rejection Region of rejection

H0012

⇑ H0 :µ=0

(34)

P-value definition and its limitations

Level curves (contour curves)

−0.50.00.51.0

Significance level 10%

µ2

●●●●

●●●

●●●

●●●●

●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●●●

●●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●●

●●●

●●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●●

●●

●●●●

●●

●●●

●●

●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●

●●●●

●●

●●●

●●●

●●●

●●●

●●●●

●●●

Region of rejection Region of rejection

H0012

H0 :µ=0

Rejection ofH00 but no rejection ofH0

and

(35)

An alternative measure of evidence and some of its properties

An alternative measure of evidence (parametric case)

In what follows, we present an alternative measure calleds-value to overcome the previous issue (Patriota, 2013, FSS, 233).

Thes-value is a function s : 2Θ× X → [0,1] such that s(Θ0,x) =

sup{α∈(0,1) : Λα(x)∩Θ0 6=∅}, if Θ0 6=∅,

0, if Θ0 =∅.

where Λα is a confidence set for θ with confidence level 1−α with some “nice” properties.

(36)

An alternative measure of evidence and some of its properties

An alternative measure of evidence (parametric case)

In what follows, we present an alternative measure calleds-value to overcome the previous issue (Patriota, 2013, FSS, 233).

Thes-value is a function s : 2Θ× X → [0,1] such that s(Θ0,x) =

sup{α∈(0,1) : Λα(x)∩Θ0 6=∅}, if Θ0 6=∅,

0, if Θ0 =∅.

where Λα is a confidence set for θ with confidence level 1−α

(37)

An alternative measure of evidence and some of its properties

Interpretation

Interpretation: s =s(Θ0,x) is the largest significance level α (or 1−s is the smallest confidence level1−α) for which the confidence set and the set Θ0 have at least one element in common.

Large values ofs indicate that there exists at least one element inΘ0 close to the center ofΛα (e.g., close to the ML estimate).

Small values of s indicate that ALLelements of Θ0 are far away from the center of Λα.

(38)

An alternative measure of evidence and some of its properties

Interpretation

Interpretation: s =s(Θ0,x) is the largest significance level α (or 1−s is the smallest confidence level1−α) for which the confidence set and the set Θ0 have at least one element in common.

Large values ofs indicate that there exists at least one element inΘ0 close to the center ofΛα (e.g., close to the ML estimate).

Small values of s indicate that ALLelements of Θ0 are far away from the center of Λα.

(39)

An alternative measure of evidence and some of its properties

Interpretation

Interpretation: s =s(Θ0,x) is the largest significance level α (or 1−s is the smallest confidence level1−α) for which the confidence set and the set Θ0 have at least one element in common.

Large values ofs indicate that there exists at least one element inΘ0 close to the center ofΛα (e.g., close to the ML estimate).

Small values of s indicate that ALL elements of Θ0 are far away from the center of Λα.

(40)

An alternative measure of evidence and some of its properties

Graphical illustration: s

1

= s (Θ

1

, x )

−1.0−0.50.00.51.01.5

θ2

Θ1

Θ2

Λs1

θ^ Θ

(41)

An alternative measure of evidence and some of its properties

Graphical illustration: s

2

= s (Θ

2

, x )

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

−1.5−1.0−0.50.00.51.01.5

θ

θ2

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

Θ1

Θ2

Λs1

θ^ Θ

Λs2

(42)

An alternative measure of evidence and some of its properties

Properties: the s-value is a possibility measure

1 s(∅,x) = 0 and s(Θ,x) = 1,

2 If Θ1 ⊆Θ2, then s(Θ1,x)≤s(Θ2,x),

3 For anyΘ12 ⊆Θ, s(Θ1∪Θ2,x) = max{s(Θ1,x),s(Θ2,x)},

4 If Θ1 ⊆Θ, then s(Θ1,x) = supθ∈Θ1s({θ},x),

5 s(Θ1,x) = 1 or s(Θc1,x) = 1:

ifθb∈Θ1 (closure ofΘ1), then s(Θ1,x) = 1, ifθb∈Θc1, then s(Θc1,x) = 1.

where θˆis an element of the center of Λα, i.e., θˆ∈T

αΛα(x).

(43)

An alternative measure of evidence and some of its properties

Properties: the s-value is a possibility measure

1 s(∅,x) = 0 and s(Θ,x) = 1,

2 If Θ1 ⊆Θ2, then s(Θ1,x)≤s(Θ2,x),

3 For anyΘ12 ⊆Θ, s(Θ1∪Θ2,x) = max{s(Θ1,x),s(Θ2,x)},

4 If Θ1 ⊆Θ, then s(Θ1,x) = supθ∈Θ1s({θ},x),

5 s(Θ1,x) = 1 or s(Θc1,x) = 1:

ifθb∈Θ1 (closure ofΘ1), then s(Θ1,x) = 1, ifθb∈Θc1, then s(Θc1,x) = 1.

where θˆis an element of the center of Λα, i.e., θˆ∈T

αΛα(x).

(44)

An alternative measure of evidence and some of its properties

Properties: the s-value is a possibility measure

1 s(∅,x) = 0 and s(Θ,x) = 1,

2 If Θ1 ⊆Θ2, then s(Θ1,x)≤s(Θ2,x),

3 For anyΘ12 ⊆Θ, s(Θ1∪Θ2,x) = max{s(Θ1,x),s(Θ2,x)},

4 If Θ1 ⊆Θ, then s(Θ1,x) = supθ∈Θ1s({θ},x),

5 s(Θ1,x) = 1 or s(Θc1,x) = 1:

ifθb∈Θ1 (closure ofΘ1), then s(Θ1,x) = 1, ifθb∈Θc1, then s(Θc1,x) = 1.

where θˆis an element of the center of Λα, i.e., θˆ∈T

αΛα(x).

(45)

An alternative measure of evidence and some of its properties

Properties: the s-value is a possibility measure

1 s(∅,x) = 0 and s(Θ,x) = 1,

2 If Θ1 ⊆Θ2, then s(Θ1,x)≤s(Θ2,x),

3 For anyΘ12 ⊆Θ, s(Θ1∪Θ2,x) = max{s(Θ1,x),s(Θ2,x)},

4 If Θ1 ⊆Θ, then s(Θ1,x) = supθ∈Θ1s({θ},x),

5 s(Θ1,x) = 1 or s(Θc1,x) = 1:

ifθb∈Θ1 (closure ofΘ1), then s(Θ1,x) = 1, ifθb∈Θc1, then s(Θc1,x) = 1.

where θˆis an element of the center of Λα, i.e., θˆ∈T

αΛα(x).

(46)

An alternative measure of evidence and some of its properties

Properties: the s-value is a possibility measure

1 s(∅,x) = 0 and s(Θ,x) = 1,

2 If Θ1 ⊆Θ2, then s(Θ1,x)≤s(Θ2,x),

3 For anyΘ12 ⊆Θ, s(Θ1∪Θ2,x) = max{s(Θ1,x),s(Θ2,x)},

4 If Θ1 ⊆Θ, then s(Θ1,x) = supθ∈Θ1s({θ},x),

5 s(Θ1,x) = 1 or s(Θc1,x) = 1:

ifθb∈Θ1 (closure ofΘ1), then s(Θ1,x) = 1,

(47)

An alternative measure of evidence and some of its properties

Decisions about H

0

LetΦ be a function such that:

Φ(Θ0) =hs(Θ0),s(Θc0)i.

Then,

Φ(Θ0) = ha,1i =⇒ rejectionof H0 if a is “small” enough,

Φ(Θ0) = h1,bi =⇒ acceptance ofH0 if b is “small” enough. Φ(Θ0) = h1,1i =⇒ total ignorance about H0.

(48)

An alternative measure of evidence and some of its properties

Decisions about H

0

LetΦ be a function such that:

Φ(Θ0) =hs(Θ0),s(Θc0)i.

Then,

Φ(Θ0) = ha,1i =⇒ rejectionof H0 if a is “small” enough, Φ(Θ0) = h1,bi =⇒ acceptance ofH0 if b is “small” enough.

Φ(Θ0) = h1,1i =⇒ total ignorance about H0.

(49)

An alternative measure of evidence and some of its properties

Decisions about H

0

LetΦ be a function such that:

Φ(Θ0) =hs(Θ0),s(Θc0)i.

Then,

Φ(Θ0) = ha,1i =⇒ rejectionof H0 if a is “small” enough, Φ(Θ0) = h1,bi =⇒ acceptance ofH0 if b is “small” enough.

Φ(Θ0) = h1,1i =⇒ total ignorance about H0.

(50)

An alternative measure of evidence and some of its properties

How to find the thresholds for a and b to decide about H

0

?

This is still an open problem.

We could try to find those thresholds vialoss functions.

or viafrequentist criteria by employing the following asymptotic property:

Property: If the statistical model is regular and the confidence region is built from a statistics Tθ(X) that converges in

distribution to χ2k, then:

sa = 1−Fk(FH−10(1−pa)),

where pa = 1−FH0(t) is the asymptotic p-value to testH0.

(51)

An alternative measure of evidence and some of its properties

How to find the thresholds for a and b to decide about H

0

?

This is still an open problem.

We could try to find those thresholds vialoss functions.

or viafrequentist criteria by employing the following asymptotic property:

Property: If the statistical model is regular and the confidence region is built from a statistics Tθ(X) that converges in

distribution to χ2k, then:

sa = 1−Fk(FH−10(1−pa)),

where pa = 1−FH0(t) is the asymptotic p-value to testH0.

(52)

An alternative measure of evidence and some of its properties

How to find the thresholds for a and b to decide about H

0

?

This is still an open problem.

We could try to find those thresholds vialoss functions.

or viafrequentist criteria by employing the following asymptotic property:

Property: If the statistical model is regular and the confidence region is built from a statistics Tθ(X) that converges in

distribution to χ2k, then:

sa = 1−Fk(FH−10(1−pa)),

where pa = 1−FH0(t) is the asymptotic p-value to testH0.

(53)

An alternative measure of evidence and some of its properties

How to find the thresholds for a and b to decide about H

0

?

This is still an open problem.

We could try to find those thresholds vialoss functions.

or viafrequentist criteria by employing the following asymptotic property:

Property: If the statistical model is regular and the confidence region is built from a statistics Tθ(X) that converges in

distribution to χ2k, then:

sa = 1−Fk(FH−10(1−pa)),

where pa = 1−FH0(t) is the asymptotic p-value to testH0.

(54)

An alternative measure of evidence and some of its properties

How to find the thresholds for a and b to decide about H

0

?

This is still an open problem.

We could try to find those thresholds vialoss functions.

or viafrequentist criteria by employing the following asymptotic property:

Property: If the statistical model is regular and the confidence region is built from a statistics Tθ(X) that converges in

distribution to χ2k, then:

(55)

Numerical illustration

Example: Bivariate Normal distribution

LetX = (X1, . . . ,Xn)be a sample from a bivariate normal distribution with mean vectorµ= (µ1, µ2)> and identity variance matrix.

Notice that (−2 log of) the likelihood-ratio statistic is Tµ(x) =n( ¯X −µ)>( ¯X −µ)∼χ22,

The confidence setΛα is given by

Λα(x) = {µ∈R2 : Tµ(x)≤F2−1(1−α)}, where F2 is the cumulative chi-squared distribution with two degrees of freedom.

(56)

Numerical illustration

Example: Bivariate Normal distribution

LetX = (X1, . . . ,Xn)be a sample from a bivariate normal distribution with mean vectorµ= (µ1, µ2)> and identity variance matrix.

Notice that (−2 log of) the likelihood-ratio statistic is Tµ(x) =n( ¯X −µ)>( ¯X −µ)∼χ22,

The confidence setΛα is given by

Λα(x) = {µ∈R2 : Tµ(x)≤F2−1(1−α)}, where F2 is the cumulative chi-squared distribution with two degrees of freedom.

(57)

Numerical illustration

Example: Bivariate Normal distribution

LetX = (X1, . . . ,Xn)be a sample from a bivariate normal distribution with mean vectorµ= (µ1, µ2)> and identity variance matrix.

Notice that (−2 log of) the likelihood-ratio statistic is Tµ(x) =n( ¯X −µ)>( ¯X −µ)∼χ22,

The confidence setΛα is given by

Λα(x) = {µ∈R2 : Tµ(x)≤F2−1(1−α)}, where F2 is the cumulative chi-squared distribution with two degrees of freedom.

(58)

Numerical illustration

Numerical illustration

Observed sample H0 :µ=0 H0012 (¯x1, ¯x2) x¯1−¯x2 p/s-value p-value s-value (0.05,-0.05) 0.1 0.9753 0.8231 0.9753 (0.09,-0.11) 0.2 0.9039 0.6547 0.9048 (0.14,-0.16) 0.3 0.7977 0.5023 0.7985 (0.19,-0.21) 0.4 0.6697 0.3711 0.6703 (0.23,-0.27) 0.5 0.5331 0.2636 0.5353 (0.28,-0.32) 0.6 0.4049 0.1797 0.4066 (0.33,-0.37) 0.7 0.2926 0.1175 0.2938 (0.37,-0.43) 0.8 0.2001 0.0736 0.2019

(59)

Numerical illustration

Numerical illustration

Observed sample H0 :µ=0 H0012 (¯x1, ¯x2) x¯1−¯x2 p/s-value p-value s-value (0.05,-0.05) 0.1 0.9753 0.8231 0.9753 (0.09,-0.11) 0.2 0.9039 0.6547 0.9048 (0.14,-0.16) 0.3 0.7977 0.5023 0.7985 (0.19,-0.21) 0.4 0.6697 0.3711 0.6703 (0.23,-0.27) 0.5 0.5331 0.2636 0.5353 (0.28,-0.32) 0.6 0.4049 0.1797 0.4066 (0.33,-0.37) 0.7 0.2926 0.1175 0.2938 (0.37,-0.43) 0.8 0.2001 0.0736 0.2019 (0.42,-0.48) 0.9 0.1308 0.0442 0.1320 (0.47,-0.53) 1.0 0.0813 0.0253 0.0821

(60)

Numerical illustration

Numerical illustration

Observed sample H0 :µ=0 H0012 (¯x1, ¯x2) x¯1−¯x2 p/s-value p-value s-value (0.05,-0.05) 0.1 0.9753 0.8231 0.9753 (0.09,-0.11) 0.2 0.9039 0.6547 0.9048 (0.14,-0.16) 0.3 0.7977 0.5023 0.7985 (0.19,-0.21) 0.4 0.6697 0.3711 0.6703 (0.23,-0.27) 0.5 0.5331 0.2636 0.5353 (0.28,-0.32) 0.6 0.4049 0.1797 0.4066 (0.33,-0.37) 0.7 0.2926 0.1175 0.2938 (0.37,-0.43) 0.8 0.2001 0.0736 0.2019

(61)

Numerical illustration

Numerical illustration

Observed sample H0 :µ=0 H0012 (¯x1, ¯x2) x¯1−¯x2 p/s-value p-value s-value (0.05,-0.05) 0.1 0.9753 0.8231 0.9753 (0.09,-0.11) 0.2 0.9039 0.6547 0.9048 (0.14,-0.16) 0.3 0.7977 0.5023 0.7985 (0.19,-0.21) 0.4 0.6697 0.3711 0.6703 (0.23,-0.27) 0.5 0.5331 0.2636 0.5353 (0.28,-0.32) 0.6 0.4049 0.1797 0.4066 (0.33,-0.37) 0.7 0.2926 0.1175 0.2938 (0.37,-0.43) 0.8 0.2001 0.0736 0.2019 (0.42,-0.48) 0.9 0.1308 0.0442 0.1320 (0.47,-0.53) 1.0 0.0813 0.0253 0.0821

(62)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

1

) = 0.9753

−0.6−0.4−0.20.00.2

µ2

(63)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

2

) = 0.9048

−0.2 0.0 0.2 0.4 0.6 0.8

−0.8−0.6−0.4−0.20.00.2

µ µ2

(64)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

3

) = 0.7985

−0.6−0.4−0.20.00.2

µ2

(65)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

4

) = 0.6703

−0.2 0.0 0.2 0.4 0.6 0.8

−0.8−0.6−0.4−0.20.00.2

µ µ2

(66)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

5

) = 0.5353

−0.6−0.4−0.20.00.2

µ2

(67)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

6

) = 0.4066

−0.2 0.0 0.2 0.4 0.6 0.8

−0.8−0.6−0.4−0.20.00.2

µ µ2

(68)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

7

) = 0.2938

−0.6−0.4−0.20.00.2

µ2

(69)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

8

) = 0.2019

−0.2 0.0 0.2 0.4 0.6 0.8

−0.8−0.6−0.4−0.20.00.2

µ µ2

(70)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

9

) = 0.1320

−0.6−0.4−0.20.00.2

µ2

(71)

Numerical illustration

Graphical illustration: s ({µ

1

= µ

2

}, x

10

) = 0.0821

−0.2 0.0 0.2 0.4 0.6 0.8

−0.8−0.6−0.4−0.20.00.2

µ µ2

(72)

Final remarks

Final remarks

The s-value:

can be applied directly whenever the log-likelihood function is concave by the formulas = 1−F(FH0(1−p))

is a possibilistic measure and can be studied by means of the Abstract belief Calculus ABC (Darwiche, Ginsberg, 1992).

can be justified bydesiderata(more basic axioms).

avoids the p-value problem of non-monotonicity.

(73)

References

References:

Darwiche, A.Y., Ginsberg, M.L. (1992). A symbolic generalization of probability theory, AAAI-92, Tenth National Conference on

Artificial Intelligence.

Patriota, AG (2017). On some assumptions of Null Hypothesis Statistical Testing, Educational and Psychological Measurement, 77, 507–524.

Patriota, AG (2013). A classical measure of evidence for general null hypotheses, Fuzzy Sets and Systems, 233, 74-88.

Pereira, C.A.B., Stern, J.M. (1999). Evidence and credibility: Full Bayesian significance test for precise hypotheses, Entropy, 1, 99–110.

Referências

Documentos relacionados

Uma revisão publicada em 2004 sobre recondicionamento muscular nestes pacientes relata que o exercício aeróbico e o treino de força são fundamentais no

• Because in precise null hypothesis testing the conditional measures of evidence are often much larger than the p -value, and it was shown that Benford’s law based digit analysis.

Tabela 6- Quantificação do teor de compostos fenólicos e da atividade antioxidante através dos métodos DPPH e ABTS das massas Controlo, LL (com incorporação de licor da

Comumente, os problemas patológicos revelam-se externamente, de maneira singular, como trincas, fissuras e infiltrações, das quais se podem avaliar suas

Neste enquadramento a autora do presente estudo interessou-se por compreender até que ponto as práticas alimentares dos imigrantes do distrito de Castelo Branco

The purpose of the present work was to measure the stereotypic content of several professional groups in a Portuguese sample, by determining the culturally shared

Neste trabalho, apresentam-se e discutem-se os resultados da aplicação da técnica de amostragem linear de descontinuidades em superfícies expostas do maciço rochoso subterrâneo