통계 > 분할표 > 다원표...

Statistics > Contingency tables >  Multi-way tables...

Linux 사례 (MX 21)

3개 이상의 요인형 변수를 포함하는 데이터셋이 활성화되면, '통계 > 분할표 > 다원표...' 메뉴 기능을 이용할 수 있다. 아래 메뉴 창에서 '통제 변수 (하나 이상 선택)'을 점검할 필요가 있다. 이원표를 만드는 기준 범주가 된다.

Linux 사례 (MX 21)
Linux 사례 (MX 21)


?xtabs  # stats 패키지의 xtabs 도움말 보기

## 'esoph' has the frequencies of cases and controls for all levels of
## the variables 'agegp', 'alcgp', and 'tobgp'.
xtabs(cbind(ncases, ncontrols) ~ ., data = esoph)
## Output is not really helpful ... flat tables are better:
ftable(xtabs(cbind(ncases, ncontrols) ~ ., data = esoph))
## In particular if we have fewer factors ...
ftable(xtabs(cbind(ncases, ncontrols) ~ agegp, data = esoph))

## This is already a contingency table in array form.
DF <- as.data.frame(UCBAdmissions)
## Now 'DF' is a data frame with a grid of the factors and the counts
## in variable 'Freq'.
DF
## Nice for taking margins ...
xtabs(Freq ~ Gender + Admit, DF)
## And for testing independence ...
summary(xtabs(Freq ~ ., DF))

## with NA's
DN <- DF; DN[cbind(6:9, c(1:2,4,1))] <- NA
DN # 'Freq' is missing only for (Rejected, Female, B)
tools::assertError(# 'na.fail' should fail :
     xtabs(Freq ~ Gender + Admit, DN, na.action=na.fail), verbose=TRUE)
op <- options(na.action = "na.omit") # the "factory" default
(xtabs(Freq ~ Gender + Admit, DN) -> xtD)
noC <- function(O) `attr<-`(O, "call", NULL)
ident_noC <- function(x,y) identical(noC(x), noC(y))
stopifnot(exprs = {
  ident_noC(xtD, xtabs(Freq ~ Gender + Admit, DN, na.action = na.omit))
  ident_noC(xtD, xtabs(Freq ~ Gender + Admit, DN, na.action = NULL))
})

xtabs(Freq ~ Gender + Admit, DN, na.action = na.pass)
## The Female:Rejected combination has NA 'Freq' (and NA prints 'invisibly' as "")
(xtNA <- xtabs(Freq ~ Gender + Admit, DN, addNA = TRUE)) # ==> count NAs
## show NA's better via  na.print = ".." :
print(xtNA, na.print= "NA")


## Create a nice display for the warp break data.
warpbreaks$replicate <- rep_len(1:9, 54)
ftable(xtabs(breaks ~ wool + tension + replicate, data = warpbreaks))

### ---- Sparse Examples ----

if(require("Matrix")) withAutoprint({
 ## similar to "nlme"s  'ergoStool' :
 d.ergo <- data.frame(Type = paste0("T", rep(1:4, 9*4)),
                      Subj = gl(9, 4, 36*4))
 xtabs(~ Type + Subj, data = d.ergo) # 4 replicates each
 set.seed(15) # a subset of cases:
 xtabs(~ Type + Subj, data = d.ergo[sample(36, 10), ], sparse = TRUE)

 ## Hypothetical two-level setup:
 inner <- factor(sample(letters[1:25], 100, replace = TRUE))
 inout <- factor(sample(LETTERS[1:5], 25, replace = TRUE))
 fr <- data.frame(inner = inner, outer = inout[as.integer(inner)])
 xtabs(~ inner + outer, fr, sparse = TRUE)
})

'Statistics > Contigency tables' 카테고리의 다른 글

3. Enter and analyze two-way table...  (0) 2022.06.28
Contingency tables  (0) 2022.02.14
1. Two-way table...  (0) 2022.02.14

통계 > 분할표 > 이원표 입력 및 분석하기...

Statistics > Contingency tables > Enter and analyze two-way table...

Linux 사례 (MX 21)

'통계 > 분할표 > 이원표 입력 및 분석하기...' 메뉴 기능을 선택하면 하위 창이 등장한다. '변수 이름', '행과 열의 수', '사례 수' 등을 입력할 수 있다. 아래의 내용은 'chisq.test' 함수 도움말 문서에 나오는 사례를 입력한다. 아래의 입력 스크립트를 참조 할 수 있다.

Linux 사례 (MX 21)
Linux 사례(MX 21)

.Table <- matrix(c(762,327,468,484,239,477), 2, 3, byrow=TRUE)
dimnames(.Table) <- list("Gender"=c("Female", "Male"), "Party"=c("Democrats", "Independent", 
  "Republican"))
.Table  # Counts
.Test <- chisq.test(.Table, correct=FALSE)
.Test
.Test$expected # Expected Counts

Linux 사례 (MX 21)


?chisq.test  # stats 패키지의  chisq.test 도움말 보기

## From Agresti(2007) p.39
M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))
dimnames(M) <- list(gender = c("F", "M"),
                    party = c("Democrat","Independent", "Republican"))
(Xsq <- chisq.test(M))  # Prints test summary
Xsq$observed   # observed counts (same as M)
Xsq$expected   # expected counts under the null
Xsq$residuals  # Pearson residuals
Xsq$stdres     # standardized residuals


## Effect of simulating p-values
x <- matrix(c(12, 5, 7, 7), ncol = 2)
chisq.test(x)$p.value           # 0.4233
chisq.test(x, simulate.p.value = TRUE, B = 10000)$p.value
                                # around 0.29!

## Testing for population probabilities
## Case A. Tabulated data
x <- c(A = 20, B = 15, C = 25)
chisq.test(x)
chisq.test(as.table(x))             # the same
x <- c(89,37,30,28,2)
p <- c(40,20,20,15,5)
try(
chisq.test(x, p = p)                # gives an error
)
chisq.test(x, p = p, rescale.p = TRUE)
                                # works
p <- c(0.40,0.20,0.20,0.19,0.01)
                                # Expected count in category 5
                                # is 1.86 < 5 ==> chi square approx.
chisq.test(x, p = p)            #               maybe doubtful, but is ok!
chisq.test(x, p = p, simulate.p.value = TRUE)

## Case B. Raw data
x <- trunc(5 * runif(100))
chisq.test(table(x))            # NOT 'chisq.test(x)'!

'Statistics > Contigency tables' 카테고리의 다른 글

2. Multi-way tables...  (0) 2022.06.28
Contingency tables  (0) 2022.02.14
1. Two-way table...  (0) 2022.02.14

도움말 > R Markdown 사용하기

Help > Using R Markdown

Linux 사례 (MX 21)

'도움말 > R Markdown 사용하기' 메뉴 기능을 선택하면, RStudio사의 R Markdown 소개 사이트로 이동한다.

https://rmarkdown.rstudio.com/lesson-1.html

 

Introduction

© Copyright 2016 - 2020 RStudio, PBC

rmarkdown.rstudio.com

 

+ Recent posts