'분류 전체보기' 카테고리의 글 목록 (9 Page)

분류 전체보기

Summarize model 2022.03.12
Select active model... 2022.03.12
1.4. Sample from normal distribution... 2022.03.12
1.3. Plot normal distribution... 2022.03.12
1.2. Normal probabilities... 2022.03.12
1.1. Normal quantiles... 2022.03.12
Set random number generator seed... 2022.03.12
1. Single-sample proportion test... 2022.03.12
7. Effect plots... 2022.03.09
Cowles 데이터셋 2022.03.09

Summarize model

modernity4Rcmdr 2022. 3. 12. 20:17

2022. 3. 12. 20:17

모델 > 모델 요약하기

Models > Summarize model

모델을 만들고, 모델의 요약 정보를 확인할 때 일반적으로 summary() 함수를 사용한다.

carData 패키지의 Prestige 데이터셋으로 선형회귀, 선형모델을 만들었다고 하자. 이 과정에서 다음 사례와 같은 요약 정보가 생산된다:

summary(LinearModel.2)

여러개의 모델이 있고, 특정 모델의 요약정보를 다시 확인하고자 할 때 사용하는 기능이다.

<계수 표준 오차의 샌드위치 추정치 사용하기>에 선택이 되어 있는 경우는 summary() 함수 대신 summarySandwich() 함수가 사용된다.

summarySandwich(LinearModel.2, type="hc3")

Select active model...

modernity4Rcmdr 2022. 3. 12. 19:51

2022. 3. 12. 19:51

모델 > 활성 모델 선택하기...

Models > Select active model...

R Commander 상단에는 메뉴 목록이 있다. 오른쪽 끝부분에 <모델: 모델이름>이 활성화되면 데이터셋으로 분석 모델을 만들었다는 의미가 된다. 그런데, 여러개의 모델을 만들면서 다양한 각도로 분석적 통찰력을 키우는 경우가 일반적이다. R Commander에서는 분석과정에서 만들어진 여러개의 모델을 메모리에 상주시키고, 상황에 맞게 활용할 준비를 갖춘다. 아래의 명령문 프롬프트 창은 세개의 모델이 있음을 알린다. carData 패키지의 Prestige 데이터셋을 이용하여, 선형회귀, 선형모델 기법을 통하여 education(교육연수), income(연소득)이 prestige(직업의 사회적 권위)에 어떤 영향을 미치는가, 또 직업유형별로 차이가 있는가를 분석한다고 가정하자.

data(Prestige, package="carData") # prestige 데이터셋 불러오기
RegModel.1 <- lm(prestige~education+income, data=Prestige) # 선형회귀모델1
summary(RegModel.1)
LinearModel.2 <- lm(prestige ~ education + log(income), data=Prestige)# 선형모델1
summary(LinearModel.2)
LinearModel.4 <- lm(prestige ~ (education + log(income))*type, data=Prestige)# 선형모델2
summary(LinearModel.4)

<활성 모델 선택하기...>기능을 선택하면 아래와 같은 모델 목록창이 등장한다. 목록에서 하나를 선택한다.

1.4. Sample from normal distribution...

modernity4Rcmdr 2022. 3. 12. 14:15

2022. 3. 12. 14:15

분포도 > 연속 분포 > 정규 분포 > 정규 분포의 표본...

Distributions > Continuous distributions > Normal distributions > Sample from normal distribution...

<정규 분포의 표본> 창에는 다양한 선택 기능이 있다. 표본의 수 (행)과 관찰 수 (열)에 표본 범위를 넣자. '데이터셋의 이름 입력하기'에는 원하는 이름을 넣을 수 있다. 나는 set.seed(번호)를 연상시키는 번호를 입력하기도 한다.

set.seed(9723)
NormalSamples_9723 <- as.data.frame(matrix(rnorm(10*5, mean=0, sd=1), ncol=5))
rownames(NormalSamples_9723) <- paste("sample", 1:10, sep="")
colnames(NormalSamples_9723) <- paste("obs", 1:5, sep="")

https://rcmdr.tistory.com/158

Set random number generator seed...

분포도 > 난수생성기 시드(seed) 생성기... Distributions > Set random number generator seed... 번호 하나를 선택한다. 그 번호는 앞으로 생성되는 난수 값들을 기억한다. set.seed(9723)

rcmdr.kr

'Distributions > Continuous distributions' 카테고리의 다른 글

1.3. Plot normal distribution... (0)	2022.03.12
1.2. Normal probabilities... (0)	2022.03.12
1.1. Normal quantiles... (0)	2022.03.12

1.3. Plot normal distribution...

modernity4Rcmdr 2022. 3. 12. 12:25

2022. 3. 12. 12:25

분포도 > 연속 분포 > 정규 분포 > 정규 분포 그리기...

Distributions > Continuous distributions > Normal distribution > Plot normal distribution...

<밀도 함수 그리기 (Plot density function)>를 선택하고 <x-값>을 선택한 상황에서 몇 몇 사례를 만들어본다.

local({
  .x <- seq(-3.291, 3.291, length.out=1000)  
  plotDistr(.x, dnorm(.x, mean=0, sd=1), cdf=FALSE, xlab="x", ylab="Density", 
  main=paste("Normal Distribution:  Mean=0, Standard deviation=1"), regions=list(c(-1.644854, Inf)), 
  col=c('#BEBEBE', '#FFA500'), legend.pos='topright')
})

local({
  .x <- seq(-3.291, 3.291, length.out=1000)  
  plotDistr(.x, dnorm(.x, mean=0, sd=1), cdf=FALSE, xlab="x", ylab="Density", 
  main=paste("Normal Distribution:  Mean=0, Standard deviation=1"))
})

local({
  .x <- seq(-3.291, 3.291, length.out=1000)  
  plotDistr(.x, dnorm(.x, mean=0, sd=1), cdf=FALSE, xlab="x", ylab="Density", 
  main=paste("Normal Distribution:  Mean=0, Standard deviation=1"), regions=list(c(1.96, Inf), c(-Inf, 
  -1.96)), col=c('#BEBEBE', '#FFA500'), legend.pos='topright')
})

<밀도 함수 그리기 (Plot density function)>를 선택하고 <분위수>를 선택한 상황에서 몇 몇 사례를 만들어본다.

<분위수>에 입력할 수 있는 범위는 0에서 1까지의 확률이다. 이 범위 안에 들어오는 숫자는 아래 명령문 내부 regions에서 보이듯이 분위수로 전환된다.

local({
  .x <- seq(-3.291, 3.291, length.out=1000)  
  plotDistr(.x, dnorm(.x, mean=0, sd=1), cdf=FALSE, xlab="x", ylab="Density", 
  main=paste("Normal Distribution:  Mean=0, Standard deviation=1"), regions=list(c(-1.64485362695147, 
  1.64485362695147)), col=c('#BEBEBE', '#FFA500'), legend.pos='topright')
})

local({
  .x <- seq(-3.291, 3.291, length.out=1000)  
  plotDistr(.x, dnorm(.x, mean=0, sd=1), cdf=FALSE, xlab="x", ylab="Density", 
  main=paste("Normal Distribution:  Mean=0, Standard deviation=1"), regions=list(c(-Inf, 
  1.64485362695147)), col=c('#BEBEBE', '#FFA500'), legend.pos='topright')
})

local({
  .x <- seq(-3.291, 3.291, length.out=1000)  
  plotDistr(.x, dnorm(.x, mean=0, sd=1), cdf=FALSE, xlab="x", ylab="Density", 
  main=paste("Normal Distribution:  Mean=0, Standard deviation=1"), regions=list(c(-Inf, 
  1.64485362695147), c(2.32634787404084, Inf)), col=c('#BEBEBE', '#FFA500'), legend.pos='topright')
})

'Distributions > Continuous distributions' 카테고리의 다른 글

1.4. Sample from normal distribution... (0)	2022.03.12
1.2. Normal probabilities... (0)	2022.03.12
1.1. Normal quantiles... (0)	2022.03.12

1.2. Normal probabilities...

modernity4Rcmdr 2022. 3. 12. 10:14

2022. 3. 12. 10:14

분포도 > 연속 분포 > 정규 분포 > 정규 확률...
Distributions > Continuous distributions > Normal distribution > Normal probabilities...

사례 값을 넣고, 분포도의 (꼬리) 방향을 정해주면 확률이 계산된다.

pnorm(c(1.644854), mean=0, sd=1, lower.tail=TRUE)
pnorm(c(1.644854), mean=0, sd=1, lower.tail=FALSE)
pnorm(c(-1.644854), mean=0, sd=1, lower.tail=TRUE)
pnorm(c(-1.644854), mean=0, sd=1, lower.tail=FALSE)

'Distributions > Continuous distributions' 카테고리의 다른 글

1.4. Sample from normal distribution... (0)	2022.03.12
1.3. Plot normal distribution... (0)	2022.03.12
1.1. Normal quantiles... (0)	2022.03.12

1.1. Normal quantiles...

modernity4Rcmdr 2022. 3. 12. 09:57

2022. 3. 12. 09:57

분포도 > 연속 분포 > 정규 분포 > 정규 분위수...

Distributions > Continuous distributions > Normal distribution > Normal quantiles...

확률을 넣고, 분포도의 (꼬리) 방향을 정해주면, 분위수가 계산된다. <확률>을 95%(.095)로 선택해보자. <낮은쪽 꼬리/높은쪽 꼬리> 선택에 따라 어떻게 값이 변하는지 살펴보자.

qnorm(c(.95), mean=0, sd=1, lower.tail=TRUE)
qnorm(c(.95), mean=0, sd=1, lower.tail=FALSE)

아래 화면에서 95% 확률로 <낮은쪽 꼬리/높은쪽 꼬리> 방향의 값을 확인할 수 있다.

'Distributions > Continuous distributions' 카테고리의 다른 글

1.4. Sample from normal distribution... (0)	2022.03.12
1.3. Plot normal distribution... (0)	2022.03.12
1.2. Normal probabilities... (0)	2022.03.12

Set random number generator seed...

modernity4Rcmdr 2022. 3. 12. 09:39

2022. 3. 12. 09:39

분포도 > 난수생성기 시드(seed) 생성기...

Distributions > Set random number generator seed...

번호 하나를 선택한다. 그 번호는 앞으로 생성되는 난수 값들을 기억한다.

set.seed(9723)

1. Single-sample proportion test...

modernity4Rcmdr 2022. 3. 12. 09:33

2022. 3. 12. 09:33

통계 > 비율 > 일-표본 비율 검정...

Statistics > Proportions > Single-sample proportion test...

요인형 변수를 두개 이상 가지고 있는 데이터셋이 활성화되어 있다면, '통계 > 비율 > 이-표본 비율 검정..' 메뉴 기능을 이용할 수 있다. carData 패키지에 있는 Chile 데이터셋을 활용해서 연습해보자. 먼저, '데이터 > 패키지에 있는 데이터 > 첨부된 패키지에서 데이터셋 읽기...' 메뉴 기능을 통하여 Chile 데이터셋을 활성화시키자. R Commander 상단에 'Chile'라는 데이터셋이 활성화되었는지 확인하자.

https://rcmdr.tistory.com/239

Chile 데이터셋

carData::Chile() data(Chile, package="carData") '데이터 > 패키지에 있는 데이터 > 첨부된 패키지에서 데이터셋 읽기...' 메뉴 기능을 선택하면 하위 선택 창으로 이동한다. 아래와 같이 carData 패키지를 선택.

rcmdr.kr

요인형 변수 vote를 변형시켜 vote.f 변수를 새롭게 코딩하고 사용하도록 하자.

data(Chile, package="carData")
Chile <- within(Chile, {
  vote.f <- Recode(vote, '"Y" = "yes"; "N" = "no"; else = NA', as.factor=TRUE)
})

<선택기능> 창에 표시되어 있는 기본 설정을 그대로 사용하자.

local({
  .Table <- xtabs(~ vote.f , data= Chile )
  cat("\nFrequency counts (test is for first level):\n")
  print(.Table)
  prop.test(rbind(.Table), alternative='two.sided', p=.5, conf.level=.95, correct=FALSE)
})

출력창에 나오는 결과는 아래와 같다:

?prop.test  # stats 패키지의 prop.test 도움말 보기

heads <- rbinom(1, size = 100, prob = .5)
prop.test(heads, 100)          # continuity correction TRUE by default
prop.test(heads, 100, correct = FALSE)

## Data from Fleiss (1981), p. 139.
## H0: The null hypothesis is that the four populations from which
##     the patients were drawn have the same true proportion of smokers.
## A:  The alternative is that this proportion is different in at
##     least one of the populations.

smokers  <- c( 83, 90, 129, 70 )
patients <- c( 86, 93, 136, 82 )
prop.test(smokers, patients)

https://rcmdr.tistory.com/53

1. Recode variables...

데이터 > 활성 데이터셋의 변수 관리하기 > 변수를 다시 코딩하기... Data > Manage variables in active data set > Recode variables... 기존 변수를 이용하여 새로운 변수를 만들 수 있다. R Commander에서 이..

rcmdr.kr

'Statistics > Proportions' 카테고리의 다른 글

2. Two-sample proportions test... (0)	2022.06.30

7. Effect plots...

modernity4Rcmdr 2022. 3. 9. 19:03

2022. 3. 9. 19:03

모델 > 그래프 > 효과 그림...

Models > Graphs > Effect plots...

'모델 > 그래프 > 효과 그림...' 기능은 미리 모델이 만들어져야 이용할 수 있다. 만들어진 모델은 아래와 같이 R Commander 상단에서 확인할 수 있다. carData 패키지의 Cowles 데이터셋으로 만든 GLM.1 모델을 활용하는 것이다.

<모델 효과 그림(들)> 창 중간에 있는 <예측변수 (하나 이상 선택)> 기능에서 sex, neuroticism, extraversion 세 변수를 모두 선택해보자.

plot(allEffects(GLM.1))

carData 패키지의 Prestige 데이터셋을 이용하여 연습해보자. 아래와 같이 prestige (직업의 사회적 권위)에 대한 education (교육연수), income (연수입), women (여성 참여율)의 영향력을 type (직업유형)별로 살펴보는 모델을 만들었다고 가정하자.

data(Prestige, package="carData")
LinearModel.1 <- lm(prestige ~ education + income + women + type, data=Prestige)
summary(LinearModel.1)

아래와 같이 LinearModel.1의 요약 정보가 출력될 것이다.

이러한 LinearModel.1의 효과 그림을 시각화 할 수 있다. <모델 효과 그림(들)> 창의 <예측변수(하나 이상 선택)> 기능에서 네개의 변수를 모두 선택해보자. 그리고 예(OK) 버튼을 누른다.

plot(allEffects(LinearModel.1))

아래와 같이 그래픽 장치 창에 선택된 변수 네개의 효과 그림이 등장할 것이다.

한편, <잔차 일부분 그리기> 기능을 선택해보자.

그래픽 장치 창에 잔차들이 플롯으로 표시된다. 표시된 잔차의 분포를 보면서 추가로로 통찰력을 키울 수 있다.

'Models > Graphs' 카테고리의 다른 글

6. Influence index plot... (0)	2022.06.21
2. Residual quantile-comparison plot... (0)	2022.06.21
5. Influence plot... (0)	2022.06.21
4. Added-variable plots... (0)	2022.06.20
3. Component + residual plots... (0)	2022.06.20

Cowles 데이터셋

modernity4Rcmdr 2022. 3. 9. 18:20

2022. 3. 9. 18:20

carData::Cowles

data(Cowles, package="carData")

help("Cowles")

Cowles {carData}

R Documentation

Cowles and Davis's Data on Volunteering

Description

The Cowles data frame has 1421 rows and 4 columns. These data come from a study of the personality determinants of volunteering for psychological research.

Usage

Cowles

Format

This data frame contains the following columns:

neuroticism

scale from Eysenck personality inventory

extraversion

scale from Eysenck personality inventory

sex

a factor with levels: female; male

volunteer

volunteeing, a factor with levels: no; yes

Source

Cowles, M. and C. Davis (1987) The subject matter of psychology: Volunteers. British Journal of Social Psychology 26, 97–102.

[Package carData version 3.0-5 Index]

PREV 이전 1 ···6 7 8 9 10 11 12 ···23 NEXT 다음

Rcmdr.kr: An R Commander User in Korea