As a former SPSS user I was wondering if anybody knows of an equivalent to the 'examine' command in R? E.g.:
EXAMINE VARIABLES=income by sex
/CINTERVAL 95.
Cheers
As a former SPSS user I was wondering if anybody knows of an equivalent to the 'examine' command in R? E.g.:
EXAMINE VARIABLES=income by sex
/CINTERVAL 95.
Cheers
The closest I'm aware of is the describe function found in the psych package.
Here's some code you can run to see how it works:
install.packages('psych')
library(psych)
data('mtcars')
# Standard describe function
describe(mtcars$mpg)
# Show interquartile ranges
describe(mtcars$mpg, IQR = TRUE)
The output for describe(mtcars$mpg, IQR = TRUE):
vars n mean sd median trimmed mad min max range skew kurtosis se IQR
X1 1 32 20.09 6.03 19.2 19.7 5.41 10.4 33.9 23.5 0.61 -0.37 1.07 7.38
One can also handle a level of by group processing by adding split() and lapply() to Matt's answer. For example, to obtain descriptive statistics on mtcars$mpg by number of cylinders, we do the following:
library(psych)
splitvar <- as.factor(mtcars$cyl)
data <- split(mtcars,splitvar)
lapply(data,function(x){describe(x$mpg,IQR=TRUE)})
...and the output:
> lapply(data,function(x){describe(x$mpg,IQR=TRUE)})
$`4`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR
X1 1 11 26.66 4.51 26 26.44 6.52 21.4 33.9 12.5 0.26 -1.65 1.36 7.6
$`6`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR
X1 1 7 19.74 1.45 19.7 19.74 1.93 17.8 21.4 3.6 -0.16 -1.91 0.55 2.35
$`8`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR
X1 1 14 15.1 2.56 15.2 15.15 1.56 10.4 19.2 8.8 -0.36 -0.57 0.68 1.85
>
We can also add quantiles via the quant= argument. Here we'll generate the 5%ile and 95%ile values.
lapply(data,function(x){describe(x$mpg,quant=c(.05,.95),IQR=TRUE)})
...and the output:
> lapply(data,function(x){describe(x$mpg,quant=c(.05,.95),IQR=TRUE)})
$`4`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR Q0.05 Q0.95
1 1 11 26.66 4.51 26 26.44 6.52 21.4 33.9 12.5 0.26 -1.65 1.36 7.6 21.45 33.15
$`6`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR Q0.05 Q0.95
1 1 7 19.74 1.45 19.7 19.74 1.93 17.8 21.4 3.6 -0.16 -1.91 0.55 2.35 17.89 21.28
$`8`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR Q0.05 Q0.95
1 1 14 15.1 2.56 15.2 15.15 1.56 10.4 19.2 8.8 -0.36 -0.57 0.68 1.85 10.4 18.88
>
...posting as Community wiki to avoid taking credit for Matt's answer.
It turns out that the psych package has a separate function, describeBy() that allows one to have multiple by group variables, which more closely emulates the behavior of the SPSS EXAMINE procedure.
We'll demonstrate with the mtcars data frame, using the cyl and am columns as by groups.
library(psych)
describeBy(mtcars,group = c("cyl","am"),quant=c(.05,.95))
...and the output for the first two by group combinations:
Descriptive statistics by group
cyl: 4
am: 0
vars n mean sd median trimmed mad min max range skew kurtosis se Q0.05
mpg 1 3 22.90 1.45 22.80 22.90 1.93 21.50 24.40 2.90 0.07 -2.33 0.84 21.63
cyl 2 3 4.00 0.00 4.00 4.00 0.00 4.00 4.00 0.00 NaN NaN 0.00 4.00
disp 3 3 135.87 13.97 140.80 135.87 8.75 120.10 146.70 26.60 -0.31 -2.33 8.07 122.17
hp 4 3 84.67 19.66 95.00 84.67 2.97 62.00 97.00 35.00 -0.38 -2.33 11.35 65.30
drat 5 3 3.77 0.13 3.70 3.77 0.01 3.69 3.92 0.23 0.38 -2.33 0.08 3.69
wt 6 3 2.94 0.41 3.15 2.94 0.06 2.46 3.19 0.73 -0.38 -2.33 0.24 2.53
qsec 7 3 20.97 1.67 20.01 20.97 0.01 20.00 22.90 2.90 0.38 -2.33 0.97 20.00
vs 8 3 1.00 0.00 1.00 1.00 0.00 1.00 1.00 0.00 NaN NaN 0.00 1.00
am 9 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 NaN NaN 0.00 0.00
gear 10 3 3.67 0.58 4.00 3.67 0.00 3.00 4.00 1.00 -0.38 -2.33 0.33 3.10
carb 11 3 1.67 0.58 2.00 1.67 0.00 1.00 2.00 1.00 -0.38 -2.33 0.33 1.10
Q0.95
mpg 24.24
cyl 4.00
disp 146.11
hp 96.80
drat 3.90
wt 3.19
qsec 22.61
vs 1.00
am 0.00
gear 4.00
carb 2.00
----------------------------------------------------------------------
cyl: 6
am: 0
vars n mean sd median trimmed mad min max range skew kurtosis se Q0.05
mpg 1 4 19.12 1.63 18.65 19.12 1.04 17.80 21.40 3.60 0.48 -1.91 0.82 17.85
cyl 2 4 6.00 0.00 6.00 6.00 0.00 6.00 6.00 0.00 NaN NaN 0.00 6.00
disp 3 4 204.55 44.74 196.30 204.55 42.55 167.60 258.00 90.40 0.17 -2.25 22.37 167.60
hp 4 4 115.25 9.18 116.50 115.25 9.64 105.00 123.00 18.00 -0.09 -2.33 4.59 105.75
drat 5 4 3.42 0.59 3.50 3.42 0.62 2.76 3.92 1.16 -0.09 -2.33 0.30 2.81
wt 6 4 3.39 0.12 3.44 3.39 0.01 3.21 3.46 0.25 -0.73 -1.70 0.06 3.25
qsec 7 4 19.21 0.82 19.17 19.21 0.85 18.30 20.22 1.92 0.11 -2.02 0.41 18.39
vs 8 4 1.00 0.00 1.00 1.00 0.00 1.00 1.00 0.00 NaN NaN 0.00 1.00
am 9 4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 NaN NaN 0.00 0.00
gear 10 4 3.50 0.58 3.50 3.50 0.74 3.00 4.00 1.00 0.00 -2.44 0.29 3.00
carb 11 4 2.50 1.73 2.50 2.50 2.22 1.00 4.00 3.00 0.00 -2.44 0.87 1.00
Q0.95
mpg 21.07
cyl 6.00
disp 253.05
hp 123.00
drat 3.92
wt 3.46
qsec 20.10
vs 1.00
am 0.00
gear 4.00
carb 4.00
----------------------------------------------------------------------