var()
function that computes the variation of the values in x
.var(x, y = NULL, na.rm = FALSE)
The arguments are:
x
: numeric vectory
:NULL (default) or a vector, matrix or data frame with compatible dimensions tox
.na.rm
: logical value indicating whetherNA
values should be removed before the computation proceeds.
Formula to calculate the Standard Deviation:
σ=∑(xi−x¯)2n−1
The variance of a data set measures the mathematical dispersion of the data relative to the mean. However, this value is difficult to apply in a real-world sense because the values used to calculate it were squared.
The standard deviation, as the square root of the variance gives a value that is in the same units as the original values, which makes it much easier to work with and to interpret. (https://rfunctionaday.blogspot.com.es/2017/10/sd-base.html)
x
:x = c(1,6,10,23,4,5,56)
var1 = sum((x - mean(x))^2)/(length(x)-1)
var2 = var(x)
var1 ; var2 #same result
## [1] 378
## [1] 378
y
:var(1:10); var(1:10,1:10) #same result
## [1] 9.166667
## [1] 9.166667
y = c(10,45,10,3,24,54,5)
var1 = var(x); var3 = var(y)
var4 = var(x,y)
var1 ; var3 ; var4
## [1] 378
## [1] 415.619
## [1] -195
na.rm
:z = c(1,6,10,23,NA, NA, 4,5,56, 56, NA)
var3 = sum((z - mean(z))^2)/(length(z)-1)
var4 = var(z)
var5 = var(z, na.rm = TRUE) #remove NA values to compute var
var3 ; var4 ; var5
## [1] NA
## [1] NA
## [1] 534.125
summary(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
## 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
## Median :5.800 Median :3.000 Median :4.350 Median :1.300
## Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
## 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
## Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
## Species
## setosa :50
## versicolor:50
## virginica :50
##
##
##
meanset = mean(iris$Sepal.Length[iris$Species=='setosa'])
meanversi = mean(iris$Sepal.Length[iris$Species=='versicolor'])
meanvir = mean(iris$Sepal.Length[iris$Species=='virginica'])
sdset = sd(iris$Sepal.Length[iris$Species=='setosa'])
sdversi = sd(iris$Sepal.Length[iris$Species=='versicolor'])
sdvir = sd(iris$Sepal.Length[iris$Species=='virginica'])
varset = var(iris$Sepal.Length[iris$Species=='setosa'])
varversi = var(iris$Sepal.Length[iris$Species=='versicolor'])
varvir = var(iris$Sepal.Length[iris$Species=='virginica'])
plot(iris$Species, iris$Sepal.Length, col = 'thistle1', main = 'SD/Var')
#SD segments
segments(1, meanset+sdset,1, meanset-sdset, col = 'deeppink', lwd = 5)
segments(2, meanversi+ sdversi, 2, meanversi-sdversi, col = 'deeppink', lwd = 5)
segments(3, meanvir + sdvir, 3, meanvir-sdvir, col = 'deeppink',lwd = 5)
#var segments
segments(1, meanset+varset,1, meanset-varset, col = 'darkviolet', lwd = 5)
segments(2, meanversi+ varversi, 2, meanversi-varversi, col = 'darkviolet', lwd = 5)
segments(3, meanvir + varvir, 3, meanvir-varvir, col = 'darkviolet',lwd = 5)
No comments:
Post a Comment