Friday, August 11, 2017

abs() and sqrt() {base}


abs() function computes the absolute value of x, while sqrt() function computes the square root of x.

The parameters of the functions are:
 x: numeric vector or array
abs(10)
## [1] 10
abs(-10)
## [1] 10
sqrt(10)
## [1] 3.162278
sqrt(-10)
## Warning in sqrt(-10): NaNs produced
## [1] NaN
sqrt(abs(-10))
## [1] 3.162278
plot(15:-15, sqrt(abs(15:-15)),  col = "orange")
lines(spline(15:-15, sqrt(abs(15:-15))), col = "gold")

Thursday, August 10, 2017

unique() {base}


unique() function is a generic function that extracts unique values from a vector, array or data frame.
The parameters of the function are: x: vector, array or data frame to remove duplicated values fromLast:logical indicating if duplication should be considered from the last
#`unique()` function using vectors:
x <- c(10 + 0:5, 1:5, 8:1)
x
##  [1] 10 11 12 13 14 15  1  2  3  4  5  8  7  6  5  4  3  2  1
u1 <- unique(x)
u1
##  [1] 10 11 12 13 14 15  1  2  3  4  5  8  7  6
u2 <- unique(x,  fromLast = TRUE) # different order
u2
##  [1] 10 11 12 13 14 15  8  7  6  5  4  3  2  1
y <- c(5:1,8:1, 10, 1:3)
y
##  [1]  5  4  3  2  1  8  7  6  5  4  3  2  1 10  1  2  3
u3 <- unique(y)
u3
## [1]  5  4  3  2  1  8  7  6 10
u4 <- unique(y,  fromLast = TRUE) # different order
u4
## [1]  8  7  6  5  4 10  1  2  3
#`unique()` function with data frames:
dim(ChickWeight)
## [1] 578   4
head(ChickWeight)
##   weight Time Chick Diet
## 1     42    0     1    1
## 2     51    2     1    1
## 3     59    4     1    1
## 4     64    6     1    1
## 5     76    8     1    1
## 6     93   10     1    1
nrow(unique(ChickWeight))
## [1] 578
unique(ChickWeight$Diet)
## [1] 1 2 3 4
## Levels: 1 2 3 4
length(unique(ChickWeight$weight))
## [1] 212

Wednesday, August 9, 2017

duplicated() {base}


duplicated() function is a general function that determines which elements are duplicated, it returns a logical vector:
The parameters of the function are:
 x: vector, array or data frame
 fromLast:logical indicating if duplication should be considered from the last
x <- c(5:1,5:1,5)
x
##  [1] 5 4 3 2 1 5 4 3 2 1 5
duplicated(x)
##  [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
duplicated(x, fromLast = TRUE)
##  [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
## extract duplicated elements, those elements that duplicated(x) == TRUE, may be repeted elements:
x[duplicated(x)]
## [1] 5 4 3 2 1 5
## extract unique elements
x[!duplicated(x)]
## [1] 5 4 3 2 1
## extract unique elements starting from the righmost value (different order):
x[!duplicated(x, fromLast = TRUE)]
## [1] 4 3 2 1 5
#duplicated using a data frmae
duplicated(iris)
##   [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [45] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [56] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [67] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [78] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [89] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [111] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [122] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [144] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
duplicated(iris$Sepal.Length)
##   [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE
##  [12] FALSE  TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [23]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [34] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE
##  [45]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE
##  [56]  TRUE FALSE  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
##  [67]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
##  [78]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [89]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [100]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE
## [111]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE
## [122]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE
## [133]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [144]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
anyDuplicated() function returns the position of the first element duplicated:
x <- c(9:1, 20, 10:6,21,10)
x
##  [1]  9  8  7  6  5  4  3  2  1 20 10  9  8  7  6 21 10
duplicated(x)
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [12]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE
anyDuplicated(x)  #first element found to be duplicated
## [1] 12
anyDuplicated(x, fromLast = TRUE) #first element found to be duplicated
## [1] 11
duplicated(iris)
##   [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [45] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [56] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [67] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [78] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [89] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [111] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [122] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [144] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
anyDuplicated(iris) ## 143
## [1] 143

Tuesday, August 8, 2017

which() {base}


which() function is a generic function that returns the position of a logical object.
The parameters of the function are: x: vector or array arr.ind: logical, to decide if the indices should be returned in an array ind: integer-valued index vector
x <- c(1,1,1,1,2)
x
## [1] 1 1 1 1 2
which(x == 2)
## [1] 5
summary(chickwts)
##      weight             feed   
##  Min.   :108.0   casein   :12  
##  1st Qu.:204.5   horsebean:10  
##  Median :258.0   linseed  :12  
##  Mean   :261.3   meatmeal :11  
##  3rd Qu.:323.5   soybean  :14  
##  Max.   :423.0   sunflower:12
which(chickwts$feed=='horsebean')
##  [1]  1  2  3  4  5  6  7  8  9 10
which(chickwts$feed=='soybean')
##  [1] 23 24 25 26 27 28 29 30 31 32 33 34 35 36
x <- matrix(c(2,3,2,3,1,1,1,1,2,3),ncol = 5)
x
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    2    2    1    1    2
## [2,]    3    3    1    1    3
#which are the positions of the numbers multiple of 2. 
#the positions are counted by columns
#1 3 5 7 9
#2 4 6 8 10
which(x %% 2 == 0)   
## [1] 1 3 9
#which are the position of the numbers multiple of 2, it returns the array indices
which(x %% 2 == 0, arr.ind = TRUE) 
##      row col
## [1,]   1   1
## [2,]   1   2
## [3,]   1   5
which.min(x) #returns the first position of the minimum value that appears in the object. 
## [1] 5
which.max(x) #returns the first position of the maximum value that appears in the object. 
## [1] 2

Monday, August 7, 2017

summary() {base}


summary() is a generic function that gives summary values of an object. When working with a vector or a data frame the summary values are Min1st Qu.MedianMean3rd Qu. and Max:
summary(1:100)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00   25.75   50.50   50.50   75.25  100.00
summary(iris)
##   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
##  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
##  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
##  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
##  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
##  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
##  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
##        Species  
##  setosa    :50  
##  versicolor:50  
##  virginica :50  
##                 
##                 

The function has the following parameters:
 x: object to display the values from
 maxsum: levels to be shown for factors
 digits: digits to be shown
summary(1:100)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00   25.75   50.50   50.50   75.25  100.00
summary(1:100, digits = 1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       1      30      50      50      80     100
summary(iris, digits = 1)
##   Sepal.Length  Sepal.Width  Petal.Length  Petal.Width        Species  
##  Min.   :4     Min.   :2    Min.   :1     Min.   :0.1   setosa    :50  
##  1st Qu.:5     1st Qu.:3    1st Qu.:2     1st Qu.:0.3   versicolor:50  
##  Median :6     Median :3    Median :4     Median :1.3   virginica :50  
##  Mean   :6     Mean   :3    Mean   :4     Mean   :1.2                  
##  3rd Qu.:6     3rd Qu.:3    3rd Qu.:5     3rd Qu.:1.8                  
##  Max.   :8     Max.   :4    Max.   :7     Max.   :2.5
summary(iris$Sepal.Width, digits = 2) #digits = 2
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     2.0     2.8     3.0     3.1     3.3     4.4
summary(iris$Sepal.Width, digits = 4) #digits = 4
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.000   2.800   3.000   3.057   3.300   4.400
summary(iris, maxsum = 2, digits = 4) #maxsum = 2
##   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
##  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
##  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
##  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
##  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
##  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
##  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
##     Species   
##  setosa : 50  
##  (Other):100  
##               
##               
##               
## 
summary(iris, maxsum = 1, digits = 4) #maxsum = 1
##   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
##  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
##  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
##  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
##  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
##  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
##  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
##     Species   
##  (Other):150  
##               
##               
##               
##               
## 

summary() function can also be used when working with other functions as lm() (lineal model), where gives other summary values as Residuals,  Coefficients,  Residual standard error,  Multiple R-squared,  Adjusted R-squared,  F-statistic and p-value:
lmod <- lm(Temp~Ozone+Solar.R+Wind, data = airquality)
lmod
## 
## Call:
## lm(formula = Temp ~ Ozone + Solar.R + Wind, data = airquality)
## 
## Coefficients:
## (Intercept)        Ozone      Solar.R         Wind  
##   72.418579     0.171966     0.007276    -0.322945
summary(lmod)
## 
## Call:
## lm(formula = Temp ~ Ozone + Solar.R + Wind, data = airquality)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -20.942  -4.996   1.283   4.434  13.168 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 72.418579   3.215525  22.522  < 2e-16 ***
## Ozone        0.171966   0.026390   6.516 2.42e-09 ***
## Solar.R      0.007276   0.007678   0.948    0.345    
## Wind        -0.322945   0.233264  -1.384    0.169    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.834 on 107 degrees of freedom
##   (42 observations deleted due to missingness)
## Multiple R-squared:  0.4999, Adjusted R-squared:  0.4858 
## F-statistic: 35.65 on 3 and 107 DF,  p-value: 4.729e-16

duplicated() {base}

duplicated()  function determines which elements are duplicated and returns a logical vector. The parameters of the function are:   ...