Wednesday, September 27, 2017

jitter() {base}


jitter() is a function that adds a small amount of noise to a numeric vector.

jitter(x, factor = 1, amount = NULL)

The parameters are:
  • x: numeric vector
  • factor: numeric
  • amount: numeric


factor and amount:
The result obtained is: x + runif(n, -a, a) where n <- length(x) and a is the amount argument when specified.
If amount == 0, we set a <- factor * z/50, where z = max(x) - min(x).
If amount is NULL (default), we set a <- factor * d/5 where d is the smallest difference between adjacent unique x values.

(x = c(1:10))
##  [1]  1  2  3  4  5  6  7  8  9 10
jitter(x)
##  [1] 1.130417 1.986822 2.988324 3.859536 5.000287 5.857188 6.918340
##  [8] 8.169720 8.874839 9.905134
jitter(x, factor = 1)
##  [1]  0.834980  1.835347  3.013703  3.871540  5.081525  6.090221  6.962873
##  [8]  7.851089  8.928137 10.162149
jitter(x, factor = 100)
##  [1]  -2.893356 -17.441460   1.516910   9.771455   7.970764  -8.361613
##  [7]   5.423622  17.443090  22.896337  -4.537215
jitter(x, factor = 1000)
##  [1] -183.377948  134.473433 -184.563341  102.799489   94.493774
##  [6]  -51.109763   -8.312566  104.303829  -70.242552  151.695206
jitter(x, factor = 1, amount = 1)
##  [1] 1.569603 2.476006 2.667142 3.165431 4.174408 5.520519 7.001732
##  [8] 7.417415 8.922810 9.026437
jitter(x, factor = 1, amount = 10)
##  [1]  1.636452 -3.943719 -2.927030  4.313734 -4.221980  9.040693 14.964047
##  [8] 14.971467 16.330573 18.446910
jitter(x, factor = 10, amount = 10)
##  [1] 10.3411754  1.4124063  1.5017123 -1.0730307  7.1035618  7.6606934
##  [7] -2.9521261  1.3903118  0.6792946  8.9670520
jitter(x, factor = 10, amount = 100)
##  [1]  54.23251  64.86327 -14.59972  75.91645  84.13828  51.97269 -12.05809
##  [8] 106.60717  73.01726 -79.91987

Also, jitter() function can be useful for data visualization. When working with scatter plots using a quantitative variable dots can be overlapped making difficult the visualization of the data.
#Data:
X=rep(1:5, each=50)
a=runif(50 , min=0 , max=10)
Y=c(a-2 , a-3 , a+2, a+4, a+3)
 
par(mfrow = c(1,2))
# plot (overlapped dots)
plot(X, Y, pch = 22, main = 'No using `jitter()`', cex.main = 0.75)
# plot with jitter
plot(jitter(X), Y, pch = 22, col = c('darkviolet'), xlab="X", ylab="Y", main = 'Using `jitter()`', cex.main = 0.75)

We can see that using jitter() function data visualization is easier.

No comments:

Post a Comment

duplicated() {base}

duplicated()  function determines which elements are duplicated and returns a logical vector. The parameters of the function are:   ...