merge()
function allows to merge horizontally 2 dataframes by key variables.
The parameters are:
-
x
: first dataframe to be merged.
-
y
: second dataframe to be merged.
-
by
: the variable or variables used to do the merging.
-
incomparables
: values that can not be matched, intended to be used for merging one column.#dataframe 1:
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
iris$ID <- row.names(iris)
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ID
## 1 5.1 3.5 1.4 0.2 setosa 1
## 2 4.9 3.0 1.4 0.2 setosa 2
## 3 4.7 3.2 1.3 0.2 setosa 3
## 4 4.6 3.1 1.5 0.2 setosa 4
## 5 5.0 3.6 1.4 0.2 setosa 5
## 6 5.4 3.9 1.7 0.4 setosa 6
#dataframe 2:
ID <- c(1,3,5,7,9,11,13)
y <- rep('test', 7)
Sepal.Length <- c(5.1,5.1,4.9,5.1,5.1,5.1,4.8)
z <- data.frame(ID,y, Sepal.Length)
z
## ID y Sepal.Length
## 1 1 test 5.1
## 2 3 test 5.1
## 3 5 test 4.9
## 4 7 test 5.1
## 5 9 test 5.1
## 6 11 test 5.1
## 7 13 test 4.8
To get only the rows that match only the
ID
variable:#merge dataframes by 'ID' variable:
merge(iris, z, by = 'ID')
## ID Sepal.Length.x Sepal.Width Petal.Length Petal.Width Species y
## 1 1 5.1 3.5 1.4 0.2 setosa test
## 2 11 5.4 3.7 1.5 0.2 setosa test
## 3 13 4.8 3.0 1.4 0.1 setosa test
## 4 3 4.7 3.2 1.3 0.2 setosa test
## 5 5 5.0 3.6 1.4 0.2 setosa test
## 6 7 4.6 3.4 1.4 0.3 setosa test
## 7 9 4.4 2.9 1.4 0.2 setosa test
## Sepal.Length.y
## 1 5.1
## 2 5.1
## 3 4.8
## 4 5.1
## 5 4.9
## 6 5.1
## 7 5.1
We get 7 rows.
To get only the values that macth the ID variables, except those that we put in the
incomparables
parameter:#merge dataframes by 'ID' and 'Sepal.Length' variables, except `incomparables` parameter:
merge(iris, z, by = 'ID', incomparables = c(1,3,4,5,6,7))
## ID Sepal.Length.x Sepal.Width Petal.Length Petal.Width Species y
## 1 11 5.4 3.7 1.5 0.2 setosa test
## 2 13 4.8 3.0 1.4 0.1 setosa test
## 3 3 4.7 3.2 1.3 0.2 setosa test
## 4 9 4.4 2.9 1.4 0.2 setosa test
## Sepal.Length.y
## 1 5.1
## 2 4.8
## 3 5.1
## 4 5.1
Here, we get 4 rows.
To get the values that match the ID and Sepal.Length variables:
#merge dataframes by 'ID' and 'Sepal.Length' variables:
merge(iris, z, by = c('ID', 'Sepal.Length'))
## ID Sepal.Length Sepal.Width Petal.Length Petal.Width Species y
## 1 1 5.1 3.5 1.4 0.2 setosa test
## 2 13 4.8 3.0 1.4 0.1 setosa test
And, here we get only 2 rows.
No comments:
Post a Comment