merge_plus.Rd
merge_plus
is a wrapper for a standard merge, a fuzzy string match,
and a a ``multivar'' match based on several columns of the data. Parameters allow
for control for fine-tuning of the match. This is primarily used as the
workhorse for the tier_match
function.
merge_plus(
data1,
data2,
by = NULL,
by.x = NULL,
by.y = NULL,
suffixes = c("_1", "_2"),
check_merge = TRUE,
unique_key_1,
unique_key_2,
match_type = "exact",
fuzzy_settings = build_fuzzy_settings(),
score_settings = NULL,
filter = NULL,
filter.args = list(),
evaluate = match_evaluate,
evaluate.args = list(),
allow.cartesian = FALSE,
multivar_settings = build_multivar_settings()
)
data.frame. First to-merge dataset (ordering matters - see Fuzzy Matching vignette.)
data.frame. Second to-merge dataset.
character string. Variables to merge on (common across data 1 and
data 2). See merge
length-1 character vector. Variable to merge on in data1. See merge
length-1 character vector. Variable to merge on in data2. See merge
character vector with length==2. Suffix to add to like named
variables after the merge. See merge
logical. Checks that your unique_keys are indeed unique.
character vector. Primary key of data1 that uniquely identifies each row (can be multiple fields)
character vector. Primary key of data2 that uniquely identifies each row (can be multiple fields)
string. If 'exact', match is exact, if 'fuzzy', match is
fuzzy. If 'multivar,' match is multivar-based. See multivar_match
,
additional arguments for amatch, to be used if match_type
= 'fuzzy'. Suggested defaults provided. See build_fuzzy_settings
.
list. Score settings for post-hoc matchscores. See build_score_settings
function or numeric. Filters a merged data1-data2 dataset. If a function, should take in a data.frame (data1 and data2 merged by name1 and name2) and spit out a trimmed verion of the data.frame (fewer rows). Think of this function as applying other conditions to matches, other than a match by name. The first argument of filter should be the data.frame. If numeric, will drop all observations with a matchscore lower than or equal to filter.
list. Arguments passed to filter, if a function
Function to evalute merge_plus output.
list. Arguments passed to evaluate
whether or not to allow many-many matches, see data.table::merge()
list of settings to go to the multivar match if match_type
== 'multivar'. See multivar-match
and build_multivar_settings
.
list with matches, filtered matches (if applicable), data1 and data2 minus matches, and match evaluation
match_evaluate