build_tier.Rd
build_tier_settings
is a convenient way to make the proper list for the
tier_list
argument of tier_match
Each vector in build_score_settings
should be the same length, and each position (first, second, third, etc.)
corresponds to one variable to score on.
build_tier(
by.x = NULL,
by.y = NULL,
check_merge = NULL,
match_type = NULL,
fuzzy_settings = build_fuzzy_settings(),
score_settings = NULL,
filter = NULL,
filter.args = NULL,
evaluate = NULL,
evaluate.args = NULL,
clean_settings = build_clean_settings(),
clean = NULL,
sequential_words = NULL,
allow.cartesian = FALSE,
multivar_settings = build_multivar_settings()
)
character string. Variable to merge on in data1. See merge
character string. Variable to merge on in data2. See merge
logical. Checks that your unique_keys are indeed unique.
string. If 'exact', match is exact, if 'fuzzy', match is
fuzzy. If 'multivar,' match is multivar-based. See multivar_match
,
additional arguments for amatch, to be used if match_type = 'fuzzy'. Suggested defaults provided. (see amatch, method='jw')
list. Score settings for post-hoc matchscores.
function or numeric. Filters a merged data1-data2 dataset. If a function, should take in a data.frame (data1 and data2 merged by name1 and name2) and spit out a trimmed verion of the data.frame (fewer rows). Think of this function as applying other conditions to matches, other than a match by name. The first argument of filter should be the data.frame. If numeric, will drop all observations with a matchscore lower than or equal to filter.
list. Arguments passed to filter, if a function
Function to evalute merge_plus output.
list. Arguments passed to evaluate
list. Settings for string cleaning. See clean_strings
and build_clean_settings
.
Boolean, T/F, whether or not to clean strings prior to the match.
data.table of words in the same format of the common_words argument in clean_strings
. Each of these will be replaced from the by columns.
whether or not to allow many-many matches, see data.table::merge()
list of settings to go to the multivar match if match_type
== 'multivar'. See multivar-match
.
a list containing 1 tier for the 'tier_list' argument of tier_match
.