build_tier_settings is a convenient way to make the proper list for the tier_list argument of tier_match Each vector in build_score_settings should be the same length, and each position (first, second, third, etc.) corresponds to one variable to score on.

build_tier(
  by.x = NULL,
  by.y = NULL,
  check_merge = NULL,
  match_type = NULL,
  fuzzy_settings = build_fuzzy_settings(),
  score_settings = NULL,
  filter = NULL,
  filter.args = NULL,
  evaluate = NULL,
  evaluate.args = NULL,
  clean_settings = build_clean_settings(),
  clean = NULL,
  sequential_words = NULL,
  allow.cartesian = FALSE,
  multivar_settings = build_multivar_settings()
)

Arguments

by.x

character string. Variable to merge on in data1. See merge

by.y

character string. Variable to merge on in data2. See merge

check_merge

logical. Checks that your unique_keys are indeed unique.

match_type

string. If 'exact', match is exact, if 'fuzzy', match is fuzzy. If 'multivar,' match is multivar-based. See multivar_match,

fuzzy_settings

additional arguments for amatch, to be used if match_type = 'fuzzy'. Suggested defaults provided. (see amatch, method='jw')

score_settings

list. Score settings for post-hoc matchscores.

filter

function or numeric. Filters a merged data1-data2 dataset. If a function, should take in a data.frame (data1 and data2 merged by name1 and name2) and spit out a trimmed verion of the data.frame (fewer rows). Think of this function as applying other conditions to matches, other than a match by name. The first argument of filter should be the data.frame. If numeric, will drop all observations with a matchscore lower than or equal to filter.

filter.args

list. Arguments passed to filter, if a function

evaluate

Function to evalute merge_plus output.

evaluate.args

list. Arguments passed to evaluate

clean_settings

list. Settings for string cleaning. See clean_strings and build_clean_settings.

clean

Boolean, T/F, whether or not to clean strings prior to the match.

sequential_words

data.table of words in the same format of the common_words argument in clean_strings. Each of these will be replaced from the by columns.

allow.cartesian

whether or not to allow many-many matches, see data.table::merge()

multivar_settings

list of settings to go to the multivar match if match_type == 'multivar'. See multivar-match.

Value

a list containing 1 tier for the 'tier_list' argument of tier_match.