build_multivar_settings is a convenient way to build the list for the multivar settings argument in merge_plus

build_multivar_settings(
  logit = NULL,
  missing = FALSE,
  wgts = NULL,
  compare_type = "diff",
  blocks = NULL,
  blocks.x = NULL,
  blocks.y = NULL,
  top = 1,
  threshold = NULL,
  nthread = 1
)

Arguments

logit

a glm or lm model as a result from a logit regression on a verified dataset. See details.

missing

boolean T/F, whether or not to treat missing (NA) observations as its own binary column for each column in by. See details.

wgts

rather than a lm model, you can supply weights to calculate matchscore. Can be weights from calculate_weights.

compare_type

a vector with the same length as "by" that describes how to compare the variables. Options are "in", "indicator", "substr", "difference", "ratio", "stringdist", and "wgt_jaccard_dist". See the Multivar Matching Vignette for details.

blocks

variable present in both data sets to "block" on before computing scores. Matchscores will only be computed for observations that share a block. See details.

blocks.x

name of blocking variables in x. cannot supply both blocks and blocks.x

blocks.y

name of blocking variables in y. cannot supply both blocks and blocks.y

top

integer. Number of matches to return for each observation.

threshold

numeric. Minimum score for a match to be included in the result.

nthread

integer. Number of cores to use when computing all combinations. See parallel::makecluster()

Value

a list containing options for the 'multivar_settings' argument of merge_plus.