build_clean_settings is a convenient way to make the proper list for the clean_settings argument of tier_match.

build_clean_settings(
  sp_char_words = fedmatch::sp_char_words,
  common_words = NULL,
  remove_char = NULL,
  remove_words = FALSE,
  stem = FALSE
)

Arguments

sp_char_words

character vector. Data.frame where first column is special characters and second column is full words. The default is

common_words

data.frame. Data.frame where first column is abbreviations and second column is full words.

remove_char

character vector. string of specific characters (for example, "letters") to be removed

remove_words

logical. If TRUE, removes all abbreviations and replacement words in common_words

stem

logical. If TRUE, words are stemmed

Value

list with settings to pass to clean_strings