Calculate word corpus for weighted jaccard matching

build_corpus(namelist1, namelist2)

Arguments

namelist1

character vector of names from dataset 1

namelist2

character vector of names from dataset 2

Value

a data.table with columns for frequency, inverse frequency, and log inverse frequency for each word in the two strings.