Skip to content

similR: Similarity and Distance metrics for binary matrices

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

muriteams/similR

Repository files navigation

similR: Similarity and Distance metrics for binary matrices

lifecycle CRAN status Travis build status Build status

The development version from GitHub with:

# install.packages("devtools")
devtools::install_github("USCCANA/similR")

Example

An example from the manual

library(similR)

data(powerset03)
 
# We can compute it over the entire set
head(similarity(powerset03, statistic="s14"))
#>      i j        s14
#> [1,] 1 2  0.6324555
#> [2,] 1 3 -0.2000000
#> [3,] 1 4  0.6324555
#> [4,] 1 5  0.4472136
#> [5,] 1 6 -0.3162278
#> [6,] 1 7 -0.2000000

# Or over two pairs
head(similarity(powerset03[[1]], powerset03[[2]], powerset03[[3]], statistic="s14"))
#>      i j        s14
#> [1,] 1 2  0.6324555
#> [2,] 1 3 -0.2000000
#> [3,] 2 3  0.6324555

# We can compute multiple distances at the same time
ans <- similarity(powerset03, statistic=c("hamming", "dennis", "jaccard"))
head(ans)
#>      i j   hamming     dennis   jaccard
#> [1,] 1 2 0.1666667  1.6329932 0.5000000
#> [2,] 1 3 0.3333333 -0.5773503 0.0000000
#> [3,] 1 4 0.1666667  1.6329932 0.5000000
#> [4,] 1 5 0.3333333  1.0000000 0.3333333
#> [5,] 1 6 0.5000000 -0.8164966 0.0000000
#> [6,] 1 7 0.3333333 -0.5773503 0.0000000

Currently, the full list of available statistics is:

data("statistics")
statistics
#> $similarity
#>  [1] "sanderberg" "sdisp"      "sfaith"     "sgk"        "sgl"       
#>  [6] "shamann"    "sjaccard"   "smichael"   "speirce"    "sph1"      
#> [11] "starwid"    "syuleq"     "syuleqw"   
#> 
#> $distance
#> [1] "dennis"   "dhamming" "dmh"      "dsd"      "dsphd"    "dyuleq"

Releases

No releases published

Packages

No packages published