univariate discretization — st

Function to classify univariate vector to interval, a wrapper of classInt::classify_intervals().

Usage

st_unidisc(x, k, method = "quantile", factor = FALSE, seed = 123456789, ...)

Arguments

x: A continuous numerical variable.
k: (optional) Number of classes required, if missing, grDevices::nclass.Sturges() is used; see also the "dpih" and "headtails" styles for automatic choice of the number of classes. k must greater than 3.
method: Chosen classify style: one of "fixed", "sd", "equal", "pretty", "quantile", "kmeans", "hclust", "bclust", "fisher", "jenks", "dpih", "headtails", "maximum", or "box". Default is quantile.
factor: (optional) Default is FALSE, if TRUE returns cols as a factor with intervals as labels rather than integers.
seed: (optional) Random seed number, default is 123456789. Setting random seed is useful when the sample size is greater than 3000(the default value for largeN) and the data is discretized by sampling 10%(the default value for samp_prop).
...: (optional) Other arguments passed to classInt::classify_intervals(), see ?classInt::classify_intervals().

Value

A discrete vector after being discretized.

Author

Wenbo Lv lyu.geosocial@gmail.com

Examples

xvar = c(22361, 9573, 4836, 5309, 10384, 4359, 11016, 4414, 3327, 3408,
         17816, 6909, 6936, 7990, 3758, 3569, 21965, 3605, 2181, 1892,
         2459, 2934, 6399, 8578, 8537, 4840, 12132, 3734, 4372, 9073,
         7508, 5203)
st_unidisc(xvar, k = 6, method = 'sd')
#>  [1] 5 3 2 2 3 2 3 2 2 2 5 2 2 3 2 2 5 2 2 1 2 2 2 3 3 2 3 2 2 3 3 2