numeric.encoder()
returns an encoder for a quantitative variable.
Usage
numeric.encoder(
x,
k,
type = 1L,
encoding.digits = NULL,
tag = "x",
frame = NULL,
weights = NULL
)
numeric.frame(
reps = NULL,
breaks = NULL,
type = NULL,
encoding.digits = NULL,
tag = "x"
)
# S3 method for class 'encoder'
print(x, digits = NULL, ...)
Arguments
- x
a numeric vector to be encoded.
- k
an integer specifying the coarseness of the encoding. If not positive, all unique values of x are used as sample points.
- type
an integer specifying the encoding method. If
1
, values are encoded to a[0, 1]
scale based on linear interpolation of the knots. If0
, values are encoded to0
or1
using ont-hot encoding on the intervals.- encoding.digits
an integer specifying the rounding digits for the encoding in case
type
is1
.- tag
character string. The name of the variable.
- frame
a "numeric.frame" object or a numeric vector that defines the sample points of the binning.
- weights
optional. A numeric vector of sample weights for each value of
x
.- reps
a numeric vector to be used as the representative values (knots).
- breaks
a numeric vector to be used as the binning breaks.
- digits
the minimum number of significant digits to be used.
- ...
not used.
Value
numeric.encoder()
returns a list containing the following components:
- frame
an object of class "numeric.frame".
- encode
a function to encode
x
into a dummy matrix.- n
the number of encoding levels.
- type
the type of encoding, "linear" or "constant".
numeric.frame()
returns a "numeric.frame" object containing the encoding information.
Details
numeric.encoder()
selects sample points from the variable x
and returns a list containing the encode()
function to convert a vector into a dummy matrix.
If type
is 1
, k
is considered the maximum number of knots, and the values between two knots are encoded as two decimals, reflecting the relative position to the knots.
If type
is 0
, k
is considered the maximum number of intervals, and the values are converted using one-hot encoding on the intervals.
Examples
data(iris, package = "datasets")
enc <- numeric.encoder(x = iris$Sepal.Length, k = 5L, tag = "Sepal.Length")
enc$frame
#> Sepal.Length Sepal.Length_min Sepal.Length_max
#> 1 4.3 4.30 4.70
#> 2 5.1 4.70 5.45
#> 3 5.8 5.45 6.10
#> 4 6.4 6.10 7.15
#> 5 7.9 7.15 7.90
enc$encode(x = c(4:8, NA))
#> 4.3 5.1 5.8 6.4 7.9
#> [1,] 1.000 0.000 0.0000000 0.0000000 0.0
#> [2,] 0.125 0.875 0.0000000 0.0000000 0.0
#> [3,] 0.000 0.000 0.6666667 0.3333333 0.0
#> [4,] 0.000 0.000 0.0000000 0.6000000 0.4
#> [5,] 0.000 0.000 0.0000000 0.0000000 1.0
#> [6,] 0.000 0.000 0.0000000 0.0000000 0.0
frm <- numeric.frame(breaks = seq(3, 9, 2), type = 0L)
enc <- numeric.encoder(x = iris$Sepal.Length, frame = frm)
enc$encode(x = c(4:8, NA))
#> [-Inf, 5) [5, 7) [7, Inf)
#> [1,] 1 0 0
#> [2,] 0 1 0
#> [3,] 0 1 0
#> [4,] 0 0 1
#> [5,] 0 0 1
#> [6,] 0 0 0
enc <- numeric.encoder(x = iris$Sepal.Length, frame = seq(3, 9, 2))
enc$encode(x = c(4:8, NA))
#> 3 5 7 9
#> [1,] 0.5 0.5 0.0 0.0
#> [2,] 0.0 1.0 0.0 0.0
#> [3,] 0.0 0.5 0.5 0.0
#> [4,] 0.0 0.0 1.0 0.0
#> [5,] 0.0 0.0 0.5 0.5
#> [6,] 0.0 0.0 0.0 0.0