API Index

CategoricalArrays.CategoricalArrayType
CategoricalArray{T}(undef, dims::Dims; levels=nothing, ordered=false)
CategoricalArray{T}(undef, dims::Int...; levels=nothing, ordered=false)

Construct an uninitialized CategoricalArray with levels of type T <: Union{AbstractChar, AbstractString, Number} and dimensions dims.

The levels keyword argument can be a vector specifying possible values for the data (this is equivalent to but more efficient than calling levels! on the resulting array). The ordered keyword argument determines whether the array values can be compared according to the ordering of levels or not (see isordered).

CategoricalArray{T, N, R}(undef, dims::Dims; levels=nothing, ordered=false)
CategoricalArray{T, N, R}(undef, dims::Int...; levels=nothing, ordered=false)

Similar to definition above, but uses reference type R instead of the default type (UInt32).

CategoricalArray(A::AbstractArray; levels=nothing, ordered=false)

Construct a new CategoricalArray with the values from A and the same element type.

The levels keyword argument can be a vector specifying possible values for the data (this is equivalent to but more efficient than calling levels! on the resulting array). If levels is omitted and the element type supports it, levels are sorted in ascending order; else, they are kept in their order of appearance in A. The ordered keyword argument determines whether the array values can be compared according to the ordering of levels or not (see isordered).

If A is already a CategoricalArray, its levels, orderedness and reference type are preserved unless explicitly overriden.

source
CategoricalArrays.CategoricalMatrixType
CategoricalMatrix{T}(undef, m::Int, n::Int; levels=nothing, ordered=false)

Construct an uninitialized CategoricalMatrix with levels of type T <: Union{AbstractChar, AbstractString, Number} and dimensions dim. The ordered keyword argument determines whether the array values can be compared according to the ordering of levels or not (see isordered).

CategoricalMatrix{T, R}(undef, m::Int, n::Int; levels=nothing, ordered=false)

Similar to definition above, but uses reference type R instead of the default type (UInt32).

CategoricalMatrix(A::AbstractMatrix; levels=nothing, ordered=false)

Construct a CategoricalMatrix with the values from A and the same element type.

The levels keyword argument can be a vector specifying possible values for the data (this is equivalent to but more efficient than calling levels! on the resulting array). If levels is omitted and the element type supports it, levels are sorted in ascending order; else, they are kept in their order of appearance in A. The ordered keyword argument determines whether the array values can be compared according to the ordering of levels or not (see isordered).

If A is already a CategoricalMatrix, its levels, orderedness and reference type are preserved unless explicitly overriden.

source
CategoricalArrays.CategoricalValueType
CategoricalValue{T <: Union{AbstractChar, AbstractString, Number}, R <: Integer}

A wrapper around a value of type T corresponding to a level in a CategoricalPool.

CategoricalValue objects are considered as equal to the value of type T they wrap by == and isequal. However, order comparisons like < and isless are only possible if isordered is true for the value's pool, and in that case the order of the pool's levels is used rather than the standard ordering of values of type T.

source
CategoricalArrays.CategoricalValueMethod
CategoricalValue(value, source::Union{CategoricalValue, CategoricalArray})

Return a CategoricalValue object wrapping value and attached to the CategoricalPool of source.

source
CategoricalArrays.CategoricalVectorType
CategoricalVector{T}(undef, m::Int; levels=nothing, ordered=false)

Construct an uninitialized CategoricalVector with levels of type T <: Union{AbstractChar, AbstractString, Number} and dimensions dim.

The levels keyword argument can be a vector specifying possible values for the data (this is equivalent to but more efficient than calling levels! on the resulting array). The ordered keyword argument determines whether the array values can be compared according to the ordering of levels or not (see isordered).

CategoricalVector{T, R}(undef, m::Int; levels=nothing, ordered=false)

Similar to definition above, but uses reference type R instead of the default type (UInt32).

CategoricalVector(A::AbstractVector; levels=nothing, ordered=false)

Construct a CategoricalVector with the values from A and the same element type.

The levels keyword argument can be a vector specifying possible values for the data (this is equivalent to but more efficient than calling levels! on the resulting array). If levels is omitted and the element type supports it, levels are sorted in ascending order; else, they are kept in their order of appearance in A. The ordered keyword argument determines whether the array values can be compared according to the ordering of levels or not (see isordered).

If A is already a CategoricalVector, its levels, orderedness and reference type are preserved unless explicitly overriden.

source
CategoricalArrays.categoricalMethod
categorical(A::AbstractArray; levels=nothing, ordered=false, compress=false)

Construct a categorical array with the values from A.

The levels keyword argument can be a vector specifying possible values for the data (this is equivalent to but more efficient than calling levels! on the resulting array). If levels is omitted and the element type supports it, levels are sorted in ascending order; else, they are kept in their order of appearance in A. The ordered keyword argument determines whether the array values can be compared according to the ordering of levels or not (see isordered).

If compress is true, the smallest reference type able to hold the number of unique values in A will be used. While this will reduce memory use, passing this parameter will also introduce a type instability which can affect performance inside the function where the call is made. Therefore, use this option with caution (the one-argument version does not suffer from this problem).

categorical(A::CategoricalArray; compress=false, levels=nothing, ordered=false)

If A is already a CategoricalArray, its levels, orderedness and reference type are preserved unless explicitly overriden.

source
CategoricalArrays.compressMethod
compress(A::CategoricalArray)

Return a copy of categorical array A using the smallest reference type able to hold the number of levels of A.

While this will reduce memory use, this function is type-unstable, which can affect performance inside the function where the call is made. Therefore, use it with caution.

source
CategoricalArrays.cutMethod
cut(x::AbstractArray, breaks::AbstractVector;
    labels::Union{AbstractVector,Function},
    extend::Union{Bool,Missing}=false, allowempty::Bool=false)

Cut a numeric array into intervals at values breaks and return an ordered CategoricalArray indicating the interval into which each entry falls. Intervals are of the form [lower, upper), i.e. the lower bound is included and the upper bound is excluded, except if extend=true the last interval, which is then closed on both ends, i.e. [lower, upper].

If x accepts missing values (i.e. eltype(x) >: Missing) the returned array will also accept them.

Keyword arguments

  • extend::Union{Bool, Missing}=false: when false, an error is raised if some values in x fall outside of the breaks; when true, breaks are automatically added to include all values in x, and the upper bound is included in the last interval; when missing, values outside of the breaks generate missing entries.
  • labels::Union{AbstractVector, Function}: a vector of strings, characters or numbers giving the names to use for the intervals; or a function f(from, to, i; leftclosed, rightclosed) that generates the labels from the left and right interval boundaries and the group index. Defaults to "[from, to)" (or "[from, to]" for the rightmost interval if extend == true).
  • allowempty::Bool=false: when false, an error is raised if some breaks appear multiple times, generating empty intervals; when true, duplicate breaks are allowed and the intervals they generate are kept as unused levels (but duplicate labels are not allowed).

Examples

julia> using CategoricalArrays

julia> cut(-1:0.5:1, [0, 1], extend=true)
5-element CategoricalArray{String,1,UInt32}:
 "[-1.0, 0.0)"
 "[-1.0, 0.0)"
 "[0.0, 1.0]"
 "[0.0, 1.0]"
 "[0.0, 1.0]" 

julia> cut(-1:0.5:1, 2)
5-element CategoricalArray{String,1,UInt32}:
 "Q1: [-1.0, 0.0)"
 "Q1: [-1.0, 0.0)"
 "Q2: [0.0, 1.0]"
 "Q2: [0.0, 1.0]"
 "Q2: [0.0, 1.0]" 

julia> cut(-1:0.5:1, 2, labels=["A", "B"])
5-element CategoricalArray{String,1,UInt32}:
 "A"
 "A"
 "B"
 "B"
 "B" 

julia> cut(-1:0.5:1, 2, labels=[-0.5, +0.5])
5-element CategoricalArray{Float64,1,UInt32}:
 -0.5
 -0.5
 0.5
 0.5
 0.5

julia> fmt(from, to, i; leftclosed, rightclosed) = "grp $i ($from//$to)"
fmt (generic function with 1 method)

julia> cut(-1:0.5:1, 3, labels=fmt)
5-element CategoricalArray{String,1,UInt32}:
 "grp 1 (-1.0//-0.3333333333333335)"
 "grp 1 (-1.0//-0.3333333333333335)"
 "grp 2 (-0.3333333333333335//0.33333333333333326)"
 "grp 3 (0.33333333333333326//1.0)"
 "grp 3 (0.33333333333333326//1.0)"
source
CategoricalArrays.cutMethod
cut(x::AbstractArray, ngroups::Integer;
    labels::Union{AbstractVector{<:AbstractString},Function},
    allowempty::Bool=false)

Cut a numeric array into ngroups quantiles, determined using quantile.

If x contains missing values, they are automatically skipped when computing quantiles.

Keyword arguments

  • labels::Union{AbstractVector, Function}: a vector of strings, characters or numbers giving the names to use for the intervals; or a function f(from, to, i; leftclosed, rightclosed) that generates the labels from the left and right interval boundaries and the group index. Defaults to "Qi: [from, to)" (or "Qi: [from, to]" for the rightmost interval).
  • allowempty::Bool=false: when false, an error is raised if some quantiles breakpoints are equal, generating empty intervals; when true, duplicate breaks are allowed and the intervals they generate are kept as unused levels (but duplicate labels are not allowed).
source
CategoricalArrays.decompressMethod
decompress(A::CategoricalArray)

Return a copy of categorical array A using the default reference type (UInt32). If A is using a small reference type (such as UInt8 or UInt16) the decompressed array will have room for more levels.

To avoid the need to call decompress, ensure compress is not called when creating the categorical array.

source
CategoricalArrays.isorderedMethod
isordered(A::CategoricalArray)

Test whether entries in A can be compared using <, > and similar operators, using the ordering of levels.

source
CategoricalArrays.levels!Method
levels!(A::CategoricalArray, newlevels::Vector; allowmissing::Bool=false)

Set the levels categorical array A. The order of appearance of levels will be respected by levels, which may affect display of results in some operations; if A is ordered (see isordered), it will also be used for order comparisons using <, > and similar operators. Reordering levels will never affect the values of entries in the array.

If A accepts missing values (i.e. eltype(A) >: Missing) and allowmissing=true, entries corresponding to omitted levels will be set to missing. Else, newlevels must include all levels which appear in the data.

source
CategoricalArrays.ordered!Method
ordered!(A::CategoricalArray, ordered::Bool)

Set whether entries in A can be compared using <, > and similar operators, using the ordering of levels. Return the modified A.

source
CategoricalArrays.recodeFunction
recode(a::AbstractArray[, default::Any], pairs::Pair...)

Return a copy of a, replacing elements matching a key of pairs with the corresponding value. The type of the array is chosen so that it can hold all recoded elements (but not necessarily original elements from a).

For each Pair in pairs, if the element is equal to (according to isequal) or in the key (first item of the pair), then the corresponding value (second item) is used. If the element matches no key and default is not provided or nothing, it is copied as-is; if default is specified, it is used in place of the original element. If an element matches more than one key, the first match is used.

recode(a::CategoricalArray[, default::Any], pairs::Pair...)

If a is a CategoricalArray then the ordering of resulting levels is determined by the order of passed pairs and default will be the last level if provided.

Examples

julia> using CategoricalArrays

julia> recode(1:10, 1=>100, 2:4=>0, [5; 9:10]=>-1)
10-element Vector{Int64}:
 100
   0
   0
   0
  -1
   6
   7
   8
  -1
  -1
 recode(a::AbstractArray{>:Missing}[, default::Any], pairs::Pair...)

If a contains missing values, they are never replaced with default: use missing in a pair to recode them. If that's not the case, the returned array will accept missing values.

Examples

julia> using CategoricalArrays

julia> recode(1:10, 1=>100, 2:4=>0, [5; 9:10]=>-1, 6=>missing)
10-element Vector{Union{Missing, Int64}}:
 100
   0
   0
   0
  -1
    missing
   7
   8
  -1
  -1    
source
CategoricalArrays.recode!Function
recode!(dest::AbstractArray, src::AbstractArray[, default::Any], pairs::Pair...)

Fill dest with elements from src, replacing those matching a key of pairs with the corresponding value.

For each Pair in pairs, if the element is equal to (according to isequal)) the key (first item of the pair) or to one of its entries if it is a collection, then the corresponding value (second item) is copied to dest. If the element matches no key and default is not provided or nothing, it is copied as-is; if default is specified, it is used in place of the original element. dest and src must be of the same length, but not necessarily of the same type. Elements of src as well as values from pairs will be converted when possible on assignment. If an element matches more than one key, the first match is used.

recode!(dest::CategoricalArray, src::AbstractArray[, default::Any], pairs::Pair...)

If dest is a CategoricalArray then the ordering of resulting levels is determined by the order of passed pairs and default will be the last level if provided.

recode!(dest::AbstractArray, src::AbstractArray{>:Missing}[, default::Any], pairs::Pair...)

If src contains missing values, they are never replaced with default: use missing in a pair to recode them.

source
CategoricalArrays.recode!Method
recode!(a::AbstractArray[, default::Any], pairs::Pair...)

Convenience function for in-place recoding, equivalent to recode!(a, a, ...).

Examples

julia> using CategoricalArrays

julia> x = collect(1:10);

julia> recode!(x, 1=>100, 2:4=>0, [5; 9:10]=>-1);

julia> x
10-element Vector{Int64}:
 100
   0
   0
   0
  -1
   6
   7
   8
  -1
  -1
source
DataAPI.levelsMethod
levels(x::CategoricalArray; skipmissing=true)
levels(x::CategoricalValue)

Return the levels of categorical array or value x. This may include levels which do not actually appear in the data (see droplevels!). missing will be included only if it appears in the data and skipmissing=false is passed.

The returned vector is an internal field of x which must not be mutated as doing so would corrupt it.

source
DataAPI.unwrapMethod
unwrap(x::CategoricalValue)
unwrap(x::Missing)

Get the value wrapped by categorical value x. If x is Missing return missing.

source