OnlineLogBinning API Reference

Documentation for OnlineLogBinning.

OnlineLogBinning.MINIMUM_VAR_64Constant
const MINIMUM_RX_64 = eps(Float64)
const MINIMUM_RX_32 = eps(Float32) 
const MINIMUM_RX_16 = eps(Float16)

Minimum allowable variance values based on the least-squares fit type. Any data stream variances smaller than these are suspiciously small, and one should not trust an automated binning analysis in these instances.

source
OnlineLogBinning.OLB_tested_numbersConstant
OLB_tested_numbers

Defines the list of tested numerical types for OnlineLogBinning.jl.

Note

These types are specifically given as:

  • Float16, Float32, Float64 for Real numbers.
  • ComplexF16, ComplexF32, ComplexF64 for Complex numbers.
source
OnlineLogBinning.BinningAccumulatorType
BinningAccumulator{T}() where {T <: Number} (default T = Float64)
BinningAccumulator{T}(::Vector{LevelAccumulator{T}})
BinningAccumulator{T}(::Int) where {T <: Number} (default T = Float64)

Main data structure for the binning analysis. T == Float64 by default in the empty constructor. There are three constructors, an empty one, one that copy-constructs with a Vector of LevelAccumulators, and one that pre-allocates that Vector based on an anticipated datastream size.

Contents

  • LvlAccums::Vector{LevelAccumulator{T}}
Compat

The pre-allocated constructor requires at least version 0.2.2.

Example

julia> # Create a BinningAccumulator with the default type T == Float64

julia> bacc = BinningAccumulator()  
BinningAccumulator{Float64} with 0 binning levels.
0th Binning Level (unbinned data):
LevelAccumulator{Float64} with online fields:
    level    = 0
    num_bins = 0
    Taccum   = 0.0
    Saccum   = 0.0
    Paccum   = PairAccumulator{Float64}(true, [0.0, 0.0])

    Calculated Level Statistics:
    Current Mean             = NaN
    Current Variance         = -0.0
    Current Std. Deviation   = -0.0
    Current Var. of the Mean = NaN
    Current Std. Error       = NaN

julia> # Add a data stream using the push! function

julia> # (The data stream does not have to have a length == power of 2.)

julia> push!(bacc, [1, 2, 3, 4])
BinningAccumulator{Float64} with 2 binning levels.
0th Binning Level (unbinned data):
LevelAccumulator{Float64} with online fields:
    level    = 0
    num_bins = 4
    Taccum   = 10.0
    Saccum   = 5.0
    Paccum   = PairAccumulator{Float64}(true, [0.0, 0.0])

    Calculated Level Statistics:
    Current Mean             = 2.5
    Current Variance         = 1.6666666666666667
    Current Std. Deviation   = 1.2909944487358056
    Current Var. of the Mean = 0.4166666666666667
    Current Std. Error       = 0.6454972243679028

1th Binning Level:
LevelAccumulator{Float64} with online fields:
    level    = 1
    num_bins = 2
    Taccum   = 5.0
    Saccum   = 2.0
    Paccum   = PairAccumulator{Float64}(true, [0.0, 0.0])

    Calculated Level Statistics:
    Current Mean             = 2.5
    Current Variance         = 2.0
    Current Std. Deviation   = 1.4142135623730951
    Current Var. of the Mean = 1.0
    Current Std. Error       = 1.0

2th Binning Level:
LevelAccumulator{Float64} with online fields:
    level    = 2
    num_bins = 0
    Taccum   = 0.0
    Saccum   = 0.0
    Paccum   = PairAccumulator{Float64}(false, [0.0, 2.5])

    Calculated Level Statistics:
    Current Mean             = NaN
    Current Variance         = -0.0
    Current Std. Deviation   = -0.0
    Current Var. of the Mean = NaN
    Current Std. Error       = NaN
source
OnlineLogBinning.BinningAnalysisResultType
BinningAnalysisResult{T <: AbstractFloat}

Small struct to determine if there is a _plateau_found from a BinningAccumulator, and what its value is.

Contents

  • plateau_found::Bool: whether the fit_RxValues found a plateau from the binned data.
  • RxAmplitude::T: the value for the plateau as calculated by fit_RxValues.
    • If plateau_found == false, then RxAmplitude = length(X) for a datastream X, so as to maximize the error estimation.
  • effective_length::Int: the effective number of uncorrelated data points in the datastream X as calculated by

\[m_{\rm eff} = \mathtt{floor} \left( \frac{\mathtt{length}(X)}{R_X} \right).\]

  • binning_mean::T: the value of the mean as calculated by

\[\mathtt{mean}(X) = \frac{ T^{(0)} }{ m^{(0)} }.\]

  • binning_error::T: the value of the error as calculated by

\[\begin{aligned} \mathtt{error}(X) &= \sqrt{ \frac{ S^{(0)} }{ m_{\rm eff} \left( m^{(0)} - 1 \right) } } \\ &= \sqrt{ \left[ \mathtt{floor}\left( \frac{m^{(0)}}{R_X} \right) \right]^{-1} \, \frac{ S^{(0)} }{ m^{(0)} - 1 } }. \end{aligned}\]

source
OnlineLogBinning.LevelAccumulatorType
LevelAccumulator{T <: Number}

Accumulator structure for a given binning level.

Contents

  • level::Int
    • Registers the binning level this accumulator is assigned
  • num_bins::Int
    • How many elements (i.e. bins) have been added to this accumulator
  • Taccum::T
    • Stands for Total Accumulator.
    • This represents the T accumulator for the mean: mean ≡ T / num_bins.
  • Saccum::T
    • Stands for Square Accumulator.
    • This represents the S accumulator for the variance: var ≡ S/(num_bins - 1).
  • Paccum::PairAccumulator{T}
    • An outward facing PairAccumulator to meet incoming data streams.
    • This accumulator processes the incoming data and then exports the Tvalue and Svalue into updates for Taccum and Saccum, respectively.
source
OnlineLogBinning.PairAccumulatorType
PairAccumulator{T <: Number}

Accumulator that directly faces an incoming data stream. Two values from that stream enter and are processed into the exported values of Tvalue and Svalue.

Contents

  • fullpair::Bool
    • A Boolean to keep track of which element of the pair is being accessed. Additionally, when fullpair == true then the contents are exported.
  • values::MVector{2, T}
    • The individual values taken from the data stream to be processed. Both Tvalue and Svalue rely on them being accessible.
source
Base.getindexMethod
getindex(bacc::BinningAccumulator; level)

Overload the [] notation by accessing the BinningAccumulator's LvlAccums at a specific binning level keyword.

Example

julia> bacc = BinningAccumulator();

julia> bacc[level = 0]
LevelAccumulator{Float64} with online fields:
    level    = 0
    num_bins = 0
    Taccum   = 0.0
    Saccum   = 0.0
    Paccum   = PairAccumulator{Float64}(true, [0.0, 0.0])

    Calculated Level Statistics:
    Current Mean             = NaN
    Current Variance         = -0.0
    Current Std. Deviation   = -0.0
    Current Var. of the Mean = NaN
    Current Std. Error       = NaN
source
Base.lengthMethod
length(bacc::BinningAccumulator)

Return the number of LevelAccumulators there are.

Example

julia> bacc = BinningAccumulator();

julia> push!(bacc, [1, 2, 3, 4, 3, 2, 1]); # Data stream with 7 elements

julia> length(bacc) # Only 2 binning levels (1 for unbinned data)
3
source
Base.push!Method
push!(bacc::BinningAccumulator, itr)

push! each value of the data stream itr through the BinningAccumulator.

Example

julia> bacc = BinningAccumulator()
BinningAccumulator{Float64} with 0 binning levels.
0th Binning Level (unbinned data):
LevelAccumulator{Float64} with online fields:
    level    = 0
    num_bins = 0
    Taccum   = 0.0
    Saccum   = 0.0
    Paccum   = PairAccumulator{Float64}(true, [0.0, 0.0])

    Calculated Level Statistics:
    Current Mean             = NaN
    Current Variance         = -0.0
    Current Std. Deviation   = -0.0
    Current Var. of the Mean = NaN
    Current Std. Error       = NaN

julia> push!(bacc, [42, -26])
BinningAccumulator{Float64} with 1 binning levels.
0th Binning Level (unbinned data):
LevelAccumulator{Float64} with online fields:
    level    = 0
    num_bins = 2
    Taccum   = 16.0
    Saccum   = 2312.0
    Paccum   = PairAccumulator{Float64}(true, [0.0, 0.0])

    Calculated Level Statistics:
    Current Mean             = 8.0
    Current Variance         = 2312.0
    Current Std. Deviation   = 48.08326112068523
    Current Var. of the Mean = 1156.0
    Current Std. Error       = 34.0

1th Binning Level:
LevelAccumulator{Float64} with online fields:
    level    = 1
    num_bins = 0
    Taccum   = 0.0
    Saccum   = 0.0
    Paccum   = PairAccumulator{Float64}(false, [0.0, 8.0])

    Calculated Level Statistics:
    Current Mean             = NaN
    Current Variance         = -0.0
    Current Std. Deviation   = -0.0
    Current Var. of the Mean = NaN
    Current Std. Error       = NaN
source
Base.push!Method
push!(bacc::BinningAccumulator, value::Number)

Add a single value from the data stream into the online binning analysis. The single value enters at the bin at the lowest level.

Example

julia> bacc = BinningAccumulator()
BinningAccumulator{Float64} with 0 binning levels.
0th Binning Level (unbinned data):
LevelAccumulator{Float64} with online fields:
    level    = 0
    num_bins = 0
    Taccum   = 0.0
    Saccum   = 0.0
    Paccum   = PairAccumulator{Float64}(true, [0.0, 0.0])

    Calculated Level Statistics:
    Current Mean             = NaN
    Current Variance         = -0.0
    Current Std. Deviation   = -0.0
    Current Var. of the Mean = NaN
    Current Std. Error       = NaN

julia> push!(bacc, 42)
BinningAccumulator{Float64} with 0 binning levels.
0th Binning Level (unbinned data):
LevelAccumulator{Float64} with online fields:
    level    = 0
    num_bins = 0
    Taccum   = 0.0
    Saccum   = 0.0
    Paccum   = PairAccumulator{Float64}(false, [0.0, 42.0])        

    Calculated Level Statistics:
    Current Mean             = NaN
    Current Variance         = -0.0
    Current Std. Deviation   = -0.0
    Current Var. of the Mean = NaN
    Current Std. Error       = NaN
Note

Notice that the Taccum and Saccum remain zero while num_bins == 0. These are only accumulated for each input pair. Or once Paccum.fullpair == true.

source
Base.push!Method
push!(pacc::PairAccumulator, value::Number)

Overload Base.push! for a PairAccumulator. One can only push! a single value <: Number at a time into this type of accumulator.

source
Base.showMethod
show([io::IO = stdout], bacc::BinningAccumulator)

Overload the Base.show function for human-readable displays.

source
Base.showMethod
show([io = stdout], lacc::LevelAccumulator)

Overload Base.show for human-readable displays.

source
OnlineLogBinning.RxValueFunction
RxValue(bacc::BinningAccumulator, [trustworthy_only = true]; [trusting_cutoff])

Calculate the RxValues from the statistically trustworthy binning levels by default, or from all of them if trustworthy_only == false.

source
OnlineLogBinning.RxValueMethod
RxValue(bacc::BinningAccumulator, level)

Compute the $R_X$ quantity from the binning analysis. This quantity starts at $1$ for low binning levels, then gradually rises, until the bins become statistically uncorrelated at which point $R_X$ should saturate. Once saturated, the effective number of uncorrelated elements in a correlated data stream of size $M$ is given in terms of $R_X$ by $M / R_X$.

source
OnlineLogBinning.SvalueMethod
Svalue(lacc::LevelAccumulator)

Function to calculate the online $S_{1,m+2}$ summation as:

\[S_{1,m+2} = S_{1,m} + S_{m+1,m+2} + \frac{m}{2(m+2)}\left( \frac{2}{m} T_{1,m} - T_{m+1,m+2} \right)^2.\]

where $T_{m+1,m+2}$ is the pairwise Tvalue for the PairAccumulator.

source
OnlineLogBinning.SvalueMethod
Svalue(pacc::PairAccumulator)

The $S$ function for a single pair following the accumulation of $m$ data points follows as

\[\begin{aligned} S_{m+1, m+2} &\equiv \sum_{k = m+1}^{m+2} \left( x_k - \frac{1}{2} T_{m+1,m+2} \right)^2 \\ &= \frac{1}{2}\left( x_{m+2} - x_{m+1} \right)^2. \end{aligned}\]

Thus, $S_{m+1,m+2}$ does not need to take $T_{m+1,m+2}$ as an argument.

source
OnlineLogBinning.TvalueMethod
Tvalue(pacc::PairAccumulator)

The $T$ function for a single pair following the accumulation of $m$ data points follows as

\[T_{m+1, m+2} \equiv \sum_{k = m+1}^{m+2} x_k = x_{m+1} + x_{m+2},\]

as expected.

source
OnlineLogBinning._plateau_foundMethod
_plateau_found(bacc, fit) → Bool

Test whether a plateau has been found from the fit using the LsqFit.jl package. This includes finding reasonable values for the sigmoid parameters.

Note

What counts as a plateau?

A plateau in the RxValues is defined to be present if the following three conditions on the sigmoid fit are all true:

  1. None of the computed level variances are too small.
  2. The amplitude is positive.
  3. The inflection point given by θ₁ / θ₂ < max_trustworthy_level(levels).

If any of these conditions are violated, then we do not trust that the RxValues have actually converged to a single value, meaning that the datastream is not sufficiently large enough to separate correlated data from one another.

source
OnlineLogBinning.bin_depthMethod
bin_depth(bacc::BinningAccumulator)

Number of binned levels present. length of the [BinningAccumulator] minus 1.

Example

julia> bacc = BinningAccumulator();

julia> push!(bacc, [1, 2, 3, 4, 3, 2, 1]); # Data stream with 7 elements

julia> bin_depth(bacc) # Only 2 binning levels (1 for unbinned data)
2
source
OnlineLogBinning.effective_uncorrelated_valuesMethod
effective_uncorrelated_values(mvals, RxVal)

Calculation of the effective number of uncorrelated values in a correlated datastream:

\[m_{\rm eff} = \mathtt{floor} \left( \frac{ m^{(0)} }{R_X} \right).\]

source
OnlineLogBinning.levels_RxValuesFunction
levels_RxValues(bacc::BinningAccumulator, [trustworthy_only = true]; [trusting_cutoff = TRUSTING_CUTOFF])

Return a Tuple of identically-sized Vectors. The first element of the Tuple are the binning levels and the second are the corresonding RxValues. If trustworthy_only == true, then only the trustworthy levels and values are returned. If trustworthy_only == false, then all levels and values are returned (except for the last level which is typically not full).

This function is meant to make visualization more convenient and does not offer any different functionality than what was available before.

Compat

Requires OnlineLogBinning.jl v0.3.0 or higher.

source
OnlineLogBinning.max_trustworthy_levelMethod
max_trustworthy_level(nelements; [trusting_cutoff])

Calculates the highest binning level that remains statistically trustworthy according to the TRUSTING_CUTOFF, $t_c$.

Given a number of elements in a data stream, $N$, this quantity is

\[\ell_{\rm max} = {\rm floor} \left[ \log_2 \left( \frac{N}{t_c} \right) \right].\]

source
OnlineLogBinning.sigmoidFunction
sigmoid(x, [amp = 1], [θ₁ = 0], [θ₂ = 1])

Calculate a Sigmoid at a given argument x. The Sigmoid function $S(x; A, \theta_1, \theta_2)$ is of the form

\[S(x; A, \theta_1, \theta_2) = \frac{A}{1 + \exp\left( \theta_1 - \theta_2 x \right)}.\]

source
OnlineLogBinning.sigmoid_jacobianMethod
sigmoid_jacobian(x, pvals)

Calculate the "Jacobian" of first derivatives for a sigmoid to speed the LsqFit fitting. The derivatives are given by

\[\begin{aligned} \frac{\partial S}{\partial A} &= \frac{1}{1 + \exp\left( \theta_1 - \theta_2 x \right)}, \\ & \\ \frac{\partial S}{\partial \theta_1} &= -\frac{A \, \exp\left( \theta_1 - \theta_2 x \right) }{\left[ 1 + \exp\left( \theta_1 - \theta_2 x \right) \right]^2}, \\ & \\ \frac{\partial S}{\partial \theta_2} &= \frac{A \, x \, \exp\left( \theta_1 - \theta_2 x \right) }{\left[ 1 + \exp\left( \theta_1 - \theta_2 x \right) \right]^2}. \end{aligned}\]

source
OnlineLogBinning.std_errorMethod
std_error( bacc::BinningAccumulator )

Online measurement of the [BinningAccumulator] standard error.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source
OnlineLogBinning.std_errorMethod
std_error( lacc::LevelAccumulator ) = sqrt(var_of_mean(lacc))

Online measurement of the [LevelAccumulator] standard error.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source
OnlineLogBinning.trustworthy_levelMethod
trustworthy_level(level; [trustworthy_cutoff = 64])

A binning level is said to be a trustworthy_level if the number of bins it contains is greater than or equal to the trustworthy_cutoff.

The number of bins $N_{\rm bin}$ in any binning level is related to the number of elements $N$ and its binning level $\ell \in \{0, 1, \dots \}$ by

\[N_{\rm bin} = \frac{N}{2^{\ell}}.\]

This means that, for a given trustworthy_cutoff of $t_c$, then the maximum number of trustworthy_levels present are

\[{\rm Total}(\ell) = 1 + {\rm floor} \left[ \log_2 \left( \frac{N}{t_c} \right) \right],\]

where the extra 1 comes from assuming the original data stream has more than $t_c$ elements in it, making the $\ell = 0$ level a trustworthy_level.

Note

Basically this just means that the statistics we're showing are not susceptible to low-number effects. The $log_2$ term is the calculated using max_trustworthy_level.

source
OnlineLogBinning.var_of_meanMethod
var_of_mean( bacc::BinningAccumulator; [level = 0] )

Online measurement of the [BinningAccumulator] variance of the mean.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source
OnlineLogBinning.var_of_meanMethod
var_of_mean( lacc::LevelAccumulator ) = var(lacc) / lacc.num_bins

Online measurement of the [LevelAccumulator] variance of the mean.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source
Statistics.meanMethod
mean( bacc::BinningAccumulator; [level = 0] )

Online measurement of the [BinningAccumulator] mean.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source
Statistics.meanMethod
mean( lacc::LevelAccumulator )

Online measurement of the [LevelAccumulator] mean.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source
Statistics.stdMethod
std( bacc::BinningAccumulator )

Online measurement of the [BinningAccumulator] standard deviation.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source
Statistics.stdMethod
std( lacc::LevelAccumulator ) = sqrt(var(lacc))

Online measurement of the [LevelAccumulator] standard deviation.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source
Statistics.varMethod
var( bacc::LevelAccumulator; [level = 0] )

Online measurement of the [BinningAccumulator] variance.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source
Statistics.varMethod
var( lacc::LevelAccumulator )

Online measurement of the [LevelAccumulator] variance.

Additional information

  • This quantity is considered online despite that it is not regularly updated when data is push!ed from the stream.
source