Packages

o

org.checkita.dqf.core.metrics.df.regular

MultiColumnDFMetrics

object MultiColumnDFMetrics

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. MultiColumnDFMetrics
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class CoMomentDFMetricCalculator(metricId: String, columns: Seq[String]) extends DFMetricCalculator with Product with Serializable

    Calculates co-moment between values of two columns

    Calculates co-moment between values of two columns

    metricId

    Id of the metric.

    columns

    Sequence of columns which are used for metric calculation

    Note

    Differs from RDD calculator in terms of processing values that are not numbers: RDD calculator will yield NaN if at least one value cannot be cast to Double. DF calculator just skips rows where some of the values cannot be cast to Duble.

  2. case class ColumnEqDFMetricCalculator(metricId: String, columns: Seq[String], reversed: Boolean) extends MultiColumnConditionalDFCalculator with Product with Serializable

    Calculates amount rows where elements in the given columns are equal

    Calculates amount rows where elements in the given columns are equal

    metricId

    Id of the metric.

    columns

    Sequence of columns which are used for metric calculation

    reversed

    Boolean flag indicating whether error collection logic should be reversed for this metric

  3. case class CovarianceBesselDFMetricCalculator(metricId: String, columns: Seq[String]) extends DFMetricCalculator with Product with Serializable

    Calculates sample covariance (covariance with Bessel's correction) between values of two columns

    Calculates sample covariance (covariance with Bessel's correction) between values of two columns

    metricId

    Id of the metric.

    columns

    Sequence of columns which are used for metric calculation

    Note

    Differs from RDD calculator in terms of processing values that are not numbers: RDD calculator will yield NaN if at least one value cannot be cast to Double. DF calculator just skips rows where some of the values cannot be cast to Duble.

  4. case class CovarianceDFMetricCalculator(metricId: String, columns: Seq[String]) extends DFMetricCalculator with Product with Serializable

    Calculates population covariance between values of two columns

    Calculates population covariance between values of two columns

    metricId

    Id of the metric.

    columns

    Sequence of columns which are used for metric calculation

    Note

    Differs from RDD calculator in terms of processing values that are not numbers: RDD calculator will yield NaN if at least one value cannot be cast to Double. DF calculator just skips rows where some of the values cannot be cast to Duble.

  5. case class DayDistanceDFMetricCalculator(metricId: String, columns: Seq[String], dateFormat: String, threshold: Int, reversed: Boolean) extends MultiColumnConditionalDFCalculator with Product with Serializable

    Calculates the number of the rows for which the day difference between two columns given as input is less than the threshold (number of days)

    Calculates the number of the rows for which the day difference between two columns given as input is less than the threshold (number of days)

    metricId

    Id of the metric.

    columns

    Sequence of columns which are used for metric calculation

    dateFormat

    Date format for values in columns

    threshold

    Maximum allowed day distance between dates in columns

    reversed

    Boolean flag indicating whether error collection logic should be reversed for this metric

  6. case class LevenshteinDistanceDFMetricCalculator(metricId: String, columns: Seq[String], threshold: Double, normalize: Boolean, reversed: Boolean) extends MultiColumnConditionalDFCalculator with Product with Serializable

    Calculates amount of rows where Levenshtein distance between 2 columns is less than threshold.

    Calculates amount of rows where Levenshtein distance between 2 columns is less than threshold.

    metricId

    Id of the metric.

    columns

    Sequence of columns which are used for metric calculation

    threshold

    Threshold (should be within [0, 1] range for normalized results)

    normalize

    Flag to define whether distance should be normalized over maximum length of two input strings

    reversed

    Boolean flag indicating whether error collection logic should be reversed for this metric

  7. abstract class MultiColumnConditionalDFCalculator extends DFMetricCalculator with ReversibleDFCalculator

    Abstract class for conditional multi-column DF metric calculators.

    Abstract class for conditional multi-column DF metric calculators. Specific of multi-column conditional metrics is that metric condition is a function of all input columns and need to be applied to all of them at once. Unlike for columns which can work with arbitrary number of columns where metric condition is applied to each of the input column separately.

    All conditional metrics are reversible: direct error collection logic implies metric increment fails when condition is not met. Correspondingly, for reversed error collection logic, metric increment fails when condition IS met.

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  13. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  14. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  15. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  16. def toString(): String
    Definition Classes
    AnyRef → Any
  17. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  18. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from AnyRef

Inherited from Any

Ungrouped