Packages

c

org.checkita.dqf.core.metrics.df

DFMetricCalculator

abstract class DFMetricCalculator extends AnyRef

Basic DF metric calculator

Note

DF Calculators are intendet to work with Batch applications only. Hence, their functionality may be revised in future to support streaming applications as well.

Linear Supertypes
AnyRef, Any
Known Subclasses
ConditionalDFCalculator, GroupingDFMetricCalculator, ApproximateDistinctValuesDFMetricCalculator, ApproximateSequenceCompletenessDFMetricCalculator, TopNDFMetricCalculator, AvgNumberDFMetricCalculator, CastedNumberDFMetricCalculator, FirstQuantileDFMetricCalculator, FormattedNumberDFMetricCalculator, GetPercentileDFMetricCalculator, GetQuantileDFMetricCalculator, MaxNumberDFMetricCalculator, MedianValueDFMetricCalculator, MinNumberDFMetricCalculator, NumberBetweenDFMetricCalculator, NumberGreaterThanDFMetricCalculator, NumberInDomainDFMetricCalculator, NumberLessThanDFMetricCalculator, NumberNotBetweenDFMetricCalculator, NumberOutDomainDFMetricCalculator, NumberValuesDFMetricCalculator, PercentileDFCalculator, StdNumberDFMetricCalculator, SumNumberDFMetricCalculator, ThirdQuantileDFMetricCalculator, AvgStringDFMetricCalculator, CompletenessDFMetricCalculator, EmptinessDFMetricCalculator, EmptyValuesDFMetricCalculator, FormattedDateDFMetricCalculator, MaxStringDFMetricCalculator, MinStringDFMetricCalculator, NullValuesDFMetricCalculator, RegexMatchDFMetricCalculator, RegexMismatchDFMetricCalculator, StringInDomainDFMetricCalculator, StringLengthDFMetricCalculator, StringOutDomainDFMetricCalculator, StringValuesDFMetricCalculator, RowCountDFMetricCalculator, DistinctValuesDFMetricCalculator, DuplicateValuesDFMetricCalculator, SequenceCompletenessDFMetricCalculator, CoMomentDFMetricCalculator, ColumnEqDFMetricCalculator, CovarianceBesselDFMetricCalculator, CovarianceDFMetricCalculator, DayDistanceDFMetricCalculator, LevenshteinDistanceDFMetricCalculator, MultiColumnConditionalDFCalculator
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DFMetricCalculator
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DFMetricCalculator()

Abstract Value Members

  1. abstract val columns: Seq[String]
  2. abstract val emptyValue: Column

    Value which is returned when metric result is null.

    Value which is returned when metric result is null.

    Attributes
    protected
  3. abstract def errorConditionExpr(implicit colTypes: Map[String, DataType]): Column

    Spark expression yielding boolean result for processed row.

    Spark expression yielding boolean result for processed row. Indicates whether metric increment failed or not. Usually checks the outcome of resultExpr.

    colTypes

    Map of column names to their datatype.

    returns

    Spark row-level expression yielding boolean result.

    Attributes
    protected
  4. abstract def errorMessage: String

    Error message that will be returned when metric increment fails.

    Error message that will be returned when metric increment fails.

    returns

    Metric increment failure message.

  5. abstract val metricId: String

    Unlike RDD calculators, DF calculators are not groped by its type.

    Unlike RDD calculators, DF calculators are not groped by its type. For each metric defined in DQ job, there will be created its own instance of DF calculator. Thus, DF metric calculators can be linked to metric definitions by metricId.

  6. abstract val metricName: MetricName
  7. abstract val resultAggregateFunction: (Column) ⇒ Column

    Function that aggregates metric increments into final metric value.

    Function that aggregates metric increments into final metric value. Accepts spark expression resultExpr as input and returns another spark expression that will yield aggregated double metric result.

    Attributes
    protected
  8. abstract def resultExpr(implicit colTypes: Map[String, DataType]): Column

    Spark expression yielding numeric result for processed row.

    Spark expression yielding numeric result for processed row. Metric will be incremented with this result using associated aggregation function.

    colTypes

    Map of column names to their datatype.

    returns

    Spark row-level expression yielding numeric result.

    Attributes
    protected
    Note

    Spark expression MUST process single row but not aggregate multiple rows.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  8. def errorExpr(rowData: Column)(implicit colTypes: Map[String, DataType]): Column

    Error collection expression: collects row data in case of metric error.

    Error collection expression: collects row data in case of metric error.

    rowData

    Array of row data from columns related to this metric calculator (source keyFields + metric columns + window start time column for streaming applications)

    colTypes

    Map of column names to their datatype.

    returns

    Spark expression that will yield row data in case of metric error.

    Attributes
    protected
  9. def errors(implicit errorDumpSize: Int, keyFields: Seq[String], colTypes: Map[String, DataType]): Column

    Final metric errors aggregation expression.

    Final metric errors aggregation expression. Collects all metric errors into an array column. The size of array is limited by maximum allowed error dump size parameter.

    errorDumpSize

    Maximum allowed number of errors to be collected per single metric.

    keyFields

    Sequence of source/stream key fields.

    colTypes

    Map of column names to their datatype.

    returns

    Spark expression that will yield array of metric errors.

  10. val errorsCol: String

    Name of the column that will store metric errors

  11. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  13. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  14. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  15. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  16. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  17. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  18. def result(implicit colTypes: Map[String, DataType]): Column

    Final metric aggregation expression that MUST yield double value.

    Final metric aggregation expression that MUST yield double value.

    colTypes

    Map of column names to their datatype.

    returns

    Spark expression that will yield double metric calculator result

  19. val resultCol: String

    Name of the column that will store metric result

  20. def rowDataExpr(keyFields: Seq[String]): Column

    Row data collection expression: collects values of selected columns to array for row where metric error occurred.

    Row data collection expression: collects values of selected columns to array for row where metric error occurred.

    keyFields

    Sequence of source/stream key fields.

    returns

    Spark expression that will yield array of row data for column related to this metric calculator.

    Attributes
    protected
  21. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  22. def toString(): String
    Definition Classes
    AnyRef → Any
  23. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from AnyRef

Inherited from Any

Ungrouped