case class EmptinessDFMetricCalculator(metricId: String, columns: Seq[String], includeEmptyStrings: Boolean, reversed: Boolean) extends ConditionalDFCalculator with Product with Serializable
Calculates emptiness of values in the specified columns, i.e. percentage of null values or empty values (if configured to account for empty values).
- metricId
Id of the metric.
- columns
Sequence of columns which are used for metric calculation
- includeEmptyStrings
Flag which sets whether empty strings are considered in addition to null values.
- reversed
Boolean flag indicating whether error collection logic should be reversed for this metric
- Alphabetic
- By Inheritance
- EmptinessDFMetricCalculator
- Serializable
- Serializable
- Product
- Equals
- ConditionalDFCalculator
- ReversibleDFCalculator
- DFMetricCalculator
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
EmptinessDFMetricCalculator(metricId: String, columns: Seq[String], includeEmptyStrings: Boolean, reversed: Boolean)
- metricId
Id of the metric.
- columns
Sequence of columns which are used for metric calculation
- includeEmptyStrings
Flag which sets whether empty strings are considered in addition to null values.
- reversed
Boolean flag indicating whether error collection logic should be reversed for this metric
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
val
columns: Seq[String]
- Definition Classes
- EmptinessDFMetricCalculator → DFMetricCalculator
-
val
emptyValue: Column
All conditional metrics should return zero when DF is empty.
All conditional metrics should return zero when DF is empty.
- Attributes
- protected
- Definition Classes
- ConditionalDFCalculator → DFMetricCalculator
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
errorConditionExpr(implicit colTypes: Map[String, DataType]): Column
For direct error collection logic metric increment is considered failed when for one or more of metric columns the condition is not met.
For direct error collection logic metric increment is considered failed when for one or more of metric columns the condition is not met. For reversed error collection logic metric increment is considered failed when for one or more of metric columns the condition IS met.
- colTypes
Map of column names to their datatype.
- returns
Spark row-level expression yielding boolean result.
- Attributes
- protected
- Definition Classes
- ConditionalDFCalculator → DFMetricCalculator
-
def
errorExpr(rowData: Column)(implicit colTypes: Map[String, DataType]): Column
Error collection expression: collects row data in case of metric error.
Error collection expression: collects row data in case of metric error.
- rowData
Array of row data from columns related to this metric calculator (source keyFields + metric columns + window start time column for streaming applications)
- colTypes
Map of column names to their datatype.
- returns
Spark expression that will yield row data in case of metric error.
- Attributes
- protected
- Definition Classes
- DFMetricCalculator
-
def
errorMessage: String
For direct error collection logic any non-null (or non-empty if
includeEmptyStrings
istrue
) values are considered as metric failure.For direct error collection logic any non-null (or non-empty if
includeEmptyStrings
istrue
) values are considered as metric failure. For reversed error collection logic null (or empty ifincludeEmptyStrings
istrue
) values are considered as metric failure- returns
Metric increment failure message.
- Definition Classes
- EmptinessDFMetricCalculator → DFMetricCalculator
-
def
errors(implicit errorDumpSize: Int, keyFields: Seq[String], colTypes: Map[String, DataType]): Column
Final metric errors aggregation expression.
Final metric errors aggregation expression. Collects all metric errors into an array column. The size of array is limited by maximum allowed error dump size parameter.
- errorDumpSize
Maximum allowed number of errors to be collected per single metric.
- keyFields
Sequence of source/stream key fields.
- colTypes
Map of column names to their datatype.
- returns
Spark expression that will yield array of metric errors.
- Definition Classes
- DFMetricCalculator
-
val
errorsCol: String
Name of the column that will store metric errors
Name of the column that will store metric errors
- Definition Classes
- DFMetricCalculator
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- val includeEmptyStrings: Boolean
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
metricCondExpr(colName: String)(implicit colTypes: Map[String, DataType]): Column
Create spark expression which applies metric condition to provided column and will yield boolean result.
Create spark expression which applies metric condition to provided column and will yield boolean result.
- colName
Column to which the metric condition is applied
- colTypes
Map of column names to their datatype.
- Attributes
- protected
- Definition Classes
- EmptinessDFMetricCalculator → ConditionalDFCalculator
-
val
metricId: String
Unlike RDD calculators, DF calculators are not groped by its type.
Unlike RDD calculators, DF calculators are not groped by its type. For each metric defined in DQ job, there will be created its own instance of DF calculator. Thus, DF metric calculators can be linked to metric definitions by metricId.
- Definition Classes
- EmptinessDFMetricCalculator → DFMetricCalculator
-
val
metricName: MetricName
- Definition Classes
- EmptinessDFMetricCalculator → DFMetricCalculator
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
result(implicit colTypes: Map[String, DataType]): Column
Final metric aggregation expression that MUST yield double value.
Final metric aggregation expression that MUST yield double value.
- colTypes
Map of column names to their datatype.
- returns
Spark expression that will yield double metric calculator result
- Definition Classes
- DFMetricCalculator
-
val
resultAggregateFunction: (Column) ⇒ Column
Emptiness metric is aggregated as ratio of total number of null (or empty) cells to total number of cells that were processed.
Emptiness metric is aggregated as ratio of total number of null (or empty) cells to total number of cells that were processed.
- Attributes
- protected
- Definition Classes
- EmptinessDFMetricCalculator → ConditionalDFCalculator → DFMetricCalculator
-
val
resultCol: String
Name of the column that will store metric result
Name of the column that will store metric result
- Definition Classes
- DFMetricCalculator
-
def
resultExpr(implicit colTypes: Map[String, DataType]): Column
Spark expression yielding numeric result for processed row.
Spark expression yielding numeric result for processed row. For conditional metrics, the increment is 1 when condition is met, otherwise increment is 0 (metric is not incremented).
- colTypes
Map of column names to their datatype.
- returns
Spark row-level expression yielding numeric result.
- Attributes
- protected
- Definition Classes
- ConditionalDFCalculator → DFMetricCalculator
-
val
reversed: Boolean
- Attributes
- protected
- Definition Classes
- EmptinessDFMetricCalculator → ReversibleDFCalculator
-
def
rowDataExpr(keyFields: Seq[String]): Column
Row data collection expression: collects values of selected columns to array for row where metric error occurred.
Row data collection expression: collects values of selected columns to array for row where metric error occurred.
- keyFields
Sequence of source/stream key fields.
- returns
Spark expression that will yield array of row data for column related to this metric calculator.
- Attributes
- protected
- Definition Classes
- DFMetricCalculator
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()