trait DQJob extends Logging
Base trait defining basic functionality of Data Quality Job
- Alphabetic
- By Inheritance
- DQJob
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
trait
RegularMetricsProcessor extends AnyRef
Metrics processed differently for batch and streaming job.
Metrics processed differently for batch and streaming job. Therefore, to generalize metric processing API, we introduce this trait that common method to process all regular metrics.
- Attributes
- protected
Abstract Value Members
- abstract val checks: Seq[CheckConfig]
- abstract val composedMetrics: Seq[ComposedMetricConfig]
- abstract val connections: Map[String, DQConnection]
- implicit abstract val fs: FileSystem
- abstract val jobConfig: JobConfig
- implicit abstract val jobId: String
- abstract val loadChecks: Seq[LoadCheckConfig]
- abstract val metrics: Seq[RegularMetricConfig]
- abstract val schemas: Map[String, SourceSchema]
- abstract val sources: Seq[Source]
- implicit abstract val spark: SparkSession
- abstract val storageManager: Option[DqStorageManager]
- abstract val targets: Seq[TargetConfig]
- abstract val trendMetrics: Seq[TrendMetricConfig]
Concrete Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
calculateComposedMetrics(stage: String, regularMetricResults: Result[MetricResults]): Result[MetricResults]
Calculates all composed metrics provided with regular metric results.
Calculates all composed metrics provided with regular metric results.
- stage
Stage indication used for logging.
- regularMetricResults
Map of regular metric results
- returns
Either a map of composed metric results or a list of calculation errors.
- Attributes
- protected
-
def
calculateTrendMetrics(stage: String)(implicit jobId: String, manager: Option[DqStorageManager], settings: AppSettings): Result[MetricResults]
Calculates all trend metrics provided with regular and composed metric results.
Calculates all trend metrics provided with regular and composed metric results.
- stage
Stage indication used for logging.
- jobId
Implicit current job ID.
- manager
Implicit storage manager used to load historical results.
- settings
Implicit application settings object.
- returns
Either a map of trend metric results or a list of calculation errors.
- Attributes
- protected
-
val
checksStage: String
- Attributes
- protected
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
combineResults(stage: String, loadCheckResults: Seq[ResultCheckLoad], checkResults: Result[Seq[ResultCheck]], jobState: Result[JobState], regularMetricResults: Result[Seq[ResultMetricRegular]], composedMetricResults: Result[Seq[ResultMetricComposed]], trendMetricResults: Result[Seq[ResultMetricTrend]], metricErrors: Result[Seq[ResultMetricError]])(implicit settings: AppSettings): Result[ResultSet]
Combines all results into a final result set.
Combines all results into a final result set.
- stage
Stage indication used for logging.
- loadCheckResults
Sequence of load check results
- checkResults
Sequence of check results (wrapped into Either)
- regularMetricResults
Sequence of regular metric results (wrapped into Either)
- composedMetricResults
Sequence of composed metric results (wrapped into Either)
- trendMetricResults
Sequence of trend metrics results (wrapped into Either)
- metricErrors
Sequence of metric errors (wrapped into Either)
- settings
Implicit application settings object
- returns
Combined results in form of ResultSet
- Attributes
- protected
-
val
composedMetricsMap: Map[String, ComposedMetricConfig]
- Attributes
- protected
-
def
encryptMetricErrors(errors: Seq[ResultMetricError])(implicit settings: AppSettings): Result[Seq[ResultMetricError]]
Encrypts rowData field in metric errors if requested per application configuration.
Encrypts rowData field in metric errors if requested per application configuration. Row data field in metric errors contains excerpt from data source and, therefore, these data can contain some sensitive information. In order to protect it, users can configure encryption chapter in application configuration and store encrypted rowData in DQ storage.
- errors
Sequence of metric errors to encrypt
- settings
Implicit application settings object
- returns
Sequence of metric errors with encrypted rowData field (if requested per configuration) or a list of encryption errors.
- Attributes
- protected
- Note
When metric errors are send via targets rowData field is never encrypted.
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
finalizeComposedMetrics(stage: String, metricResults: Result[MetricResults])(implicit settings: AppSettings): Result[Seq[ResultMetricComposed]]
Finalizes composed metric results: selects only composed metrics results and converts them to final composed metric results representation ready for writing into storage DB or sending via targets.
Finalizes composed metric results: selects only composed metrics results and converts them to final composed metric results representation ready for writing into storage DB or sending via targets.
- stage
Stage indication used for logging.
- metricResults
Map with all metric results
- settings
Implicit application settings object
- returns
Either a finalized sequence of composed metric results or a list of conversion errors.
- Attributes
- protected
-
def
finalizeJobState(jobConfig: JobConfig)(implicit settings: AppSettings, jobId: String): Result[JobState]
Finalizes job state: select whether to encrypt or not and converts to the final representation ready for writing into storage DB or sending via targets.
Finalizes job state: select whether to encrypt or not and converts to the final representation ready for writing into storage DB or sending via targets.
- jobConfig
Parsed Data Quality job configuration
- settings
Implicit application settings object
- jobId
Current Job ID
- returns
Either a finalized of job state.
- Attributes
- protected
-
def
finalizeMetricErrors(stage: String, metricResults: Result[MetricResults])(implicit settings: AppSettings): Result[Seq[ResultMetricError]]
Finalizes metric errors: retrieves metrics errors from results and converts them to final metric errors representation ready for writing into storage DB or sending via targets.
Finalizes metric errors: retrieves metrics errors from results and converts them to final metric errors representation ready for writing into storage DB or sending via targets.
- stage
Stage indication used for logging.
- metricResults
Map with all metric results
- settings
Implicit application settings object
- returns
Either a finalized sequence of metric errors or a list of conversion errors.
- Attributes
- protected
- Note
There could be a situations when metric errors hold the same error data. We are not interested in sending repeating data neither to storage database nor to Targets. Therefore, sequence of metric errors is deduplicated by unique constraint.
-
def
finalizeRegularMetrics(stage: String, metricResults: Result[MetricResults])(implicit settings: AppSettings): Result[Seq[ResultMetricRegular]]
Finalizes regular metric results: selects only regular metrics results and converts them to final regular metric results representation ready for writing into storage DB or sending via targets.
Finalizes regular metric results: selects only regular metrics results and converts them to final regular metric results representation ready for writing into storage DB or sending via targets.
- stage
Stage indication used for logging.
- metricResults
Map with all metric results
- settings
Implicit application settings object
- returns
Either a finalized sequence of regular metric results or a list of conversion errors.
- Attributes
- protected
-
def
finalizeTrendMetrics(stage: String, metricResults: Result[MetricResults])(implicit settings: AppSettings): Result[Seq[ResultMetricTrend]]
Finalizes trend metric results: selects only trend metrics results and converts them to final trend metric results representation ready for writing into storage DB or sending via targets.
Finalizes trend metric results: selects only trend metrics results and converts them to final trend metric results representation ready for writing into storage DB or sending via targets.
- stage
Stage indication used for logging.
- metricResults
Map with all metric results
- settings
Implicit application settings object
- returns
Either a finalized sequence of trend metric results or a list of conversion errors.
- Attributes
- protected
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initLogger(lvl: Level): Unit
Initialises logger:
Initialises logger:
- gets log4j properties with following priority: resources directory -> working directory -> default settings
- updates root logger level with verbosity level defined at application start
- reconfigures Logger
- lvl
Root logger level defined at application start
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
val
loadCheckStage: String
- Attributes
- protected
-
val
loadChecksBySources: Map[String, Seq[LoadCheckConfig]]
- Attributes
- protected
-
lazy val
log: Logger
- Definition Classes
- Logging
- Annotations
- @transient()
-
def
logMetricResults(stage: String, metType: String, mr: MetricResults): Unit
Logs metric calculation results.
Logs metric calculation results. Used during metric processing, to immediately log their calculation status.
- stage
Stage indication used for logging.
- metType
Type of the metrics being calculated (either regular or composed)
- mr
Map with metric results
- Attributes
- protected
- implicit val manager: Option[DqStorageManager]
-
val
metricStage: String
- Attributes
- protected
-
val
metricsBySources: Map[String, Seq[RegularMetricConfig]]
- Attributes
- protected
-
val
metricsMap: Map[String, RegularMetricConfig]
- Attributes
- protected
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
performChecks(stage: String, metricResults: Result[MetricResults])(implicit settings: AppSettings): Result[Seq[ResultCheck]]
Performs all checks
Performs all checks
- stage
Stage indication used for logging.
- metricResults
Map with metric results (both regular and composed)
- settings
Implicit application settings object
- returns
Either sequence of check results or a list of check evaluation errors.
- Attributes
- protected
-
def
performLoadChecks(stage: String)(implicit settings: AppSettings): Seq[ResultCheckLoad]
Performs all load checks
Performs all load checks
- stage
Stage indication used for logging.
- settings
Implicit application settings object
- returns
Sequence of load check results.
- Attributes
- protected
-
def
processAll(regularMetricsProcessor: RegularMetricsProcessor, stagePrefix: Option[String] = None)(implicit settings: AppSettings): Result[ResultSet]
Top-level processing function: aggregates and runs all processing stages in required order.
Top-level processing function: aggregates and runs all processing stages in required order.
- regularMetricsProcessor
Regular metric processor used to calculate regular metric results.
- stagePrefix
Prefix to stage names. Used for logging in streaming applications to indicate window for which results are processed.
- settings
Implicit application settings object
- returns
Either final results set or a list of processing errors
- Attributes
- protected
-
def
processTargets(stage: String, resultSet: Result[ResultSet])(implicit settings: AppSettings): Result[Unit]
Processes all targets
Processes all targets
- stage
Stage indication used for logging.
- resultSet
Final results set
- settings
Implicit application settings object
- returns
Either unit or a list of target processing errors.
- Attributes
- protected
-
def
runStorageMigration(stage: String)(implicit settings: AppSettings): Result[String]
Runs database migration provided with storage manager.
Runs database migration provided with storage manager.
- stage
Stage indication used for logging.
- settings
Implicit application settings object
- returns
Nothing in case of successful migration or a list of migration errors.
- Attributes
- protected
-
def
saveResults(stage: String, resultSet: Result[ResultSet])(implicit settings: AppSettings): Result[String]
Saves results into Data Quality storage
Saves results into Data Quality storage
- stage
Stage indication used for logging.
- resultSet
Final results set
- returns
Either a status string or a list of saving errors.
- Attributes
- protected
-
val
storageStage: String
- Attributes
- protected
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
val
targetsStage: String
- Attributes
- protected
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
val
trendMetricsMap: Map[String, TrendMetricConfig]
- Attributes
- protected
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()