DQStreamJob

final case class DQStreamJob(jobConfig: JobConfig, sources: Seq[Source], metrics: Seq[RegularMetricConfig], composedMetrics: Seq[ComposedMetricConfig] = Seq.empty, trendMetrics: Seq[TrendMetricConfig] = Seq.empty, loadChecks: Seq[LoadCheckConfig] = Seq.empty, checks: Seq[CheckConfig] = Seq.empty, targets: Seq[TargetConfig] = Seq.empty, schemas: Map[String, SourceSchema] = Map.empty, connections: Map[String, DQConnection] = Map.empty, storageManager: Option[DqStorageManager] = None, bufferCheckpoint: Option[ProcessorBuffer] = None)(implicit jobId: String, settings: AppSettings, spark: SparkSession, fs: FileSystem) extends Logging with Product with Serializable

Data Quality Streaming Job: provides all required functionality to calculate quality metrics, perform checks, save results and send targets for streaming data sources.

sources: Sequence of sources to process
metrics: Sequence of metrics to calculate
composedMetrics: Sequence of composed metrics to calculate
trendMetrics: Sequence of trend metrics to calculate
loadChecks: Sequence of load checks to perform
checks: Sequence of checks to perform
targets: Sequence of targets to send
schemas: Map of user-defined schemas (used for load checks evaluation)
connections: Map of connections to external systems (used to send targets)
storageManager: Data Quality Storage manager (used to save results)
bufferCheckpoint: Buffer state from last checkpoint for this job.
jobId: Implicit job ID
settings: Implicit application settings object
spark: Implicit spark session object
fs: Implicit hadoop file system object

Linear Supertypes

Serializable, Serializable, Product, Equals, Logging, AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

DQStreamJob
Serializable
Serializable
Product
Equals
Logging
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new DQStreamJob(jobConfig: JobConfig, sources: Seq[Source], metrics: Seq[RegularMetricConfig], composedMetrics: Seq[ComposedMetricConfig] = Seq.empty, trendMetrics: Seq[TrendMetricConfig] = Seq.empty, loadChecks: Seq[LoadCheckConfig] = Seq.empty, checks: Seq[CheckConfig] = Seq.empty, targets: Seq[TargetConfig] = Seq.empty, schemas: Map[String, SourceSchema] = Map.empty, connections: Map[String, DQConnection] = Map.empty, storageManager: Option[DqStorageManager] = None, bufferCheckpoint: Option[ProcessorBuffer] = None)(implicit jobId: String, settings: AppSettings, spark: SparkSession, fs: FileSystem)
sources
Sequence of sources to process
metrics
Sequence of metrics to calculate
composedMetrics
Sequence of composed metrics to calculate
trendMetrics
Sequence of trend metrics to calculate
loadChecks
Sequence of load checks to perform
checks
Sequence of checks to perform
targets
Sequence of targets to send
schemas
Map of user-defined schemas (used for load checks evaluation)
connections
Map of connections to external systems (used to send targets)
storageManager
Data Quality Storage manager (used to save results)
bufferCheckpoint
Buffer state from last checkpoint for this job.
jobId
Implicit job ID
settings
Implicit application settings object
spark
Implicit spark session object
fs
Implicit hadoop file system object

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
val bufferCheckpoint: Option[ProcessorBuffer]
implicit val caseSensitive: Boolean
val checks: Seq[CheckConfig]
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
val composedMetrics: Seq[ComposedMetricConfig]
val connections: Map[String, DQConnection]
implicit val dumpSize: Int
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def initLogger(lvl: Level): Unit
Initialises logger:
Initialises logger:
- gets log4j properties with following priority: resources directory -> working directory -> default settings
- updates root logger level with verbosity level defined at application start
- reconfigures Logger
lvl
Root logger level defined at application start

Definition Classes
Logging
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val jobConfig: JobConfig
val loadChecks: Seq[LoadCheckConfig]
lazy val log: Logger

Definition Classes
Logging
Annotations
@transient()
def logPreMsg(): Unit

Attributes
protected
val metrics: Seq[RegularMetricConfig]
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def run(): Result[Unit]
Runs Data Quality job.
Runs Data Quality job. Job include following stages (some of them can be omitted depending on the configuration):
- processing load checks
- calculating regular metrics
- calculating composed composed metrics
- processing checks
- saving results
- sending/saving targets
returns
Either a set of job results or a list of errors that occurred during job run.
val schemas: Map[String, SourceSchema]
val sources: Seq[Source]
val storageManager: Option[DqStorageManager]
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
val targets: Seq[TargetConfig]
val trendMetrics: Seq[TrendMetricConfig]
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()

Packages

DQStreamJob

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

DQStreamJob 

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped

DQStreamJob