package appconf
Type Members
-
final
case class
AppConfig(applicationName: Option[NonEmptyString], storage: Option[StorageConfig], email: Option[EmailConfig], mattermost: Option[MattermostConfig], encryption: Option[Encryption], streaming: StreamConfig = StreamConfig(), dateTimeOptions: DateTimeConfig = DateTimeConfig(), enablers: Enablers = Enablers(), defaultSparkOptions: Seq[SparkParam] = Seq.empty) extends Product with Serializable
Application-level configuration
Application-level configuration
- applicationName
Name of Checkita Data Quality spark application
- storage
Defines parameters for connection to history storage.
Defines parameters to sent email notifications
- mattermost
Defines parameters to sent mattermost notifications
- encryption
Defines parameters to encrypt secrets in job config
- dateTimeOptions
Defines datetime representation settings
- enablers
Configure enablers (switchers) to turn on/off some features of DQ
- defaultSparkOptions
List of default Spark Configurations
-
case class
DateTimeConfig(timeZone: ZoneId = ZoneId.of("UTC"), referenceDateFormat: DateFormat = '>..., executionDateFormat: DateFormat = '>...) extends Product with Serializable
Application-level configuration describing datetime settings
Application-level configuration describing datetime settings
- timeZone
Timezone used to render date and time
- referenceDateFormat
Date format used to represent reference date
- executionDateFormat
Date format used to represent execution date
-
final
case class
EmailConfig(host: URI, port: Port, address: Email, name: NonEmptyString, sslOnConnect: Boolean = false, tlsEnabled: Boolean = false, username: Option[NonEmptyString], password: Option[NonEmptyString]) extends Product with Serializable
Application-level configuration describing connection to SMTP server
Application-level configuration describing connection to SMTP server
- host
SMTP host
- port
SMTP port
- address
Email address to sent notification from
- name
Name of the sender
- sslOnConnect
Boolean flag indication whether to use SSL on connect
- tlsEnabled
Boolean flag indication whether to enable TLS
- username
Username for connection to SMTP server (if required)
- password
Password for connection to SMTP server (if required)
-
final
case class
Enablers(allowSqlQueries: Boolean = false, allowNotifications: Boolean = false, aggregatedKafkaOutput: Boolean = false, enableCaseSensitivity: Boolean = false, errorDumpSize: PositiveInt = 10000, outputRepartition: PositiveInt = 1, metricEngineAPI: MetricEngineAPI = MetricEngineAPI.RDD, checkFailureTolerance: CheckFailureTolerance = CheckFailureTolerance.None) extends Product with Serializable
Application-level configuration for switchers (enablers)
Application-level configuration for switchers (enablers)
- allowSqlQueries
Enables arbitrary SQL queries in virtual sources
- allowNotifications
Enables notifications to be sent from DQ application
- aggregatedKafkaOutput
Enables sending aggregates messages for Kafka Targets (one per each target type, except checkAlerts where one message per checkAlert will be sent)
- enableCaseSensitivity
Enable columns case sensitivity
- errorDumpSize
Maximum number of errors to be collected per single metric per partition.
- outputRepartition
Sets the number of partitions when writing outputs. By default writes single file.
- metricEngineAPI
Metric processor API used to process metrics: either Spark RDD or Spark DF.
- checkFailureTolerance
Sets check failure tolerance for the application i.e. whether the application should return non-zero exit code when some the checks have failed.
-
final
case class
Encryption(secret: EncryptionKey, keyFields: Seq[String] = Seq("password", "secret"), encryptErrorData: Boolean = false) extends Product with Serializable
Application-level configuration describing encryption sensitive fields
Application-level configuration describing encryption sensitive fields
- secret
Secret string used to encrypt/decrypt sensitive fields
- keyFields
List of key fields used to identify fields that requires encryption/decryption.
- encryptErrorData
Boolean flag indicating whether rowData (contains excerpts from data sources) field in metric errors should be encrypted.
-
final
case class
MattermostConfig(host: URL, token: NonEmptyString) extends Product with Serializable
Application-level configuration describing connection to Mattermost API
Application-level configuration describing connection to Mattermost API
- host
Mattermost API host
- token
Mattermost API token (using Bot accounts for notifications is preferable)
-
final
case class
StorageConfig(dbType: DQStorageType, url: URI, username: Option[NonEmptyString], password: Option[NonEmptyString], schema: Option[NonEmptyString], saveErrorsToStorage: Boolean = false) extends Product with Serializable
Application-level configuration describing connection to history database.
Application-level configuration describing connection to history database.
- dbType
Type of database used to store DQ data (one of the supported RDBMS)
- url
Connection URL (without protocol identifiers)
- username
Username to connect to database with (if required)
- password
Password to connect to database with (if required)
- schema
Schema where data quality tables are located (if required)
- saveErrorsToStorage
Enables metric errors to be stored in storage database. Be careful when storing metric errors in storage database as this might overload the storage.
-
case class
StreamConfig(trigger: Duration = Duration("10s"), window: Duration = Duration("10m"), watermark: Duration = Duration("5m"), allowEmptyWindows: Boolean = false, checkpointDir: Option[URI] = None) extends Product with Serializable
Application-level configuration describing streaming settings
Application-level configuration describing streaming settings
- trigger
Trigger interval: defines time interval for which micro-batches are collected.
- window
Window interval: defines tabbing window size used to accumulate metrics.
- watermark
Watermark level: defines time interval after which late records are no longer processed.
- allowEmptyWindows
Boolean flag indicating whether empty windows are allowed. Thus, in situation when window is below watermark and for some of the processed streams there are no results then all related checks will be skipped if this flag is set to 'true'. Otherwise, checks will be processed and return error status with 'metric results were not found' message.
- checkpointDir
Checkpoint directory. If not set, then checkpoints in streaming applications will not be saved.