c

org.checkita.dqf.core.metrics.rdd.regular.BasicStringRDDMetrics

DuplicateValuesRDDMetricCalculator

case class DuplicateValuesRDDMetricCalculator(numDuplicates: Long, uniqueValues: Set[String], failCount: Long = 0, status: CalculatorStatus = CalculatorStatus.Success, failMsg: String = "OK") extends RDDMetricCalculator with Product with Serializable

Calculates number of duplicate values for given column or tuple of columns. WARNING: In order to find duplicates, the processed unique values are stored as a set without any kind of trimming and hashing. So if a big diversion of elements needs to be processed there is a risk of getting OOM error in cases when executors have insufficient memory allocation.

numDuplicates

Number of found duplicates

uniqueValues

Set of unique values obtained from already processed rows

Linear Supertypes
Serializable, Serializable, Product, Equals, RDDMetricCalculator, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DuplicateValuesRDDMetricCalculator
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. RDDMetricCalculator
  7. AnyRef
  8. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DuplicateValuesRDDMetricCalculator()
  2. new DuplicateValuesRDDMetricCalculator(numDuplicates: Long, uniqueValues: Set[String], failCount: Long = 0, status: CalculatorStatus = CalculatorStatus.Success, failMsg: String = "OK")

    numDuplicates

    Number of found duplicates

    uniqueValues

    Set of unique values obtained from already processed rows

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. def copyWithError(status: CalculatorStatus, msg: String, failInc: Long = 1): RDDMetricCalculator

    Copy calculator with error status and corresponding message.

    Copy calculator with error status and corresponding message.

    status

    Calculator status to copy with

    msg

    Failure message

    failInc

    Failure increment

    returns

    Copy of this calculator with error status

    Attributes
    protected
    Definition Classes
    DuplicateValuesRDDMetricCalculatorRDDMetricCalculator
  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. val failCount: Long
    Attributes
    protected
    Definition Classes
    DuplicateValuesRDDMetricCalculatorRDDMetricCalculator
  9. val failMsg: String
    Attributes
    protected
    Definition Classes
    DuplicateValuesRDDMetricCalculatorRDDMetricCalculator
  10. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. def getFailCounter: Long

    Gets current metric failure counts

    Gets current metric failure counts

    returns

    Failure count

    Definition Classes
    RDDMetricCalculator
  13. def getFailMessage: String

    Gets current failure or error message

    Gets current failure or error message

    returns

    Failure message

    Definition Classes
    RDDMetricCalculator
  14. def getStatus: CalculatorStatus

    Gets current metric calculator status

    Gets current metric calculator status

    returns

    Calculator status

    Definition Classes
    RDDMetricCalculator
  15. def increment(values: Seq[Any]): RDDMetricCalculator

    Safely updates metric calculator

    Safely updates metric calculator

    values

    values to process

    returns

    updated calculator

    Definition Classes
    RDDMetricCalculator
  16. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  17. def merge(m2: RDDMetricCalculator): RDDMetricCalculator

    Merges two metric calculators together

    Merges two metric calculators together

    m2

    second metric calculator

    returns

    merged metric calculator

    Definition Classes
    DuplicateValuesRDDMetricCalculatorRDDMetricCalculator
  18. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  19. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  20. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  21. val numDuplicates: Long
  22. def result(): Map[String, (Double, Option[String])]

    Gets results of calculator in the current state

    Gets results of calculator in the current state

    returns

    Map of (result_name -> (result, additionalResults))

    Definition Classes
    DuplicateValuesRDDMetricCalculatorRDDMetricCalculator
  23. val status: CalculatorStatus
    Attributes
    protected
    Definition Classes
    DuplicateValuesRDDMetricCalculatorRDDMetricCalculator
  24. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  25. def tryToIncrement(values: Seq[Any]): RDDMetricCalculator

    Increment metric calculator.

    Increment metric calculator. May throw an exception.

    values

    values to process

    returns

    updated calculator or throws an exception

    Attributes
    protected
    Definition Classes
    DuplicateValuesRDDMetricCalculatorRDDMetricCalculator
  26. val uniqueValues: Set[String]
  27. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  28. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from RDDMetricCalculator

Inherited from AnyRef

Inherited from Any

Ungrouped