Packages

c

org.checkita.dqf.core.metrics.rdd.regular.MultiColumnRDDMetrics

LevenshteinDistanceRDDMetricCalculator

case class LevenshteinDistanceRDDMetricCalculator(cnt: Double, threshold: Double, normalize: Boolean, reversed: Boolean, failCount: Long = 0, status: CalculatorStatus = CalculatorStatus.Success, failMsg: String = "OK") extends RDDMetricCalculator with ReversibleRDDCalculator with Product with Serializable

Calculates amount of rows where Levenshtein distance between 2 columns is less than threshold.

cnt

current success counter

threshold

Threshold (should be within [0, 1] range for normalized results)

normalize

Flag to define whether distance should be normalized over maximum length of two input strings

returns

result map with keys: "LEVENSHTEIN_DISTANCE"

Linear Supertypes
Serializable, Serializable, Product, Equals, ReversibleRDDCalculator, RDDMetricCalculator, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. LevenshteinDistanceRDDMetricCalculator
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. ReversibleRDDCalculator
  7. RDDMetricCalculator
  8. AnyRef
  9. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new LevenshteinDistanceRDDMetricCalculator(threshold: Double, normalize: Boolean, reversed: Boolean)
  2. new LevenshteinDistanceRDDMetricCalculator(cnt: Double, threshold: Double, normalize: Boolean, reversed: Boolean, failCount: Long = 0, status: CalculatorStatus = CalculatorStatus.Success, failMsg: String = "OK")

    cnt

    current success counter

    threshold

    Threshold (should be within [0, 1] range for normalized results)

    normalize

    Flag to define whether distance should be normalized over maximum length of two input strings

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. val cnt: Double
  7. def copyWithError(status: CalculatorStatus, msg: String, failInc: Long = 1): RDDMetricCalculator

    Copy calculator with error status and corresponding message.

    Copy calculator with error status and corresponding message.

    status

    Calculator status to copy with

    msg

    Failure message

    failInc

    Failure increment

    returns

    Copy of this calculator with error status

    Attributes
    protected
    Definition Classes
    LevenshteinDistanceRDDMetricCalculatorRDDMetricCalculator
  8. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  9. val failCount: Long
    Attributes
    protected
    Definition Classes
    LevenshteinDistanceRDDMetricCalculatorRDDMetricCalculator
  10. val failMsg: String
    Attributes
    protected
    Definition Classes
    LevenshteinDistanceRDDMetricCalculatorRDDMetricCalculator
  11. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  13. def getFailCounter: Long

    Gets current metric failure counts

    Gets current metric failure counts

    returns

    Failure count

    Definition Classes
    RDDMetricCalculator
  14. def getFailMessage: String

    Gets current failure or error message

    Gets current failure or error message

    returns

    Failure message

    Definition Classes
    RDDMetricCalculator
  15. def getStatus: CalculatorStatus

    Gets current metric calculator status

    Gets current metric calculator status

    returns

    Calculator status

    Definition Classes
    RDDMetricCalculator
  16. def increment(values: Seq[Any]): RDDMetricCalculator

    Safely updates metric calculator with respect to specified error collection logic (direct or reversed).

    Safely updates metric calculator with respect to specified error collection logic (direct or reversed).

    values

    values to process

    returns

    updated calculator

    Definition Classes
    ReversibleRDDCalculator
  17. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  18. def merge(m2: RDDMetricCalculator): RDDMetricCalculator

    Merges two metric calculators together

    Merges two metric calculators together

    m2

    second metric calculator

    returns

    merged metric calculator

    Definition Classes
    LevenshteinDistanceRDDMetricCalculatorRDDMetricCalculator
  19. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  20. val normalize: Boolean
  21. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  22. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  23. def result(): Map[String, (Double, Option[String])]

    Gets results of calculator in the current state

    Gets results of calculator in the current state

    returns

    Map of (result_name -> (result, additionalResults))

    Definition Classes
    LevenshteinDistanceRDDMetricCalculatorRDDMetricCalculator
  24. val reversed: Boolean
    Attributes
    protected
    Definition Classes
    LevenshteinDistanceRDDMetricCalculatorReversibleRDDCalculator
  25. val status: CalculatorStatus
    Attributes
    protected
    Definition Classes
    LevenshteinDistanceRDDMetricCalculatorRDDMetricCalculator
  26. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  27. val threshold: Double
  28. def tryToIncrement(values: Seq[Any]): RDDMetricCalculator

    Increment metric calculator.

    Increment metric calculator. May throw an exception. Direct error collection logic implies that rows where levenshtein distance between two string values is greater than or equal to the provided threshold are considered as metric failure and are collected.

    values

    values to process

    returns

    updated calculator or throws an exception

    Attributes
    protected
    Definition Classes
    LevenshteinDistanceRDDMetricCalculatorRDDMetricCalculator
  29. def tryToIncrementReversed(values: Seq[Any]): RDDMetricCalculator

    Increment metric calculator with REVERSED error collection logic.

    Increment metric calculator with REVERSED error collection logic. May throw an exception. Reversed error collection logic implies that rows where levenshtein distance between two string values is lower than the provided threshold are considered as metric failure and are collected.

    values

    values to process

    returns

    updated calculator or throws an exception

    Attributes
    protected
    Definition Classes
    LevenshteinDistanceRDDMetricCalculatorReversibleRDDCalculator
  30. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from ReversibleRDDCalculator

Inherited from RDDMetricCalculator

Inherited from AnyRef

Inherited from Any

Ungrouped