ml.dmlc.xgboost4j.scala.spark

XGBoostEstimator

Related Doc: package spark

class XGBoostEstimator extends Predictor[Vector, XGBoostEstimator, XGBoostModel] with LearningTaskParams with GeneralParams with BoosterParams

XGBoost Estimator to produce a XGBoost model

Linear Supertypes
BoosterParams, GeneralParams, LearningTaskParams, Predictor[Vector, XGBoostEstimator, XGBoostModel], PredictorParams, HasPredictionCol, HasFeaturesCol, HasLabelCol, Estimator[XGBoostModel], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. XGBoostEstimator
  2. BoosterParams
  3. GeneralParams
  4. LearningTaskParams
  5. Predictor
  6. PredictorParams
  7. HasPredictionCol
  8. HasFeaturesCol
  9. HasLabelCol
  10. Estimator
  11. PipelineStage
  12. Logging
  13. Params
  14. Serializable
  15. Serializable
  16. Identifiable
  17. AnyRef
  18. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new XGBoostEstimator(uid: String)

  2. new XGBoostEstimator(xgboostParams: Map[String, Any])

Value Members

  1. final def !=(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  5. val alpha: DoubleParam

    L1 regularization term on weights, increase this value will make model more conservative.

    L1 regularization term on weights, increase this value will make model more conservative. [default=0]

    Definition Classes
    BoosterParams
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. val baseScore: DoubleParam

    the initial prediction score of all instances, global bias.

    the initial prediction score of all instances, global bias. default=0.5

    Definition Classes
    LearningTaskParams
  8. val boosterType: Param[String]

    Booster to use, options: {'gbtree', 'gblinear', 'dart'}

    Booster to use, options: {'gbtree', 'gblinear', 'dart'}

    Definition Classes
    BoosterParams
  9. final def clear(param: Param[_]): XGBoostEstimator.this.type

    Definition Classes
    Params
  10. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  11. val colSampleByLevel: DoubleParam

    subsample ratio of columns for each split, in each level.

    subsample ratio of columns for each split, in each level. [default=1] range: (0,1]

    Definition Classes
    BoosterParams
  12. val colSampleByTree: DoubleParam

    subsample ratio of columns when constructing each tree.

    subsample ratio of columns when constructing each tree. [default=1] range: (0,1]

    Definition Classes
    BoosterParams
  13. def copy(extra: ParamMap): XGBoostEstimator

    Definition Classes
    XGBoostEstimator → Predictor → Estimator → PipelineStage → Params
  14. def copyValues[T <: class="extype" name="org.apache.spark.ml.param.Params">Params](to: T, extra: ParamMap): T

    Attributes
    protected
    Definition Classes
    Params
  15. val customEval: Param[EvalTrait]

    customized evaluation function provided by user.

    customized evaluation function provided by user. default: null

    Definition Classes
    GeneralParams
  16. val customObj: Param[ObjectiveTrait]

    customized objective function provided by user.

    customized objective function provided by user. default: null

    Definition Classes
    GeneralParams
  17. final def defaultCopy[T <: class="extype" name="org.apache.spark.ml.param.Params">Params](extra: ParamMap): T

    Attributes
    protected
    Definition Classes
    Params
  18. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  19. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  20. val eta: DoubleParam

    step size shrinkage used in update to prevents overfitting.

    step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features and eta actually shrinks the feature weights to make the boosting process more conservative. [default=0.3] range: [0,1]

    Definition Classes
    BoosterParams
  21. val evalMetric: Param[String]

    evaluation metrics for validation data, a default metric will be assigned according to objective(rmse for regression, and error for classification, mean average precision for ranking).

    evaluation metrics for validation data, a default metric will be assigned according to objective(rmse for regression, and error for classification, mean average precision for ranking). options: rmse, mae, logloss, error, merror, mlogloss, auc, ndcg, map, gamma-deviance

    Definition Classes
    LearningTaskParams
  22. def explainParam(param: Param[_]): String

    Definition Classes
    Params
  23. def explainParams(): String

    Explains all params of this instance.

    Explains all params of this instance. See explainParam().

    Definition Classes
    BoosterParams → Params
  24. def extractLabeledPoints(dataset: Dataset[_]): RDD[org.apache.spark.ml.feature.LabeledPoint]

    Attributes
    protected
    Definition Classes
    Predictor
  25. final def extractParamMap(): ParamMap

    Definition Classes
    Params
  26. final def extractParamMap(extra: ParamMap): ParamMap

    Definition Classes
    Params
  27. final val featuresCol: Param[String]

    Definition Classes
    HasFeaturesCol
  28. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  29. def fit(dataset: Dataset[_]): XGBoostModel

    Definition Classes
    Predictor → Estimator
  30. def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[XGBoostModel]

    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  31. def fit(dataset: Dataset[_], paramMap: ParamMap): XGBoostModel

    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  32. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): XGBoostModel

    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  33. val gamma: DoubleParam

    minimum loss reduction required to make a further partition on a leaf node of the tree.

    minimum loss reduction required to make a further partition on a leaf node of the tree. the larger, the more conservative the algorithm will be. [default=0] range: [0, Double.MaxValue]

    Definition Classes
    BoosterParams
  34. final def get[T](param: Param[T]): Option[T]

    Definition Classes
    Params
  35. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  36. final def getDefault[T](param: Param[T]): Option[T]

    Definition Classes
    Params
  37. final def getFeaturesCol: String

    Definition Classes
    HasFeaturesCol
  38. final def getLabelCol: String

    Definition Classes
    HasLabelCol
  39. final def getOrDefault[T](param: Param[T]): T

    Definition Classes
    Params
  40. def getParam(paramName: String): Param[Any]

    Definition Classes
    Params
  41. final def getPredictionCol: String

    Definition Classes
    HasPredictionCol
  42. final def hasDefault[T](param: Param[T]): Boolean

    Definition Classes
    Params
  43. def hasParam(paramName: String): Boolean

    Definition Classes
    Params
  44. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  45. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Attributes
    protected
    Definition Classes
    Logging
  46. final def isDefined(param: Param[_]): Boolean

    Definition Classes
    Params
  47. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  48. final def isSet(param: Param[_]): Boolean

    Definition Classes
    Params
  49. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  50. final val labelCol: Param[String]

    Definition Classes
    HasLabelCol
  51. val lambda: DoubleParam

    L2 regularization term on weights, increase this value will make model more conservative.

    L2 regularization term on weights, increase this value will make model more conservative. [default=1]

    Definition Classes
    BoosterParams
  52. val lambdaBias: DoubleParam

    Parameter of linear booster L2 regularization term on bias, default 0(no L1 reg on bias because it is not important)

    Parameter of linear booster L2 regularization term on bias, default 0(no L1 reg on bias because it is not important)

    Definition Classes
    BoosterParams
  53. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  54. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  55. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  56. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  57. def logError(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  58. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  59. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  60. def logName: String

    Attributes
    protected
    Definition Classes
    Logging
  61. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  62. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  63. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  64. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  65. val maxDeltaStep: DoubleParam

    Maximum delta step we allow each tree's weight estimation to be.

    Maximum delta step we allow each tree's weight estimation to be. If the value is set to 0, it means there is no constraint. If it is set to a positive value, it can help making the update step more conservative. Usually this parameter is not needed, but it might help in logistic regression when class is extremely imbalanced. Set it to value of 1-10 might help control the update. [default=0] range: [0, Double.MaxValue]

    Definition Classes
    BoosterParams
  66. val maxDepth: IntParam

    maximum depth of a tree, increase this value will make model more complex / likely to be overfitting.

    maximum depth of a tree, increase this value will make model more complex / likely to be overfitting. [default=6] range: [1, Int.MaxValue]

    Definition Classes
    BoosterParams
  67. val minChildWeight: DoubleParam

    minimum sum of instance weight(hessian) needed in a child.

    minimum sum of instance weight(hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression mode, this simply corresponds to minimum number of instances needed to be in each node. The larger, the more conservative the algorithm will be. [default=1] range: [0, Double.MaxValue]

    Definition Classes
    BoosterParams
  68. val missing: FloatParam

    the value treated as missing.

    the value treated as missing. default: Float.NaN

    Definition Classes
    GeneralParams
  69. val nWorkers: IntParam

    number of workers used to train xgboost model.

    number of workers used to train xgboost model. default: 1

    Definition Classes
    GeneralParams
  70. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  71. val normalizeType: Param[String]

    Parameter of Dart booster.

    Parameter of Dart booster. type of normalization algorithm, options: {'tree', 'forest'}. [default="tree"]

    Definition Classes
    BoosterParams
  72. final def notify(): Unit

    Definition Classes
    AnyRef
  73. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  74. val numThreadPerTask: IntParam

    number of threads used by per worker.

    number of threads used by per worker. default 1

    Definition Classes
    GeneralParams
  75. val objective: Param[String]

    Specify the learning task and the corresponding learning objective.

    Specify the learning task and the corresponding learning objective. options: reg:linear, reg:logistic, binary:logistic, binary:logitraw, count:poisson, multi:softmax, multi:softprob, rank:pairwise, reg:gamma. default: reg:linear

    Definition Classes
    LearningTaskParams
  76. lazy val params: Array[Param[_]]

    Definition Classes
    Params
  77. final val predictionCol: Param[String]

    Definition Classes
    HasPredictionCol
  78. val rateDrop: DoubleParam

    Parameter of Dart booster.

    Parameter of Dart booster. dropout rate. [default=0.0] range: [0.0, 1.0]

    Definition Classes
    BoosterParams
  79. val round: IntParam

    The number of rounds for boosting

    The number of rounds for boosting

    Definition Classes
    GeneralParams
  80. val sampleType: Param[String]

    Parameter for Dart booster.

    Parameter for Dart booster. Type of sampling algorithm. "uniform": dropped trees are selected uniformly. "weighted": dropped trees are selected in proportion to weight. [default="uniform"]

    Definition Classes
    BoosterParams
  81. val scalePosWeight: DoubleParam

    Control the balance of positive and negative weights, useful for unbalanced classes.

    Control the balance of positive and negative weights, useful for unbalanced classes. A typical value to consider: sum(negative cases) / sum(positive cases). [default=0]

    Definition Classes
    BoosterParams
  82. final def set(paramPair: ParamPair[_]): XGBoostEstimator.this.type

    Attributes
    protected
    Definition Classes
    Params
  83. final def set(param: String, value: Any): XGBoostEstimator.this.type

    Attributes
    protected
    Definition Classes
    Params
  84. final def set[T](param: Param[T], value: T): XGBoostEstimator.this.type

    Definition Classes
    Params
  85. final def setDefault(paramPairs: ParamPair[_]*): XGBoostEstimator.this.type

    Attributes
    protected
    Definition Classes
    Params
  86. final def setDefault[T](param: Param[T], value: T): XGBoostEstimator.this.type

    Attributes
    protected
    Definition Classes
    Params
  87. def setFeaturesCol(value: String): XGBoostEstimator

    Definition Classes
    Predictor
  88. def setLabelCol(value: String): XGBoostEstimator

    Definition Classes
    Predictor
  89. def setPredictionCol(value: String): XGBoostEstimator

    Definition Classes
    Predictor
  90. val silent: IntParam

    0 means printing running messages, 1 means silent mode.

    0 means printing running messages, 1 means silent mode. default: 0

    Definition Classes
    GeneralParams
  91. val sketchEps: DoubleParam

    This is only used for approximate greedy algorithm.

    This is only used for approximate greedy algorithm. This roughly translated into O(1 / sketch_eps) number of bins. Compared to directly select number of bins, this comes with theoretical guarantee with sketch accuracy. [default=0.03] range: (0, 1)

    Definition Classes
    BoosterParams
  92. val skipDrop: DoubleParam

    Parameter of Dart booster.

    Parameter of Dart booster. probability of skip dropout. If a dropout is skipped, new trees are added in the same manner as gbtree. [default=0.0] range: [0.0, 1.0]

    Definition Classes
    BoosterParams
  93. val subSample: DoubleParam

    subsample ratio of the training instance.

    subsample ratio of the training instance. Setting it to 0.5 means that XGBoost randomly collected half of the data instances to grow trees and this will prevent overfitting. [default=1] range:(0,1]

    Definition Classes
    BoosterParams
  94. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  95. def toString(): String

    Definition Classes
    Identifiable → AnyRef → Any
  96. def train(trainingSet: Dataset[_]): XGBoostModel

    produce a XGBoostModel by fitting the given dataset

    produce a XGBoostModel by fitting the given dataset

    Definition Classes
    XGBoostEstimator → Predictor
  97. def transformSchema(schema: StructType): StructType

    Definition Classes
    Predictor → PipelineStage
  98. def transformSchema(schema: StructType, logging: Boolean): StructType

    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  99. val treeMethod: Param[String]

    The tree construction algorithm used in XGBoost.

    The tree construction algorithm used in XGBoost. options: {'auto', 'exact', 'approx'} [default='auto']

    Definition Classes
    BoosterParams
  100. val uid: String

    Definition Classes
    XGBoostEstimator → Identifiable
  101. val useExternalMemory: BooleanParam

    whether to use external memory as cache.

    whether to use external memory as cache. default: false

    Definition Classes
    GeneralParams
  102. def validateAndTransformSchema(schema: StructType, fitting: Boolean, featuresDataType: DataType): StructType

    Attributes
    protected
    Definition Classes
    PredictorParams
  103. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  104. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  105. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def validateParams(): Unit

    Definition Classes
    Params
    Annotations
    @deprecated
    Deprecated

    (Since version 2.0.0) Will be removed in 2.1.0. Checks should be merged into transformSchema.

Inherited from BoosterParams

Inherited from GeneralParams

Inherited from LearningTaskParams

Inherited from Predictor[Vector, XGBoostEstimator, XGBoostModel]

Inherited from PredictorParams

Inherited from HasPredictionCol

Inherited from HasFeaturesCol

Inherited from HasLabelCol

Inherited from Estimator[XGBoostModel]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Ungrouped