LightGBM Parameters
Official documentation:
- English: https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMRegressor.html#lightgbm.LGBMRegressor
- Chinese: https://lightgbm.cn/docs/6/
The LGBMRegressor constructor accepts the following parameters:
lightgbm.LGBMRegressor(boosting_type='gbdt', num_leaves=31, max_depth=-1, learning_rate=0.1, n_estimators=100, subsample_for_bin=200000, objective=None, class_weight=None, min_split_gain=0.0, min_child_weight=0.001, min_child_samples=20, subsample=1.0, subsample_freq=0, colsample_bytree=1.0, reg_alpha=0.0, reg_lambda=0.0, random_state=None, n_jobs=None, importance_type='split', **kwargs)
Recommended parameter ranges:
- Learning rate: [0.01, 0.15]
- Maximum depth: [3, 25]
- Feature fraction / colsample_bytree: [0.5, 1]
- Bagging fraction / subsample: [0.5, 1]
- lambda_l1: [0, 0.01~0.1, 1]
- lambda_l2: [0, 0.1, 0.5, 1]
- min_gain_to_split / min_split_gain: [0, 0.05 ~ 0.1, 0.3, 0.5, 0.7, 0.9, 1]
- min_sum_hessian_in_leaf / min_child_weight: [1, 3, 5, 7]
XGBoost
Official documentation: https://xgboost.readthedocs.io/en/latest/parameter.html#general-parameters
Advantages
- Regularization
- Parallel processing
- Customizable optimization objectives and evaluation metrics
- Built-in handling of missing values
- Greedy algorithm for pruning
Key Parameters
- learning_rate: Reduces the weight of each step to improve model robustness. Typical values: 0.01-0.2.
- min_child_weight: Minimum sum of instance weight needed in a child. Prevents overfitting. Default: 1.
- max_depth: Maximum depth of a tree. Prevents overfitting. Typical values: 3-10. Default: 6.
- gamma: Minimum loss reduction required to make a further partition. Default: 0.
- subsample: Subsample ratio of the training instance. Prevents overfitting. Typical values: 0.5-1. Default: 1.
- colsample_bytree: Subsample ratio of columns when constructing each tree. Typical values: 0.5-1. Default: 1.
Visualization
# Feature importance plot
xgb.plot_importance(model)
# Tree visualization
xgb.plot_tree(model, num_trees=2)
# Timing execution
start = timeit.default_timer()
# Code to be timed
end = timeit.default_timer()
print(f"Execution time: {end - start} seconds")
Support Vector Regression (SVR) Parameters
Official documentation: https://xgboost.readthedocs.io/en/latest/parameter.html#general-parameters
When optimizing hyperparameters for SVR, it's often more effective to focus on C and epsilon rather than C and gamma. The former combination typically yields better performance improvements.
Random Forest
Key Parameters
- max_features: Increasing this value improves model performance but reduces tree diversity.
- n_estimators: Number of trees in the forest. Higher values yield better performance but slower computation.
- min_samples_leaf: Important parameter. For small datasets: 1-50, for large datasets: 200-300.
- min_samples_split: Range: 2-30.
Random Forest is relatively robust to parameter changes, often producing good results even with default parameters.
Bayesian Optimization with Tree-structured Parzen Estimator
Resources:
- https://optunity.readthedocs.io/en/latest/user/solvers/TPE.html#hyperopt
- https://github.com/WillKoehrsen/hyperparameter_optimization/blob/master/Introduction%20to%20Bayesian%20Optimization%20with%20Hyperopt.ipynb
Parameter Space Definition Functions
- hp.pchoice(label, p_options): Returns one of p_options with specified probabilities.
- hp.uniform(label, low, high): Uniform distribution between low and high.
- hp.quniform(label, low, high, q): Round uniform value divided by q, then multiplied by q.
- hp.loguniform(label, low, high): Value whose log is uniformly distributed.
- hp.randint(label, upper): Random integer in [0, upper).
- hp.normal(label, mu, sigma): Normal distribution with mean mu and standard deviation sigma.
- hp.qnormal(label, mu, sigma, q): Quantized normal distribution.
- hp.lognormal(label, mu, sigma): Log-normal distribution.
- hp.qlognormal(label, mu, sigma, q): Quantized log-normal distribution.
Optimization Algorithms
- Random search (hyperopt.rand.suggest)
- Simulated annealing (hyperopt.anneal.suggest)
- TPE algorithm (hyperopt.tpe.suggest)
One-Hot Encoding Example
# Reshape feature
feature = feature.reshape(X.shape[0], 1)
# Apply one-hot encoding
encoder = OneHotEncoder(sparse=False)
feature = encoder.fit_transform(feature)
# Encode string input values as integers
encoded_features = None
for i in range(X.shape[1]):
label_encoder = LabelEncoder()
feature = label_encoder.fit_transform(X[:, i])
feature = feature.reshape(X.shape[0], 1)
encoder = OneHotEncoder(sparse=False)
feature = encoder.fit_transform(feature)
if encoded_features is None:
encoded_features = feature
else:
encoded_features = numpy.concatenate((encoded_features, feature), axis=1)
print(f"Encoded features shape: {encoded_features.shape}")
Data Loading Example
# Load data from URL
dataset = pd.read_csv('https://labfile.oss.aliyuncs.com/courses/1283/adult.data.csv')
print(dataset.head())