Repository Analysis

lightgbm-org/LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

5.5 Low AI signal View on GitHub
5.5
Adjusted Score
5.5
Raw Score
100%
Time Factor
2026-05-30
Last Push
18,409
Stars
C++
Language
111,619
Lines of Code
347
Files
609
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 0HIGH 6MEDIUM 13LOW 590

Pattern Findings

609 matches across 10 categories. Click a row to expand file-level details.

Hyper-Verbose Identifiers242 hits · 200 pts
SeverityFileLineSnippet
LOW.ci/parameter-generator.py199def gen_parameter_description(
LOWtests/python_package_test/test_engine.py2395def test_refit_with_one_tree_regression():
LOWtests/python_package_test/test_engine.py2404def test_refit_with_one_tree_binary_classification():
LOWtests/python_package_test/test_engine.py2413def test_refit_with_one_tree_multiclass_classification():
LOWtests/python_package_test/test_engine.py2422def test_refit_dataset_params(rng):
LOWtests/python_package_test/test_engine.py2494def test_constant_features_regression():
LOWtests/python_package_test/test_engine.py2501def test_constant_features_binary():
LOWtests/python_package_test/test_engine.py2507def test_constant_features_multiclass():
LOWtests/python_package_test/test_engine.py2513def test_constant_features_multiclassova():
LOWtests/python_package_test/test_engine.py4776def test_train_and_cv_raise_informative_error_for_train_set_of_wrong_type():
LOWtests/python_package_test/test_engine.py4784def test_train_and_cv_raise_informative_error_for_impossible_num_boost_round(num_boost_round):
LOWtests/python_package_test/test_engine.py4793def test_train_raises_informative_error_if_any_valid_sets_are_not_dataset_objects():
LOWtests/python_package_test/test_engine.py161def test_missing_value_handle():
LOWtests/python_package_test/test_engine.py181def test_missing_value_handle_more_na():
LOWtests/python_package_test/test_engine.py201def test_missing_value_handle_na():
LOWtests/python_package_test/test_engine.py232def test_missing_value_handle_zero():
LOWtests/python_package_test/test_engine.py263def test_missing_value_handle_none():
LOWtests/python_package_test/test_engine.py358def test_categorical_handle_na(use_quantized_grad):
LOWtests/python_package_test/test_engine.py408def test_categorical_non_zero_inputs(use_quantized_grad):
LOWtests/python_package_test/test_engine.py487def test_multiclass_prediction_early_stopping():
LOWtests/python_package_test/test_engine.py659def test_ranking_prediction_early_stopping():
LOWtests/python_package_test/test_engine.py744def test_ranking_with_position_information_with_file(tmp_path):
LOWtests/python_package_test/test_engine.py797def test_ranking_with_position_information_with_dataset_constructor(tmp_path):
LOWtests/python_package_test/test_engine.py893def test_early_stopping_ignores_training_set(use_valid):
LOWtests/python_package_test/test_engine.py931def test_early_stopping_via_global_params(first_metric_only):
LOWtests/python_package_test/test_engine.py959def test_early_stopping_is_not_enabled_for_non_positive_stopping_rounds(early_stopping_round):
LOWtests/python_package_test/test_engine.py1009def test_early_stopping_min_delta(first_only, single_metric, greater_is_better):
LOWtests/python_package_test/test_engine.py1088def test_early_stopping_min_delta_via_global_params(early_stopping_min_delta):
LOWtests/python_package_test/test_engine.py1110def test_early_stopping_can_be_triggered_via_custom_callback():
LOWtests/python_package_test/test_engine.py1113 def _early_stop_after_seventh_iteration(env):
LOWtests/python_package_test/test_engine.py1158def test_continue_train_reused_dataset():
LOWtests/python_package_test/test_engine.py1190def test_continue_train_multiclass():
LOWtests/python_package_test/test_engine.py1278def test_cv_works_with_init_model(tmp_path):
LOWtests/python_package_test/test_engine.py1452def test_feature_name_with_non_ascii(rng, tmp_path):
LOWtests/python_package_test/test_engine.py1469def test_parameters_are_loaded_from_model_file(tmp_path, capsys, rng):
LOWtests/python_package_test/test_engine.py1519def test_string_serialized_params_retrieval(rng):
LOWtests/python_package_test/test_engine.py1563def test_save_load_copy_pickle(tmp_path):
LOWtests/python_package_test/test_engine.py1596def test_all_expected_params_are_written_out_to_model_text(tmp_path):
LOWtests/python_package_test/test_engine.py1965def test_contribs_sparse_multiclass():
LOWtests/python_package_test/test_engine.py2065 def train_and_get_predictions(features, labels):
LOWtests/python_package_test/test_engine.py2141def test_training_on_constructed_subset_without_params(rng):
LOWtests/python_package_test/test_engine.py2153def generate_trainset_for_monotone_constraints_tests(x3_to_category=True):
LOWtests/python_package_test/test_engine.py2186def test_monotone_constraints(test_with_categorical_variable):
LOWtests/python_package_test/test_engine.py2223 def are_interactions_enforced(gbm, feature_sets):
LOWtests/python_package_test/test_engine.py2271 def are_first_splits_non_monotone(tree, n, monotone_constraints):
LOWtests/python_package_test/test_engine.py2282 def are_there_monotone_splits(tree, monotone_constraints):
LOWtests/python_package_test/test_engine.py2313def test_monotone_penalty_max():
LOWtests/python_package_test/test_engine.py2453def test_mape_for_specific_boosting_types(boosting_type):
LOWtests/python_package_test/test_engine.py2947def test_multiple_feval_train():
LOWtests/python_package_test/test_engine.py2972def test_objective_callable_train_binary_classification():
LOWtests/python_package_test/test_engine.py2985def test_objective_callable_train_regression():
LOWtests/python_package_test/test_engine.py2996def test_objective_callable_cv_binary_classification():
LOWtests/python_package_test/test_engine.py3008def test_objective_callable_cv_regression():
LOWtests/python_package_test/test_engine.py3041def test_default_objective_and_metric():
LOWtests/python_package_test/test_engine.py3063def test_multiclass_custom_objective(use_weight):
LOWtests/python_package_test/test_engine.py3088def test_multiclass_custom_eval(use_weight):
LOWtests/python_package_test/test_engine.py3157def test_get_split_value_histogram(rng_fixed_seed):
LOWtests/python_package_test/test_engine.py3246def test_early_stopping_for_only_first_metric():
LOWtests/python_package_test/test_engine.py3247 def metrics_combination_train_regression(valid_sets, metric_list, assumed_iteration, first_metric_only, feval=None):
LOWtests/python_package_test/test_engine.py3266 def metrics_combination_cv_regression(
182 more matches not shown…
Over-Commented Block221 hits · 196 pts
SeverityFileLineSnippet
LOWCMakeLists.txt801 if(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
LOWbuild_r.R1# For macOS users who have decided to use gcc
LOWbuild-python.sh1#!/bin/sh
LOWbuild-python.sh21# sh ./build-python.sh install --precompile
LOWbuild-python.sh41# --gpu
LOWbuild-cran-package.sh1#!/bin/sh
LOW.ci/check-workflow-status.sh1#!/bin/bash
LOW.ci/set-commit-status.sh1#!/bin/bash
LOW.ci/append-comment.sh1#!/bin/bash
LOW.ci/install-r-deps.R1# Install R dependencies, using only base R.
LOW.ci/check-dynamic-dependencies.sh1#!/bin/bash
LOW.ci/rerun-workflow.sh1#!/bin/bash
LOW.ci/conda-envs/ci-core.txt1# [description]
LOW.ci/conda-envs/ci-core-py310.txt1# [description]
LOWR-package/demo/basic_walkthrough.R81# Since we do not have this file with us, the following line is just for illustration
LOWR-package/demo/categorical_features_rules.R1# Here we are going to try training a model with categorical features
LOWR-package/demo/categorical_features_rules.R21# $ duration : int 79 220 185 199 226 141 341 151 57 313 ...
LOWR-package/demo/categorical_features_rules.R41# $ marital : num 1 2 1 3 3 2 2 2 1 1 ...
LOWR-package/demo/efficient_many_training.R1# Efficient training means training without giving up too much RAM
LOWR-package/tests/testthat/helper.R1# ref for this file:
LOWR-package/tests/testthat/test_basic.R2281 )
LOWR-package/R/lgb.convert_with_rules.R61#'
LOWR-package/R/lgb.convert_with_rules.R81#' new_iris <- lgb.convert_with_rules(data = iris)
LOWR-package/R/lgb.convert_with_rules.R101#'
LOWR-package/R/lgb.interpret.R1#' @name lgb.interpret
LOWR-package/R/lgb.interpret.R21#' \dontshow{data.table::setDTthreads(1L)}
LOWR-package/R/lgb.interpret.R41#' )
LOWR-package/R/utils.R161 params$metric <- as.list(unique(unlist(params$metric)))
LOWR-package/R/utils.R181# For example, "num_iterations" can also be provided to lgb.train()
LOWR-package/R/multithreading.R1#' @name setLGBMThreads
LOWR-package/R/multithreading.R21#' @export
LOWR-package/R/lgb.restore_handle.R1#' @name lgb.restore_handle
LOWR-package/R/lgb.restore_handle.R21#' \dontshow{setLGBMthreads(2L)}
LOWR-package/R/lgb.plot.importance.R1#' @name lgb.plot.importance
LOWR-package/R/lgb.plot.importance.R21#' \donttest{
LOWR-package/R/lgb.model.dt.tree.R1#' @name lgb.model.dt.tree
LOWR-package/R/lgb.model.dt.tree.R21#' for a leaf, it simply labels it as \code{"NA"}}
LOWR-package/R/lgb.model.dt.tree.R41#' dtrain <- lgb.Dataset(train$data, label = train$label)
LOWR-package/R/lgb.train.R1#' @name lgb.train
LOWR-package/R/lgb.train.R21#' train <- agaricus.train
LOWR-package/R/lightgbm.R1#' @name lgb_shared_params
LOWR-package/R/lightgbm.R21#' The "metric" section of the documentation}
LOWR-package/R/lightgbm.R41#' \item{\bold{c. list}:
LOWR-package/R/lightgbm.R61#' validation set does not improve for several consecutive iterations.
LOWR-package/R/lightgbm.R81#' de-serialized, the underlying C++ model object gets reconstructed from these raw bytes, but will only
LOWR-package/R/lightgbm.R101#' than \code{\link{lgb.train}}.
LOWR-package/R/lightgbm.R121#' \code{label}).
LOWR-package/R/lightgbm.R141#' If passing \code{NULL} (the default), will try to use the number of physical cores in the
LOWR-package/R/lightgbm.R261 what = lgb.train
LOWR-package/R/lightgbm.R281#' https://archive.ics.uci.edu/ml/datasets/Mushroom
LOWR-package/R/lightgbm.R301#' \item{\code{label}: the label for each record}
LOWR-package/R/lightgbm.R321#' UCI Machine Learning Repository.
LOWR-package/R/lgb.make_serializable.R1#' @name lgb.make_serializable
LOWR-package/R/lgb.Dataset.R1#' @name lgb_shared_dataset_params
LOWR-package/R/lgb.Dataset.R761#' @title Construct \code{lgb.Dataset} object
LOWR-package/R/lgb.Dataset.R781#' @param categorical_feature categorical features. This can either be a character vector of feature
LOWR-package/R/lgb.Dataset.R841#' a character representing a path to a text file (CSV, TSV, or LibSVM),
LOWR-package/R/lgb.Dataset.R861#'
LOWR-package/R/lgb.Dataset.R881#' , row.names = FALSE
LOWR-package/R/lgb.Dataset.R921
161 more matches not shown…
Redundant / Tautological Comments34 hits · 58 pts
SeverityFileLineSnippet
LOWR-package/R/lgb.Predictor.R27 # Check if model file is a booster handle already
LOWR-package/R/lgb.Predictor.R74 # Check if number of iterations is existing - if not, then set it to -1 (use all)
LOWR-package/R/lgb.Predictor.R78 # Check if start iterations is existing - if not, then set it to 0 (start from the first iteration)
LOWR-package/R/lgb.Predictor.R83 # Check if data is a file name and not a matrix
LOWR-package/R/lgb.Predictor.R243 # Check if data is a matrix
LOWR-package/R/lgb.Predictor.R424 # Check if data is a dgCMatrix (sparse matrix, column compressed format)
LOWR-package/R/lgb.Predictor.R450 # Check if number of rows is strange (not a multiple of the dataset rows)
LOWR-package/R/lgb.model.dt.tree.R147 # Check if split index is not null in leaf
LOWR-package/R/lightgbm.R226 # Set data to a temporary variable
LOWR-package/R/lgb.Dataset.R162 # Check if more categorical features were output over the feature space
LOWR-package/R/lgb.Dataset.R366 # Check if dgCMatrix (sparse matrix column compressed)
LOWR-package/R/lgb.Dataset.R417 # Check if dgCMatrix (sparse matrix column compressed)
LOWR-package/R/lgb.Dataset.R466 # Check if attribute key is in the known attribute list
LOWR-package/R/lgb.Dataset.R517 # Check if attribute key is in the known attribute list
LOWR-package/R/lgb.Dataset.R1026 # Check if invalid element list
LOWR-package/R/lgb.Dataset.R1134 # Check if dataset is not a dataset
LOWR-package/R/lgb.importance.R71 # Check if relative values are requested
LOWR-package/R/lgb.Booster.R840 # Check if there are evaluation metrics
LOWR-package/R/lgb.Booster.R843 # Check if evaluation metric is a function
LOWR-package/R/lgb.Booster.R850 # Check if data to assess is existing differently
LOWR-package/R/lgb.Booster.R205 # Check if objective is empty
LOWR-package/R/lgb.Booster.R403 # Check if evaluation was not done
LOWR-package/R/lgb.Booster.R745 # Check if current iteration was already predicted
LOWR-package/R/lgb.Booster.R1544 # Check if evaluation result is existing
LOWR-package/R/lgb.Booster.R1560 # Check if error is requested
LOWR-package/R/callback.R79 # Check if period is at least 1 or more
LOWR-package/R/callback.R85 # Check if iteration matches moduo
LOWR-package/R/callback.R91 # Check if message is existing
LOWR-package/R/callback.R135 # Check if evaluation record exists
LOWR-package/R/callback.R206 # Check if verbose or not
LOWR-package/R/callback.R264 # Check if score is better
LOWR-package/R/callback.R278 # Check if early stopping is required
LOWR-package/src/install.libs.R175# Check if Windows installation (for gcc vs Visual Studio)
LOWpython-package/lightgbm/basic.py3132 # Check if the weight contains values other than one
Unused Imports45 hits · 42 pts
SeverityFileLineSnippet
LOWpython-package/lightgbm/compat.py171
LOWpython-package/lightgbm/compat.py172
LOWpython-package/lightgbm/compat.py205
LOWpython-package/lightgbm/compat.py213
LOWpython-package/lightgbm/compat.py222
LOWpython-package/lightgbm/compat.py225
LOWpython-package/lightgbm/compat.py226
LOWpython-package/lightgbm/compat.py227
LOWpython-package/lightgbm/compat.py227
LOWpython-package/lightgbm/compat.py282
LOWpython-package/lightgbm/compat.py283
LOWpython-package/lightgbm/compat.py284
LOWpython-package/lightgbm/compat.py285
LOWpython-package/lightgbm/compat.py329
LOWpython-package/lightgbm/compat.py176
LOWpython-package/lightgbm/compat.py178
LOWpython-package/lightgbm/__init__.py11
LOWpython-package/lightgbm/__init__.py11
LOWpython-package/lightgbm/__init__.py11
LOWpython-package/lightgbm/__init__.py11
LOWpython-package/lightgbm/__init__.py12
LOWpython-package/lightgbm/__init__.py12
LOWpython-package/lightgbm/__init__.py12
LOWpython-package/lightgbm/__init__.py12
LOWpython-package/lightgbm/__init__.py12
LOWpython-package/lightgbm/__init__.py13
LOWpython-package/lightgbm/__init__.py13
LOWpython-package/lightgbm/__init__.py13
LOWpython-package/lightgbm/__init__.py16
LOWpython-package/lightgbm/__init__.py16
LOWpython-package/lightgbm/__init__.py16
LOWpython-package/lightgbm/__init__.py16
LOWpython-package/lightgbm/__init__.py20
LOWpython-package/lightgbm/__init__.py20
LOWpython-package/lightgbm/__init__.py20
LOWpython-package/lightgbm/__init__.py20
LOWpython-package/lightgbm/__init__.py20
LOWpython-package/lightgbm/__init__.py24
LOWpython-package/lightgbm/__init__.py24
LOWpython-package/lightgbm/__init__.py24
LOWpython-package/lightgbm/basic.py49
LOWpython-package/lightgbm/basic.py53
LOWpython-package/lightgbm/basic.py55
LOWpython-package/lightgbm/sklearn.py54
LOWdocs/conf.py111
Deep Nesting43 hits · 35 pts
SeverityFileLineSnippet
LOW.ci/parameter-generator.py16
LOW.ci/parameter-generator.py109
LOW.ci/parameter-generator.py199
LOW.ci/parameter-generator.py263
LOWtests/python_package_test/test_engine.py136
LOWtests/python_package_test/test_engine.py686
LOWtests/python_package_test/test_engine.py688
LOWtests/python_package_test/test_basic.py712
LOWtests/python_package_test/test_consistency.py13
LOWtests/python_package_test/test_consistency.py49
LOWtests/python_package_test/utils.py161
LOWtests/python_package_test/utils.py174
LOWtests/python_package_test/test_dask.py150
LOWtests/python_package_test/test_dask.py325
LOWtests/python_package_test/test_dask.py817
LOWtests/python_package_test/test_sklearn.py72
LOWtests/python_package_test/test_sklearn.py1746
LOWtests/python_package_test/test_sklearn.py2081
LOWpython-package/lightgbm/callback.py326
LOWpython-package/lightgbm/callback.py405
LOWpython-package/lightgbm/plotting.py458
LOWpython-package/lightgbm/plotting.py480
LOWpython-package/lightgbm/engine.py109
LOWpython-package/lightgbm/engine.py522
LOWpython-package/lightgbm/dask.py187
LOWpython-package/lightgbm/dask.py433
LOWpython-package/lightgbm/dask.py911
LOWpython-package/lightgbm/basic.py373
LOWpython-package/lightgbm/basic.py552
LOWpython-package/lightgbm/basic.py889
LOWpython-package/lightgbm/basic.py1102
LOWpython-package/lightgbm/basic.py2079
LOWpython-package/lightgbm/basic.py2122
LOWpython-package/lightgbm/basic.py2546
LOWpython-package/lightgbm/basic.py2755
LOWpython-package/lightgbm/basic.py2953
LOWpython-package/lightgbm/basic.py3309
LOWpython-package/lightgbm/basic.py3479
LOWpython-package/lightgbm/basic.py3602
LOWpython-package/lightgbm/basic.py5222
LOWpython-package/lightgbm/sklearn.py867
LOWpython-package/lightgbm/sklearn.py973
LOWpython-package/lightgbm/sklearn.py1591
Cross-Language Confusion3 hits · 22 pts
SeverityFileLineSnippet
HIGHdocs/conf.py274 sh build-cran-package.sh || exit 1
HIGHdocs/conf.py275 R CMD INSTALL --with-keep.source lightgbm_*.tar.gz || exit 1
HIGHdocs/conf.py276 Rscript .ci/build-docs.R || exit 1
Self-Referential Comments7 hits · 21 pts
SeverityFileLineSnippet
MEDIUMR-package/demo/categorical_features_rules.R69# Creating the LightGBM dataset with categorical features
MEDIUMR-package/R/lgb.train.R165 # Create the predictor set
MEDIUMR-package/R/lgb.cv.R186 # Create the predictor set
MEDIUMR-package/R/lgb.cv.R541 # Create a vector of integers from 1:k as many times as possible without
MEDIUMR-package/inst/make-r-def.R2# Create a definition file (.def) from a .dll file, using objdump.
MEDIUMpython-package/lightgbm/basic.py3954 # Create the node record, and populate universal data members
MEDIUMdocs/conf.py7# This file is execfile()d with the current directory set to its
AI Slop Vocabulary8 hits · 16 pts
SeverityFileLineSnippet
LOWbuild-python.sh310 # avoid trying to recompile, just use hatchling and copy in relevant files
LOWR-package/demo/basic_walkthrough.R140# To load it in, simply call lgb.Dataset
MEDIUMtests/python_package_test/test_sklearn.py1418 # Verify that eval_metric is robust to receiving a list with None
MEDIUMtests/python_package_test/test_sklearn.py367 y = y.astype(str) # utilize label encoder at it's max power
MEDIUMtests/python_package_test/test_sklearn.py395 y = y.astype(str) # utilize label encoder at it's max power
MEDIUMtests/python_package_test/test_sklearn.py425 y = y.astype(str) # utilize label encoder at it's max power
LOWpython-package/lightgbm/basic.py2823 # If the data is a arrow data, we can just pass it to C
MEDIUMsrc/io/dataset_loader.cpp54 // support to get header from parser config, so could utilize following label name to id mapping logic.
Cross-File Repetition3 hits · 15 pts
SeverityFileLineSnippet
HIGHpython-package/lightgbm/sklearn.py0docstring is set after definition, using a template.
HIGHpython-package/lightgbm/sklearn.py0docstring is set after definition, using a template.
HIGHpython-package/lightgbm/sklearn.py0docstring is set after definition, using a template.
Slop Phrases3 hits · 6 pts
SeverityFileLineSnippet
LOWR-package/R/lgb.Booster.R1122#' different parameters or prediction type, so make sure to check that the output is what
LOWinclude/LightGBM/config.h1098 // desc = **Note**: don't forget to allow this port in firewall settings before training
MEDIUMsrc/treelearner/parallel_tree_learner.h124* When #data is large and #feature is large, you can use this to have better speed-up