dmlc/xgboost

6.4

Adjusted Score

6.4

Raw Score

100%

Time Factor

2026-07-14

Last Push

28.6K

Stars

C++

Language

144.7K

Lines of Code

761

Files

824

Pattern Hits

2026-07-14

Scan Date

0.01

HC Hit Rate

What These Metrics Mean

Adjusted Score: Primary synthetic code indicator. Raw score normalised per 1,000 lines of code and multiplied by the temporal discount factor. This is the definitive comparative metric — use it to rank repositories by AI authorship density.
Raw Score: The unmodified sum of all severity-weighted, context-multiplied pattern match scores before temporal discounting. Reflects the absolute signal strength independent of when the repository was last active.
Time Factor: The temporal discount multiplier (0–100%) applied to the raw score. Repositories last updated before ChatGPT's launch (Nov 2022) receive a 5% factor. Full signal is only assigned to repositories active in the post-adoption era (Jan 2024+).
Pattern Hits: Total count of individual pattern matches across all files and categories. A high hit count with a low score may indicate a very large codebase with isolated AI snippets; a low count with a high score indicates dense, concentrated AI signatures.
HC Hit Rate: High+Critical pattern hits per file, averaged across the repository. This orthogonal signal catches repositories where a few files are densely packed with high-severity AI tells — a strong indicator even when the normalised score appears moderate due to codebase size.
Lines of Code / Files: Total lines and files analysed. The scanner examines 94 file extensions. These denominators are used to normalise the score, enabling fair comparison between repositories of vastly different sizes.

Score History

This chart maps the temporal evolution of the adjusted synthetic code score across successive scan runs. An upward trajectory indicates ongoing incorporation of AI-generated code or expanding LLM-assisted scaffolding; a stable or declining trajectory may reflect active human refactoring, code removal, or the adoption of stricter authorship policies. The dashed secondary line (right axis) independently tracks total raw pattern hit count, which can diverge from the normalised score when codebase size changes significantly between scans.

Severity Breakdown

Classifies detected patterns by their diagnostic confidence and structural impact. CRITICAL patterns (coefficient 10) represent definitive synthetic signatures — hallucinated imports, explicit LLM attribution metadata — virtually never produced by human authors. HIGH (5) indicates strong structural tells such as cross-file repetition or cross-linguistic idioms. MEDIUM (2) covers recognisable conversational padding and AI-specific vocabulary. LOW (1) captures subtle indicators like tautological comments and generic boilerplate that require density to carry independent signal.

CRITICAL 4HIGH 6MEDIUM 17LOW 797

Directory Score Breakdown

This horizontal bar chart decomposes the repository's raw synthetic code score by top-level directory, allowing you to pinpoint precisely which modules or components carry the highest AI authorship density. Directories with disproportionately high scores relative to their size warrant targeted manual review: concentrated AI signatures often trace back to mass-generated configuration layers, auto-ported test suites, LLM-scaffolded boilerplate classes, or entire subsystems authored under heavy copilot assistance. Use this view to prioritise your human code-review effort.

Pattern Findings

The scanner identified 824 distinct pattern matches across 18 syntactic categories. Each entry below represents a discrete location in the source code where the engine recorded a statistically significant AI authorship indicator. Expand any category row to inspect the individual file paths, line numbers, code snippets, and the lexical context (CODE, COMMENT, or STRING) in which each match was detected.

Reading the findings table: The Severity column indicates the diagnostic confidence level (CRITICAL / HIGH / MEDIUM / LOW). The Context column identifies whether the match occurred inside executable code, an inline comment, or a string literal — comment-context matches receive a ×1.5 weight because LLMs systematically over-annotate. The ⚡ bolt icon marks clustered matches: three or more patterns within a 10-line window, each receiving an additional ×1.5 density multiplier as dense clusters constitute far stronger evidence of synthetic authorship than isolated hits.

Hyper-Verbose Identifiers285 hits · 308 pts

Severity	File	Line	Snippet	Context
LOW	demo/dask/gpu_training.py	42	def using_quantile_device_dmatrix(client: Client, X: da.Array, y: da.Array) -> da.Array:	CODE
LOW	demo/dask/dask_callbacks.py	18	def probability_for_going_backward(epoch: int) -> float:	CODE
LOW	demo/nvflare/horizontal/custom/controller.py	48	def process_result_of_unknown_task(self, client: Client, task_name: str,	CODE
LOW	demo/nvflare/vertical/custom/controller.py	48	def process_result_of_unknown_task(self, client: Client, task_name: str,	CODE
LOW	demo/guide-python/continuation.py	50	def training_continuation_early_stop(tmpdir: str, use_pickle: bool) -> None:	CODE
LOW	demo/guide-python/prediction_intervals.py	138	def plot_prediction_intervals(	CODE
LOW	tests/python/test_ordinal.py	72	def test_training_continuation() -> None:	CODE
LOW	tests/python/test_ordinal.py	80	def test_recode_dmatrix_predict() -> None:	CODE
LOW	tests/python/test_training_continuation.py	37	def run_training_continuation(	CODE
LOW	tests/python/test_training_continuation.py	134	def test_training_continuation_json(self, tmp_path: Path) -> None:	CODE
LOW	tests/python/test_training_continuation.py	139	def test_training_continuation_updaters_json(self, tmp_path: Path) -> None:	CODE
LOW	tests/python/test_training_continuation.py	174	def test_continuation_determinism(kwargs: Any) -> None:	CODE
LOW	tests/python/test_basic.py	191	def test_dmatrix_numpy_init_omp(self):	CODE
LOW	tests/python/test_basic.py	227	def test_cv_explicit_fold_indices(self):	CODE
LOW	tests/python/test_basic.py	241	def test_cv_explicit_fold_indices_labels(self):	CODE
LOW	tests/python/test_early_stopping.py	15	def test_early_stopping_nonparallel(self):	CODE
LOW	tests/python/test_early_stopping.py	122	def test_cv_early_stopping_with_multiple_eval_sets_and_metrics(self):	CODE
LOW	tests/python/test_with_arrow.py	39	def test_arrow_table_with_label(self):	CODE
LOW	tests/python/test_data_iterator.py	337	def test_categorical_extmem_qdm(	CODE
LOW⚡	tests/python/test_interaction_constraints.py	10	def test_exact_interaction_constraints(self) -> None:	CODE
LOW⚡	tests/python/test_interaction_constraints.py	13	def test_hist_interaction_constraints(self) -> None:	CODE
LOW⚡	tests/python/test_interaction_constraints.py	16	def test_approx_interaction_constraints(self) -> None:	CODE
LOW⚡	tests/python/test_interaction_constraints.py	19	def test_hist_multi_interaction_constraints(self) -> None:	CODE
LOW⚡	tests/python/test_interaction_constraints.py	22	def test_interaction_constraints_feature_names(self) -> None:	CODE
LOW	tests/python/test_interaction_constraints.py	59	def test_hist_training_accuracy(self, tree_method: str) -> None:	CODE
LOW	tests/python/test_model_io.py	104	def test_categorical_model_io(self, tmp_path: Path) -> None:	CODE
LOW	tests/python/test_model_io.py	341	def test_with_sklearn_obj_metric(tmp_path: Path) -> None:	CODE
LOW	tests/python/generate_models.py	37	def generate_regression_model() -> None:	CODE
LOW	tests/python/generate_models.py	107	def generate_classification_model() -> None:	CODE
LOW	tests/python/generate_models.py	178	def generate_aft_survival_models() -> None:	CODE
LOW	tests/python/test_callback.py	124	def test_early_stopping_custom_eval(self, breast_cancer: BreastCancer) -> None:	CODE
LOW	tests/python/test_callback.py	144	def test_early_stopping_customize(self, breast_cancer: BreastCancer) -> None:	CODE
LOW	tests/python/test_callback.py	206	def test_early_stopping_custom_eval_skl(self, breast_cancer: BreastCancer) -> None:	CODE
LOW	tests/python/test_callback.py	218	def test_early_stopping_save_best_model(self, breast_cancer: BreastCancer) -> None:	CODE
LOW	tests/python/test_callback.py	258	def test_early_stopping_continuation(	CODE
LOW	tests/python/test_callback.py	292	def test_early_stopping_multiple_metrics(self):	CODE
LOW	tests/python/test_callback.py	314	def test_eta_decay_leaf_output(self, tree_method: str, objective: str) -> None:	CODE
LOW	tests/python/test_basic_models.py	104	def test_boost_from_prediction(self):	CODE
LOW	tests/python/test_basic_models.py	119	def test_boost_from_existing_model(self) -> None:	CODE
LOW	tests/python/test_basic_models.py	227	def test_feature_names_validation(self):	CODE
LOW	tests/python/test_basic_models.py	243	def test_special_model_dump_characters(self) -> None:	CODE
LOW	tests/python/test_survival.py	58	def test_aft_survival_toy_data(	CODE
LOW	tests/python/test_survival.py	133	def test_aft_survival_demo_data():	CODE
LOW	tests/python/test_dmatrix.py	413	def verify_numpy_feature_names():	CODE
LOW	tests/python/test_predict.py	74	def test_base_margin_vs_base_score() -> None:	CODE
LOW	tests/python/test_ranking.py	93	def test_ranking_with_unweighted_data():	CODE
LOW	tests/python/test_ranking.py	127	def test_ranking_with_weighted_data():	CODE
LOW	tests/python/test_ranking.py	184	def test_lambdarank_parameters(params):	CODE
LOW	tests/python/test_multi_target.py	46	def test_shap_multi_output_tree() -> None:	CODE
LOW	tests/python/test_multi_target.py	196	def test_feature_importance_strategy_compare() -> None:	CODE
LOW	tests/python/test_multi_target.py	210	def test_gradient_based_sampling_accuracy() -> None:	CODE
LOW	tests/python/test_interpret.py	7	def test_shap_values_matches_predict() -> None:	CODE
LOW	tests/python/test_interpret.py	20	def test_shap_values_accepts_sklearn_model() -> None:	CODE
LOW	tests/python/test_interpret.py	34	def test_shap_values_uses_sklearn_iteration_range() -> None:	CODE
LOW⚡	tests/python/test_interpret.py	51	def test_shap_values_rejects_background_data() -> None:	CODE
LOW⚡	tests/python/test_interpret.py	61	def test_shap_values_validates_get_booster() -> None:	CODE
LOW⚡	tests/python/test_interpret.py	69	def test_shap_values_uses_missing_for_array_like_data() -> None:	CODE
LOW	tests/python/test_interpret.py	83	def test_shap_values_rejects_missing_with_dmatrix() -> None:	CODE
LOW	tests/python/test_config.py	10	def test_global_config_verbosity(verbosity_level):	CODE
LOW	tests/python/test_config.py	23	def test_global_config_use_rmm(use_rmm):	CODE
225 more matches not shown…

Over-Commented Block320 hits · 300 pts

Severity	File	Line	Snippet	Context
LOW	demo/kaggle-higgs/speedtest.R	41	nrounds <- 120	COMMENT
LOW	demo/kaggle-higgs/speedtest.R	61	# 111.682 0.777 35.963	COMMENT
LOW	demo/guide-python/generalized_linear_model.py	21	# alpha is the L1 regularizer	COMMENT
LOW	R-package/R/xgb.DMatrix.save.R	1	#' Save xgb.DMatrix object to binary file	COMMENT
LOW	R-package/R/xgb.plot.multi.trees.R	1	#' Project all trees on one tree	COMMENT
LOW	R-package/R/xgb.plot.multi.trees.R	21	#'	COMMENT
LOW	R-package/R/xgb.plot.multi.trees.R	41	#' nrounds = 30,	COMMENT
LOW	R-package/R/xgb.plot.deepness.R	1	#' Plot model tree depth	COMMENT
LOW	R-package/R/xgb.plot.deepness.R	21	#' Those could be helpful in determining sensible ranges of the `max_depth`	COMMENT
LOW	R-package/R/xgb.plot.deepness.R	41	#'	COMMENT
LOW	R-package/R/xgb.plot.deepness.R	61	#' xgb.plot.deepness(model)	COMMENT
LOW	R-package/R/utils.R	381		COMMENT
LOW	R-package/R/utils.R	421	#' for objects produced by [xgboost()]), outside of its core components, might also keep:	COMMENT
LOW	R-package/R/utils.R	441	#' not used for prediction / importance / plotting / etc.	COMMENT
LOW	R-package/R/utils.R	461	#' preferred for long-term storage.	COMMENT
LOW	R-package/R/utils.R	481	#'	COMMENT
LOW	R-package/R/utils.R	501	#' xgb.save(bst, fname)	COMMENT
LOW	R-package/R/utils.R	521	#' obj2 <- readRDS(fname)	COMMENT
LOW	R-package/R/utils.R	541	#' that code calling xgboost will still work once those are removed in future releases.	COMMENT
LOW	R-package/R/xgb.plot.importance.R	1	#' Plot feature importance	COMMENT
LOW	R-package/R/xgb.plot.importance.R	21	#'	COMMENT
LOW	R-package/R/xgb.plot.importance.R	41	#' - `xgb.ggplot.importance()`: A customizable "ggplot" object.	COMMENT
LOW	R-package/R/xgb.plot.importance.R	61	#' xgb.plot.importance(	COMMENT
LOW	R-package/R/xgb.plot.importance.R	141	# Avoid error messages during CRAN check.	COMMENT
LOW	R-package/R/xgb.save.raw.R	1	#' Save XGBoost model to R's raw vector	COMMENT
LOW	R-package/R/xgb.save.raw.R	21	#' train <- agaricus.train	COMMENT
LOW	R-package/R/xgb.plot.shap.R	1	#' SHAP dependence plots	COMMENT
LOW	R-package/R/xgb.plot.shap.R	21	#' The default (`NULL`) will use up to 100k data points.	COMMENT
LOW	R-package/R/xgb.plot.shap.R	41	#' If `FALSE`, only a list of matrices is returned.	COMMENT
LOW	R-package/R/xgb.plot.shap.R	61	#' a meaningful thing to do.	COMMENT
LOW	R-package/R/xgb.plot.shap.R	81	#' data.table::setDTthreads(nthread)	COMMENT
LOW	R-package/R/xgb.plot.shap.R	101	#'	COMMENT
LOW	R-package/R/xgb.plot.shap.R	121	#' model = model_multiclass,	COMMENT
LOW	R-package/R/xgb.plot.shap.R	141	#' )	COMMENT
LOW	R-package/R/xgb.plot.shap.R	241	#' Visualizes SHAP contributions of different features.	COMMENT
LOW	R-package/R/xgb.plot.shap.R	261	#' and the Python library <https://github.com/shap/shap>.	COMMENT
LOW	R-package/R/xgboost.R	821	if (is.na(early_stopping_rounds) \|\| early_stopping_rounds <= 0L) {	COMMENT
LOW	R-package/R/xgboost.R	841	#'	COMMENT
LOW	R-package/R/xgboost.R	861	#' For package authors using 'xgboost' as a dependency, it is highly recommended to use	COMMENT
LOW	R-package/R/xgboost.R	881	#' Note that categorical features are only supported for `data.frame` inputs, and are automatically	COMMENT
LOW	R-package/R/xgboost.R	901	#' set as the last level.	COMMENT
LOW	R-package/R/xgboost.R	921	#' prediction type (e.g. `multi:softmax` vs. `multi:softprob`) are not allowed, and neither are	COMMENT
LOW	R-package/R/xgboost.R	941	#' - `"reg:gamma"`: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., fo	COMMENT
LOW	R-package/R/xgboost.R	961	#' 2 (info), and 3 (debug).	COMMENT
LOW	R-package/R/xgboost.R	981	#' @param early_stopping_rounds Number of boosting rounds after which training will be stopped	COMMENT
LOW	R-package/R/xgboost.R	1001	#' A value of `+1` for a given feature makes the model predictions / scores constrained to be	COMMENT
LOW	R-package/R/xgboost.R	1021	#' for more information.	COMMENT
LOW	R-package/R/xgboost.R	1041	#' Note that, if it contains more than one column, then columns will not be matched by name to	COMMENT
LOW	R-package/R/xgboost.R	1061	#' - For linear booster:	COMMENT
LOW	R-package/R/xgboost.R	1081	#' @examples	COMMENT
LOW	R-package/R/xgboost.R	1101	#'	COMMENT
LOW	R-package/R/xgboost.R	1241	#'	COMMENT
LOW	R-package/R/xgboost.R	1261	#' probabilities of belonging to the last class in the case of binary classification). Result will	COMMENT
LOW	R-package/R/xgboost.R	1281	#' Output will be a numeric matrix with shape `[nrows, nfeatures+1]`, with the intercept being the	COMMENT
LOW	R-package/R/xgboost.R	1301	#' @param iteration_range Sequence of rounds/iterations from the model to use for prediction, specified by passing	COMMENT
LOW	R-package/R/xgboost.R	1321	#'	COMMENT
LOW	R-package/R/xgboost.R	1541	}	COMMENT
LOW	R-package/R/xgboost.R	1561	#' School of Information and Computer Science.	COMMENT
LOW	R-package/R/xgboost.R	1581	#' <https://archive.ics.uci.edu/ml/datasets/Mushroom>	COMMENT
LOW	R-package/R/xgboost.R	1601	#' @importFrom data.table :=	COMMENT
260 more matches not shown…

Unused Imports67 hits · 64 pts

Severity	File	Line	Context
LOW	demo/multiclass_classification/train.py	3	CODE
LOW	demo/dask/dask_learning_to_rank.py	16	CODE
LOW	demo/nvflare/vertical/custom/trainer.py	10	CODE
LOW	demo/guide-python/cat_in_the_dat.py	24	CODE
LOW	demo/guide-python/learning_to_rank.py	19	CODE
LOW	tests/test_distributed/test_with_spark/test_spark.py	6	CODE
LOW	tests/test_distributed/test_gpu_with_dask/conftest.py	21	CODE
LOW	…t_distributed/test_gpu_with_dask/test_gpu_with_dask.py	15	CODE
LOW	tests/test_distributed/test_with_dask/conftest.py	3	CODE
LOW	tests/test_distributed/test_with_dask/test_with_dask.py	1978	CODE
LOW	tests/test_distributed/test_with_dask/test_with_dask.py	2069	CODE
LOW	tests/python-sycl/test_sycl_with_sklearn.py	3	CODE
LOW	tests/python-sycl/test_sycl_prediction.py	1	CODE
LOW	tests/python-sycl/test_sycl_prediction.py	7	CODE
LOW	tests/python-sycl/test_sycl_updaters.py	1	CODE
LOW	tests/python-sycl/test_sycl_updaters.py	2	CODE
LOW	tests/python-sycl/test_sycl_updaters.py	3	CODE
LOW	tests/python-sycl/test_sycl_updaters.py	5	CODE
LOW	tests/python-sycl/test_sycl_updaters.py	7	CODE
LOW	python-package/xgboost/objective.py	28	CODE
LOW	python-package/xgboost/compat.py	39	CODE
LOW	python-package/xgboost/compat.py	40	CODE
LOW	python-package/xgboost/compat.py	41	CODE
LOW	python-package/xgboost/compat.py	81	CODE
LOW	python-package/xgboost/_typing.py	34	CODE
LOW	python-package/xgboost/__init__.py	6	CODE
LOW	python-package/xgboost/__init__.py	6	CODE
LOW	python-package/xgboost/__init__.py	6	CODE
LOW	python-package/xgboost/__init__.py	12	CODE
LOW	python-package/xgboost/__init__.py	12	CODE
LOW	python-package/xgboost/__init__.py	12	CODE
LOW	python-package/xgboost/__init__.py	12	CODE
LOW	python-package/xgboost/__init__.py	12	CODE
LOW	python-package/xgboost/__init__.py	12	CODE
LOW	python-package/xgboost/__init__.py	20	CODE
LOW	python-package/xgboost/__init__.py	21	CODE
LOW	python-package/xgboost/__init__.py	21	CODE
LOW	python-package/xgboost/__init__.py	24	CODE
LOW	python-package/xgboost/__init__.py	24	CODE
LOW	python-package/xgboost/__init__.py	24	CODE
LOW	python-package/xgboost/__init__.py	25	CODE
LOW	python-package/xgboost/__init__.py	25	CODE
LOW	python-package/xgboost/__init__.py	25	CODE
LOW	python-package/xgboost/__init__.py	26	CODE
LOW	python-package/xgboost/__init__.py	26	CODE
LOW	python-package/xgboost/__init__.py	26	CODE
LOW	python-package/xgboost/__init__.py	26	CODE
LOW	python-package/xgboost/__init__.py	26	CODE
LOW	python-package/xgboost/__init__.py	26	CODE
LOW	python-package/xgboost/core.py	5	CODE
LOW	python-package/xgboost/training.py	42	CODE
LOW	python-package/xgboost/data.py	82	CODE
LOW	python-package/xgboost/data.py	83	CODE
LOW	python-package/xgboost/data.py	85	CODE
LOW	python-package/xgboost/data.py	85	CODE
LOW	python-package/xgboost/_data_utils.py	41	CODE
LOW	python-package/xgboost/testing/__init__.py	53	CODE
LOW	python-package/xgboost/testing/__init__.py	53	CODE
LOW	python-package/xgboost/spark/params.py	3	CODE
LOW	python-package/xgboost/spark/__init__.py	8	CODE
7 more matches not shown…

Hallucination Indicators4 hits · 45 pts

Severity	File	Line	Snippet	Context
CRITICAL	R-package/tests/testthat/test_dmatrix.R	674	expect_equal(xgb.get.DMatrix.num.non.missing(dm1), 10)	CODE
CRITICAL	R-package/tests/testthat/test_dmatrix.R	679	expect_equal(xgb.get.DMatrix.num.non.missing(dm2), 8)	CODE
CRITICAL	python-package/xgboost/data.py	501	# pandas.core.internals.managers.SingleBlockManager.array_values()	COMMENT
CRITICAL	…rc/main/java/ml/dmlc/xgboost4j/java/flink/XGBoost.java	170	return new XGBoostModel(ml.dmlc.xgboost4j.java.XGBoost.loadModel(opened));	CODE

Deep Nesting42 hits · 40 pts

Severity	File	Line	Context
LOW	demo/guide-python/model_parser.py	146	CODE
LOW	tests/python/test_model_compatibility.py	29	CODE
LOW	tests/python/test_model_compatibility.py	76	CODE
LOW	tests/python/test_shap.py	12	CODE
LOW	tests/python/test_shap.py	82	CODE
LOW	tests/python/test_tracker.py	222	CODE
LOW	tests/python/test_tracker.py	256	CODE
LOW	tests/python/test_tracker.py	227	CODE
LOW	tests/python/test_tracker.py	268	CODE
LOW	tests/test_distributed/test_with_dask/test_with_dask.py	1741	CODE
LOW	tests/python-sycl/test_sycl_updaters.py	61	CODE
LOW	tests/python-sycl/test_sycl_training_continuation.py	9	CODE
LOW	tests/python-sycl/test_sycl_training_continuation.py	29	CODE
LOW	python-package/xgboost/callback.py	555	CODE
LOW	python-package/xgboost/callback.py	634	CODE
LOW	python-package/xgboost/core.py	2321	CODE
LOW	python-package/xgboost/core.py	3084	CODE
LOW	python-package/xgboost/libpath.py	19	CODE
LOW	python-package/xgboost/sklearn.py	1173	CODE
LOW	python-package/xgboost/sklearn.py	1403	CODE
LOW	python-package/xgboost/sklearn.py	1833	CODE
LOW	python-package/xgboost/data.py	370	CODE
LOW	python-package/xgboost/data.py	534	CODE
LOW	python-package/xgboost/data.py	1074	CODE
LOW	python-package/xgboost/dask/__init__.py	1032	CODE
LOW	python-package/xgboost/dask/__init__.py	1560	CODE
LOW	python-package/xgboost/dask/__init__.py	1975	CODE
LOW	python-package/xgboost/testing/ordinal.py	124	CODE
LOW	python-package/xgboost/testing/ordinal.py	137	CODE
LOW	python-package/xgboost/testing/continuation.py	14	CODE
LOW	python-package/xgboost/testing/continuation.py	38	CODE
LOW	python-package/xgboost/testing/shared.py	53	CODE
LOW	python-package/xgboost/spark/core.py	1362	CODE
LOW	python-package/xgboost/spark/core.py	1388	CODE
LOW	python-package/xgboost/spark/data.py	174	CODE
LOW	python-package/xgboost/spark/data.py	213	CODE
LOW	doc/conf.py	65	CODE
LOW	ops/script/run_clang_tidy.py	103	CODE
LOW	ops/script/lint_cpp.py	13	CODE
LOW	ops/script/lint_cpp.py	135	CODE
LOW	ops/script/change_scala_version.py	7	CODE
LOW	ops/script/test_r_package.py	230	CODE

AI Structural Patterns41 hits · 38 pts

Severity	File	Line	Context
LOW	tests/python/test_parse_tree.py	34	CODE
LOW	tests/python-sycl/test_sycl_simple_dask.py	17	CODE
LOW	python-package/xgboost/callback.py	269	CODE
LOW	python-package/xgboost/collective.py	146	CODE
LOW	python-package/xgboost/collective.py	158	CODE
LOW	python-package/xgboost/plotting.py	21	CODE
LOW	python-package/xgboost/plotting.py	154	CODE
LOW	python-package/xgboost/core.py	662	CODE
LOW	python-package/xgboost/core.py	860	CODE
LOW	python-package/xgboost/core.py	909	CODE
LOW	python-package/xgboost/core.py	1102	CODE
LOW	python-package/xgboost/core.py	1480	CODE
LOW	python-package/xgboost/core.py	1965	CODE
LOW	python-package/xgboost/core.py	2406	CODE
LOW	python-package/xgboost/core.py	2999	CODE
LOW	python-package/xgboost/sklearn.py	809	CODE
LOW	python-package/xgboost/sklearn.py	1270	CODE
LOW	python-package/xgboost/sklearn.py	1721	CODE
LOW	python-package/xgboost/sklearn.py	1974	CODE
LOW	python-package/xgboost/sklearn.py	2059	CODE
LOW	python-package/xgboost/sklearn.py	2167	CODE
LOW	python-package/xgboost/training.py	53	CODE
LOW	python-package/xgboost/training.py	435	CODE
LOW	python-package/xgboost/training.py	253	CODE
LOW	python-package/xgboost/data.py	1229	CODE
LOW	python-package/xgboost/dask/__init__.py	256	CODE
LOW	python-package/xgboost/dask/__init__.py	581	CODE
LOW	python-package/xgboost/dask/__init__.py	831	CODE
LOW	python-package/xgboost/dask/__init__.py	1211	CODE
LOW	python-package/xgboost/dask/__init__.py	296	CODE
LOW	python-package/xgboost/dask/__init__.py	591	CODE
LOW	python-package/xgboost/dask/__init__.py	1660	CODE
LOW	python-package/xgboost/dask/__init__.py	1772	CODE
LOW	python-package/xgboost/dask/__init__.py	1975	CODE
LOW	python-package/xgboost/dask/__init__.py	2144	CODE
LOW	python-package/xgboost/dask/__init__.py	2206	CODE
LOW	python-package/xgboost/dask/data.py	206	CODE
LOW	python-package/xgboost/testing/data.py	859	CODE
LOW	python-package/xgboost/spark/estimator.py	195	CODE
LOW	python-package/xgboost/spark/estimator.py	372	CODE
LOW	python-package/xgboost/spark/estimator.py	561	CODE

AI Slop Vocabulary11 hits · 24 pts

Severity	File	Line	Snippet	Context
LOW	demo/rmm_plugin/rmm_mgpu_with_dask.py	44	# To use RMM pool allocator with a GPU Dask cluster, just add rmm_pool_size option	COMMENT
MEDIUM	demo/guide-python/custom_softmax.py	70	# suitable for demo. Also the one in native XGBoost core is more robust to	COMMENT
MEDIUM	python-package/xgboost/config.py	58	# Show all messages, including ones pertaining to debugging	COMMENT
LOW	python-package/xgboost/sklearn.py	1851	# If output_margin is active, simply return the scores	STRING
MEDIUM	python-package/xgboost/dask/__init__.py	690	# dask paradigm. But as a side effect, the `evals_result` in single-node API	COMMENT
MEDIUM	python-package/xgboost/testing/__init__.py	662	"""Reassign stdout temporarily in order to test printed statements	STRING
MEDIUM	python-package/xgboost/spark/core.py	935	# Spark-rapids is a project to leverage GPUs to accelerate spark SQL.	COMMENT
LOW	python-package/xgboost/spark/core.py	673	# For now, since we cannot call rdd.getNumPartitions(), we just return	COMMENT
LOW	python-package/xgboost/spark/core.py	1351	# User don't set gpu configurations, just use cpu	COMMENT
MEDIUM	…main/scala/ml/dmlc/xgboost4j/scala/spark/XGBoost.scala	144	// that utilize GPUs alongside training tasks in order to avoid GPU out-of-memory errors.	COMMENT
MEDIUM	src/common/cuda_pinned_allocator.h	73	// This is actually a pinned memory allocator in disguise. We utilize HMM or ATS for	COMMENT

Modern Structural Boilerplate20 hits · 20 pts

Severity	File	Line	Snippet	Context
LOW	python-package/xgboost/callback.py	36	__all__ = [	CODE
LOW	python-package/xgboost/interpret.py	127	__all__ = ["shap_values"]	CODE
LOW	python-package/xgboost/config.py	127	def set_config(**new_config: Any) -> None:	STRING
LOW	python-package/xgboost/__init__.py	41	__all__ = [	CODE
LOW	python-package/xgboost/core.py	933	def set_float_info(self, field: str, data: ArrayLike) -> None:	CODE
LOW	python-package/xgboost/core.py	948	def set_float_info_npy2d(self, field: str, data: ArrayLike) -> None:	CODE
LOW	python-package/xgboost/core.py	964	def set_uint_info(self, field: str, data: ArrayLike) -> None:	CODE
LOW	python-package/xgboost/core.py	995	def set_label(self, label: ArrayLike) -> None:	CODE
LOW	python-package/xgboost/core.py	1007	def set_weight(self, weight: ArrayLike) -> None:	CODE
LOW	python-package/xgboost/core.py	1027	def set_base_margin(self, margin: ArrayLike) -> None:	CODE
LOW	python-package/xgboost/core.py	1045	def set_group(self, group: ArrayLike) -> None:	CODE
LOW	python-package/xgboost/core.py	2045	def set_attr(self, **kwargs: Optional[Any]) -> None:	CODE
LOW	python-package/xgboost/core.py	2075	def _set_feature_info(self, features: Optional[FeatureInfo], field: str) -> None:	CODE
LOW	python-package/xgboost/sklearn.py	1265	def _set_evaluation_result(self, evals_result: EvalsLog) -> None:	CODE
LOW	python-package/xgboost/dask/__init__.py	138	__all__ = [	CODE
LOW	python-package/xgboost/spark/__init__.py	17	__all__ = [	CODE
LOW⚡	python-package/xgboost/spark/core.py	358	def _set_predict_params_default(self) -> None:	CODE
LOW	python-package/xgboost/spark/core.py	300	def _set_xgb_params_default(self) -> None:	CODE
LOW	python-package/xgboost/spark/core.py	336	def _set_fit_params_default(self) -> None:	CODE
LOW	python-package/xgboost/spark/estimator.py	41	def set_param_attrs(attr_name: str, param: Param) -> None:	CODE

Self-Referential Comments6 hits · 18 pts

Severity	File	Line	Snippet	Context
MEDIUM	demo/guide-python/cat_pipeline.py	66	# Create an encoder based on training data.	COMMENT
MEDIUM	R-package/R/utils.R	2	# This file is for the low level reusable utility functions	COMMENT
MEDIUM	R-package/inst/make-r-def.R	2	# Create a definition file (.def) from a .dll file, using objdump. This	COMMENT
MEDIUM	tests/python-gpu/test_gpu_prediction.py	266	# Create a wide dataset	COMMENT
MEDIUM	python-package/xgboost/dask/data.py	434	# Create the training DMatrix	COMMENT
MEDIUM	python-package/xgboost/testing/data.py	180	# Create a dictionary-backed dataframe, enable this when the roundtrip is	COMMENT

Cross-File Repetition3 hits · 15 pts

Severity	File	Snippet	Context
HIGH	demo/guide-python/quantile_data_iterator.py	utility function for obtaining current batch of data.	STRING
HIGH	tests/python-gpu/test_from_cudf.py	utility function for obtaining current batch of data.	STRING
HIGH	python-package/xgboost/dask/data.py	utility function for obtaining current batch of data.	STRING

Cross-Language Confusion2 hits · 12 pts

Severity	File	Line	Snippet	Context
HIGH	tests/python/test_with_pandas.py	487	assert df.equals(copy)	CODE
HIGH	python-package/xgboost/testing/data.py	200	# assert pd_catcodes.equals(pa_catcodes)	COMMENT

Excessive Try-Catch Wrapping11 hits · 12 pts

Severity	File	Line	Snippet	Context
LOW	demo/guide-python/distributed_extmem_basic.py	189	except Exception as e:	CODE
LOW	tests/python/test_with_shap.py	9	except Exception:	CODE
LOW	tests/python/test_openmp.py	44	except Exception as e:	CODE
LOW	python-package/xgboost/core.py	380	except Exception as e: # pylint: disable=broad-except	CODE
LOW	python-package/xgboost/data.py	1326	except Exception: # pylint: disable=broad-except	CODE
MEDIUM	python-package/xgboost/data.py	1002	def _lazy_load_cudf_is_cat() -> Callable[[Any], bool]:	CODE
LOW	python-package/xgboost/dask/__init__.py	673	except Exception: # pylint: disable=broad-except	CODE
LOW	python-package/xgboost/testing/__init__.py	187	except Exception: # pylint: disable=broad-except	CODE
LOW	python-package/xgboost/testing/__init__.py	757	except Exception as e: # pylint: disable=broad-except	CODE
LOW	python-package/xgboost/spark/core.py	1654	except Exception as e: # pylint: disable=W0703	CODE
LOW	ops/script/change_version.py	163	except Exception as e:	STRING

AI Response Leakage1 hit · 8 pts

Severity	File	Line	Snippet	Context
HIGH	R-package/R/xgb.DMatrix.R	664	#' # In this example, batches are obtained by subsetting the 'x' variable.	COMMENT

Decorative Section Separators2 hits · 6 pts

Severity	File	Line	Snippet	Context
MEDIUM	python-package/CMakeLists.txt	26	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	python-package/CMakeLists.txt	28	# ---------------------------------------------------------------------------	COMMENT

Redundant / Tautological Comments4 hits · 6 pts

Severity	File	Line	Snippet	Context
LOW	R-package/tests/testthat/test_ranking.R	25	# Check if the metric is monotone increasing	COMMENT
LOW	R-package/tests/testthat/test_ranking.R	53	# Check if the metric is monotone increasing	COMMENT
LOW	R-package/tests/testthat/test_ranking.R	63	all(diff(z) <= 0) # Check if z is monotone decreasing	CODE
LOW	jvm-packages/create_jni.py	80	# Set GPU_ARCH_FLAG to override the CUDA architectures.	COMMENT

Slop Phrases1 hit · 3 pts

Severity	File	Line	Snippet	Context
MEDIUM	demo/guide-python/sklearn_evals_result.py	25	# Or you can use: clf = xgb.XGBClassifier(**param_dist)	COMMENT

Verbosity Indicators2 hits · 3 pts

Severity	File	Line	Snippet	Context
LOW	R-package/R/xgb.train.R	420	#' The purpose of this function is to enable IDE autocompletions and to provide in-package	COMMENT
LOW	python-package/xgboost/spark/core.py	1255	# all the columns specified by features_cols, so we need to check if	COMMENT

Example Usage Blocks2 hits · 3 pts

Severity	File	Line	Snippet	Context
LOW	ops/pipeline/build-cuda.sh	5	## Usage:	COMMENT
LOW	ops/pipeline/test-python-wheel.sh	4	## Usage:	COMMENT

Analysis Overview

What These Metrics Mean

Score History

Severity Breakdown

Directory Score Breakdown

Pattern Findings