ludwig-ai/ludwig

25.4

Adjusted Score

25.4

Raw Score

100%

Time Factor

2026-07-13

Last Push

11.7K

Stars

Python

Language

230.8K

Lines of Code

1.6K

Files

3.6K

Pattern Hits

2026-07-14

Scan Date

0.05

HC Hit Rate

What These Metrics Mean

Adjusted Score: Primary synthetic code indicator. Raw score normalised per 1,000 lines of code and multiplied by the temporal discount factor. This is the definitive comparative metric — use it to rank repositories by AI authorship density.
Raw Score: The unmodified sum of all severity-weighted, context-multiplied pattern match scores before temporal discounting. Reflects the absolute signal strength independent of when the repository was last active.
Time Factor: The temporal discount multiplier (0–100%) applied to the raw score. Repositories last updated before ChatGPT's launch (Nov 2022) receive a 5% factor. Full signal is only assigned to repositories active in the post-adoption era (Jan 2024+).
Pattern Hits: Total count of individual pattern matches across all files and categories. A high hit count with a low score may indicate a very large codebase with isolated AI snippets; a low count with a high score indicates dense, concentrated AI signatures.
HC Hit Rate: High+Critical pattern hits per file, averaged across the repository. This orthogonal signal catches repositories where a few files are densely packed with high-severity AI tells — a strong indicator even when the normalised score appears moderate due to codebase size.
Lines of Code / Files: Total lines and files analysed. The scanner examines 94 file extensions. These denominators are used to normalise the score, enabling fair comparison between repositories of vastly different sizes.

Score History

This chart maps the temporal evolution of the adjusted synthetic code score across successive scan runs. An upward trajectory indicates ongoing incorporation of AI-generated code or expanding LLM-assisted scaffolding; a stable or declining trajectory may reflect active human refactoring, code removal, or the adoption of stricter authorship policies. The dashed secondary line (right axis) independently tracks total raw pattern hit count, which can diverge from the normalised score when codebase size changes significantly between scans.

Severity Breakdown

Classifies detected patterns by their diagnostic confidence and structural impact. CRITICAL patterns (coefficient 10) represent definitive synthetic signatures — hallucinated imports, explicit LLM attribution metadata — virtually never produced by human authors. HIGH (5) indicates strong structural tells such as cross-file repetition or cross-linguistic idioms. MEDIUM (2) covers recognisable conversational padding and AI-specific vocabulary. LOW (1) captures subtle indicators like tautological comments and generic boilerplate that require density to carry independent signal.

CRITICAL 0HIGH 76MEDIUM 824LOW 2706

Directory Score Breakdown

This horizontal bar chart decomposes the repository's raw synthetic code score by top-level directory, allowing you to pinpoint precisely which modules or components carry the highest AI authorship density. Directories with disproportionately high scores relative to their size warrant targeted manual review: concentrated AI signatures often trace back to mass-generated configuration layers, auto-ported test suites, LLM-scaffolded boilerplate classes, or entire subsystems authored under heavy copilot assistance. Use this view to prioritise your human code-review effort.

Pattern Findings

The scanner identified 3606 distinct pattern matches across 21 syntactic categories. Each entry below represents a discrete location in the source code where the engine recorded a statistically significant AI authorship indicator. Expand any category row to inspect the individual file paths, line numbers, code snippets, and the lexical context (CODE, COMMENT, or STRING) in which each match was detected.

Reading the findings table: The Severity column indicates the diagnostic confidence level (CRITICAL / HIGH / MEDIUM / LOW). The Context column identifies whether the match occurred inside executable code, an inline comment, or a string literal — comment-context matches receive a ×1.5 weight because LLMs systematically over-annotate. The ⚡ bolt icon marks clustered matches: three or more patterns within a 10-line window, each receiving an additional ×1.5 density multiplier as dense clusters constitute far stronger evidence of synthetic authorship than isolated hits.

Decorative Section Separators767 hits · 2580 pts

Severity	File	Line	Snippet	Context
MEDIUM	ludwig/experiment_utils.py	14	# ==============================================================================	COMMENT
MEDIUM	ludwig/api_types.py	14	# ==============================================================================	COMMENT
MEDIUM⚡	ludwig/collect.py	269	# ----------------	COMMENT
MEDIUM⚡	ludwig/collect.py	271	# ----------------	COMMENT
MEDIUM⚡	ludwig/collect.py	275	# -------------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	277	# -------------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	282	# ------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	284	# ------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	287	# ------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	289	# ------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	351	# ----------------	COMMENT
MEDIUM⚡	ludwig/collect.py	353	# ----------------	COMMENT
MEDIUM⚡	ludwig/collect.py	357	# -------------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	359	# -------------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	364	# ------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	366	# ------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	407	# ----------------	COMMENT
MEDIUM⚡	ludwig/collect.py	409	# ----------------	COMMENT
MEDIUM⚡	ludwig/collect.py	415	# ------------------	COMMENT
MEDIUM⚡	ludwig/collect.py	417	# ------------------	COMMENT
MEDIUM	ludwig/collect.py	15	# ==============================================================================	COMMENT
MEDIUM	ludwig/collect.py	235	# ---------------	COMMENT
MEDIUM	ludwig/collect.py	237	# ---------------	COMMENT
MEDIUM	ludwig/contrib.py	14	# ==============================================================================	COMMENT
MEDIUM	ludwig/forecast.py	68	# ---------------	COMMENT
MEDIUM	ludwig/forecast.py	70	# ---------------	COMMENT
MEDIUM⚡	ludwig/forecast.py	96	# ----------------	COMMENT
MEDIUM⚡	ludwig/forecast.py	98	# ----------------	COMMENT
MEDIUM⚡	ludwig/forecast.py	101	# -------------------------	COMMENT
MEDIUM⚡	ludwig/forecast.py	103	# -------------------------	COMMENT
MEDIUM	ludwig/hyperopt_cli.py	15	# ==============================================================================	COMMENT
MEDIUM⚡	ludwig/hyperopt_cli.py	150	# -------------------	COMMENT
MEDIUM⚡	ludwig/hyperopt_cli.py	152	# -------------------	COMMENT
MEDIUM⚡	ludwig/hyperopt_cli.py	161	# ----------------------------	COMMENT
MEDIUM⚡	ludwig/hyperopt_cli.py	163	# ----------------------------	COMMENT
MEDIUM⚡	ludwig/hyperopt_cli.py	173	# ---------------	COMMENT
MEDIUM⚡	ludwig/hyperopt_cli.py	175	# ---------------	COMMENT
MEDIUM	ludwig/hyperopt_cli.py	226	# ----------------	COMMENT
MEDIUM	ludwig/hyperopt_cli.py	228	# ----------------	COMMENT
MEDIUM	ludwig/hyperopt_cli.py	300	# ------------------	COMMENT
MEDIUM	ludwig/hyperopt_cli.py	302	# ------------------	COMMENT
MEDIUM	ludwig/error.py	14	# ==============================================================================	COMMENT
MEDIUM	ludwig/upload.py	92	# ---------------	COMMENT
MEDIUM	ludwig/upload.py	94	# ---------------	COMMENT
MEDIUM	ludwig/upload.py	111	# ---------------	COMMENT
MEDIUM	ludwig/upload.py	113	# ---------------	COMMENT
MEDIUM	ludwig/preprocess.py	15	# ==============================================================================	COMMENT
MEDIUM	ludwig/preprocess.py	96	# ---------------	COMMENT
MEDIUM	ludwig/preprocess.py	98	# ---------------	COMMENT
MEDIUM	ludwig/preprocess.py	141	# ----------------	COMMENT
MEDIUM	ludwig/preprocess.py	143	# ----------------	COMMENT
MEDIUM	ludwig/preprocess.py	166	# ------------------	COMMENT
MEDIUM	ludwig/preprocess.py	168	# ------------------	COMMENT
MEDIUM	ludwig/globals.py	15	# ==============================================================================	COMMENT
MEDIUM	ludwig/constants.py	15	# ==============================================================================	COMMENT
MEDIUM	ludwig/predict.py	15	# ==============================================================================	COMMENT
MEDIUM	ludwig/predict.py	110	# ---------------	COMMENT
MEDIUM	ludwig/predict.py	112	# ---------------	COMMENT
MEDIUM⚡	ludwig/predict.py	141	# ----------------	COMMENT
MEDIUM⚡	ludwig/predict.py	143	# ----------------	COMMENT
707 more matches not shown…

Hyper-Verbose Identifiers1505 hits · 1565 pts

Severity	File	Line	Snippet	Context
LOW	ludwig/experiment_utils.py	61	def get_experiment_description(	CODE
LOW	ludwig/contrib.py	32	def add_contrib_callback_args(parser: argparse.ArgumentParser):	CODE
LOW	ludwig/api.py	224	def _initialize_llm_for_zero_shot(self, random_seed: int = default_random_seed):	CODE
LOW	ludwig/api.py	710	def _tune_batch_size_and_grad_accum(self, trainer, dataset, random_seed: int = default_random_seed):	CODE
LOW	ludwig/api.py	768	def save_dequantized_base_model(self, save_path: str) -> None:	CODE
LOW	ludwig/api.py	867	def _generate_streaming_outputs(	CODE
LOW	ludwig/api.py	903	def _generate_non_streaming_outputs(	CODE
LOW	ludwig/api.py	1772	def _preprocess_for_prediction(	CODE
LOW	ludwig/serve_ray_serve.py	43	def make_ludwig_deployment_class(num_replicas: int = 1, ray_actor_options: dict \| None = None):	CODE
LOW⚡	ludwig/config_generation.py	22	def get_ludwig_schema_context() -> str:	CODE
LOW	ludwig/config_sampling/explore_schema.py	173	def generate_possible_configs(config_options: dict[str, Any]):	CODE
LOW	ludwig/config_sampling/explore_schema.py	260	def combine_configs_for_comparator_combiner(	CODE
LOW	ludwig/config_sampling/explore_schema.py	291	def combine_configs_for_sequence_combiner(	CODE
LOW	ludwig/explain/util.py	53	def get_absolute_module_key_from_submodule(module: torch.nn.Module, submodule: torch.nn.Module):	CODE
LOW⚡	ludwig/explain/captum.py	59	def retry_with_halved_batch_size(run_config: ExplanationRunConfig):	CODE
LOW⚡	ludwig/explain/captum.py	69	def retry_with_halved_batch_size_fn(fn):	CODE
LOW⚡	ludwig/explain/captum.py	70	def retry_with_halved_batch_size_wrapper(args, *kwargs):	CODE
LOW	ludwig/explain/captum_ray.py	177	def get_total_attribution_task(	CODE
LOW⚡	ludwig/config_validation/checks.py	178	def check_class_balance_preprocessing(config: "ModelConfig") -> None:	CODE
LOW⚡	ludwig/config_validation/checks.py	188	def check_sampling_exclusivity(config: "ModelConfig") -> None:	CODE
LOW⚡	ludwig/config_validation/checks.py	197	def check_validation_metric_exists(config: "ModelConfig") -> None:	CODE
LOW⚡	ludwig/config_validation/checks.py	466	def check_llm_finetuning_trainer_config(config: "ModelConfig"):	CODE
LOW⚡	ludwig/config_validation/checks.py	484	def check_llm_finetuning_backend_config(config: "ModelConfig"):	CODE
LOW	ludwig/config_validation/checks.py	53	def get_config_check_registry():	CODE
LOW	ludwig/config_validation/checks.py	75	def check_feature_names_unique(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	88	def check_tied_features_valid(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	114	def check_ray_backend_in_memory_preprocessing(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	136	def check_sequence_concat_combiner_requirements(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	154	def check_comparator_combiner_requirements(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	224	def check_hf_tokenizer_requirements(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	237	def check_hf_encoder_requirements(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	250	def check_stacked_transformer_requirements(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	274	def check_hyperopt_search_algorithm_dependencies_installed(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	286	def check_hyperopt_scheduler_dependencies_installed(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	298	def check_tagger_decoder_requirements(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	329	def check_hyperopt_parameter_dicts(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	372	def check_concat_combiner_requirements(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	403	def check_hyperopt_nested_parameter_dicts(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	439	def check_llm_exactly_one_input_text_feature(config: "ModelConfig"):	CODE
LOW	ludwig/config_validation/checks.py	450	def check_llm_finetuning_output_feature_config(config: "ModelConfig"):	CODE
LOW	ludwig/config_validation/checks.py	511	def check_llm_finetuning_adalora_config(config: "ModelConfig"):	CODE
LOW	ludwig/config_validation/checks.py	539	def check_llm_finetuning_adaption_prompt_parameters(config: "ModelConfig"):	CODE
LOW	ludwig/config_validation/checks.py	571	def check_llm_quantization_backend_incompatibility(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	604	def check_llm_text_encoder_is_not_used_with_ecd(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	633	def check_qlora_merge_and_unload_compatibility(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	658	def check_prompt_requirements(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	720	def check_sample_ratio_and_size_compatible(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/checks.py	728	def check_grpo_requires_text_output(config: "ModelConfig") -> None:	CODE
LOW	ludwig/config_validation/preprocessing.py	1	def check_global_max_sequence_length_fits_prompt_template(metadata, global_preprocessing_parameters):	CODE
LOW	ludwig/distributed/__init__.py	40	def get_current_dist_strategy() -> DistributedStrategy:	CODE
LOW	ludwig/distributed/__init__.py	53	def get_default_strategy_name() -> str:	CODE
LOW	ludwig/distributed/accelerate.py	211	def allow_gradient_accumulation(self) -> bool:	CODE
LOW	ludwig/distributed/accelerate.py	234	def extract_model_for_serialization(cls, model):	CODE
LOW	ludwig/distributed/accelerate.py	244	def replace_model_from_serialization(cls, state):	CODE
LOW	ludwig/distributed/base.py	162	def allow_gradient_accumulation(self) -> bool:	CODE
LOW	ludwig/distributed/base.py	191	def extract_model_for_serialization(cls, model: nn.Module) -> nn.Module \| tuple[nn.Module, list[dict]]:	CODE
LOW	ludwig/distributed/base.py	195	def replace_model_from_serialization(cls, state: nn.Module \| tuple[nn.Module, list[dict]]) -> nn.Module:	CODE
LOW	ludwig/callbacks/__init__.py	99	def on_hyperopt_preprocessing_start(self, experiment_name: str, **kwargs):	CODE
LOW	ludwig/callbacks/__init__.py	106	def on_hyperopt_preprocessing_end(self, experiment_name: str, **kwargs):	CODE
LOW	ludwig/callbacks/__init__.py	212	def on_trainer_train_teardown(self, trainer, progress_tracker, save_path: str, is_coordinator: bool, **kwargs):	CODE
1445 more matches not shown…

Unused Imports321 hits · 282 pts

Severity	File	Line	Context
LOW	ludwig/__init__.py	18	CODE
LOW	ludwig/types.py	9	CODE
LOW	ludwig/serve_ray_serve.py	24	CODE
LOW	ludwig/serve_kserve.py	18	CODE
LOW	ludwig/config_validation/validation.py	13	CODE
LOW	ludwig/config_validation/validation.py	14	CODE
LOW	ludwig/config_validation/validation.py	14	CODE
LOW	ludwig/config_validation/validation.py	15	CODE
LOW	ludwig/config_validation/validation.py	16	CODE
LOW	ludwig/config_validation/validation.py	16	CODE
LOW	ludwig/config_validation/validation.py	26	CODE
LOW	ludwig/distributed/accelerate.py	170	CODE
LOW	ludwig/distributed/base.py	1	CODE
LOW	ludwig/datasets/__init__.py	21	CODE
LOW	ludwig/datasets/__init__.py	22	CODE
LOW	ludwig/datasets/loaders/misc_loaders.py	3	CODE
LOW	ludwig/datasets/loaders/multilabel_loader.py	6	CODE
LOW	ludwig/datasets/loaders/qa_loader.py	8	CODE
LOW	ludwig/datasets/loaders/hugging_face.py	15	CODE
LOW	ludwig/datasets/loaders/multiple_choice_loader.py	9	CODE
LOW	ludwig/datasets/loaders/ner_loader.py	8	CODE
LOW	ludwig/datasets/loaders/openml_loader.py	15	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	15	CODE
LOW	ludwig/datasets/loaders/vqa_loader.py	8	CODE
LOW	ludwig/datasets/loaders/code_loader.py	3	CODE
LOW	ludwig/datasets/loaders/translation_loader.py	8	CODE
LOW	ludwig/features/transforms.py	7	CODE
LOW	ludwig/features/feature_registries.py	58	CODE
LOW	ludwig/features/feature_registries.py	59	CODE
LOW	ludwig/features/text_feature.py	16	CODE
LOW	ludwig/features/timeseries_feature.py	37	CODE
LOW	ludwig/contribs/__init__.py	49	CODE
LOW	ludwig/contribs/__init__.py	59	CODE
LOW	ludwig/contribs/__init__.py	69	CODE
LOW	ludwig/combiners/combiners.py	27	CODE
LOW	ludwig/combiners/__init__.py	2	CODE
LOW	ludwig/combiners/tabpfn_v2_combiner.py	20	CODE
LOW	ludwig/combiners/tabpfn_v2_combiner.py	64	CODE
LOW	ludwig/utils/dataset_quality.py	33	CODE
LOW	ludwig/utils/trainer_utils.py	23	CODE
LOW	ludwig/utils/trainer_utils.py	24	CODE
LOW	ludwig/utils/hf_utils.py	1	CODE
LOW	ludwig/utils/misc_utils.py	36	CODE
LOW	ludwig/utils/llm_utils.py	1	CODE
LOW	ludwig/utils/checkpoint_utils.py	27	CODE
LOW	ludwig/utils/checkpoint_utils.py	28	CODE
LOW	ludwig/utils/upload_utils.py	1	CODE
LOW	ludwig/utils/entmax/__init__.py	3	CODE
LOW	ludwig/utils/entmax/__init__.py	3	CODE
LOW	ludwig/utils/entmax/__init__.py	3	CODE
LOW	ludwig/utils/entmax/__init__.py	3	CODE
LOW	ludwig/utils/entmax/__init__.py	4	CODE
LOW	ludwig/utils/entmax/__init__.py	4	CODE
LOW	ludwig/utils/entmax/__init__.py	4	CODE
LOW	ludwig/utils/entmax/__init__.py	4	CODE
LOW	ludwig/utils/entmax/__init__.py	4	CODE
LOW	ludwig/utils/entmax/__init__.py	4	CODE
LOW	ludwig/utils/entmax/__init__.py	4	CODE
LOW	ludwig/utils/entmax/__init__.py	4	CODE
LOW	ludwig/utils/entmax/__init__.py	14	CODE
261 more matches not shown…

Modern Structural Boilerplate215 hits · 222 pts

Severity	File	Line	Snippet	Context
LOW	ludwig/serve_vllm.py	24	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/collect.py	35	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/forecast.py	14	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/hyperopt_cli.py	29	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/upload.py	10	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/preprocess.py	32	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/presets.py	10	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/predict.py	31	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/serve.py	39	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/inspect_model.py	9	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/export.py	25	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/api.py	115	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/api.py	1832	def set_logging_level(logging_level: int) -> None:	CODE
LOW	ludwig/serve_ray_serve.py	29	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/experiment.py	33	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/serve_kserve.py	23	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/train.py	32	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/evaluate.py	30	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/progress_bar.py	38	def set_postfix(self, ordered_dict: dict \| None = None, **kwargs) -> None:	CODE
LOW⚡	ludwig/config_generation.py	19	logger = logging.getLogger(__name__)	CODE
LOW⚡	ludwig/serve_v2.py	27	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/model_inspector.py	15	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/explain/captum.py	41	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/distributed/accelerate.py	38	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/vector_index/__init__.py	6	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/datasets/archives.py	26	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/datasets/loaders/hugging_face.py	26	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/datasets/loaders/openml_loader.py	27	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	38	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/datasets/loaders/mnist.py	28	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/vector_feature.py	40	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/category_feature.py	65	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/binary_feature.py	50	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/text_feature.py	64	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/set_feature.py	45	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/bag_feature.py	30	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/number_feature.py	47	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/date_feature.py	35	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/passthrough_feature.py	78	def update_config_with_metadata(self, feature_config, feature_metadata, args, *kwargs) -> None:	CODE
LOW	ludwig/features/image_feature.py	109	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/anomaly_feature.py	72	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/anomaly_feature.py	281	def update_metrics(self, targets: torch.Tensor, predictions: dict[str, torch.Tensor]) -> None:	CODE
LOW	ludwig/features/timeseries_feature.py	39	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/sequence_feature.py	64	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/base_feature.py	53	logger = logging.getLogger(__name__)	CODE
LOW⚡	ludwig/features/base_feature.py	214	def update_config_with_metadata(feature_config, feature_metadata, args, *kwargs) -> None:	CODE
LOW⚡	ludwig/features/base_feature.py	217	def update_config_after_module_init(self, feature_config) -> None:	CODE
LOW	ludwig/features/base_feature.py	411	def _setup_loss(self) -> None:	CODE
LOW	ludwig/features/base_feature.py	415	def _setup_metrics(self) -> None:	CODE
LOW	ludwig/features/base_feature.py	485	def update_metrics(self, targets: Tensor, predictions: dict[str, Tensor]) -> None:	CODE
LOW⚡	ludwig/features/base_feature.py	601	def update_config_with_metadata(feature_config, feature_metadata, args, *kwargs) -> None:	CODE
LOW	ludwig/features/h3_feature.py	27	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/features/audio_feature.py	52	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/contribs/comet.py	25	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/contribs/wandb.py	24	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/contribs/aim.py	11	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/contribs/mlflow/mlflow3.py	24	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/contribs/mlflow/__init__.py	16	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/combiners/combiners.py	52	logger = logging.getLogger(__name__)	CODE
LOW	ludwig/combiners/tabpfn_v2_combiner.py	30	logger = logging.getLogger(__name__)	CODE
155 more matches not shown…

Over-Commented Block224 hits · 219 pts

Severity	File	Line	Snippet	Context
LOW	.protolint.yaml	1	# Adapted from	COMMENT
LOW	ludwig/experiment_utils.py	1	# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.	COMMENT
LOW	ludwig/api_types.py	1	# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.	COMMENT
LOW	ludwig/collect.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/contrib.py	1	# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.	COMMENT
LOW	ludwig/hyperopt_cli.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/error.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/preprocess.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/globals.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/constants.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/predict.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/__init__.py	1	# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.	COMMENT
LOW	ludwig/serve.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/export.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/api.py	1	# !/usr/bin/env python	COMMENT
LOW	ludwig/experiment.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/cli.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/train.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/evaluate.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/callbacks/__init__.py	1	# !/usr/bin/env python	COMMENT
LOW	ludwig/datasets/dataset_config.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/archives.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/datasets/loaders/ieee_fraud.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/adult_census_income.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/kdd_loader.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/hugging_face.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/goemotions.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/higgs.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/split_loaders.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/ethos_binary.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/consumer_complaints_loader.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/creditcard_fraud.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/sarcastic_headlines.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/openml_loader.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/naval.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/dataset_loader.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/agnews.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/sarcos.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/insurance_lite.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/rossman_store_sales.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/allstate_claims_severity.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/forest_cover.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/forest_cover.py	61	# Elevation quantitative meters Elevation in meters	COMMENT
LOW	ludwig/datasets/loaders/code_alpaca_loader.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/sst.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/santander_value_prediction.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/flickr8k.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/datasets/loaders/camseq.py	1	# Copyright (c) 2023 Aizen Corp.	COMMENT
LOW	ludwig/datasets/loaders/mnist.py	1	# Copyright (c) 2022 Predibase, Inc.	COMMENT
LOW	ludwig/features/feature_registries.py	1	# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.	COMMENT
LOW	ludwig/features/vector_feature.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/features/category_feature.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/features/binary_feature.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/features/text_feature.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/features/text_feature.py	481	#	COMMENT
LOW	ludwig/features/set_feature.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/features/bag_feature.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/features/number_feature.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/features/feature_utils.py	1	#! /usr/bin/env python	COMMENT
LOW	ludwig/features/date_feature.py	1	#! /usr/bin/env python	COMMENT
164 more matches not shown…

Deep Nesting185 hits · 166 pts

Severity	File	Line	Context
LOW	ludwig/serve_vllm.py	27	CODE
LOW	ludwig/serve_vllm.py	116	CODE
LOW	ludwig/serve.py	132	CODE
LOW	ludwig/serve.py	421	CODE
LOW	ludwig/inspect_model.py	12	CODE
LOW	ludwig/api.py	1865	CODE
LOW	ludwig/api.py	232	CODE
LOW	ludwig/api.py	1161	CODE
LOW	ludwig/serve_v2.py	96	CODE
LOW	ludwig/config_sampling/explore_schema.py	22	CODE
LOW	ludwig/config_sampling/explore_schema.py	173	CODE
LOW	ludwig/config_sampling/explore_schema.py	291	CODE
LOW	ludwig/config_sampling/parameter_sampling.py	10	CODE
LOW	ludwig/explain/captum.py	351	CODE
LOW	ludwig/explain/captum.py	379	CODE
LOW	ludwig/explain/captum_ray.py	32	CODE
LOW	ludwig/config_validation/checks.py	224	CODE
LOW	ludwig/config_validation/checks.py	237	CODE
LOW	ludwig/config_validation/checks.py	329	CODE
LOW	ludwig/callbacks/studio.py	183	CODE
LOW	ludwig/datasets/archives.py	41	CODE
LOW	ludwig/datasets/archives.py	67	CODE
LOW	ludwig/datasets/archives.py	90	CODE
LOW	ludwig/datasets/loaders/newyorker_caption_contest.py	19	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	256	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	333	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	363	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	406	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	425	CODE
LOW	ludwig/datasets/loaders/sst.py	54	CODE
LOW	ludwig/datasets/loaders/sst.py	278	CODE
LOW	ludwig/datasets/loaders/vqa_loader.py	25	CODE
LOW	ludwig/datasets/loaders/vqa_loader.py	51	CODE
LOW	ludwig/datasets/loaders/flickr8k.py	23	CODE
LOW	ludwig/datasets/loaders/mnist.py	109	CODE
LOW	ludwig/features/vector_feature.py	44	CODE
LOW	ludwig/features/category_feature.py	400	CODE
LOW	ludwig/features/text_feature.py	324	CODE
LOW	ludwig/features/number_feature.py	329	CODE
LOW	ludwig/features/date_feature.py	72	CODE
LOW	ludwig/features/image_feature.py	146	CODE
LOW	ludwig/features/image_feature.py	499	CODE
LOW	ludwig/features/image_feature.py	905	CODE
LOW	ludwig/features/sequence_feature.py	380	CODE
LOW	ludwig/features/audio_feature.py	81	CODE
LOW	ludwig/features/audio_feature.py	242	CODE
LOW	ludwig/features/audio_feature.py	438	CODE
LOW	ludwig/contribs/mlflow/mlflow3.py	170	CODE
LOW	ludwig/combiners/combiners.py	254	CODE
LOW	ludwig/combiners/combiners.py	589	CODE
LOW	ludwig/utils/visualization_utils.py	1398	CODE
LOW	ludwig/utils/sequence_packing.py	27	CODE
LOW	ludwig/utils/image_utils.py	127	CODE
LOW	ludwig/utils/dataset_quality.py	268	CODE
LOW	ludwig/utils/batch_size_tuner.py	19	CODE
LOW	ludwig/utils/trainer_utils.py	291	CODE
LOW	ludwig/utils/training_report.py	16	CODE
LOW	ludwig/utils/model_export.py	172	CODE
LOW	ludwig/utils/algorithms_utils.py	19	CODE
LOW	ludwig/utils/misc_utils.py	69	CODE
125 more matches not shown…

Cross-Language Confusion30 hits · 161 pts

Severity	File	Line	Snippet	Context
HIGH⚡	ludwig/config_validation/checks.py	692	# TODO: retrieval by default should be set to null, not a default dict:	COMMENT
HIGH	ludwig/config_validation/checks.py	663	# TODO: `prompt` by default should be set to null, not a default dict:	COMMENT
HIGH	ludwig/utils/output_feature_utils.py	118	"features, or disabling the bucketing setting bucketing_field to None / null, "	CODE
HIGH	ludwig/schema/trainer.py	498	"are inversely proportional to this vector. When null, a uniform preference is used."	CODE
HIGH	ludwig/schema/features/loss/loss.py	87	return "[undefined]"	CODE
HIGH	ludwig/schema/features/preprocessing/date.py	38	description="This parameter can either be a datetime format string, or null, in which case the datetime "	CODE
HIGH	ludwig/schema/combiners/common_transformer_options.py	60	description="The number of stacked fully connected layers (only applies if `reduce_output` is not null).",	CODE
HIGH	ludwig/schema/combiners/tabnet.py	80	description="Size of the virtual batch size used by ghost batch norm. If null, regular batch norm is used "	CODE
HIGH	ludwig/schema/encoders/sequence_encoders.py	23	`[{filter_size: 7, pool_size: 3}, {filter_size: 7, pool_size: 3}, {filter_size: 3, pool_size: null},	CODE
HIGH	ludwig/schema/encoders/sequence_encoders.py	24	{filter_size: 3, pool_size: null}, {filter_size: 3, pool_size: null}, {filter_size: 3, pool_size: 3}]`.	CODE
HIGH	ludwig/schema/encoders/sequence_encoders.py	408	description="If stacked_layers is null, this is the number of elements in the stack of parallel convolutional "	CODE
HIGH	ludwig/schema/encoders/image/base.py	216	"each layer. It indicates the normalization applied to the activations and can be null, "	CODE
HIGH	ludwig/schema/encoders/image/base.py	279	"each layer. It indicates the norm of the output and can be null, batch or layer.",	CODE
HIGH	ludwig/schema/llms/peft.py	1456	"Per-source weights; must have the same length as `sources`. If null, all weights default to 1.0."	CODE
HIGH	ludwig/schema/llms/peft.py	1513	"If null, the first entry in `adapters` is used. Set this to a merged adapter "	CODE
HIGH	ludwig/modules/preference_losses.py	135	+ beta * KL(policy \|\| reference)	STRING
HIGH	tests/ludwig/utils/test_dataframe_utils.py	85	assert scalar_df.equals(expected_df)	CODE
HIGH	tests/ludwig/utils/test_data_utils.py	51	assert df.equals(	CODE
HIGH	tests/ludwig/utils/test_data_utils.py	65	assert df.equals(	CODE
HIGH	tests/ludwig/utils/test_data_utils.py	81	assert df.equals(pd.DataFrame([1, 2, 3, 4, 5], columns=["x"]))	CODE
HIGH	tests/ludwig/utils/test_dataset_utils.py	34	assert split_df.equals(	CODE
HIGH	tests/ludwig/utils/test_dataset_utils.py	89	assert split_df.equals(	CODE
HIGH	tests/ludwig/utils/test_dataset_utils.py	144	assert split_df.equals(	CODE
HIGH	tests/ludwig/utils/test_dataset_utils.py	199	assert split_df.equals(	CODE
HIGH⚡	tests/ludwig/data/test_split.py	81	assert not s1.equals(s2)	CODE
HIGH⚡	tests/ludwig/data/test_split.py	85	assert s1.equals(s3)	CODE
HIGH	tests/ludwig/data/test_split.py	228	assert not s1.equals(s2)	CODE
HIGH	tests/ludwig/data/test_split.py	235	assert s1.equals(s3)	CODE
HIGH	tests/integration_tests/test_visualization.py	1605	assert ground_truth_train_split.equals(pd.Series([0]))	CODE
HIGH	tests/integration_tests/test_mlflow.py	81	assert pred_df.equals(expected_df)	CODE

Cross-File Repetition27 hits · 135 pts

Severity	File	Snippet	Context
HIGH	ludwig/contribs/comet.py	class that defines the methods necessary to hook into process.	STRING
HIGH	ludwig/contribs/wandb.py	class that defines the methods necessary to hook into process.	STRING
HIGH	ludwig/contribs/aim.py	class that defines the methods necessary to hook into process.	STRING
HIGH	ludwig/benchmarking/profiler_callbacks.py	class that defines the methods necessary to hook into process.	STRING
HIGH	tests/integration_tests/test_experiment.py	class that defines the methods necessary to hook into process.	STRING
HIGH	ludwig/utils/torch_utils.py	returns the size of the input tensor without the batch dimension.	STRING
HIGH	ludwig/modules/reduction_modules.py	returns the size of the input tensor without the batch dimension.	STRING
HIGH	ludwig/modules/convolutional_modules.py	returns the size of the input tensor without the batch dimension.	STRING
HIGH	ludwig/models/llm.py	forward pass of the model. args: inputs: inputs to the model. can be a dictionary of input names to input tensors or a t	STRING
HIGH	ludwig/models/ecd.py	forward pass of the model. args: inputs: inputs to the model. can be a dictionary of input names to input tensors or a t	STRING
HIGH	ludwig/models/base.py	forward pass of the model. args: inputs: inputs to the model. can be a dictionary of input names to input tensors or a t	STRING
HIGH	ludwig/models/llm.py	returns init arguments for constructing this model.	STRING
HIGH	ludwig/models/ecd.py	returns init arguments for constructing this model.	STRING
HIGH	ludwig/models/base.py	returns init arguments for constructing this model.	STRING
HIGH	ludwig/visualize/threshold.py	load model data from files to be shown by compare_predictions_distribution. args: predictions: list of prediction result	STRING
HIGH	ludwig/visualize/curves.py	load model data from files to be shown by compare_predictions_distribution. args: predictions: list of prediction result	STRING
HIGH	ludwig/visualize/performance.py	load model data from files to be shown by compare_predictions_distribution. args: predictions: list of prediction result	STRING
HIGH	ludwig/schema/split.py	custom dataclass field that when used inside a dataclass will allow the user to specify a decoder config. returns: initi	STRING
HIGH	ludwig/schema/features/preprocessing/utils.py	custom dataclass field that when used inside a dataclass will allow the user to specify a decoder config. returns: initi	STRING
HIGH	ludwig/schema/encoders/utils.py	custom dataclass field that when used inside a dataclass will allow the user to specify a decoder config. returns: initi	STRING
HIGH	ludwig/schema/decoders/utils.py	custom dataclass field that when used inside a dataclass will allow the user to specify a decoder config. returns: initi	STRING
HIGH	ludwig/schema/features/loss/utils.py	returns a json schema of conditionals to validate against decoder types for specific feature types.	STRING
HIGH	ludwig/schema/encoders/utils.py	returns a json schema of conditionals to validate against decoder types for specific feature types.	STRING
HIGH	ludwig/schema/decoders/utils.py	returns a json schema of conditionals to validate against decoder types for specific feature types.	STRING
HIGH	tests/integration_tests/utils.py	helper method to avoid code repetition in running an experiment. :param input_features: input schema :param output_featu	STRING
HIGH	tests/integration_tests/test_api.py	helper method to avoid code repetition in running an experiment. :param input_features: input schema :param output_featu	STRING
HIGH	tests/integration_tests/test_visualization_api.py	helper method to avoid code repetition in running an experiment. :param input_features: input schema :param output_featu	STRING

Excessive Try-Catch Wrapping114 hits · 117 pts

Severity	File	Line	Snippet	Context
LOW	ludwig/check.py	32	except Exception:	CODE
LOW	ludwig/serve.py	264	except Exception:	CODE
LOW	ludwig/serve.py	308	except Exception as exc:	CODE
LOW	ludwig/serve.py	331	except Exception:	CODE
LOW	ludwig/serve.py	372	except Exception:	CODE
LOW	ludwig/api.py	601	except Exception:	CODE
LOW	ludwig/api.py	617	except Exception:	CODE
LOW	ludwig/api.py	1551	except Exception:	CODE
LOW	ludwig/config_generation.py	101	except Exception as exc:	CODE
LOW	ludwig/config_generation.py	196	except Exception as e:	CODE
LOW	ludwig/serve_v2.py	324	except Exception as exc:	CODE
LOW	ludwig/serve_v2.py	344	except Exception as exc:	CODE
LOW	ludwig/serve_v2.py	370	except Exception as exc:	CODE
LOW	ludwig/datasets/__init__.py	320	except Exception as e:	CODE
LOW	ludwig/datasets/loaders/openml_loader.py	150	except Exception as exc:	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	264	except Exception as e:	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	274	except Exception as fallback_e:	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	285	except Exception:	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	291	except Exception:	CODE
LOW	ludwig/datasets/loaders/dataset_loader.py	345	except Exception:	CODE
LOW	ludwig/features/binary_feature.py	174	except Exception as e:	CODE
LOW	ludwig/features/text_feature.py	365	except Exception:	CODE
LOW	ludwig/features/date_feature.py	85	except Exception as e:	CODE
MEDIUM	ludwig/features/date_feature.py	72	def date_to_list(date_value, datetime_format, preprocessing_parameters):	CODE
LOW	ludwig/features/anomaly_feature.py	358	except Exception as e:	CODE
LOW	ludwig/features/anomaly_feature.py	367	except Exception as e:	CODE
LOW	ludwig/features/anomaly_feature.py	384	except Exception as e:	CODE
LOW	ludwig/features/base_feature.py	296	except Exception:	CODE
LOW	ludwig/features/base_feature.py	514	except Exception as e:	CODE
LOW	ludwig/contribs/comet.py	50	except Exception:	CODE
LOW	ludwig/contribs/comet.py	105	except Exception:	CODE
LOW	ludwig/contribs/comet.py	112	except Exception:	CODE
LOW	ludwig/contribs/aim.py	42	except Exception:	CODE
LOW	ludwig/contribs/mlflow/mlflow3.py	73	except Exception:	CODE
LOW	ludwig/contribs/mlflow/mlflow3.py	90	except Exception as e:	CODE
LOW	ludwig/contribs/mlflow/mlflow3.py	97	except Exception as e:	CODE
LOW	ludwig/contribs/mlflow/mlflow3.py	139	except Exception as e:	CODE
LOW	ludwig/contribs/mlflow/mlflow3.py	166	except Exception:	CODE
LOW	ludwig/contribs/mlflow/mlflow3.py	189	except Exception:	CODE
LOW	ludwig/contribs/mlflow/mlflow3.py	205	except Exception as e:	CODE
LOW	ludwig/utils/image_utils.py	116	except Exception as e:	CODE
LOW	ludwig/utils/image_utils.py	197	except Exception:	CODE
LOW	ludwig/utils/image_utils.py	209	except Exception:	CODE
LOW	ludwig/utils/image_utils.py	228	except Exception:	CODE
LOW	ludwig/utils/hf_utils.py	108	except Exception:	CODE
LOW	ludwig/utils/hf_utils.py	185	except Exception as e:	CODE
LOW	ludwig/utils/model_export.py	68	except Exception as e:	CODE
LOW	ludwig/utils/model_export.py	111	except Exception as e:	CODE
LOW	ludwig/utils/model_export.py	122	except Exception as e2:	CODE
LOW⚡	ludwig/utils/audio_utils.py	73	except Exception:	CODE
LOW⚡	ludwig/utils/audio_utils.py	84	except Exception:	CODE
LOW	ludwig/utils/output_feature_utils.py	110	except Exception as e:	CODE
LOW	ludwig/utils/checkpoint_utils.py	174	except Exception as e:	CODE
LOW	ludwig/utils/checkpoint_utils.py	394	except Exception:	CODE
LOW	ludwig/utils/fs_utils.py	97	except Exception:	CODE
LOW	ludwig/backend/datasource.py	50	except Exception as e:	CODE
LOW	ludwig/visualize/_utils.py	286	except Exception:	CODE
LOW	ludwig/schema/utils.py	404	except Exception as e:	CODE
LOW	ludwig/schema/features/augmentation/utils.py	141	except Exception as e:	CODE
LOW	ludwig/schema/features/preprocessing/utils.py	80	except Exception as e:	CODE
54 more matches not shown…

Self-Referential Comments33 hits · 97 pts

Severity	File	Line	Snippet	Context
MEDIUM	ludwig/api.py	177	# Initialize the config object	COMMENT
MEDIUM	ludwig/api.py	804	# Create the LLM model class instance with the loaded LLM if it hasn't been initialized yet.	COMMENT
MEDIUM⚡	ludwig/datasets/loaders/dataset_loader.py	490	) # This function is defined in the Hugging Face dataloader	CODE
MEDIUM	ludwig/features/timeseries_feature.py	89	# Create the list of shifts we want to perform over the series.	COMMENT
MEDIUM	ludwig/combiners/combiners.py	57	"""This class provides an opaque handle to the input features, preventing them from being registered as state.	STRING
MEDIUM	ludwig/utils/hf_utils.py	168	# Create the repo if it doesn't exist. This is a no-op if the repo already exists	COMMENT
MEDIUM	ludwig/utils/audio_utils.py	177	# The following code for FBank is adapted from jameslyons/python_speech_features	COMMENT
MEDIUM	ludwig/utils/llm_quantization_utils.py	25	# Create a new Linear layer with the same shape	COMMENT
MEDIUM	ludwig/utils/fs_utils.py	65	# Create a windows compatible path from url path	COMMENT
MEDIUM	ludwig/backend/datasource.py	55	# Create a dataset from the paths and indices, then map to read files	COMMENT
MEDIUM	ludwig/modules/optimization_modules.py	50	# Create a dict of parameters to be passed to torch (i.e. everything except `type`):	COMMENT
MEDIUM	ludwig/modules/convolutional_modules.py	1273	# The following code for ResNet is adapted from the TensorFlow implementation	COMMENT
MEDIUM	ludwig/data/dataset_synthesizer.py	400	# Create a Random Image	COMMENT
MEDIUM	ludwig/hyperopt/run.py	171	# Initialize config object	COMMENT
MEDIUM	tests/ludwig/utils/test_upload_utils.py	29	# Create a temporary folder designating training output directory.	COMMENT
MEDIUM	tests/ludwig/utils/test_upload_utils.py	48	# Create a temporary folder designating training output directory.	COMMENT
MEDIUM	tests/ludwig/utils/test_hf_utils.py	55	# Create a temporary folder	COMMENT
MEDIUM	tests/ludwig/utils/test_hf_utils.py	58	# Create a file within the temporary folder	COMMENT
MEDIUM	tests/ludwig/utils/test_model_utils.py	21	# Create a sample model	COMMENT
MEDIUM	tests/ludwig/utils/test_model_utils.py	46	# Create a sample model	COMMENT
MEDIUM	tests/ludwig/utils/test_model_utils.py	52	# Create a new device for testing	COMMENT
MEDIUM	tests/ludwig/automl/test_base_config.py	130	# Create a temporary directory to store the parquet file	COMMENT
MEDIUM	tests/ludwig/automl/test_base_config.py	133	# Create a dataframe with all the types	COMMENT
MEDIUM	tests/ludwig/decoders/test_llm_decoders.py	54	# Create a Boolean mask for elements equal to 0 or 2 (padding or output)	COMMENT
MEDIUM	examples/mnist/advanced_model_training.py	15	# ## Import required libraries	COMMENT
MEDIUM	examples/mnist/assess_model_performance.py	9	# ## Import required libraries	COMMENT
MEDIUM	examples/llm_text_generation/simple_model_training.py	9	# Import required libraries	COMMENT
MEDIUM	examples/llm_few_shot_learning/simple_model_training.py	9	# Import required libraries	COMMENT
MEDIUM	examples/class_imbalance/model_training.py	10	# Import required libraries	COMMENT
MEDIUM	examples/titanic/simple_model_training.py	8	# Import required libraries	COMMENT
MEDIUM	examples/titanic/multiple_model_training.py	10	# ## Import required libraries	COMMENT
MEDIUM	examples/insurance_lite/train.py	5	# Import required libraries	COMMENT
MEDIUM	…amples/llm_zero_shot_learning/simple_model_training.py	9	# Import required libraries	COMMENT

Docstring Block Structure16 hits · 80 pts

Severity	File	Line	Snippet	Context
HIGH	ludwig/api.py	769	Upscales quantized weights of a model to fp16 and saves the result in a specified folder. Args: sav	STRING
HIGH	ludwig/api.py	1497	Preprocess a dataset and return it split into training / validation / test sets. Args: dataset: Sou	STRING
HIGH	ludwig/config_generation.py	112	Generate a Ludwig config from a natural language task description. Uses an LLM to translate the description into a	STRING
HIGH	ludwig/features/image_feature.py	440	Returns a torchvision transform that is compatible with the model variant. Note that the raw torchvision transform	STRING
HIGH	ludwig/utils/date_utils.py	67	Convert a numeric timestamp to a datetime object. `datetime` objects can be created from POSIX timestamps like thos	STRING
HIGH	ludwig/utils/trainer_utils.py	544	Freezes layers in a model whose names match a specified regular expression pattern. This function iterates over all	STRING
HIGH	ludwig/utils/hf_utils.py	128	Uploads a local folder to the Hugging Face Model Hub. Args: repo_id (str): The ID of the target repository	STRING
HIGH	ludwig/schema/utils.py	76	Deserialize a value into a config instance. Handles the common pattern of checking if a value is a raw dict	STRING
HIGH	ludwig/modules/training_hooks.py	25	Abstract method to be implemented by subclasses. This is the method that defines the custom behavior of the trai	STRING
HIGH	ludwig/modules/convolutional_modules.py	1293	Retrieve the size of each block_layer in the ResNet model. The number of block layers used for the Resnet model var	STRING
HIGH	ludwig/data/preprocessing.py	523	Builds a dataset from a dataframe and a list of features. Args: config: A dictionary containing the Ludwig	STRING
HIGH	ludwig/data/preprocessing.py	1036	The purpose of this function is to balance the training dataset using either over-sampling or under- sampling.	STRING
HIGH	tests/ludwig/encoders/test_llm_encoders.py	66	Get the PEFT paramter name prefix for a given adapter type. Args: adapter: A valid config value for	STRING
HIGH	tests/integration_tests/utils.py	212	Helper method to generate synthetic data based on input, output feature specs. Args: input_features: schema	STRING
HIGH	tests/integration_tests/utils.py	1071	Asserts that the preprocessed dataset has the correct shape and dtype for a given feature type. Args: featu	STRING
HIGH	tests/integration_tests/parameter_update_utils.py	24	Reports on the number of parameters in a Ludwig component and their update status. Args: module: (Ludwi	STRING

AI Structural Patterns94 hits · 76 pts

Severity	File	Line	Context
LOW	ludwig/experiment_utils.py	61	CODE
LOW	ludwig/collect.py	38	CODE
LOW	ludwig/forecast.py	17	CODE
LOW	ludwig/hyperopt_cli.py	32	CODE
LOW	ludwig/preprocess.py	35	CODE
LOW	ludwig/predict.py	34	CODE
LOW	ludwig/api.py	1865	CODE
LOW	ludwig/api.py	232	CODE
LOW	ludwig/api.py	930	CODE
LOW	ludwig/api.py	1021	CODE
LOW	ludwig/api.py	1287	CODE
LOW	ludwig/api.py	1485	CODE
LOW	ludwig/experiment.py	36	CODE
LOW	ludwig/train.py	35	CODE
LOW	ludwig/evaluate.py	33	CODE
LOW	ludwig/utils/strings_utils.py	322	CODE
LOW	ludwig/utils/trainer_utils.py	99	CODE
LOW	ludwig/utils/torch_utils.py	352	CODE
LOW	ludwig/trainers/trainer_llm.py	47	CODE
LOW	ludwig/trainers/trainer_llm.py	414	CODE
LOW	ludwig/trainers/trainer.py	111	CODE
LOW	ludwig/encoders/set_encoders.py	37	CODE
LOW	ludwig/encoders/h3_encoders.py	54	CODE
LOW	ludwig/encoders/h3_encoders.py	228	CODE
LOW	ludwig/encoders/h3_encoders.py	336	CODE
LOW	ludwig/encoders/generic_encoders.py	65	CODE
LOW	ludwig/encoders/sequence_encoders.py	157	CODE
LOW	ludwig/encoders/sequence_encoders.py	282	CODE
LOW	ludwig/encoders/sequence_encoders.py	519	CODE
LOW	ludwig/encoders/sequence_encoders.py	791	CODE
LOW	ludwig/encoders/sequence_encoders.py	1032	CODE
LOW	ludwig/encoders/sequence_encoders.py	1241	CODE
LOW	ludwig/encoders/sequence_encoders.py	1500	CODE
LOW	ludwig/encoders/sequence_encoders.py	1776	CODE
LOW	ludwig/encoders/date_encoders.py	67	CODE
LOW	ludwig/encoders/date_encoders.py	208	CODE
LOW	ludwig/encoders/date_encoders.py	295	CODE
LOW	ludwig/encoders/text_encoders.py	263	CODE
LOW	ludwig/encoders/text_encoders.py	404	CODE
LOW	ludwig/encoders/text_encoders.py	540	CODE
LOW	ludwig/encoders/text_encoders.py	656	CODE
LOW	ludwig/encoders/text_encoders.py	792	CODE
LOW	ludwig/encoders/text_encoders.py	944	CODE
LOW	ludwig/encoders/text_encoders.py	1062	CODE
LOW	ludwig/encoders/text_encoders.py	1209	CODE
LOW	ludwig/encoders/text_encoders.py	1316	CODE
LOW	ludwig/encoders/text_encoders.py	1462	CODE
LOW	ludwig/encoders/text_encoders.py	1588	CODE
LOW	ludwig/encoders/text_encoders.py	1725	CODE
LOW	ludwig/encoders/text_encoders.py	1849	CODE
LOW	ludwig/encoders/text_encoders.py	1978	CODE
LOW	ludwig/encoders/text_encoders.py	2092	CODE
LOW	ludwig/encoders/mamba_hybrid.py	139	CODE
LOW	ludwig/encoders/mamba_hybrid.py	236	CODE
LOW	ludwig/encoders/timeseries_encoders.py	22	CODE
LOW	ludwig/encoders/bag_encoders.py	37	CODE
LOW	ludwig/encoders/image/base.py	47	CODE
LOW	ludwig/encoders/image/base.py	173	CODE
LOW	ludwig/decoders/sequence_decoders.py	378	CODE
LOW	ludwig/decoders/sequence_decoders.py	599	CODE
34 more matches not shown…

Modern AI Meta-Vocabulary12 hits · 40 pts

Severity	File	Line	Snippet	Context
MEDIUM⚡	ludwig/config_generation.py	28	# Extract just the key parts to fit in context window	COMMENT
MEDIUM⚡	ludwig/serve_v2.py	159	# Model manager (dependency injection)	COMMENT
MEDIUM⚡	ludwig/config_validation/checks.py	476	# If performing zero-shot, we must specify pretrained adapter weights	COMMENT
MEDIUM⚡	ludwig/config_validation/checks.py	676	# If template is NOT provided, then task is required for zero/few shot learning:	COMMENT
MEDIUM⚡	ludwig/config_validation/checks.py	682	# If a template IS provided (i.e. we are not doing a built-in zero/few-shot learning), then...	COMMENT
MEDIUM	ludwig/encoders/text_encoders.py	2441	# by the scaling factor so we can use the extended context window.	COMMENT
MEDIUM	ludwig/data/prompt.py	165	# determine if this is a few-shot or zero-shot prompt	STRING
MEDIUM	ludwig/data/prompt.py	166	# few-shot prompts require a search function that returns samples from some dataset	STRING
MEDIUM	tests/integration_tests/test_sequence_features.py	70	# setup model scaffolding to for testing	COMMENT
MEDIUM	examples/image_encoders/README.md	9	### Why pretrained encoders matter for few-shot learning	COMMENT
MEDIUM	examples/llm_few_shot_learning/simple_model_training.py	6	# a zero shot classification model. It uses the facebook/opt-350m model	COMMENT
MEDIUM	…amples/llm_zero_shot_learning/simple_model_training.py	6	# a zero shot classification model. It uses the facebook/opt-350m model	COMMENT

AI Slop Vocabulary18 hits · 34 pts

Severity	File	Line	Snippet	Context
MEDIUM	ludwig/presets.py	43	# robust (interquartile) scaling on number features, mild-but-not-trivial FC stack, AdamW	COMMENT
MEDIUM	ludwig/api.py	1940	# use Ludwig's utility to facilitate creating a dataframe	COMMENT
MEDIUM	ludwig/explain/captum.py	372	# For a robust baseline, we take the mean of all samples from the training data.	COMMENT
MEDIUM	ludwig/config_validation/checks.py	65	"""Checks instances of comprehensive (all parameters and defaults filled in) schema-validated config."""	STRING
LOW	ludwig/datasets/loaders/misc_loaders.py	50	# mc1_targets / mc2_targets are dicts; just use best_answer as text	COMMENT
LOW	ludwig/features/vector_feature.py	240	# no overall stats, just return empty dictionary	COMMENT
LOW	ludwig/features/category_feature.py	223	# If no unknown is defined, just use the most popular token's index as the fallback index	STRING
LOW	ludwig/features/set_feature.py	327	# no overall stats, just return empty dictionary	COMMENT
LOW	ludwig/features/number_feature.py	514	# no overall stats, just return empty dictionary	COMMENT
LOW	ludwig/features/image_feature.py	1355	# no overall stats, just return empty dictionary	COMMENT
LOW	ludwig/features/timeseries_feature.py	362	# no overall stats, just return empty dictionary	COMMENT
LOW	ludwig/combiners/combiners.py	674	# todo: can we just use projector_size? # hidden_size,	COMMENT
LOW	ludwig/utils/visualization_utils.py	1497	# just use stripplots since they are categorical scatter plots.	COMMENT
LOW	ludwig/utils/llm_utils.py	573	# and just set it to a tensor of IGNORE_INDEX_TOKEN_ID so that we don't compute loss on this target tensor.	COMMENT
MEDIUM	ludwig/modules/loss_modules.py	150	# robust lambda	COMMENT
MEDIUM	…wig/config_validation/test_validate_config_combiner.py	10	# Essentially verifies that the combiner registry is not empty at import time:	COMMENT
LOW	tests/ludwig/modules/test_metric_modules.py	558	# Correct pattern: just call compute() — sync happens automatically inside.	COMMENT
MEDIUM	tests/integration_tests/test_input_feature_tied.py	26	# note: vocab parameter, below, is made up to facilitate creating input encoders	COMMENT

Redundant / Tautological Comments20 hits · 30 pts

Severity	File	Line	Snippet	Context
LOW⚡	ludwig/collect.py	276	# Output results parameters	COMMENT
LOW⚡	ludwig/collect.py	358	# Output results parameters	COMMENT
LOW⚡	ludwig/forecast.py	102	# Output results parameters	COMMENT
LOW⚡	ludwig/predict.py	148	# Output results parameters	COMMENT
LOW⚡	ludwig/evaluate.py	151	# Output results parameters	COMMENT
LOW	ludwig/config_validation/checks.py	301	# Check if there is a text or sequence output feature using a tagger decoder	COMMENT
LOW	ludwig/features/category_feature.py	176	# Check if the fallback label is in the vocab, if not add it.	COMMENT
LOW	ludwig/utils/visualization_utils.py	335	# Set ticks to the number of properties (in radians)	COMMENT
LOW⚡	ludwig/utils/fs_utils.py	54	# Check if the cache path exists, if not create it	COMMENT
LOW	ludwig/utils/tokenizers.py	802	# Set it to eos_token to avoid NoneType errors in preprocessing.	COMMENT
LOW	ludwig/models/llm.py	756	# Check if the saved weights are merged (no adapter_config.json) or adapter-only	COMMENT
LOW	ludwig/automl/base_config.py	314	# Check if it is a nullboolean field. We do this since if you read a csv with	STRING
LOW	ludwig/schema/utils.py	152	# Check if the subclass overrides _jsonschema_type_mapping	COMMENT
LOW	ludwig/schema/utils.py	478	# Check if THIS class (or a parent) defines __post_init__	COMMENT
LOW	ludwig/schema/utils.py	1507	# Check if subclass overrides _jsonschema_type_mapping - if so, use	COMMENT
LOW	ludwig/data/batcher/test_batcher.py	42	# Check if string loading works as well	STRING
LOW	ludwig/data/batcher/test_batcher.py	93	# Check if string loading works as well	STRING
LOW	ludwig/hyperopt/run.py	189	# Check if all features are grid type parameters and log UserWarning if needed	COMMENT
LOW⚡	ludwig/hyperopt/execution.py	65	# Check if ConfigSpace 1.x (no 'q' parameter)	COMMENT
LOW	examples/kfold_cv/k-fold_cv_classification.sh	19	# Display results from K-fold cv	COMMENT

Synthetic Comment Markers3 hits · 18 pts

Severity	File	Line	Snippet	Context
HIGH	ludwig/datasets/configs/hc3.yaml	10	answer is human-written (0) or generated by ChatGPT (1). Each source row is	CODE
HIGH	ludwig/datasets/configs/hc3_chinese.yaml	10	of whether an answer is human-written (0) or generated by ChatGPT (1).	CODE
HIGH	ludwig/schema/metadata/configs/features.yaml	339	# TODO: review metadata generated by Copilot	COMMENT

Verbosity Indicators9 hits · 14 pts

Severity	File	Line	Snippet	Context
LOW	ludwig/api.py	1199	# Step 1: Preprocess the initial lookback window once	COMMENT
LOW	ludwig/api.py	1230	# Step 2: Incremental prediction loop — O(horizon) steps, each O(1) preprocessing	COMMENT
LOW	ludwig/api.py	1259	# Step 3: Update embeddings incrementally for the next step.	COMMENT
LOW	ludwig/config_validation/checks.py	587	# If the backend is not explicitly set, then we need to check if a Ray process is running	COMMENT
LOW	ludwig/distributed/base.py	153	The purpose of this function is to reduce network overhead.	STRING
LOW⚡	ludwig/encoders/text_encoders.py	2531	# Step 1: Prepare quantized base model for training (freeze + cast).	COMMENT
LOW⚡	ludwig/encoders/text_encoders.py	2536	# Step 2: Initialize adapter on quantized base if not already done	COMMENT
LOW⚡	ludwig/encoders/text_encoders.py	2540	# Step 3: Load adapter weights from checkpoint	COMMENT
LOW	ludwig/data/preprocessing.py	1036	"""The purpose of this function is to balance the training dataset using either over-sampling or under-	STRING

Structural Annotation Overuse6 hits · 11 pts

Severity	File	Line	Snippet	Context
LOW	ludwig/api.py	1199	# Step 1: Preprocess the initial lookback window once	COMMENT
LOW	ludwig/api.py	1230	# Step 2: Incremental prediction loop — O(horizon) steps, each O(1) preprocessing	COMMENT
LOW	ludwig/api.py	1259	# Step 3: Update embeddings incrementally for the next step.	COMMENT
LOW⚡	ludwig/encoders/text_encoders.py	2531	# Step 1: Prepare quantized base model for training (freeze + cast).	COMMENT
LOW⚡	ludwig/encoders/text_encoders.py	2536	# Step 2: Initialize adapter on quantized base if not already done	COMMENT
LOW⚡	ludwig/encoders/text_encoders.py	2540	# Step 3: Load adapter weights from checkpoint	COMMENT

Fake / Example Data5 hits · 7 pts

Severity	File	Line	Snippet	Context
LOW	tests/ludwig/schema_fields/test_fields_misc.py	155	schema_utils.StringOptions(options=["placeholder"], default="placeholder", allow_none=False),	CODE
LOW⚡	tests/ludwig/schema_fields/test_fields_misc.py	166	default="placeholder",	CODE
LOW⚡	tests/ludwig/schema_fields/test_fields_misc.py	170	schema_utils.StringOptions(options=["placeholder"], default="placeholder", allow_none=False),	CODE
LOW⚡	tests/ludwig/schema_fields/test_fields_misc.py	176	assert CustomTestSchema2.model_validate({}).foo == "placeholder"	CODE
LOW⚡	tests/ludwig/schema_fields/test_fields_misc.py	178	assert CustomTestSchema2().foo == "placeholder"	CODE

Slop Phrases1 hit · 2 pts

Severity	File	Line	Snippet	Context
LOW	ludwig/benchmarking/examples/process_config.py	94	# make sure to return the ludwig_config	COMMENT

Example Usage Blocks1 hit · 2 pts

Severity	File	Line	Snippet	Context
LOW	docker/build_and_push.sh	4	# Usage:	COMMENT

Analysis Overview

What These Metrics Mean

Score History

Severity Breakdown

Directory Score Breakdown

Pattern Findings