Repository Analysis

ludwig-ai/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

24.5 Moderate AI signal View on GitHub
24.5
Adjusted Score
24.5
Raw Score
100%
Time Factor
2026-05-29
Last Push
11,708
Stars
Python
Language
230,185
Lines of Code
1628
Files
3290
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 0HIGH 119MEDIUM 803LOW 2368

Pattern Findings

3290 matches across 16 categories. Click a row to expand file-level details.

Decorative Section Separators758 hits · 2516 pts
SeverityFileLineSnippet
MEDIUMludwig/experiment_utils.py14# ==============================================================================
MEDIUMludwig/api_types.py14# ==============================================================================
MEDIUMludwig/collect.py269 # ----------------
MEDIUMludwig/collect.py271 # ----------------
MEDIUMludwig/collect.py275 # -------------------------
MEDIUMludwig/collect.py277 # -------------------------
MEDIUMludwig/collect.py282 # ------------------
MEDIUMludwig/collect.py284 # ------------------
MEDIUMludwig/collect.py287 # ------------------
MEDIUMludwig/collect.py289 # ------------------
MEDIUMludwig/collect.py351 # ----------------
MEDIUMludwig/collect.py353 # ----------------
MEDIUMludwig/collect.py357 # -------------------------
MEDIUMludwig/collect.py359 # -------------------------
MEDIUMludwig/collect.py364 # ------------------
MEDIUMludwig/collect.py366 # ------------------
MEDIUMludwig/collect.py407 # ----------------
MEDIUMludwig/collect.py409 # ----------------
MEDIUMludwig/collect.py415 # ------------------
MEDIUMludwig/collect.py417 # ------------------
MEDIUMludwig/collect.py15# ==============================================================================
MEDIUMludwig/collect.py235 # ---------------
MEDIUMludwig/collect.py237 # ---------------
MEDIUMludwig/contrib.py14# ==============================================================================
MEDIUMludwig/forecast.py68 # ---------------
MEDIUMludwig/forecast.py70 # ---------------
MEDIUMludwig/forecast.py96 # ----------------
MEDIUMludwig/forecast.py98 # ----------------
MEDIUMludwig/forecast.py101 # -------------------------
MEDIUMludwig/forecast.py103 # -------------------------
MEDIUMludwig/hyperopt_cli.py15# ==============================================================================
MEDIUMludwig/hyperopt_cli.py150 # -------------------
MEDIUMludwig/hyperopt_cli.py152 # -------------------
MEDIUMludwig/hyperopt_cli.py161 # ----------------------------
MEDIUMludwig/hyperopt_cli.py163 # ----------------------------
MEDIUMludwig/hyperopt_cli.py173 # ---------------
MEDIUMludwig/hyperopt_cli.py175 # ---------------
MEDIUMludwig/hyperopt_cli.py226 # ----------------
MEDIUMludwig/hyperopt_cli.py228 # ----------------
MEDIUMludwig/hyperopt_cli.py300 # ------------------
MEDIUMludwig/hyperopt_cli.py302 # ------------------
MEDIUMludwig/error.py14# ==============================================================================
MEDIUMludwig/upload.py92 # ---------------
MEDIUMludwig/upload.py94 # ---------------
MEDIUMludwig/upload.py111 # ---------------
MEDIUMludwig/upload.py113 # ---------------
MEDIUMludwig/preprocess.py15# ==============================================================================
MEDIUMludwig/preprocess.py96 # ---------------
MEDIUMludwig/preprocess.py98 # ---------------
MEDIUMludwig/preprocess.py141 # ----------------
MEDIUMludwig/preprocess.py143 # ----------------
MEDIUMludwig/preprocess.py166 # ------------------
MEDIUMludwig/preprocess.py168 # ------------------
MEDIUMludwig/globals.py15# ==============================================================================
MEDIUMludwig/constants.py15# ==============================================================================
MEDIUMludwig/predict.py15# ==============================================================================
MEDIUMludwig/predict.py110 # ---------------
MEDIUMludwig/predict.py112 # ---------------
MEDIUMludwig/predict.py141 # ----------------
MEDIUMludwig/predict.py143 # ----------------
698 more matches not shown…
Hyper-Verbose Identifiers1493 hits · 1546 pts
SeverityFileLineSnippet
LOWludwig/experiment_utils.py61def get_experiment_description(
LOWludwig/contrib.py32def add_contrib_callback_args(parser: argparse.ArgumentParser):
LOWludwig/api.py224 def _initialize_llm_for_zero_shot(self, random_seed: int = default_random_seed):
LOWludwig/api.py710 def _tune_batch_size_and_grad_accum(self, trainer, dataset, random_seed: int = default_random_seed):
LOWludwig/api.py768 def save_dequantized_base_model(self, save_path: str) -> None:
LOWludwig/api.py867 def _generate_streaming_outputs(
LOWludwig/api.py903 def _generate_non_streaming_outputs(
LOWludwig/api.py1772 def _preprocess_for_prediction(
LOWludwig/serve_ray_serve.py43def make_ludwig_deployment_class(num_replicas: int = 1, ray_actor_options: dict | None = None):
LOWludwig/config_generation.py22def get_ludwig_schema_context() -> str:
LOWludwig/config_sampling/explore_schema.py173def generate_possible_configs(config_options: dict[str, Any]):
LOWludwig/config_sampling/explore_schema.py260def combine_configs_for_comparator_combiner(
LOWludwig/config_sampling/explore_schema.py291def combine_configs_for_sequence_combiner(
LOWludwig/explain/util.py53def get_absolute_module_key_from_submodule(module: torch.nn.Module, submodule: torch.nn.Module):
LOWludwig/explain/captum.py59def retry_with_halved_batch_size(run_config: ExplanationRunConfig):
LOWludwig/explain/captum.py69 def retry_with_halved_batch_size_fn(fn):
LOWludwig/explain/captum.py70 def retry_with_halved_batch_size_wrapper(*args, **kwargs):
LOWludwig/explain/captum_ray.py177def get_total_attribution_task(
LOWludwig/config_validation/checks.py178def check_class_balance_preprocessing(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py188def check_sampling_exclusivity(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py197def check_validation_metric_exists(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py53def get_config_check_registry():
LOWludwig/config_validation/checks.py75def check_feature_names_unique(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py88def check_tied_features_valid(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py114def check_ray_backend_in_memory_preprocessing(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py136def check_sequence_concat_combiner_requirements(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py154def check_comparator_combiner_requirements(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py224def check_hf_tokenizer_requirements(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py237def check_hf_encoder_requirements(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py250def check_stacked_transformer_requirements(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py274def check_hyperopt_search_algorithm_dependencies_installed(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py286def check_hyperopt_scheduler_dependencies_installed(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py298def check_tagger_decoder_requirements(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py329def check_hyperopt_parameter_dicts(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py372def check_concat_combiner_requirements(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py403def check_hyperopt_nested_parameter_dicts(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py439def check_llm_exactly_one_input_text_feature(config: "ModelConfig"):
LOWludwig/config_validation/checks.py450def check_llm_finetuning_output_feature_config(config: "ModelConfig"):
LOWludwig/config_validation/checks.py466def check_llm_finetuning_trainer_config(config: "ModelConfig"):
LOWludwig/config_validation/checks.py484def check_llm_finetuning_backend_config(config: "ModelConfig"):
LOWludwig/config_validation/checks.py511def check_llm_finetuning_adalora_config(config: "ModelConfig"):
LOWludwig/config_validation/checks.py539def check_llm_finetuning_adaption_prompt_parameters(config: "ModelConfig"):
LOWludwig/config_validation/checks.py571def check_llm_quantization_backend_incompatibility(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py604def check_llm_text_encoder_is_not_used_with_ecd(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py633def check_qlora_merge_and_unload_compatibility(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py658def check_prompt_requirements(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py720def check_sample_ratio_and_size_compatible(config: "ModelConfig") -> None:
LOWludwig/config_validation/checks.py728def check_grpo_requires_text_output(config: "ModelConfig") -> None:
LOWludwig/config_validation/preprocessing.py1def check_global_max_sequence_length_fits_prompt_template(metadata, global_preprocessing_parameters):
LOWludwig/distributed/__init__.py40def get_current_dist_strategy() -> DistributedStrategy:
LOWludwig/distributed/__init__.py53def get_default_strategy_name() -> str:
LOWludwig/distributed/accelerate.py211 def allow_gradient_accumulation(self) -> bool:
LOWludwig/distributed/accelerate.py234 def extract_model_for_serialization(cls, model):
LOWludwig/distributed/accelerate.py244 def replace_model_from_serialization(cls, state):
LOWludwig/distributed/base.py162 def allow_gradient_accumulation(self) -> bool:
LOWludwig/distributed/base.py191 def extract_model_for_serialization(cls, model: nn.Module) -> nn.Module | tuple[nn.Module, list[dict]]:
LOWludwig/distributed/base.py195 def replace_model_from_serialization(cls, state: nn.Module | tuple[nn.Module, list[dict]]) -> nn.Module:
LOWludwig/callbacks/__init__.py89 def on_hyperopt_preprocessing_start(self, experiment_name: str, **kwargs):
LOWludwig/callbacks/__init__.py96 def on_hyperopt_preprocessing_end(self, experiment_name: str, **kwargs):
LOWludwig/callbacks/__init__.py202 def on_trainer_train_teardown(self, trainer, progress_tracker, save_path: str, is_coordinator: bool, **kwargs):
1433 more matches not shown…
Cross-File Repetition70 hits · 350 pts
SeverityFileLineSnippet
HIGHludwig/contribs/comet.py0class that defines the methods necessary to hook into process.
HIGHludwig/contribs/wandb.py0class that defines the methods necessary to hook into process.
HIGHludwig/contribs/aim.py0class that defines the methods necessary to hook into process.
HIGHludwig/benchmarking/profiler_callbacks.py0class that defines the methods necessary to hook into process.
HIGHtests/integration_tests/test_experiment.py0class that defines the methods necessary to hook into process.
HIGHtests/integration_tests/test_experiment.py0class that defines the methods necessary to hook into process.
HIGHludwig/utils/metric_utils.py0returns a dict of output_feature_name -> list of metric names.
HIGHludwig/utils/metric_utils.py0returns a dict of output_feature_name -> list of metric names.
HIGHludwig/utils/metric_utils.py0returns a dict of output_feature_name -> list of metric names.
HIGHludwig/models/llm.py0returns init arguments for constructing this model.
HIGHludwig/models/ecd.py0returns init arguments for constructing this model.
HIGHludwig/models/base.py0returns init arguments for constructing this model.
HIGHludwig/schema/split.py0this dataclass generates a schema for the fixed splitting config.
HIGHludwig/schema/split.py0this dataclass generates a schema for the fixed splitting config.
HIGHludwig/schema/split.py0this dataclass generates a schema for the fixed splitting config.
HIGHludwig/modules/loss_modules.py0params: class_weights: list or 1d tensor of length equal to number of classes.
HIGHludwig/modules/loss_modules.py0params: class_weights: list or 1d tensor of length equal to number of classes.
HIGHludwig/modules/loss_modules.py0params: class_weights: list or 1d tensor of length equal to number of classes.
HIGHludwig/modules/loss_modules.py0params: preds: float tensor [b, c] of class logits. target: long tensor [b] of integer class indices.
HIGHludwig/modules/loss_modules.py0params: preds: float tensor [b, c] of class logits. target: long tensor [b] of integer class indices.
HIGHludwig/modules/loss_modules.py0params: preds: float tensor [b, c] of class logits. target: long tensor [b] of integer class indices.
HIGHludwig/modules/convolutional_modules.py0returns the size of the input tensor without the batch dimension.
HIGHludwig/modules/convolutional_modules.py0returns the size of the input tensor without the batch dimension.
HIGHludwig/modules/convolutional_modules.py0returns the size of the input tensor without the batch dimension.
HIGHludwig/modules/convolutional_modules.py0returns the size of the input tensor without the batch dimension.
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/test_visualization.py0ensure pdf and png figures from the experiments can be saved. :param csv_filename: csv fixture from tests.conftest.csv_f
HIGHtests/integration_tests/utils.py0helper method to avoid code repetition in running an experiment. :param input_features: input schema :param output_featu
HIGHtests/integration_tests/test_api.py0helper method to avoid code repetition in running an experiment. :param input_features: input schema :param output_featu
HIGHtests/integration_tests/test_api.py0helper method to avoid code repetition in running an experiment. :param input_features: input schema :param output_featu
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
HIGHtests/integration_tests/test_visualization_api.py0ensure pdf and png figures can be saved via visualization api call. :param experiment_to_use: object containing trained
10 more matches not shown…
Unused Imports321 hits · 282 pts
SeverityFileLineSnippet
LOWludwig/__init__.py18
LOWludwig/types.py9
LOWludwig/serve_ray_serve.py24
LOWludwig/serve_kserve.py18
LOWludwig/config_validation/validation.py13
LOWludwig/config_validation/validation.py14
LOWludwig/config_validation/validation.py14
LOWludwig/config_validation/validation.py15
LOWludwig/config_validation/validation.py16
LOWludwig/config_validation/validation.py16
LOWludwig/config_validation/validation.py26
LOWludwig/distributed/accelerate.py170
LOWludwig/distributed/base.py1
LOWludwig/datasets/__init__.py21
LOWludwig/datasets/__init__.py22
LOWludwig/datasets/loaders/misc_loaders.py3
LOWludwig/datasets/loaders/multilabel_loader.py6
LOWludwig/datasets/loaders/qa_loader.py8
LOWludwig/datasets/loaders/hugging_face.py15
LOWludwig/datasets/loaders/multiple_choice_loader.py9
LOWludwig/datasets/loaders/ner_loader.py8
LOWludwig/datasets/loaders/openml_loader.py15
LOWludwig/datasets/loaders/dataset_loader.py15
LOWludwig/datasets/loaders/vqa_loader.py8
LOWludwig/datasets/loaders/code_loader.py3
LOWludwig/datasets/loaders/translation_loader.py8
LOWludwig/features/transforms.py7
LOWludwig/features/feature_registries.py58
LOWludwig/features/feature_registries.py59
LOWludwig/features/text_feature.py16
LOWludwig/features/timeseries_feature.py37
LOWludwig/contribs/__init__.py49
LOWludwig/contribs/__init__.py59
LOWludwig/contribs/__init__.py69
LOWludwig/combiners/combiners.py27
LOWludwig/combiners/__init__.py2
LOWludwig/combiners/tabpfn_v2_combiner.py20
LOWludwig/combiners/tabpfn_v2_combiner.py64
LOWludwig/utils/dataset_quality.py33
LOWludwig/utils/trainer_utils.py23
LOWludwig/utils/trainer_utils.py24
LOWludwig/utils/hf_utils.py1
LOWludwig/utils/misc_utils.py36
LOWludwig/utils/llm_utils.py1
LOWludwig/utils/checkpoint_utils.py27
LOWludwig/utils/checkpoint_utils.py28
LOWludwig/utils/upload_utils.py1
LOWludwig/utils/entmax/__init__.py3
LOWludwig/utils/entmax/__init__.py3
LOWludwig/utils/entmax/__init__.py3
LOWludwig/utils/entmax/__init__.py3
LOWludwig/utils/entmax/__init__.py4
LOWludwig/utils/entmax/__init__.py4
LOWludwig/utils/entmax/__init__.py4
LOWludwig/utils/entmax/__init__.py4
LOWludwig/utils/entmax/__init__.py4
LOWludwig/utils/entmax/__init__.py4
LOWludwig/utils/entmax/__init__.py4
LOWludwig/utils/entmax/__init__.py4
LOWludwig/utils/entmax/__init__.py14
261 more matches not shown…
Over-Commented Block221 hits · 216 pts
SeverityFileLineSnippet
LOW.protolint.yaml1# Adapted from
LOWludwig/experiment_utils.py1# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.
LOWludwig/api_types.py1# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.
LOWludwig/collect.py1#! /usr/bin/env python
LOWludwig/contrib.py1# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.
LOWludwig/hyperopt_cli.py1#! /usr/bin/env python
LOWludwig/error.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/preprocess.py1#! /usr/bin/env python
LOWludwig/globals.py1#! /usr/bin/env python
LOWludwig/constants.py1#! /usr/bin/env python
LOWludwig/predict.py1#! /usr/bin/env python
LOWludwig/__init__.py1# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.
LOWludwig/serve.py1#! /usr/bin/env python
LOWludwig/export.py1#! /usr/bin/env python
LOWludwig/api.py1# !/usr/bin/env python
LOWludwig/experiment.py1#! /usr/bin/env python
LOWludwig/cli.py1#! /usr/bin/env python
LOWludwig/train.py1#! /usr/bin/env python
LOWludwig/evaluate.py1#! /usr/bin/env python
LOWludwig/callbacks/__init__.py1# !/usr/bin/env python
LOWludwig/datasets/dataset_config.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/archives.py1#! /usr/bin/env python
LOWludwig/datasets/loaders/ieee_fraud.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/adult_census_income.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/kdd_loader.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/hugging_face.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/goemotions.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/higgs.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/split_loaders.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/ethos_binary.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/consumer_complaints_loader.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/creditcard_fraud.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/sarcastic_headlines.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/openml_loader.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/naval.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/dataset_loader.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/agnews.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/sarcos.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/insurance_lite.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/rossman_store_sales.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/allstate_claims_severity.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/forest_cover.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/forest_cover.py61 # Elevation quantitative meters Elevation in meters
LOWludwig/datasets/loaders/code_alpaca_loader.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/sst.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/santander_value_prediction.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/flickr8k.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/datasets/loaders/camseq.py1# Copyright (c) 2023 Aizen Corp.
LOWludwig/datasets/loaders/mnist.py1# Copyright (c) 2022 Predibase, Inc.
LOWludwig/features/feature_registries.py1# Copyright (c) 2023 Predibase, Inc., 2019 Uber Technologies, Inc.
LOWludwig/features/vector_feature.py1#! /usr/bin/env python
LOWludwig/features/category_feature.py1#! /usr/bin/env python
LOWludwig/features/binary_feature.py1#! /usr/bin/env python
LOWludwig/features/text_feature.py1#! /usr/bin/env python
LOWludwig/features/text_feature.py481 #
LOWludwig/features/set_feature.py1#! /usr/bin/env python
LOWludwig/features/bag_feature.py1#! /usr/bin/env python
LOWludwig/features/number_feature.py1#! /usr/bin/env python
LOWludwig/features/feature_utils.py1#! /usr/bin/env python
LOWludwig/features/date_feature.py1#! /usr/bin/env python
161 more matches not shown…
Deep Nesting184 hits · 170 pts
SeverityFileLineSnippet
LOWludwig/serve_vllm.py27
LOWludwig/serve_vllm.py116
LOWludwig/serve.py132
LOWludwig/serve.py421
LOWludwig/inspect_model.py12
LOWludwig/api.py1865
LOWludwig/api.py232
LOWludwig/api.py1161
LOWludwig/serve_v2.py96
LOWludwig/config_sampling/explore_schema.py22
LOWludwig/config_sampling/explore_schema.py173
LOWludwig/config_sampling/explore_schema.py291
LOWludwig/config_sampling/parameter_sampling.py10
LOWludwig/explain/captum.py351
LOWludwig/explain/captum.py379
LOWludwig/explain/captum_ray.py32
LOWludwig/config_validation/checks.py224
LOWludwig/config_validation/checks.py237
LOWludwig/config_validation/checks.py329
LOWludwig/callbacks/studio.py183
LOWludwig/datasets/archives.py41
LOWludwig/datasets/archives.py67
LOWludwig/datasets/archives.py90
LOWludwig/datasets/loaders/newyorker_caption_contest.py19
LOWludwig/datasets/loaders/dataset_loader.py256
LOWludwig/datasets/loaders/dataset_loader.py333
LOWludwig/datasets/loaders/dataset_loader.py363
LOWludwig/datasets/loaders/dataset_loader.py406
LOWludwig/datasets/loaders/dataset_loader.py425
LOWludwig/datasets/loaders/sst.py54
LOWludwig/datasets/loaders/sst.py278
LOWludwig/datasets/loaders/vqa_loader.py25
LOWludwig/datasets/loaders/vqa_loader.py51
LOWludwig/datasets/loaders/flickr8k.py23
LOWludwig/datasets/loaders/mnist.py109
LOWludwig/features/vector_feature.py44
LOWludwig/features/category_feature.py400
LOWludwig/features/text_feature.py324
LOWludwig/features/number_feature.py329
LOWludwig/features/date_feature.py72
LOWludwig/features/image_feature.py146
LOWludwig/features/image_feature.py499
LOWludwig/features/image_feature.py905
LOWludwig/features/sequence_feature.py380
LOWludwig/features/audio_feature.py81
LOWludwig/features/audio_feature.py242
LOWludwig/features/audio_feature.py438
LOWludwig/contribs/mlflow/mlflow3.py170
LOWludwig/combiners/combiners.py254
LOWludwig/combiners/combiners.py589
LOWludwig/utils/visualization_utils.py1398
LOWludwig/utils/sequence_packing.py27
LOWludwig/utils/image_utils.py127
LOWludwig/utils/dataset_quality.py268
LOWludwig/utils/batch_size_tuner.py19
LOWludwig/utils/trainer_utils.py291
LOWludwig/utils/training_report.py16
LOWludwig/utils/model_export.py172
LOWludwig/utils/algorithms_utils.py19
LOWludwig/utils/misc_utils.py69
124 more matches not shown…
Cross-Language Confusion30 hits · 158 pts
SeverityFileLineSnippet
HIGHludwig/config_validation/checks.py663 # TODO: `prompt` by default should be set to null, not a default dict:
HIGHludwig/config_validation/checks.py692 # TODO: retrieval by default should be set to null, not a default dict:
HIGHludwig/utils/output_feature_utils.py118 "features, or disabling the bucketing setting bucketing_field to None / null, "
HIGHludwig/schema/trainer.py498 "are inversely proportional to this vector. When null, a uniform preference is used."
HIGHludwig/schema/features/loss/loss.py87 return "[undefined]"
HIGHludwig/schema/features/preprocessing/date.py38 description="This parameter can either be a datetime format string, or null, in which case the datetime "
HIGHludwig/schema/combiners/common_transformer_options.py60 description="The number of stacked fully connected layers (only applies if `reduce_output` is not null).",
HIGHludwig/schema/combiners/tabnet.py80 description="Size of the virtual batch size used by ghost batch norm. If null, regular batch norm is used "
HIGHludwig/schema/encoders/sequence_encoders.py23`[{filter_size: 7, pool_size: 3}, {filter_size: 7, pool_size: 3}, {filter_size: 3, pool_size: null},
HIGHludwig/schema/encoders/sequence_encoders.py24{filter_size: 3, pool_size: null}, {filter_size: 3, pool_size: null}, {filter_size: 3, pool_size: 3}]`.
HIGHludwig/schema/encoders/sequence_encoders.py408 description="If stacked_layers is null, this is the number of elements in the stack of parallel convolutional "
HIGHludwig/schema/encoders/image/base.py216 "each layer. It indicates the normalization applied to the activations and can be null, "
HIGHludwig/schema/encoders/image/base.py279 "each layer. It indicates the norm of the output and can be null, batch or layer.",
HIGHludwig/schema/llms/peft.py1456 "Per-source weights; must have the same length as `sources`. If null, all weights default to 1.0."
HIGHludwig/schema/llms/peft.py1513 "If null, the first entry in `adapters` is used. Set this to a merged adapter "
HIGHludwig/modules/preference_losses.py135 + beta * KL(policy || reference)
HIGHtests/ludwig/utils/test_dataframe_utils.py85 assert scalar_df.equals(expected_df)
HIGHtests/ludwig/utils/test_data_utils.py51 assert df.equals(
HIGHtests/ludwig/utils/test_data_utils.py65 assert df.equals(
HIGHtests/ludwig/utils/test_data_utils.py81 assert df.equals(pd.DataFrame([1, 2, 3, 4, 5], columns=["x"]))
HIGHtests/ludwig/utils/test_dataset_utils.py34 assert split_df.equals(
HIGHtests/ludwig/utils/test_dataset_utils.py89 assert split_df.equals(
HIGHtests/ludwig/utils/test_dataset_utils.py144 assert split_df.equals(
HIGHtests/ludwig/utils/test_dataset_utils.py199 assert split_df.equals(
HIGHtests/ludwig/data/test_split.py81 assert not s1.equals(s2)
HIGHtests/ludwig/data/test_split.py85 assert s1.equals(s3)
HIGHtests/ludwig/data/test_split.py228 assert not s1.equals(s2)
HIGHtests/ludwig/data/test_split.py235 assert s1.equals(s3)
HIGHtests/integration_tests/test_visualization.py1605 assert ground_truth_train_split.equals(pd.Series([0]))
HIGHtests/integration_tests/test_mlflow.py81 assert pred_df.equals(expected_df)
Excessive Try-Catch Wrapping112 hits · 115 pts
SeverityFileLineSnippet
LOWludwig/check.py32 except Exception:
LOWludwig/serve.py264 except Exception:
LOWludwig/serve.py308 except Exception as exc:
LOWludwig/serve.py331 except Exception:
LOWludwig/serve.py372 except Exception:
LOWludwig/api.py601 except Exception:
LOWludwig/api.py617 except Exception:
LOWludwig/api.py1551 except Exception:
LOWludwig/config_generation.py101 except Exception as exc:
LOWludwig/config_generation.py196 except Exception as e:
LOWludwig/serve_v2.py324 except Exception as exc:
LOWludwig/serve_v2.py344 except Exception as exc:
LOWludwig/serve_v2.py370 except Exception as exc:
LOWludwig/datasets/__init__.py320 except Exception as e:
LOWludwig/datasets/loaders/openml_loader.py150 except Exception as exc:
LOWludwig/datasets/loaders/dataset_loader.py264 except Exception as e:
LOWludwig/datasets/loaders/dataset_loader.py274 except Exception as fallback_e:
LOWludwig/datasets/loaders/dataset_loader.py285 except Exception:
LOWludwig/datasets/loaders/dataset_loader.py291 except Exception:
LOWludwig/datasets/loaders/dataset_loader.py345 except Exception:
LOWludwig/features/binary_feature.py174 except Exception as e:
LOWludwig/features/text_feature.py365 except Exception:
LOWludwig/features/date_feature.py85 except Exception as e:
MEDIUMludwig/features/date_feature.py72def date_to_list(date_value, datetime_format, preprocessing_parameters):
LOWludwig/features/anomaly_feature.py358 except Exception as e:
LOWludwig/features/anomaly_feature.py367 except Exception as e:
LOWludwig/features/anomaly_feature.py384 except Exception as e:
LOWludwig/features/base_feature.py296 except Exception:
LOWludwig/features/base_feature.py514 except Exception as e:
LOWludwig/contribs/comet.py50 except Exception:
LOWludwig/contribs/comet.py105 except Exception:
LOWludwig/contribs/comet.py112 except Exception:
LOWludwig/contribs/aim.py42 except Exception:
LOWludwig/contribs/mlflow/mlflow3.py73 except Exception:
LOWludwig/contribs/mlflow/mlflow3.py90 except Exception as e:
LOWludwig/contribs/mlflow/mlflow3.py97 except Exception as e:
LOWludwig/contribs/mlflow/mlflow3.py139 except Exception as e:
LOWludwig/contribs/mlflow/mlflow3.py166 except Exception:
LOWludwig/contribs/mlflow/mlflow3.py189 except Exception:
LOWludwig/contribs/mlflow/mlflow3.py205 except Exception as e:
LOWludwig/utils/image_utils.py116 except Exception as e:
LOWludwig/utils/image_utils.py197 except Exception:
LOWludwig/utils/image_utils.py209 except Exception:
LOWludwig/utils/image_utils.py228 except Exception:
LOWludwig/utils/hf_utils.py108 except Exception:
LOWludwig/utils/hf_utils.py185 except Exception as e:
LOWludwig/utils/model_export.py68 except Exception as e:
LOWludwig/utils/model_export.py111 except Exception as e:
LOWludwig/utils/model_export.py122 except Exception as e2:
LOWludwig/utils/audio_utils.py73 except Exception:
LOWludwig/utils/audio_utils.py84 except Exception:
LOWludwig/utils/output_feature_utils.py110 except Exception as e:
LOWludwig/utils/checkpoint_utils.py174 except Exception as e:
LOWludwig/utils/checkpoint_utils.py394 except Exception:
LOWludwig/utils/fs_utils.py97 except Exception:
LOWludwig/backend/datasource.py50 except Exception as e:
LOWludwig/visualize/_utils.py286 except Exception:
LOWludwig/schema/utils.py404 except Exception as e:
LOWludwig/schema/features/augmentation/utils.py141 except Exception as e:
LOWludwig/schema/features/preprocessing/utils.py80 except Exception as e:
52 more matches not shown…
Self-Referential Comments33 hits · 97 pts
SeverityFileLineSnippet
MEDIUMludwig/api.py177 # Initialize the config object
MEDIUMludwig/api.py804 # Create the LLM model class instance with the loaded LLM if it hasn't been initialized yet.
MEDIUMludwig/datasets/loaders/dataset_loader.py490 ) # This function is defined in the Hugging Face dataloader
MEDIUMludwig/features/timeseries_feature.py89 # Create the list of shifts we want to perform over the series.
MEDIUMludwig/combiners/combiners.py57 """This class provides an opaque handle to the input features, preventing them from being registered as state.
MEDIUMludwig/utils/hf_utils.py168 # Create the repo if it doesn't exist. This is a no-op if the repo already exists
MEDIUMludwig/utils/audio_utils.py177# The following code for FBank is adapted from jameslyons/python_speech_features
MEDIUMludwig/utils/llm_quantization_utils.py25 # Create a new Linear layer with the same shape
MEDIUMludwig/utils/fs_utils.py65 # Create a windows compatible path from url path
MEDIUMludwig/backend/datasource.py55 # Create a dataset from the paths and indices, then map to read files
MEDIUMludwig/modules/optimization_modules.py50 # Create a dict of parameters to be passed to torch (i.e. everything except `type`):
MEDIUMludwig/modules/convolutional_modules.py1273# The following code for ResNet is adapted from the TensorFlow implementation
MEDIUMludwig/data/dataset_synthesizer.py400 # Create a Random Image
MEDIUMludwig/hyperopt/run.py171 # Initialize config object
MEDIUMtests/ludwig/utils/test_upload_utils.py29 # Create a temporary folder designating training output directory.
MEDIUMtests/ludwig/utils/test_upload_utils.py48 # Create a temporary folder designating training output directory.
MEDIUMtests/ludwig/utils/test_hf_utils.py55 # Create a temporary folder
MEDIUMtests/ludwig/utils/test_hf_utils.py58 # Create a file within the temporary folder
MEDIUMtests/ludwig/utils/test_model_utils.py21 # Create a sample model
MEDIUMtests/ludwig/utils/test_model_utils.py46 # Create a sample model
MEDIUMtests/ludwig/utils/test_model_utils.py52 # Create a new device for testing
MEDIUMtests/ludwig/automl/test_base_config.py130 # Create a temporary directory to store the parquet file
MEDIUMtests/ludwig/automl/test_base_config.py133 # Create a dataframe with all the types
MEDIUMtests/ludwig/decoders/test_llm_decoders.py54 # Create a Boolean mask for elements equal to 0 or 2 (padding or output)
MEDIUMexamples/mnist/advanced_model_training.py15# ## Import required libraries
MEDIUMexamples/mnist/assess_model_performance.py9# ## Import required libraries
MEDIUMexamples/llm_text_generation/simple_model_training.py9# Import required libraries
MEDIUMexamples/llm_few_shot_learning/simple_model_training.py9# Import required libraries
MEDIUMexamples/class_imbalance/model_training.py10# Import required libraries
MEDIUMexamples/titanic/simple_model_training.py8# Import required libraries
MEDIUMexamples/titanic/multiple_model_training.py10# ## Import required libraries
MEDIUMexamples/insurance_lite/train.py5# Import required libraries
MEDIUM…amples/llm_zero_shot_learning/simple_model_training.py9# Import required libraries
Docstring Block Structure16 hits · 80 pts
SeverityFileLineSnippet
HIGHludwig/api.py769Upscales quantized weights of a model to fp16 and saves the result in a specified folder. Args: sav
HIGHludwig/api.py1497Preprocess a dataset and return it split into training / validation / test sets. Args: dataset: Sou
HIGHludwig/config_generation.py112Generate a Ludwig config from a natural language task description. Uses an LLM to translate the description into a
HIGHludwig/features/image_feature.py440Returns a torchvision transform that is compatible with the model variant. Note that the raw torchvision transform
HIGHludwig/utils/date_utils.py67Convert a numeric timestamp to a datetime object. `datetime` objects can be created from POSIX timestamps like thos
HIGHludwig/utils/trainer_utils.py544Freezes layers in a model whose names match a specified regular expression pattern. This function iterates over all
HIGHludwig/utils/hf_utils.py128Uploads a local folder to the Hugging Face Model Hub. Args: repo_id (str): The ID of the target repository
HIGHludwig/schema/utils.py76Deserialize a value into a config instance. Handles the common pattern of checking if a value is a raw dict
HIGHludwig/modules/training_hooks.py25Abstract method to be implemented by subclasses. This is the method that defines the custom behavior of the trai
HIGHludwig/modules/convolutional_modules.py1293Retrieve the size of each block_layer in the ResNet model. The number of block layers used for the Resnet model var
HIGHludwig/data/preprocessing.py523Builds a dataset from a dataframe and a list of features. Args: config: A dictionary containing the Ludwig
HIGHludwig/data/preprocessing.py991The purpose of this function is to balance the training dataset using either over-sampling or under- sampling.
HIGHtests/ludwig/encoders/test_llm_encoders.py66Get the PEFT paramter name prefix for a given adapter type. Args: adapter: A valid config value for
HIGHtests/integration_tests/utils.py212Helper method to generate synthetic data based on input, output feature specs. Args: input_features: schema
HIGHtests/integration_tests/utils.py1071Asserts that the preprocessed dataset has the correct shape and dtype for a given feature type. Args: featu
HIGHtests/integration_tests/parameter_update_utils.py24 Reports on the number of parameters in a Ludwig component and their update status. Args: module: (Ludwi
AI Slop Vocabulary18 hits · 34 pts
SeverityFileLineSnippet
MEDIUMludwig/presets.py43 # robust (interquartile) scaling on number features, mild-but-not-trivial FC stack, AdamW
MEDIUMludwig/api.py1940 # use Ludwig's utility to facilitate creating a dataframe
MEDIUMludwig/explain/captum.py372 # For a robust baseline, we take the mean of all samples from the training data.
MEDIUMludwig/config_validation/checks.py65 """Checks instances of comprehensive (all parameters and defaults filled in) schema-validated config."""
LOWludwig/datasets/loaders/misc_loaders.py50 # mc1_targets / mc2_targets are dicts; just use best_answer as text
LOWludwig/features/vector_feature.py240 # no overall stats, just return empty dictionary
LOWludwig/features/category_feature.py223 # If no unknown is defined, just use the most popular token's index as the fallback index
LOWludwig/features/set_feature.py327 # no overall stats, just return empty dictionary
LOWludwig/features/number_feature.py514 # no overall stats, just return empty dictionary
LOWludwig/features/image_feature.py1355 # no overall stats, just return empty dictionary
LOWludwig/features/timeseries_feature.py362 # no overall stats, just return empty dictionary
LOWludwig/combiners/combiners.py674 # todo: can we just use projector_size? # hidden_size,
LOWludwig/utils/visualization_utils.py1497 # just use stripplots since they are categorical scatter plots.
LOWludwig/utils/llm_utils.py573 # and just set it to a tensor of IGNORE_INDEX_TOKEN_ID so that we don't compute loss on this target tensor.
MEDIUMludwig/modules/loss_modules.py150 # robust lambda
MEDIUM…wig/config_validation/test_validate_config_combiner.py10 # Essentially verifies that the combiner registry is not empty at import time:
LOWtests/ludwig/modules/test_metric_modules.py558 # Correct pattern: just call compute() — sync happens automatically inside.
MEDIUMtests/integration_tests/test_input_feature_tied.py26# note: vocab parameter, below, is made up to facilitate creating input encoders
Redundant / Tautological Comments20 hits · 28 pts
SeverityFileLineSnippet
LOWludwig/collect.py276 # Output results parameters
LOWludwig/collect.py358 # Output results parameters
LOWludwig/forecast.py102 # Output results parameters
LOWludwig/predict.py148 # Output results parameters
LOWludwig/evaluate.py151 # Output results parameters
LOWludwig/config_validation/checks.py301 # Check if there is a text or sequence output feature using a tagger decoder
LOWludwig/features/category_feature.py176 # Check if the fallback label is in the vocab, if not add it.
LOWludwig/utils/visualization_utils.py335 # Set ticks to the number of properties (in radians)
LOWludwig/utils/fs_utils.py54 # Check if the cache path exists, if not create it
LOWludwig/utils/tokenizers.py802 # Set it to eos_token to avoid NoneType errors in preprocessing.
LOWludwig/models/llm.py756 # Check if the saved weights are merged (no adapter_config.json) or adapter-only
LOWludwig/automl/base_config.py314 # Check if it is a nullboolean field. We do this since if you read a csv with
LOWludwig/schema/utils.py152 # Check if the subclass overrides _jsonschema_type_mapping
LOWludwig/schema/utils.py478 # Check if THIS class (or a parent) defines __post_init__
LOWludwig/schema/utils.py1507 # Check if subclass overrides _jsonschema_type_mapping - if so, use
LOWludwig/data/batcher/test_batcher.py42 # Check if string loading works as well
LOWludwig/data/batcher/test_batcher.py93 # Check if string loading works as well
LOWludwig/hyperopt/run.py189 # Check if all features are grid type parameters and log UserWarning if needed
LOWludwig/hyperopt/execution.py65 # Check if ConfigSpace 1.x (no 'q' parameter)
LOWexamples/kfold_cv/k-fold_cv_classification.sh19# Display results from K-fold cv
Synthetic Comment Markers3 hits · 18 pts
SeverityFileLineSnippet
HIGHludwig/datasets/configs/hc3.yaml10 answer is human-written (0) or generated by ChatGPT (1). Each source row is
HIGHludwig/datasets/configs/hc3_chinese.yaml10 of whether an answer is human-written (0) or generated by ChatGPT (1).
HIGHludwig/schema/metadata/configs/features.yaml339 # TODO: review metadata generated by Copilot
Verbosity Indicators9 hits · 14 pts
SeverityFileLineSnippet
LOWludwig/api.py1199 # Step 1: Preprocess the initial lookback window once
LOWludwig/api.py1230 # Step 2: Incremental prediction loop — O(horizon) steps, each O(1) preprocessing
LOWludwig/api.py1259 # Step 3: Update embeddings incrementally for the next step.
LOWludwig/config_validation/checks.py587 # If the backend is not explicitly set, then we need to check if a Ray process is running
LOWludwig/distributed/base.py153 The purpose of this function is to reduce network overhead.
LOWludwig/encoders/text_encoders.py2531 # Step 1: Prepare quantized base model for training (freeze + cast).
LOWludwig/encoders/text_encoders.py2536 # Step 2: Initialize adapter on quantized base if not already done
LOWludwig/encoders/text_encoders.py2540 # Step 3: Load adapter weights from checkpoint
LOWludwig/data/preprocessing.py991 """The purpose of this function is to balance the training dataset using either over-sampling or under-
Slop Phrases1 hit · 2 pts
SeverityFileLineSnippet
LOWludwig/benchmarking/examples/process_config.py94 # make sure to return the ludwig_config
Example Usage Blocks1 hit · 2 pts
SeverityFileLineSnippet
LOWdocker/build_and_push.sh4# Usage: