Repository Analysis

NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

10.8 Low AI signal View on GitHub
10.8
Adjusted Score
10.8
Raw Score
100%
Time Factor
2026-05-30
Last Push
16,514
Stars
Python
Language
894,390
Lines of Code
2675
Files
6402
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 30HIGH 213MEDIUM 885LOW 5274

Pattern Findings

6402 matches across 19 categories. Click a row to expand file-level details.

Hyper-Verbose Identifiers3468 hits · 3431 pts
SeverityFileLineSnippet
LOWpretrain_vlm.py227def train_valid_test_datasets_provider(train_val_test_num_samples):
LOWpretrain_vlm.py267def _preprocess_data_for_llava(data):
LOWpretrain_vlm.py465def llava_position_embedding_ranks(pp_ranks):
LOWtrain_rl.py339def train_valid_test_datasets_provider(train_val_test_num_samples):
LOWpretrain_gpt.py267def core_gpt_dataset_config_from_args(args: Any) -> GPTDatasetConfig:
LOWpretrain_gpt.py332def train_valid_test_datasets_provider(train_val_test_num_samples, vp_stage=None):
LOWmodel_provider.py62def count_parameters_in_layer(model, layer_name):
LOWpretrain_hybrid.py259def core_gpt_dataset_config_from_args(args: Any) -> GPTDatasetConfig:
LOWpretrain_hybrid.py301def train_valid_test_datasets_provider(train_val_test_num_samples, vp_stage=None):
LOWgpt_builders.py103def _get_transformer_layer_spec(use_te, config):
LOWtasks/finetune_utils.py53def _cross_entropy_forward_step(batch, model):
LOWtasks/finetune_utils.py95def _build_infinite_size_dataloader(dataloader):
LOWtasks/finetune_utils.py106def _build_train_valid_dataloaders(train_dataset, valid_dataset,
LOWtasks/data_utils.py35def build_tokens_types_paddings_from_text(text_a, text_b,
LOWtasks/data_utils.py49def build_tokens_types_paddings_from_ids(text_a_ids, text_b_ids, max_seq_length,
LOWtasks/eval_utils.py65def calculate_correct_answers(name, model, dataloader,
LOWtasks/eval_utils.py111 def correct_answers_forward_step(batch, model):
LOWtools/run_vlm_text_generation.py161def generate_and_write_samples(model):
LOWtools/prepare_cache.py54def _normalize_prepare_cache_args(args: Any) -> None:
LOWtools/prepare_cache.py65def _validate_prepare_cache_args(args: Any) -> None:
LOWtools/prepare_cache.py82def _disable_cache_load_only_flags(args: Any) -> Dict[str, bool]:
LOWtools/prepare_cache.py102def _print_effective_configuration(
LOWtools/prepare_cache.py120def core_gpt_dataset_config_from_args(args: Any) -> GPTDatasetConfig:
LOWtools/run_inference_performance_test.py48def add_inference_benchmarking_args(parser):
LOWtools/build_sequences_per_dataset.py60def build_sequences_per_dataset(args):
LOWtools/run_dynamic_text_generation_server.py21def add_text_generation_server_args(parser: argparse.ArgumentParser):
LOWtools/run_dynamic_text_generation_server.py37async def run_text_generation_server(
LOWtools/checkpoint/saver_hf_llava.py176 def receive_vision_projection(self):
LOWtools/checkpoint/saver_hf_llava.py352 def save_state_dict_to_hf_checkpoint(self):
LOWtools/checkpoint/schema_core.py10def get_core_transformer_block_key(model_key):
LOWtools/checkpoint/schema_hf.py173def get_language_model_schema(
LOWtools/checkpoint/loader_base.py30 def _maybe_parse_additional_megatron_args(self, margs, checkpoint_args):
LOWtools/checkpoint/loader_base.py90 def _maybe_ensure_additional_required_arguments(self):
LOWtools/checkpoint/loader_base.py107 def ensure_required_arguments(self):
LOWtools/checkpoint/loader_base.py193 def get_models_for_pipeline_stage(count, dtype):
LOWtools/checkpoint/loader_base.py425 def build_checkpoint_metadata(self, true_vocab_size):
LOWtools/checkpoint/gpt_hybrid_conversion.py404def initialize_ssm_layer_params(
LOWtools/checkpoint/gpt_hybrid_conversion.py152def parse_hybrid_layer_pattern(pattern):
LOWtools/checkpoint/gpt_hybrid_conversion.py183def build_layer_index_mapping(layer_types, direction):
LOWtools/checkpoint/gpt_hybrid_conversion.py316def validate_pattern_gpt_compatible(layer_types, direction):
LOWtools/checkpoint/gpt_hybrid_conversion.py361def validate_source_args_gpt_compatible(source_args, direction):
LOWtools/checkpoint/dist_checkpoint_io.py69def resolve_checkpoint_subdir(load_dir):
LOWtools/checkpoint/dist_checkpoint_io.py113def ensure_single_rank_process_group():
LOWtools/checkpoint/dist_checkpoint_io.py159def load_dist_checkpoint_full(load_dir):
LOWtools/checkpoint/dist_checkpoint_io.py221def save_dist_checkpoint_full(
LOWtools/checkpoint/dist_checkpoint_io.py260def write_latest_iteration_marker(save_dir, iteration):
LOWtools/checkpoint/loader_mixtral_hf.py28def load_args_from_checkpoint(args):
LOWtools/checkpoint/loader_mixtral_hf.py63def verify_transformers_version():
LOWtools/checkpoint/saver_llava.py134 def _maybe_parse_additional_megatron_args(self, margs):
LOWtools/checkpoint/saver_llava.py320 def receive_vision_projection(self):
LOWtools/checkpoint/checkpoint_inspector.py288def save_checkpoint_with_pickle_protocol(state_dict, output_dir, pickle_protocol=4):
LOWtools/checkpoint/checkpoint_inspector.py292 def transform_object_override(write_item, obj):
LOWtools/checkpoint/checkpoint_inspector.py831def convert_torch_dist_to_fsdp_dtensor(
LOWtools/checkpoint/saver_base.py31 def _maybe_parse_additional_megatron_args(self, margs):
LOWtools/checkpoint/saver_base.py38 def insert_megatron_path_and_check_te(self):
LOWtools/checkpoint/saver_base.py277 def receive_checkpoint_metadata(self):
LOWtools/checkpoint/saver_base.py341 def save_local_models_to_checkpoint(self):
LOWtools/checkpoint/loader_llava.py54 def _maybe_parse_additional_megatron_args(self, margs, checkpoint_args):
LOWtools/checkpoint/loader_llava.py88 def _maybe_ensure_additional_required_arguments(self):
LOWtools/checkpoint/loader_llava.py104 def build_checkpoint_metadata(self, true_vocab_size):
3408 more matches not shown…
Decorative Section Separators696 hits · 2438 pts
SeverityFileLineSnippet
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py400# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py402# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py132# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py134# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py225# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py227# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py486# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py488# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py530# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py532# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py627# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py629# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py698# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py700# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py722# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/gpt_hybrid_conversion.py724# ---------------------------------------------------------------------------
MEDIUMtools/checkpoint/saver_llava.py225 #-----------
MEDIUMtools/checkpoint/saver_llava.py186 #-----------
MEDIUMtools/checkpoint/saver_base.py377 #-----------
MEDIUMtools/checkpoint/saver_base.py420 # --------------
MEDIUMtools/checkpoint/saver_base.py432 # ------------------
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py360 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py362 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py386 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py388 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py410 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py412 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py433 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py435 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py474 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py476 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py496 # ================================================================
MEDIUM…s/common_pile_dataset/create_common_pile_ci_dataset.py498 # ================================================================
MEDIUM…ts/performance_tests/shell_test_utils/run_perf_test.sh20# ── Parse KEY=VALUE positional args ───────────────────────────────────────────
MEDIUM…ts/performance_tests/shell_test_utils/run_perf_test.sh50# ── Read model_config.yaml ────────────────────────────────────────────────────
MEDIUM…ts/performance_tests/shell_test_utils/run_perf_test.sh83# ── Build model args (substituting ${CHECKPOINT_LOAD_PATH}) ───────────────────
MEDIUM…ts/performance_tests/shell_test_utils/run_perf_test.sh99# ── Make image-bundled extras (mamba-ssm) visible to the cog venv ─────────────
MEDIUM…ts/performance_tests/shell_test_utils/run_perf_test.sh128# ── Launch the inference server in the background ─────────────────────────────
MEDIUM…ts/performance_tests/shell_test_utils/run_perf_test.sh185# ── Wait for server readiness ─────────────────────────────────────────────────
MEDIUM…ts/performance_tests/shell_test_utils/run_perf_test.sh205# ── Benchmark sweep ───────────────────────────────────────────────────────────
MEDIUM…ts/performance_tests/shell_test_utils/run_perf_test.sh224# ── Baseline comparison or recording ──────────────────────────────────────────
MEDIUMtests/unit_tests/test_optimizer_param_scheduler.py302# ── get_canonical_lr_for_logging tests ──────────────────────────────────────
MEDIUMtests/unit_tests/test_emerging_optimizers.py63# ===========================================================================
MEDIUMtests/unit_tests/test_emerging_optimizers.py65# ===========================================================================
MEDIUMtests/unit_tests/test_emerging_optimizers.py841# ===========================================================================
MEDIUMtests/unit_tests/test_emerging_optimizers.py843# ===========================================================================
MEDIUMtests/unit_tests/test_emerging_optimizers.py1288# ===========================================================================
MEDIUMtests/unit_tests/test_emerging_optimizers.py1290# ===========================================================================
MEDIUMtests/unit_tests/test_emerging_optimizers.py1575# ===========================================================================
MEDIUMtests/unit_tests/test_emerging_optimizers.py1577# ===========================================================================
MEDIUMtests/unit_tests/test_argument_utils.py654# ---------------------------------------------------------------------------
MEDIUMtests/unit_tests/test_argument_utils.py656# ---------------------------------------------------------------------------
MEDIUMtests/unit_tests/tokenizers/test_tokenizer.py640# ---------------------------------------------------------------------------
MEDIUMtests/unit_tests/tokenizers/test_tokenizer.py642# ---------------------------------------------------------------------------
MEDIUMtests/unit_tests/tokenizers/test_tokenizer.py472# ------------------------------------------------------------------------
MEDIUMtests/unit_tests/tokenizers/test_tokenizer.py474# ------------------------------------------------------------------------
MEDIUMtests/unit_tests/tokenizers/test_tokenizer.py512# ---------------------------------------------------------------------------
MEDIUMtests/unit_tests/tokenizers/test_tokenizer.py514# ---------------------------------------------------------------------------
MEDIUMtests/unit_tests/fusions/test_bias_dropout_fusion.py8# ---------------------------------------------------------------------------
MEDIUMtests/unit_tests/fusions/test_bias_dropout_fusion.py10# ---------------------------------------------------------------------------
636 more matches not shown…
Cross-File Repetition168 hits · 840 pts
SeverityFileLineSnippet
HIGHpretrain_gpt.py0build the train test and validation datasets. args: train_val_test_num_samples : a list containing the number of samples
HIGHpretrain_hybrid.py0build the train test and validation datasets. args: train_val_test_num_samples : a list containing the number of samples
HIGHexamples/post_training/modelopt/finetune.py0build the train test and validation datasets. args: train_val_test_num_samples : a list containing the number of samples
HIGHexamples/t5/pretrain_t5.py0build the train test and validation datasets. args: train_val_test_num_samples : a list containing the number of samples
HIGHmegatron/elastification/pretrain_hybrid_flex.py0build the train test and validation datasets. args: train_val_test_num_samples : a list containing the number of samples
HIGHtools/checkpoint/saver_hf_llava.py0required top-level function that creates the saver and calls its .save().
HIGHtools/checkpoint/saver_core.py0required top-level function that creates the saver and calls its .save().
HIGHtools/checkpoint/saver_llava.py0required top-level function that creates the saver and calls its .save().
HIGHtools/checkpoint/loader_base.py0orchestrates loading a megatron checkpoint and sending model parameters over a given multiprocessing queue. args: args:
HIGHtools/checkpoint/loader_core.py0orchestrates loading a megatron checkpoint and sending model parameters over a given multiprocessing queue. args: args:
HIGHtools/checkpoint/loader_llava.py0orchestrates loading a megatron checkpoint and sending model parameters over a given multiprocessing queue. args: args:
HIGHtools/checkpoint/loader_base.py0parse megatron arguments by forcibly overwriting sys.argv. populates self.margs and self.checkpoint_args.
HIGHtools/checkpoint/saver_base.py0parse megatron arguments by forcibly overwriting sys.argv. populates self.margs and self.checkpoint_args.
HIGHtools/checkpoint/loader_llava.py0parse megatron arguments by forcibly overwriting sys.argv. populates self.margs and self.checkpoint_args.
HIGHtools/checkpoint/loader_base.py0construct a sys.argv list for megatron's argument parser. this centralizes the hack of overwriting sys.argv.
HIGHtools/checkpoint/loader_core.py0construct a sys.argv list for megatron's argument parser. this centralizes the hack of overwriting sys.argv.
HIGHtools/checkpoint/saver_base.py0construct a sys.argv list for megatron's argument parser. this centralizes the hack of overwriting sys.argv.
HIGHtools/checkpoint/loader_llava.py0construct a sys.argv list for megatron's argument parser. this centralizes the hack of overwriting sys.argv.
HIGHtests/unit_tests/test_fp8_param.py0dp_overlap: (overlap_param_gather, overlap_grad_reduce)
HIGHtests/unit_tests/test_fp8_param.py0dp_overlap: (overlap_param_gather, overlap_grad_reduce)
HIGHtests/unit_tests/test_fp8_param.py0dp_overlap: (overlap_param_gather, overlap_grad_reduce)
HIGHtests/unit_tests/test_hyper_comm_grid.py0set up distributed environment for the entire test class.
HIGH…it_tests/pipeline_parallel/test_bridge_communicator.py0set up distributed environment for the entire test class.
HIGH…sts/pipeline_parallel/test_multimodule_communicator.py0set up distributed environment for the entire test class.
HIGHtests/unit_tests/test_argument_utils.py0config with argparse_meta metadata for testing overrides.
HIGHtests/unit_tests/test_argument_utils.py0config with argparse_meta metadata for testing overrides.
HIGHtests/unit_tests/test_argument_utils.py0config with argparse_meta metadata for testing overrides.
HIGH…sts/unit_tests/transformer/test_submodule_callables.py0runs the model in reference mode and captures outputs and gradients. args: model: the transformer model to run. input_te
HIGH…sts/unit_tests/a2a_overlap/test_schedule_layer_1f1b.py0runs the model in reference mode and captures outputs and gradients. args: model: the transformer model to run. input_te
HIGH…sts/unit_tests/a2a_overlap/test_schedule_layer_1f1b.py0runs the model in reference mode and captures outputs and gradients. args: model: the transformer model to run. input_te
HIGH…sts/unit_tests/transformer/test_submodule_callables.py0runs the model with all-to-all overlap optimization and captures outputs and gradients. args: model: the transformer mod
HIGH…sts/unit_tests/a2a_overlap/test_schedule_layer_1f1b.py0runs the model with all-to-all overlap optimization and captures outputs and gradients. args: model: the transformer mod
HIGH…sts/unit_tests/a2a_overlap/test_schedule_layer_1f1b.py0runs the model with all-to-all overlap optimization and captures outputs and gradients. args: model: the transformer mod
HIGH…it_tests/pipeline_parallel/test_bridge_communicator.py0destroy all tracked grids and bridge communicator pgs.
HIGH…_tests/pipeline_parallel/test_multimodule_schedules.py0destroy all tracked grids and bridge communicator pgs.
HIGHtests/unit_tests/models/test_mimo_1f1b_schedule.py0destroy all tracked grids and bridge communicator pgs.
HIGHtests/unit_tests/pipeline_parallel/test_helpers.py0helper method to get the model chunk id given the iteration number.
HIGHtests/unit_tests/pipeline_parallel/test_helpers.py0helper method to get the model chunk id given the iteration number.
HIGHmegatron/core/pipeline_parallel/schedules.py0helper method to get the model chunk id given the iteration number.
HIGHtests/unit_tests/pipeline_parallel/test_helpers.py0helper method to get the microbatch_id within model chunk given the iteration number.
HIGHtests/unit_tests/pipeline_parallel/test_helpers.py0helper method to get the microbatch_id within model chunk given the iteration number.
HIGHmegatron/core/pipeline_parallel/schedules.py0helper method to get the microbatch_id within model chunk given the iteration number.
HIGHtests/unit_tests/pipeline_parallel/test_helpers.py0check if an iteration is the first for a model chunk.
HIGHtests/unit_tests/pipeline_parallel/test_helpers.py0check if an iteration is the first for a model chunk.
HIGHmegatron/core/pipeline_parallel/schedules.py0check if an iteration is the first for a model chunk.
HIGHtests/unit_tests/pipeline_parallel/test_helpers.py0check if an iteration is the last for a model chunk.
HIGHtests/unit_tests/pipeline_parallel/test_helpers.py0check if an iteration is the last for a model chunk.
HIGHmegatron/core/pipeline_parallel/schedules.py0check if an iteration is the last for a model chunk.
HIGHtests/unit_tests/models/test_mimo_1f1b_schedule.py0return a transformerconfig for the vision projection mlp.
HIGHexamples/mimo/configs/llava_vlm.py0return a transformerconfig for the vision projection mlp.
HIGHexamples/mimo/configs/llava_avlm.py0return a transformerconfig for the vision projection mlp.
HIGH…unit_tests/dist_checkpointing/models/test_gpt_model.py0test model loading with different vocab size (caused by tp padding).
HIGH…nit_tests/dist_checkpointing/models/test_bert_model.py0test model loading with different vocab size (caused by tp padding).
HIGHtests/unit_tests/dist_checkpointing/models/common.py0test model loading with different vocab size (caused by tp padding).
HIGH…ultimodal/evaluation/evaluate_video_phys_game_bench.py0merge input files to a format compatible with the evaluator.
HIGHexamples/multimodal/evaluation/evaluate_realworldqa.py0merge input files to a format compatible with the evaluator.
HIGHexamples/multimodal/evaluation/evaluate_ai2d.py0merge input files to a format compatible with the evaluator.
HIGH…es/multimodal/evaluation/evaluate_video_motionbench.py0merge input files to a format compatible with the evaluator.
HIGHexamples/multimodal/evaluation/evaluate_spdocvqa.py0merge input files to a format compatible with the evaluator.
HIGHexamples/multimodal/evaluation/evaluate_textvqa.py0merge input files to a format compatible with the evaluator.
108 more matches not shown…
Unused Imports795 hits · 750 pts
SeverityFileLineSnippet
LOWtrain_rl.py4
LOWmamba_builders.py14
LOWmamba_builders.py15
LOWgpt_builders.py19
LOWtools/merge_datasets.py3
LOWtools/preprocess_data_nmt.py6
LOWtools/run_inference_performance_test.py11
LOWtools/run_inference_performance_test.py12
LOWtools/run_inference_performance_test.py30
LOWtools/run_inference_performance_test.py36
LOWtools/run_inference_performance_test.py40
LOWtools/run_inference_performance_test.py41
LOWtools/linter.py11
LOWtools/preprocess_mmdata.py12
LOWtools/checkpoint/saver_core.py3
LOWtools/checkpoint/saver_core.py5
LOWtools/checkpoint/saver_core.py7
LOWtools/checkpoint/saver_core.py8
LOWtools/checkpoint/remap_gpt_dsa_to_mamba.py38
LOWtools/checkpoint/remap_gpt_dsa_to_mamba.py39
LOWtools/checkpoint/remap_gpt_dsa_to_mamba.py108
LOWtools/checkpoint/gpt_hybrid_conversion.py122
LOWtools/checkpoint/dist_checkpoint_io.py20
LOWtools/checkpoint/dist_checkpoint_io.py31
LOWtools/checkpoint/utils.py3
LOWtools/checkpoint/hybrid_conversion.py10
LOWtools/checkpoint/loader_mixtral_hf.py3
LOWtools/checkpoint/loader_mixtral_hf.py167
LOWtools/checkpoint/saver_llava.py2
LOWtools/checkpoint/saver_llava.py4
LOWtools/checkpoint/saver_llava.py173
LOWtools/checkpoint/saver_llava.py173
LOWtools/checkpoint/checkpoint_inspector.py37
LOWtools/checkpoint/loader_core.py3
LOWtools/checkpoint/loader_core.py4
LOWtools/checkpoint/loader_core.py5
LOWtools/checkpoint/loader_core.py6
LOWtools/checkpoint/loader_core.py7
LOWtools/checkpoint/saver_base.py2
LOWtools/checkpoint/loader_llava.py3
LOWtools/checkpoint/loader_llava.py7
LOWtools/bert_embedding/embed.py3
LOWtools/bert_embedding/embed.py8
LOWtools/bert_embedding/embed.py15
LOWtools/bert_embedding/embed.py16
LOWtools/bert_embedding/embed.py19
LOWtools/bert_embedding/embed.py21
LOWtools/bert_embedding/embed.py21
LOWtools/bert_embedding/__init__.py3
LOWtools/bert_embedding/__init__.py3
LOW…formance_tests/shell_test_utils/compare_to_baseline.py23
LOWtests/unit_tests/test_fp8_param.py3
LOWtests/unit_tests/test_fp8_param.py38
LOWtests/unit_tests/test_inference.py3
LOWtests/unit_tests/test_basic.py2
LOWtests/unit_tests/conftest.py4
LOWtests/unit_tests/test_emerging_optimizers.py11
LOWtests/unit_tests/test_imports.py14
LOWtests/unit_tests/test_fp4_param.py5
LOWtests/unit_tests/test_fp4_param.py11
735 more matches not shown…
Deep Nesting544 hits · 468 pts
SeverityFileLineSnippet
LOWgpt_builders.py24
LOWtasks/finetune_utils.py147
LOWtasks/eval_utils.py65
LOW.gitlab/scripts/check_imports.py66
LOW.gitlab/scripts/check_imports.py123
LOWtools/preprocess_data.py310
LOWtools/run_vlm_text_generation.py88
LOWtools/run_text_generation_server.py115
LOWtools/checkpoint/saver_hf_llava.py69
LOWtools/checkpoint/loader_base.py254
LOWtools/checkpoint/gpt_hybrid_conversion.py631
LOWtools/checkpoint/gpt_hybrid_conversion.py726
LOWtools/checkpoint/dist_checkpoint_io.py131
LOWtools/checkpoint/hybrid_conversion.py59
LOWtools/checkpoint/hybrid_conversion.py114
LOWtools/checkpoint/hybrid_conversion.py170
LOWtools/checkpoint/hybrid_conversion.py222
LOWtools/checkpoint/saver_llava.py183
LOWtools/checkpoint/checkpoint_inspector.py74
LOWtools/checkpoint/checkpoint_inspector.py335
LOWtools/checkpoint/checkpoint_inspector.py951
LOWtools/checkpoint/saver_base.py365
LOWtools/checkpoint/loader_llava.py161
LOW…s/common_pile_dataset/create_common_pile_ci_dataset.py63
LOW…s/common_pile_dataset/create_common_pile_ci_dataset.py109
LOW…formance_tests/shell_test_utils/compare_to_baseline.py36
LOWtests/unit_tests/test_fp8_param.py214
LOWtests/unit_tests/test_optimizer.py574
LOWtests/unit_tests/test_utils.py311
LOWtests/unit_tests/test_imports.py63
LOWtests/unit_tests/test_layer_wise_optimizer.py254
LOWtests/unit_tests/test_layer_wise_optimizer.py724
LOWtests/unit_tests/ssm/ops/test_ssd_bmm.py68
LOWtests/unit_tests/tokenizers/test_tokenizer.py290
LOW…t_tests/tools/checkpoint/test_gpt_hybrid_conversion.py461
LOWtests/unit_tests/extension/test_kitchen_sdpa.py60
LOW…egatron_fsdp/test_mcore_fully_sharded_data_parallel.py1067
LOW…egatron_fsdp/test_mcore_fully_sharded_data_parallel.py1139
LOW…egatron_fsdp/test_mcore_fully_sharded_data_parallel.py1142
LOW…ts/distributed/megatron_fsdp/test_mfsdp_fully_shard.py176
LOW…ts/distributed/megatron_fsdp/test_mfsdp_fully_shard.py888
LOW…ts/distributed/megatron_fsdp/test_mfsdp_fully_shard.py1058
LOWtests/unit_tests/transformer/test_cuda_graphs.py450
LOWtests/unit_tests/transformer/test_cuda_graphs.py642
LOWtests/unit_tests/transformer/test_transformer_block.py405
LOW…mental_attention_variant/test_attention_variant_dsa.py411
LOW…mental_attention_variant/test_attention_variant_dsa.py997
LOW…mental_attention_variant/test_attention_variant_dsa.py1031
LOW…mental_attention_variant/test_attention_variant_dsa.py1117
LOW…mental_attention_variant/test_attention_variant_dsa.py1245
LOW…mental_attention_variant/test_attention_variant_dsa.py1282
LOW…mental_attention_variant/test_attention_variant_dsa.py1496
LOWtests/unit_tests/training/config/test_container_base.py248
LOW…it_tests/pipeline_parallel/test_bridge_communicator.py311
LOW…sts/pipeline_parallel/test_multimodule_communicator.py170
LOW…sts/pipeline_parallel/test_multimodule_communicator.py609
LOW…_tests/pipeline_parallel/test_multimodule_schedules.py258
LOWtests/unit_tests/pipeline_parallel/test_schedules.py89
LOW…ne_parallel/test_fine_grained_activation_offloading.py335
LOW…ts/unit_tests/models/test_dsa_gpt_mamba_equivalence.py221
484 more matches not shown…
Hallucination Indicators30 hits · 415 pts
SeverityFileLineSnippet
CRITICALtools/checkpoint/loader_mixtral_hf.py69 model.embedding.word_embeddings.weight.data.copy_(
CRITICALtools/checkpoint/loader_mixtral_hf.py74 model.decoder.final_layernorm.weight.data.copy_(hf_model.model.norm.weight)
CRITICALtools/checkpoint/loader_mixtral_hf.py103 layer.mlp.router.weight.data.copy_(hf_layer.block_sparse_moe.gate.weight)
CRITICALtools/checkpoint/loader_mixtral_hf.py127 layer.self_attention.linear_qkv.layer_norm_weight.data.copy_(hf_layer.input_layernorm.weight)
CRITICALtools/checkpoint/saver_llava.py199 model.vision_model.conv1.weight.data.copy_(vit_embeddings_msg["conv1 weight"])
CRITICALtools/checkpoint/saver_llava.py201 model.vision_model.conv1.bias.data.copy_(vit_embeddings_msg["conv1 bias"])
CRITICALtools/checkpoint/saver_llava.py202 model.vision_model.position_embeddings.weight.data.copy_(vit_embeddings_msg["position embeddings"])
CRITICALtools/checkpoint/saver_llava.py208 model.vision_model.embedder.weight.data.copy_(embedder_weight[tp_rank])
CRITICALtools/checkpoint/saver_llava.py210 model.vision_model.embedder.bias.data.copy_(embedder_bias[tp_rank])
CRITICALtools/checkpoint/saver_llava.py214 model.vision_model.ln_pre.weight.data.copy_(vit_embeddings_msg["ln pre weight"])
CRITICALtools/checkpoint/saver_llava.py215 model.vision_model.ln_pre.bias.data.copy_(vit_embeddings_msg["ln pre bias"])
CRITICALtools/checkpoint/saver_llava.py218 model.vision_model.ln_post.weight.data.copy_(vit_embeddings_msg["ln post weight"])
CRITICALtools/checkpoint/saver_llava.py219 model.vision_model.ln_post.bias.data.copy_(vit_embeddings_msg["ln post bias"])
CRITICALtools/checkpoint/saver_llava.py343 model.vision_projection.encoder.linear_fc1.weight.data.copy_(
CRITICALtools/checkpoint/saver_llava.py345 model.vision_projection.encoder.linear_fc2.weight.data.copy_(
CRITICALtools/checkpoint/saver_llava.py348 model.vision_projection.encoder.linear_fc1.layer_norm_weight.data.copy_(
CRITICALtools/checkpoint/saver_llava.py351 model.vision_projection.encoder.linear_fc1.layer_norm_bias.data.copy_(
CRITICALtools/checkpoint/saver_llava.py354 model.vision_projection.encoder.linear_fc1.bias.data.copy_(
CRITICALtools/checkpoint/saver_llava.py356 model.vision_projection.encoder.linear_fc2.bias.data.copy_(vision_projection_l1_bias)
CRITICALtests/unit_tests/transformer/moe/test_routers.py115 assert self.sequential_mlp.router.weight.grad.abs().sum() == 0
CRITICALtests/unit_tests/transformer/moe/test_routers.py121 assert self.sequential_mlp.router.weight.grad.abs().sum() > 0
CRITICALtests/unit_tests/transformer/moe/test_routers.py126 self.sequential_mlp.router.weight.grad.fill_(0)
CRITICALtests/unit_tests/transformer/moe/test_routers.py129 assert self.sequential_mlp.router.weight.grad.abs().sum() > 0
CRITICALtests/unit_tests/models/test_gpt_model.py87 assert self.gpt_model.embedding.word_embeddings.weight.std().cpu().item() == approx(
CRITICALtests/unit_tests/models/test_gpt_model.py90 assert self.gpt_model.embedding.word_embeddings.weight.mean().cpu().item() == approx(
CRITICALmegatron/core/parallel_state.py2158 and torch.distributed.distributed_c10d._world.pg_map.get(_DATA_PARALLEL_GROUP_GLOO, None)
CRITICALmegatron/core/parallel_state.py2167 and torch.distributed.distributed_c10d._world.pg_map.get(
CRITICALmegatron/core/parallel_state.py2206 and torch.distributed.distributed_c10d._world.pg_map.get(
CRITICALmegatron/core/parallel_state.py2220 and torch.distributed.distributed_c10d._world.pg_map.get(
CRITICALmegatron/core/transformer/attention.py1731 self.linear_qkv.weight.main_param.data.copy_(
Self-Referential Comments131 hits · 381 pts
SeverityFileLineSnippet
MEDIUMgpt_builders.py43 # Define the decoder block spec
MEDIUMgpt_builders.py55 # Define the decoder layer spec
MEDIUMgpt_builders.py68 # Define the decoder block spec
MEDIUM.gitlab/stages/05.publish.yml149 # Define the full refspec for the branch
MEDIUMtools/run_inference_performance_test.py112 # Create a list of valid token IDs
MEDIUMtools/common_pile_dataset/setup_common_pile_dataset.sh87# Create a virtual environment to avoid system package conflicts
MEDIUMtools/common_pile_dataset/setup_common_pile_dataset.sh122# Create the output directory
MEDIUMtests/unit_tests/test_optimizer.py790 # Create a new state_dict with all params set to 3.
MEDIUMtests/unit_tests/test_optimizer.py879 # Create a simple model for testing
MEDIUMtests/unit_tests/test_optimizer.py961 # Create a simple model
MEDIUMtests/unit_tests/test_utils.py454 # Create a straggler_detector with enabled set to false.
MEDIUMtests/unit_tests/test_hyper_comm_grid.py431 # Create a process group
MEDIUMtests/unit_tests/test_hyper_comm_grid.py434 # Create a tensor for communication test
MEDIUMtests/unit_tests/test_hyper_comm_grid.py535 # Create a unique tensor based on rank
MEDIUMtests/unit_tests/test_emerging_optimizers.py70 # Create a simple linear model for testing
MEDIUMtests/unit_tests/test_emerging_optimizers.py640 # Create a model with QKV-like parameter
MEDIUMtests/unit_tests/test_training.py143 # Create a mock state_dict with gradients (use deterministic values for reproducibility).
MEDIUMtests/unit_tests/ssm/ops/test_ssm_kernel.py83 # Create the Mixer instance directly
MEDIUMtests/unit_tests/fusions/test_torch_softmax.py197 # Create a padding mask
MEDIUM…/unit_tests/post_training/test_modelopt_module_spec.py238 # Define the expected signature
MEDIUM…/unit_tests/post_training/test_modelopt_module_spec.py277 # Define the expected signature
MEDIUM…tests/distributed/test_torch_fully_sharded_parallel.py56 # Create a dummy model and configs.
MEDIUM…tests/distributed/test_torch_fully_sharded_parallel.py63 # Create the sharded model.
MEDIUM…tests/distributed/test_torch_fully_sharded_parallel.py84 # Create a dummy model and configs.
MEDIUM…tests/distributed/test_torch_fully_sharded_parallel.py90 # Create a custom process group (using the default world for testing)
MEDIUM…tests/distributed/test_torch_fully_sharded_parallel.py93 # Create the sharded model with explicit process group
MEDIUM…distributed/megatron_fsdp/test_mfsdp_uneven_dtensor.py675 # Create a manual uneven sharding along dim 1 with possible zero-length local on some ranks
MEDIUM…/unit_tests/transformer/test_multi_token_prediction.py774 # Create a dummy loss tensor
MEDIUM…tests/transformer/test_transformer_block_custom_pgs.py259 # Create a transformer block with default process groups
MEDIUM…tests/transformer/test_transformer_block_custom_pgs.py285 # Create a transformer block with custom process groups
MEDIUM…tests/transformer/test_transformer_block_custom_pgs.py737 # Create a single transformer block
MEDIUMtests/unit_tests/transformer/test_cuda_graphs.py718 # Create the CUDA graphs - this is where the is_last_layer logic is tested
MEDIUMtests/unit_tests/transformer/test_cuda_graphs.py965 # Create a mapping of sample_keys to indices
MEDIUMtests/unit_tests/transformer/test_transformer_block.py545 # Create a new build_layers method that uses interleaved attention
MEDIUM…ts/unit_tests/transformer/moe/test_token_dispatcher.py172 # Create the answer.
MEDIUMtests/unit_tests/transformer/moe/test_shared_experts.py96 # Create a dummy input tensor.
MEDIUMtests/unit_tests/transformer/moe/test_aux_loss.py211 # Create a new config with updated parameters
MEDIUMtests/unit_tests/transformer/moe/test_aux_loss.py214 # Create the router with the updated config
MEDIUM…s/unit_tests/training/config/test_instantiate_utils.py334 # Create a mock that raises an error when used with functools.partial
MEDIUMtests/unit_tests/training/config/test_yaml_utils.py215 # Create a mock torch dtype
MEDIUMtests/unit_tests/training/config/test_yaml_utils.py262 # Create a mock GenerationConfig
MEDIUMtests/unit_tests/utils/test_experimental_log_once.py23 # Define a fresh function with the decorator so it has its own closure state.
MEDIUMtests/unit_tests/models/test_mimo_audio_submodules.py212 # Create a time array
MEDIUMtests/unit_tests/models/test_mimo_audio_submodules.py215 # Create a simple sine wave at 440 Hz (A4)
MEDIUMtests/unit_tests/models/test_gpt_model.py237 # Define the expected signature
MEDIUM…sts/unit_tests/models/test_mimo_embedding_alignment.py22 # Create a minimal MimoModelConfig
MEDIUM…sts/unit_tests/models/test_mimo_embedding_alignment.py73 # Create a simple batch
MEDIUM…sts/unit_tests/models/test_mimo_embedding_alignment.py249 # Create a test case with 2 batches:
MEDIUM…sts/unit_tests/models/test_mimo_embedding_alignment.py285 # Create the unflattened embeddings that would come from a vision encoder
MEDIUMtests/unit_tests/models/test_fastconformer_model.py45 # Create a parameter with the target dtype so ``next(self.parameters()).dtype``
MEDIUMtests/unit_tests/models/test_mimo_submodules.py92 # Create the main module spec
MEDIUMtests/unit_tests/models/test_mimo_submodules.py351 # Create a data batch without images
MEDIUMtests/unit_tests/dist_checkpointing/test_fp8.py33 # Create a quantizer for FP8 conversion
MEDIUM…s/unit_tests/dist_checkpointing/test_fully_parallel.py601 # Create a mock that will do what it's supposed to do,
MEDIUM…/unit_tests/inference/contexts/test_dynamic_context.py317 # Initialize all variables
MEDIUM…/unit_tests/inference/contexts/test_dynamic_context.py896 # Create an active_requests_mask where requests 0, 2, and 4 are finished (0),
MEDIUM…/unit_tests/inference/contexts/test_dynamic_context.py986 # Create an active_requests_mask where all requests are finished
MEDIUM…ts/unit_tests/inference/engines/test_dynamic_engine.py1894 # Create a deterministic mock forward pass that returns logits
MEDIUM…ts/unit_tests/inference/engines/test_dynamic_engine.py1590 # Create a request with length 513
MEDIUMtests/unit_tests/data/test_builder.py93 # Define the class here to avoid pytest warnings
71 more matches not shown…
Excessive Try-Catch Wrapping192 hits · 228 pts
SeverityFileLineSnippet
LOWtasks/finetune_utils.py61 except Exception:
LOWtasks/eval_utils.py114 except Exception:
MEDIUMtasks/eval_utils.py111def correct_answers_forward_step(batch, model):
LOW.gitlab/scripts/check_imports.py117 except Exception:
LOWtools/check_copyright.py22 except Exception as e:
MEDIUMtools/check_copyright.py23 print(f"Error reading {file_path}: {e}")
MEDIUMtools/text_generation_cli.py21 print(f"Error {response.status_code}: {response.json()['message']}")
LOWtools/checkpoint/saver_hf_llava.py400 except Exception as e:
LOWtools/checkpoint/loader_base.py78 except Exception as e:
MEDIUMtools/checkpoint/loader_base.py79 print(f"Error validating Megatron arguments: {e}")
LOWtools/checkpoint/saver_core.py70 except Exception as e:
LOWtools/checkpoint/gpt_hybrid_conversion.py384 except Exception:
LOWtools/checkpoint/dist_checkpoint_io.py215 except Exception:
LOWtools/checkpoint/hybrid_conversion.py322 except Exception:
LOWtools/checkpoint/loader_mixtral_hf.py340 except Exception:
MEDIUMtools/checkpoint/loader_mixtral_hf.py337def load_checkpoint(queue, args):
LOWtools/checkpoint/saver_llava.py404 except Exception as e:
LOWtools/checkpoint/loader_core.py90 except Exception as e:
LOWtools/checkpoint/loader_llava.py360 except Exception as e:
LOWtools/bert_embedding/embed.py217 except Exception:
LOWtools/bert_embedding/embed.py223 except Exception:
LOWtests/unit_tests/conftest.py48 except Exception:
LOWtests/unit_tests/conftest.py96 except Exception as e:
LOWtests/unit_tests/test_hyper_comm_grid.py325 except Exception as e:
LOWtests/unit_tests/test_api_backwards_compat_setup.py106 except Exception as e:
LOWtests/unit_tests/test_api_backwards_compat_setup.py134 except Exception as e:
LOWtests/unit_tests/test_imports.py58 except Exception:
LOWtests/unit_tests/test_imports.py77 except Exception:
LOWtests/unit_tests/test_utilities.py107 except Exception:
LOWtests/unit_tests/tokenizers/test_tokenizer.py649except Exception:
LOWtests/unit_tests/tokenizers/test_tokenizer.py18except Exception:
LOWtests/unit_tests/tokenizers/test_tokenizer.py151 except Exception:
LOWtests/unit_tests/tokenizers/test_tokenizer.py177 except Exception:
LOWtests/unit_tests/tokenizers/test_tokenizer.py319 except Exception:
LOW…egatron_fsdp/test_mcore_fully_sharded_data_parallel.py1156 except Exception:
LOW…ts/distributed/megatron_fsdp/test_mfsdp_fully_shard.py73 except Exception as e:
LOWtests/unit_tests/transformer/moe/test_aux_loss.py38except Exception: # pragma: no cover - defensive
LOWtests/unit_tests/transformer/moe/test_routers.py25except Exception: # pragma: no cover - defensive
LOWtests/unit_tests/models/test_mimo_audio_submodules.py123 except Exception as e:
LOWtests/unit_tests/models/test_mimo_audio_submodules.py197 except Exception as e:
LOWtests/unit_tests/models/test_hybrid_moe_model.py405 except Exception:
LOWtests/unit_tests/models/test_mimo_submodules.py55 except Exception as e:
LOWtests/unit_tests/models/test_mimo_submodules.py105 except Exception as e:
LOWtests/unit_tests/models/test_mimo_submodules.py229 except Exception as e:
LOWtests/unit_tests/models/test_mimo_submodules.py284 except Exception as e:
MEDIUMtests/unit_tests/models/test_mimo_model.py147def setup_method(self, method):
MEDIUMtests/unit_tests/models/test_mimo_model.py163def teardown_method(self, method):
MEDIUMtests/unit_tests/models/test_mimo_model.py528def setup_method(self, method):
MEDIUMtests/unit_tests/models/test_mimo_model.py542def teardown_method(self, method):
LOWtests/unit_tests/models/test_mimo_model.py150 except Exception:
LOWtests/unit_tests/models/test_mimo_model.py166 except Exception:
LOWtests/unit_tests/models/test_mimo_model.py531 except Exception:
LOWtests/unit_tests/models/test_mimo_model.py545 except Exception:
LOWtests/unit_tests/dist_checkpointing/test_fp8.py24 except Exception as e:
LOWtests/unit_tests/dist_checkpointing/test_async_save.py30 except Exception as e:
LOWtests/unit_tests/resharding/test_model_swap.py34except Exception:
LOWtests/unit_tests/resharding/test_model_swap.py44except Exception:
LOWtests/unit_tests/resharding/test_model_swap.py365 except Exception:
LOWtests/unit_tests/resharding/test_model_swap.py492 except Exception:
LOWtests/unit_tests/resharding/test_model_swap.py644 except Exception:
132 more matches not shown…
Redundant / Tautological Comments127 hits · 194 pts
SeverityFileLineSnippet
LOWtrain_rl.py62 # Check if fp8_model_init supports preserve_high_precision_init_val
LOWtools/check_copyright.py20 # Check if the expected header is at the start of the file
LOWtools/common_pile_dataset/setup_common_pile_dataset.sh50# Check if create_common_pile_ci_dataset.py was scp'd alongside this script
LOWdocker/common/install_source_wheels.sh25# Check if required arguments are provided
LOWtests/unit_tests/test_optimizer.py34 # Check if FP8 block scaling is available.
LOWtests/unit_tests/test_utils.py457 # Check if configuration was success.
LOWtests/unit_tests/test_utils.py460 # Check if the instance is in disabled state.
LOWtests/unit_tests/test_utils.py464 # Check if all ranks have straggler detector enabled.
LOWtests/unit_tests/conftest.py83 # Check if data directory exists and has content
LOWtests/unit_tests/fusions/test_torch_softmax.py145 # Check if output is a valid probability distribution
LOWtests/unit_tests/transformer/test_cuda_graphs.py516 # Check if cuda graph is correctly setting is first/last layer
LOWtests/unit_tests/transformer/test_attention.py416 # Check if output and bias have the correct shape
LOWtests/unit_tests/transformer/test_attention.py598 # Check if the output is close
LOW…/unit_tests/transformer/test_multi_latent_attention.py1551 # Check if the output is the same
LOW…it_tests/transformer/moe/test_moe_layer_discrepancy.py69 # Check if parameters are the same
LOW…it_tests/transformer/moe/test_moe_layer_discrepancy.py75 # Check if input is the same across all ranks
LOW…it_tests/transformer/moe/test_moe_layer_discrepancy.py150 # Check if output is the same across all ranks
LOW…it_tests/transformer/moe/test_moe_layer_discrepancy.py216 # Check if output is the same across all ranks
LOWtests/unit_tests/transformer/moe/test_moe_layer.py164 # Check if the moe layer is interleaved correctly
LOWtests/unit_tests/models/test_llava_model.py753 # Check if output shape is as expected
LOW…_tests/dist_checkpointing/test_layer_wise_optimizer.py147 # Check if optimizer is ChainedOptimizer (expected for standard setup)
LOW…/unit_tests/inference/contexts/test_dynamic_context.py879 # Assign blocks to the requests (one block per request)
LOW…/unit_tests/inference/contexts/test_dynamic_context.py955 # Assign blocks to the requests:
LOW…ts/unit_tests/inference/engines/test_dynamic_engine.py3183 # Check if any request was evicted during this step
LOW…ts/test_utils/python_scripts/download_golden_values.py282 # Check if we should skip based on only_failing flag
LOW…nal_tests/test_cases/common/ckpt_converter/__main__.py895 # Print results.
LOW…unctional_tests/shell_test_utils/run_batch_ci_tests.sh173 # Check if file is empty (job still running or not started)
LOWtests/functional_tests/shell_test_utils/run_ci_test.sh11# Set umask to 0002 to allow group read/write permissions
LOWtests/functional_tests/shell_test_utils/run_ci_test.sh257 ## Loop over the list of model configs in the params file and run each one in sequence, collecting
LOWdocs/conf.py43# Check if we should skip autodoc generation
LOWexamples/post_training/modelopt/finetune.py271 # Check if this is OpenAI chat data?
LOWexamples/mimo/utils/logging.py40 # Print output projections
LOWexamples/gptoss/02_train.sh55# Check if checkpoint path exists
LOWexamples/gptoss/02_train.sh62# Check if tensorboard logs path exists
LOWexamples/multimodal/dataset_helpers.py121 # Check if all samples fit in the knapsack capacity.
LOWexamples/inference/utils.py235 # Check if we have any prompts (from command line or JSONL)
LOWexamples/inference/advanced/gpt_dynamic_inference.py263 # Check if all requests are finished.
LOWexamples/rl/benchmark_refit.py216 # Print results
LOWexamples/rl/benchmark_refit.py312 # Print results
LOWexamples/rl/environments/countdown/countdown.py34 # Check if all numbers in equation are available
LOWscripts/check_api_backwards_compatibility.py240 # Check if this breakage kind should be ignored globally (not a signature change)
LOWscripts/check_api_backwards_compatibility.py253 # Check if this is a breakage kind we ignore for __init__ methods
LOWscripts/check_api_backwards_compatibility.py263 # Check if it's a child of a filtered object
LOWscripts/check_api_backwards_compatibility.py350 # Print results
LOW.github/actions/check-nvidia-sso-membership/action.yml94 # Check if SSO file is available
LOW.github/actions/check-nvidia-sso-membership/action.yml103 # Check if username exists as a key in the JSON object
LOWmegatron/post_training/model_builder.py225 # Set num_layers to 0 for base model in offline mode
LOWmegatron/core/parallel_state.py836 # Set NCCL_COLLNET_ENABLE to 1 to enable SHARP for the dp group.
LOWmegatron/core/parallel_state.py1275 # Set NCCL_COLLNET_ENABLE to 1 to enable SHARP for the dp_replica group.
LOWmegatron/core/parallel_state.py1304 # Set NCCL_COLLNET_ENABLE to 0 to restrict SHARP application to the dp_replica group.
LOWmegatron/core/fp4_utils.py14# Check if Transformer Engine is installed
LOWmegatron/core/fp4_utils.py25# Check if Transformer Engine has class for fp4 tensors.
LOWmegatron/core/fp8_utils.py23# Check if Transformer Engine is installed
LOWmegatron/core/fp8_utils.py40# Check if Transformer Engine has class for fp8 tensors.
LOWmegatron/core/fp8_utils.py55# Check if Transformer Engine has MXFP8Tensor class
LOWmegatron/core/fp8_utils.py635 # Check if fp8_model_init supports setting recipe
LOWmegatron/core/fp8_utils.py640 # Check if fp8_model_init supports preserve_high_precision_init_val
LOWmegatron/core/utils.py2841 # Check if any deprecated key is present in kwargs
LOWmegatron/core/ssm/ops/causal_conv1d_triton.py243 # Check if input is 2D, temporarily treat as 3D for uniform processing
LOWmegatron/core/tokenizers/megatron_tokenizer.py67 # Check if metadata file exists
67 more matches not shown…
Docstring Block Structure35 hits · 175 pts
SeverityFileLineSnippet
HIGHtools/checkpoint/remap_gpt_dsa_to_mamba.py46Return the HybridModel state-dict key corresponding to *key* from GPTModel. Args: key: A key from the GPTMo
HIGH…egatron_fsdp/test_mcore_fully_sharded_data_parallel.py774 Run a small deterministic (optional) training loop using a mocked MoE/GPT model and optimizer. This hel
HIGHmegatron/core/process_groups_config.py668Get process group collection for a specific module. Args: module_name: Name of the module.
HIGHmegatron/core/hyper_comm_grid.py121Create a process group based on a list of dimension names Note: The unique key used to store the process group
HIGHmegatron/core/utils.py120Validates the request to the experimental function. Args: func (Callable): Callee max_l
HIGHmegatron/core/utils.py186Validates the request to the experimental function. Args: func (Callable): Callee max_l
HIGHmegatron/core/utils.py218Pass-through to callee attribute if experimental flag is enabled. Args: super (supe
HIGHmegatron/core/timers.py385Returns the output string with logged timer values according to configured options. Args: names (Li
HIGH…ore/tokenizers/text/libraries/huggingface_tokenizer.py189 Adds a dictionary of special tokens (eos, pad, cls...). If special tokens are NOT in the vocabulary, th
HIGHmegatron/core/fusions/fused_bias_swiglu.py210Implementation of biased SwiGLU that handles different input shapes. This function reshapes the input if necessary,
HIGHmegatron/core/fusions/fused_bias_geglu.py154Implementation of biased GEGLU that handles different input shapes. This function reshapes the input if necessary,
HIGHmegatron/core/optimizer/optimizer.py414Filter and reorder state_dict parameter groups to match current optimizer groups. Keys used for matching align w
HIGH…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py4187 Release the specified parameter bucket, freeing its associated buffer storage. This function marks or
HIGH…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py4598 Creates a distributed tensor (DTensor) from a local tensor with support for Megatron-FSDP and Tensor Parallel s
HIGH…e/distributed/fsdp/src/megatron_fsdp/uneven_dtensor.py140 Validates the chunk metadata of an uneven DTensor to ensure correctness and boundary coverage. Notes: - `g
HIGH…e/distributed/fsdp/src/megatron_fsdp/uneven_dtensor.py257 Gather a DTensor with potentially uneven sharding across ranks into a full tensor. This function handles DTens
HIGH…core/distributed/fsdp/src/megatron_fsdp/fully_shard.py446 Fully shard the optimizer for Megatron-FSDP. This is an in-place operation on the optimizer instance, which mod
HIGH…tron/core/datasets/blended_megatron_dataset_builder.py497Build the DistributedDataset Return None if and only if the underlying dataset class is not built on the curren
HIGHmegatron/core/datasets/object_storage_utils.py136Ascertain whether the object at the given S3 path exists in S3 Args: client (S3Client): The S3 client
HIGHmegatron/core/datasets/indexed_dataset.py88Get the size of the dtype/code in bytes Args: key (Union[int, Type[numpy.number]]): The dtype or co
HIGHmegatron/core/datasets/indexed_dataset.py798Return from the dataset Args: idx (Union[int, numpy.integer, slice]): The index or index slice into
HIGH…ron/core/pipeline_parallel/multimodule_communicator.py496Compute the total number of pipeline stages across a multi-module chain. Interprets ``topology`` as a directed
HIGHmegatron/core/models/hybrid/hybrid_layer_allocation.py161Count layers by type across the full hybrid pattern (main + MTP). Parses the pattern to extract main and MTP compon
HIGHmegatron/core/models/hybrid/hybrid_layer_allocation.py200Parse a unified hybrid pattern string into main and MTP components. The pattern uses "/" as a separator between the
HIGHmegatron/core/models/hybrid/hybrid_layer_allocation.py301Validate and convert a single pipeline segment pattern to a layer type list. This is used after the main pattern ha
HIGHmegatron/core/models/hybrid/hybrid_layer_allocation.py337Select and validate the pipeline segment for the given PP rank and VP stage. When the main pattern contains '|' pip
HIGHmegatron/core/dist_checkpointing/validation.py294Raises or logs an error in case missing or unexpected keys are non-empty. Args: missing_keys (Set[str]): mi
HIGHmegatron/core/dist_checkpointing/validation.py372Validate if the ShardedTensors and ShardedObjects from multiple processes define correct sharding. Local ShardedTen
HIGH…tron/core/dist_checkpointing/strategies/async_utils.py669Finalizes all available calls. This method must be called on all ranks. Args: blocking (bo
HIGH…ng/nvshmem_copy_service/memory/tensor_pointer_utils.py19 Extract the data pointer from a tensor. Args: tensor: Can be torch.Tensor, CuPy array, or
HIGHmegatron/core/export/trtllm/trtllm_layers.py85Helper function to rename model layer names to TRTLLM Layer names We go through each layer (keys) in the model
HIGHmegatron/training/arguments.py183Validate model config arguments from heterogeneous config. This function takes model arguments and validates them b
HIGHmegatron/training/config/instantiate_utils.py128Instantiate an object or callable from a config object. This function takes a configuration object (dictionary, lis
HIGHmegatron/training/config/instantiate_utils.py232Recursively instantiates a node within a configuration structure. This function handles the instantiation of indivi
HIGHmegatron/rl/rl_utils.py1181Pad trajectories and extract the generation masks. Args: rollouts: Rollouts to extract trajectories from.
Over-Commented Block112 hits · 98 pts
SeverityFileLineSnippet
LOW.gitlab/scripts/check_imports.py1# Copyright (c) 2025, NVIDIA CORPORATION.
LOWtools/bisect.sh1#!/usr/bin/env bash
LOWtools/trigger_internal_ci.py1#!/usr/bin/env python3
LOWtools/checkpoint/convert.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/checkpoint/convert.py21# full model weights, nothing split.
LOWtools/checkpoint/convert.py41# consumed_train_samples
LOWtools/checkpoint/convert.py61# "mlp l0 bias"
LOWtools/checkpoint/gpt_hybrid_conversion.py221 else:
LOWtools/checkpoint/gpt_hybrid_conversion.py241#
LOWtools/common_pile_dataset/setup_common_pile_dataset.sh1#!/bin/bash
LOW…ts/performance_tests/shell_test_utils/run_perf_test.sh1#!/usr/bin/env bash
LOW…ts/performance_tests/shell_test_utils/run_perf_test.sh101# from /usr/lib/python3.12/dist-packages — but mamba-ssm + causal-conv1d (and
LOWtests/unit_tests/test_imports.py1# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
LOW…s/checkpoint/test_gpt_hybrid_conversion_parallelism.py41# tiny synthetic DCP checkpoint and round-trips it through the converter on
LOW…unit_tests/transformer/test_fsdp_dtensor_checkpoint.py1# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
LOW…s/unit_tests/elastification/test_hybrid_flex_router.py81 tensor_model_parallel_size=1, pipeline_model_parallel_size=1
LOW…s/unit_tests/pipeline_parallel/test_pipeline_layout.py361
LOWtests/unit_tests/inference/test_hybrid_moe.py61# Combinatorial sweep: unordered combinations with repetition of ALL_STATES
LOWtests/test_utils/python_scripts/recipe_parser.py21# cadence remains the trigger axis.
LOWtests/test_utils/recipes/h100/moe.yaml101 - environment: [lts]
LOWtests/test_utils/recipes/h100/moe.yaml221 # Super important mr, mr-github tests that run for both DEV and LTS per mr, mr-github #
LOWtests/test_utils/recipes/h100/gpt.yaml461 # scope: [nightly] # Requires PyT 2.4: #481
LOWtests/test_utils/recipes/gb200/moe.yaml101 platforms: [dgx_gb200]
LOWtests/test_utils/recipes/gb200/moe.yaml201 # - test_case: [gpt3_mcore_te_tp2_pp1_frozen_resume_torch_dist_te_8experts2parallel_dist_optimizer]
LOW…unctional_tests/test_cases/common/moe_perf/__main__.py401
LOW…unctional_tests/test_cases/common/moe_perf/__main__.py421 # test_moe_layer_performance(case, debug_mode=True)
LOW…unctional_tests/shell_test_utils/run_batch_ci_tests.sh1#!/bin/bash
LOWdocs/autodoc2_docstrings_parser.py1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOWdocs/conf.py1# Copyright (c) 2025-2026, NVIDIA CORPORATION. All rights reserved.
LOWexamples/megatron_fsdp/train_llama3_8b_fsdp_h100_fp8.sh121 --ckpt-format fsdp_dtensor
LOWexamples/megatron_fsdp/sbatch_checkpoint_convert.sh21SLURM_LOGS="${OUTPUT_PATH}/slurm_logs"
LOWexamples/mimo/train.py121 # iterator exhausted on all ranks
LOWexamples/inference/run_inference_server.sh1#!/bin/bash
LOWexamples/inference/run_offline_inference.sh1#!/bin/bash
LOW.github/workflows/sync-team-usergroups.yml1# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/cicd-approve-test-queue.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/release-docs.yml1# Copyright (c) 2025, NVIDIA CORPORATION.
LOW.github/workflows/community-bot.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/nightly-sync-main-to-dev.yml1# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/release.yaml1# Copyright (c) 2019-2026, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/sync-skills.yml1# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/close-inactive-issue-pr.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/oncall-assign.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/oncall-rotation.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/cherry-pick-release-commit.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/build-docs.yml1# Copyright (c) 2025, NVIDIA CORPORATION.
LOW.github/workflows/release-nightly-docs.yml1# Copyright (c) 2026, NVIDIA CORPORATION.
LOW.github/workflows/cicd-main.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/install-test.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/release-freeze.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/workflows/copyright-check.yml1# Copyright (c) 2025, NVIDIA CORPORATION.
LOW.github/scripts/oncall_manager.py1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW.github/scripts/sync_team_usergroups.py1# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
LOW.github/actions/action.yml1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOWskills/nightly-sync/SKILL.md261 # pyproject.toml is allowed to differ ONLY for git source reconciliation
LOWmegatron/core/parallel_state.py41_DATA_PARALLEL_GROUP_GLOO = None
LOWmegatron/core/parallel_state.py1021 global _EMBEDDING_GROUP
LOWmegatron/core/parallel_state.py1041 # UCC backend requires CUDA_DEVICE_MAX_CONNECTIONS variable to be larger than 1,
LOWmegatron/core/config_logger.py1# Copyright (c) 2025, NVIDIA CORPORATION.
LOWmegatron/core/rerun_state_machine.py821 # caller isn't required to have wrapped its iterator in
52 more matches not shown…
Verbosity Indicators47 hits · 81 pts
SeverityFileLineSnippet
LOWtools/checkpoint/saver_hf_llava.py17 # Step 1: Reshape back to (num_head, 3*head_dim, -1)
LOWtools/checkpoint/saver_hf_llava.py20 # Step 2: Slice along the head_dim dimension to get q, k, v
LOWtools/checkpoint/saver_hf_llava.py25 # Step 3: Reshape each back to (num_head * head_dim, -1)
LOW…s/common_pile_dataset/create_common_pile_ci_dataset.py361 # Step 1: Get vocabulary files
LOW…s/common_pile_dataset/create_common_pile_ci_dataset.py387 # Step 2: Get raw text data
LOW…s/common_pile_dataset/create_common_pile_ci_dataset.py411 # Step 3: Preprocess for GPT (GPT2BPETokenizer)
LOW…s/common_pile_dataset/create_common_pile_ci_dataset.py434 # Step 4: Preprocess for BERT (BertWordPieceLowerCase + split-sentences)
LOW…s/common_pile_dataset/create_common_pile_ci_dataset.py475 # Step 5: Preprocess for T5 (BertWordPieceCase)
LOW…s/common_pile_dataset/create_common_pile_ci_dataset.py497 # Step 6: Clean up and verify
LOW…/unit_tests/inference/contexts/test_dynamic_context.py1796 # Step 1: Forward pass for all 3 requests
LOW…/unit_tests/inference/contexts/test_dynamic_context.py1805 # Step 2: Forward pass where req 10 finishes, req 11 continues. Req 999 is NOT scheduled.
LOW…/unit_tests/inference/contexts/test_dynamic_context.py1827 # Step 3: Add the next chunk. It should sit exactly at the boundary (index 1) and inherit the state.
LOW…/unit_tests/inference/contexts/test_dynamic_context.py1894 # Step 1: All 3 requests are active, process forward pass
LOW…/unit_tests/inference/contexts/test_dynamic_context.py1903 # Step 2: Both decode requests finish, chunked prefill NOT scheduled this step.
LOW…ts/unit_tests/inference/engines/test_dynamic_engine.py2490 # Step 1: Prefill. Processes the 4 prompt tokens.
LOW…l_tests/python_test_utils/compute_golden_statistics.py11 # Step 1: Run batch tests (from megatron-rl directory):
LOW…l_tests/python_test_utils/compute_golden_statistics.py15 # Step 2: Wait for jobs to complete, then compute statistics:
LOW.github/scripts/oncall_manager.py218 # Step 1: Add new oncall first (include current members to avoid removing anyone yet)
LOW.github/scripts/oncall_manager.py228 # Step 2: Now set the usergroup to contain only the new oncall
LOW…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py1374 # Step 0: Register new FSDP unit modules.
LOW…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py1407 # Step 1: Group the parameters according to their execution order and attributes.
LOW…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py1445 # Step 2: Bucket the parameters based on the guide bucket size.
LOW…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py1494 # Step 3: Split parameter groups to meet communication segmentation requirements.
LOW…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py1539 # Step 4: Generate the groups of collective buckets, where each group aggregates
LOW…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py4050 # If use double buffer, we need to check if the next bucket
LOW…e/transformer/custom_layers/batch_invariant_kernels.py265 # Step 1: Find maximum value in the row for numerical stability
LOW…e/transformer/custom_layers/batch_invariant_kernels.py277 # Step 2: Compute sum of exp(x - max_val)
LOW…e/transformer/custom_layers/batch_invariant_kernels.py293 # Step 3: Compute final log_softmax values: x - max_val - log_sum_exp
LOWmegatron/core/pipeline_parallel/bridge_communicator.py415 # Step 1: broadcast its shape so receivers can allocate
LOWmegatron/core/pipeline_parallel/bridge_communicator.py421 # Step 2: broadcast the actual tensor
LOW…n/core/dist_checkpointing/strategies/fully_parallel.py238 # Step 3: load part of the checkpoint.
LOW…n/core/dist_checkpointing/strategies/fully_parallel.py264 # Step 4: exchange data between ranks
LOW…gatron/core/resharding/nvshmem_copy_service/service.py232 # Step 1: Segment tasks (break large tasks into chunks)
LOW…gatron/core/resharding/nvshmem_copy_service/service.py242 # Step 2: Pack tasks into workload groups
LOW…gatron/core/resharding/nvshmem_copy_service/service.py249 # Step 3: Schedule workloads to iterations
LOW…gatron/core/resharding/nvshmem_copy_service/service.py258 # Step 4: Prepare iteration schedules
LOW…gatron/core/resharding/nvshmem_copy_service/service.py264 # Step 5: Build GPU execution plans
LOW…gatron/core/resharding/nvshmem_copy_service/service.py273 # Step 6: Create double-buffered events
LOW…harding/nvshmem_copy_service/core/pipeline_executor.py153 # Step 1: Pack NEXT iteration (async)
LOW…harding/nvshmem_copy_service/core/pipeline_executor.py165 # Step 2: Unpack PRIOR iteration (async)
LOW…harding/nvshmem_copy_service/core/pipeline_executor.py180 # Step 3: Send CURRENT iteration
LOW…harding/nvshmem_copy_service/core/pipeline_executor.py223 # Step 5: Wait for async pack to complete (double-buffer safety)
LOW…shmem_copy_service/planning/communication_scheduler.py39 # Step 1: Collect all batches across all PE pairs
LOW…shmem_copy_service/planning/communication_scheduler.py44 # Step 2: Assign batches to iterations using greedy conflict-free algorithm
LOW…shmem_copy_service/planning/communication_scheduler.py49 # Step 3: Exchange detailed workload summaries (Task IDs/Sizes)
LOW…shmem_copy_service/planning/communication_scheduler.py55 # Step 4: Build schedule map for this PE
LOW…e/communication/torch_symm_triton/fused_collectives.py151 # Step 1: - reduce-scatter + residual add for this token + collect sq sum
AI Slop Vocabulary34 hits · 76 pts
SeverityFileLineSnippet
LOWtasks/finetune_utils.py120 # shuffling so we can just use a simple infinite loop.
MEDIUMtests/unit_tests/test_inference.py107 # we are replicating what lm-eval-harness::TemplateLM::_encode_pair does
MEDIUM…s/checkpoint/test_gpt_hybrid_conversion_parallelism.py43# harness launched pytest. When that default PG is multi-rank (e.g. Megatron's
LOW…ts/distributed/megatron_fsdp/test_mfsdp_fully_shard.py227 # DP: Only relevant when using HSDP, where we need the flattened DP group for data parallelism. (Otherwise, just pas
LOW…ts/distributed/megatron_fsdp/test_mfsdp_fully_shard.py229 # DP-Shard-CP: Only required if using CP. Otherwise, just pass dp_shard to FSDP.
MEDIUMtests/unit_tests/transformer/test_utils.py331 """Test a comprehensive scenario with multiple configurations."""
MEDIUMtests/unit_tests/training/models/test_base.py171 """from_dict() reconstructs configs from serialized dicts, handles nested dataclasses, and is robust to unknown keys
LOW…s/unit_tests/dist_checkpointing/models/test_mlp_glu.py129 # Load happens in-place, so we can just use the same tensors
MEDIUM…/unit_tests/inference/contexts/test_dynamic_context.py2292 # 6. Verify seamless append (no legacy offset math needed)
MEDIUM…cipes/h100/gpt-dynamic-inference-with-coordinator.yaml92 # skills/run-performance-tests/SKILL.md for the harness it runs under.
MEDIUM…inference_server_smoke_tp1_pp1_dp8_583m/serve_smoke.py43 # the JET harness expects at ``logs/*/*/attempt_0/*/std*.log``) while still
LOW…sts/functional_tests/shell_test_utils/_run_training.sh85 # If value is "true", just use the key
MEDIUM.github/workflows/claude_review.yml94 # Strict review: comprehensive Megatron-LM focused analysis
LOWmegatron/core/optimizer_param_scheduler.py235 # If the learning rate is constant, just return the initial value.
LOWmegatron/core/timers.py476 # polutes the runs list, so we just add each as a scalar
MEDIUMmegatron/core/ssm/triton_cache_manager.py69 # use temp dir to be robust against program interruptions
LOW…ore/tokenizers/text/parsers/qwen3_coder_tool_parser.py16# These map to vLLM types but we just use dictionaries for now
LOWmegatron/core/tensor_parallel/random.py108 # if not using cuda graphs, just use the builtin pytorch function
LOWmegatron/core/tensor_parallel/random.py189 # already graphable, just return it.
LOWmegatron/core/tensor_parallel/random.py195 # already non-graphable, just return it.
MEDIUM…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py1549 # Set aggregate buckets by FSDP units, i.e. buckets pertaining to the same
MEDIUM…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py2149 # to leverage NCCL UBR for high-precision gradient reduction with
MEDIUM…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py3111 # for a seamless user experience and coverage for ZeRO-1 and ZeRO-2?
MEDIUM…ibuted/fsdp/src/megatron_fsdp/param_and_grad_buffer.py4319 # TODO(@cspades): Clean up this logic in conjunction with
LOW…atron/core/distributed/fsdp/src/megatron_fsdp/utils.py198 # if not using cuda graphs, just use the builtin pytorch function
LOWmegatron/core/transformer/transformer_config.py2237 # so just set both if either is specified.
LOWmegatron/core/transformer/transformer_layer.py862 # elements in bias_chunks are the same for all chunks, so we can just use the first one
MEDIUMmegatron/core/transformer/mlp.py207 # Weight resharding across TP sizes will have aforementioned problems.
MEDIUMmegatron/core/datasets/indexed_dataset.py6# Essentially re-written in entirety
MEDIUM…e/models/common/embeddings/language_model_embedding.py145 # the original tensor from being garbage collected. Clone to facilitate GC.
LOWmegatron/core/models/bert/bert_model.py193 # For local layer spec we just use b1ss
LOWmegatron/core/extensions/transformer_engine.py687 # TODO should we ditch normalization config and just use spec to choose LayerNorm vs RMSNorm?
LOW…tron/core/dist_checkpointing/strategies/async_utils.py479 # to simply call `sync_all_async_calls` to check if other ranks complete the writing
MEDIUM…/core/inference/data_parallel_inference_coordinator.py386 # Todo [Siddharth]: Make this more robust to handle invalid messages.
Cross-Language Confusion9 hits · 39 pts
SeverityFileLineSnippet
HIGH…core/transformer/heterogeneous/heterogeneous_config.py177 "n_heads_in_group": null,
HIGH…e/pipeline_parallel/fine_grained_activation_offload.py504 self.push(chunk)
HIGH…e/pipeline_parallel/fine_grained_activation_offload.py655 self.push(cur_chunk)
HIGHmegatron/core/inference/inference_request.py94 Each block hash is computed as SHA-256(parent_digest || block_bytes), where
HIGHmegatron/core/inference/unified_memory.py113 if (device != prev_device && device >= 0) cudaSetDevice(device);
HIGHmegatron/core/inference/unified_memory.py149 if (device != prev_device && prev_device >= 0) cudaSetDevice(prev_device);
HIGHmegatron/core/inference/contexts/dynamic_context.py3718 # : [ XX | XX | 16 XX | 12 72 24 88 XX ] (XX = undefined)
HIGHmegatron/core/inference/moe/vllm_fused_moe.py518 `input` are undefined).
HIGHmegatron/training/arguments.py3220 "n_heads_in_group": null,
Dead Code5 hits · 10 pts
SeverityFileLineSnippet
MEDIUMtools/checkpoint/loader_llava.py52
MEDIUM…nce_tp1_pp1_583m_cuda_graphs_validation/cuda_graphs.py183
MEDIUM…nce_tp1_pp1_583m_cuda_graphs_validation/cuda_graphs.py184
MEDIUM…nce_tp1_pp1_583m_cuda_graphs_validation/cuda_graphs.py185
MEDIUM…nce_tp1_pp1_583m_cuda_graphs_validation/cuda_graphs.py186
Synthetic Comment Markers1 hit · 5 pts
SeverityFileLineSnippet
HIGH.github/workflows/claude-copy-to-main.yml113 --body "🤖 **This PR was auto-generated by Claude** via the \`/claude copy\` command.\n\nCherry-picked f
Example Usage Blocks3 hits · 4 pts
SeverityFileLineSnippet
LOWtools/bisect.sh5# Usage:
LOWtools/common_pile_dataset/setup_common_pile_dataset.sh9# Usage:
LOW…unctional_tests/shell_test_utils/run_batch_ci_tests.sh5# Usage:
Slop Phrases2 hits · 2 pts
SeverityFileLineSnippet
MEDIUMmegatron/core/process_groups_config.py660 """Check if this rank has a language model.
MEDIUMmegatron/core/process_groups_config.py663 True if this rank has a language model, False otherwise.
Overly Generic Function Names3 hits · 2 pts
SeverityFileLineSnippet
LOW…s/unit_tests/training/config/test_instantiate_utils.py52def test_function(arg1=None, arg2=None, **kwargs):
LOWtests/unit_tests/training/config/test_yaml_utils.py45def test_function():
LOWmegatron/core/utils.py2544 def my_function():