Repository Analysis

modelscope/ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).

13.1 Low AI signal View on GitHub
13.1
Adjusted Score
13.1
Raw Score
100%
Time Factor
2026-05-30
Last Push
14,335
Stars
Python
Language
162,148
Lines of Code
1223
Files
1935
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 0HIGH 41MEDIUM 79LOW 1815

Pattern Findings

1935 matches across 17 categories. Click a row to expand file-level details.

Unused Imports877 hits · 760 pts
SeverityFileLineSnippet
LOWsetup.py7
LOWtests/llm/test_template.py49
LOWtests/models/test_mllm.py7
LOWtests/models/test_mllm.py7
LOWtests/train/test_resume_from_checkpoint.py29
LOWtests/train/test_resume_from_checkpoint.py29
LOWtests/train/test_freeze.py17
LOWtests/train/test_freeze.py17
LOWtests/train/test_freeze.py34
LOWtests/train/test_freeze.py34
LOWtests/train/test_freeze.py51
LOWtests/train/test_freeze.py51
LOWtests/train/test_freeze.py68
LOWtests/train/test_freeze.py68
LOWtests/test_align/test_padding_side.py50
LOWtests/hub/test_check_model.py17
LOWswift/__init__.py7
LOWswift/__init__.py7
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py8
LOWswift/__init__.py11
LOWswift/__init__.py11
LOWswift/__init__.py12
LOWswift/__init__.py12
LOWswift/__init__.py13
LOWswift/__init__.py13
LOWswift/__init__.py13
LOWswift/__init__.py13
LOWswift/__init__.py13
LOWswift/__init__.py13
LOWswift/__init__.py13
LOWswift/__init__.py13
LOWswift/__init__.py13
LOWswift/__init__.py13
LOWswift/__init__.py15
LOWswift/__init__.py15
LOWswift/__init__.py16
LOWswift/__init__.py16
LOWswift/__init__.py16
LOWswift/__init__.py16
LOWswift/__init__.py16
LOWswift/__init__.py17
LOWswift/__init__.py17
LOWswift/__init__.py17
LOWswift/__init__.py18
LOWswift/__init__.py18
LOWswift/__init__.py19
LOWswift/__init__.py19
LOWswift/__init__.py20
LOWswift/__init__.py20
LOWswift/__init__.py20
817 more matches not shown…
Hyper-Verbose Identifiers362 hits · 363 pts
SeverityFileLineSnippet
LOWtests/test_utils.py74def create_dummy_test_dataset(feat, label, num):
LOWtests/run.py77def gather_test_suites_in_files(test_dir, case_file_list, list_tests):
LOWtests/run.py132def async_run_command_with_popen(cmd, device_id):
LOWtests/run.py315def run_non_parallelizable_test_suites(suites, result_dir):
LOWtests/utils/test_async_rewards.py17 def test_start_and_shutdown_event_loop_in_daemon(self):
LOWtests/utils/test_async_rewards.py38 def test_run_async_function_in_daemon_loop(self):
LOWtests/utils/test_async_rewards.py57 def test_async_orm_base_class(self):
LOWtests/utils/test_async_rewards.py80 def test_async_reward_is_detected(self):
LOWtests/utils/test_async_rewards.py106 def test_parallel_async_execution(self):
LOWtests/utils/test_async_rewards.py153 def test_async_reward_function_batch_performance(self):
LOWtests/utils/test_rewards.py38 def test_multiple_steps_with_boxed(self):
LOWtests/utils/test_rewards.py62 def test_batch_processing_no_tag(self):
LOWtests/utils/test_rewards.py73 def test_answer_tag_with_plain_number(self):
LOWtests/utils/test_rewards.py82 def test_answer_tag_with_latex(self):
LOWtests/utils/test_rewards.py91 def test_long_text_with_answer_tag(self):
LOWtests/utils/test_rewards.py106 def test_answer_tag_with_complex_expression(self):
LOWtests/utils/test_rewards.py115 def test_solution_with_answer_tag(self):
LOWtests/utils/test_rewards.py124 def test_answer_tag_wrong_answer(self):
LOWtests/utils/test_rewards.py133 def test_mixed_batch_with_and_without_tags(self):
LOWtests/utils/test_rewards.py170 def test_answer_tag_with_extra_whitespace(self):
LOWtests/utils/test_rewards.py179 def test_multiple_answer_tags(self):
LOWtests/utils/test_rewards.py188 def test_real_world_example_from_user(self):
LOWtests/utils/test_rewards.py212 def test_equivalent_fractions(self):
LOWtests/utils/test_rewards.py221 def test_different_forms_same_answer(self):
LOWtests/utils/test_rewards.py230 def test_latex_inline_math_delimiters(self):
LOWtests/utils/test_rewards.py240 def test_latex_display_math_delimiters(self):
LOWtests/utils/test_rewards.py249 def test_mixed_latex_delimiters(self):
LOWtests/tuners/test_swift_device_map.py25 def test_swift_multiple_adapters(self):
LOWtests/tuners/test_swift_base.py73 def test_swift_adapter_forward(self):
LOWtests/tuners/test_swift_base.py93 def test_swift_prompt_forward(self):
LOWtests/tuners/test_swift_base.py110 def test_swift_restuner_forward(self):
LOWtests/tuners/test_swift_base.py133 def lora_injection_with_dtype(self, dtype=torch.float32):
LOWtests/tuners/test_swift_base.py185 def test_swift_lora_injection(self):
LOWtests/tuners/test_swift_base.py257 def test_swift_multiple_adapters(self):
LOWtests/tuners/test_swift_base.py382 def test_swift_multiple_adapters_switching(self):
LOWtests/tuners/test_peft.py120 def test_peft_adalora_injection(self):
LOWtests/tuners/test_extra_state_dict.py26 def test_swift_extra_state_dict(self):
LOWtests/tuners/test_extra_state_dict.py42 def test_swift_modules_to_save(self):
LOWtests/tuners/test_swift_restuning.py95 def test_swift_restuning_diffusers_sd(self):
LOWtests/tuners/test_scetuning.py64 def test_scetuning_part_mixin(self):
LOWtests/train/test_vllm_importance_sampling_basic.py32 def _compute_sequence_level_ratios(self, is_ratio: torch.Tensor, completion_mask: torch.Tensor) -> torch.Tensor:
LOWtests/train/test_vllm_importance_sampling_basic.py49 def _apply_rollout_importance_sampling(self, rollout_log_ratio: torch.Tensor,
LOWtests/train/test_vllm_importance_sampling_basic.py99 def _compute_is_correction_metrics(
LOWtests/train/test_vllm_importance_sampling_basic.py166 def test_token_truncate_basic(self):
LOWtests/train/test_vllm_importance_sampling_basic.py199 def test_sequence_truncate_basic(self):
LOWtests/train/test_vllm_importance_sampling_basic.py237 def test_threshold_sensitivity(self):
LOWtests/train/test_vllm_importance_sampling_basic.py345 def test_clipped_frac_token_truncate(self):
LOWtests/train/test_vllm_importance_sampling_basic.py359 def test_clipped_frac_token_mask(self):
LOWtests/train/test_vllm_importance_sampling_basic.py373 def test_clipped_frac_sequence_level(self):
LOWtests/train/test_vllm_importance_sampling_basic.py391 def test_kl_divergence_same_policy(self):
LOWtests/train/test_resume_from_checkpoint.py28def test_resume_from_checkpoint():
LOWtests/train/test_sft.py162def test_mllm_streaming_mp_ddp():
LOWtests/train/test_sft.py215def test_resume_from_checkpoint():
LOWtests/train/test_sft.py275def test_predict_with_generate():
LOWtests/train/test_export_cached_dataset.py1def test_export_cached_dataset():
LOWtests/test_align/test_template/test_vision.py1063def _infer_ernie_vl_thinking_hf(model, processor, messages):
LOWtests/test_align/test_template/test_gene.py7def test_deepseek_janus_pro_gene():
LOWdocs/source/BestPractices/MLLM-Registration.md218 def _get_new_tokens_use_audio_in_video(self, i, *, video_grid_thw, video_second_per_grid, audio_lengths,
LOWdocs/source_en/BestPractices/MLLM-Registration.md221 def _get_new_tokens_use_audio_in_video(self, i, *, video_grid_thw, video_second_per_grid, audio_lengths,
LOWswift/metrics/reranker.py111 def calculate_ndcg_single_query(relevance_scores, ranking):
302 more matches not shown…
Deep Nesting326 hits · 298 pts
SeverityFileLineSnippet
LOWsetup.py24
LOWsetup.py44
LOWsetup.py78
LOWsetup.py96
LOWtests/test_utils.py124
LOWtests/run.py33
LOWtests/run.py77
LOWtests/run.py185
LOWtests/run.py204
LOWtests/run.py341
LOWtests/run.py454
LOWtests/general/test_arch.py1
LOWtests/train/test_vllm_importance_sampling_basic.py49
LOWswift/ui/base.py268
LOWswift/ui/base.py314
LOWswift/ui/llm_export/llm_export.py101
LOWswift/ui/llm_grpo/llm_grpo.py216
LOWswift/ui/llm_grpo/external_rollout.py145
LOWswift/ui/llm_grpo/tuner.py18
LOWswift/ui/llm_rlhf/llm_rlhf.py232
LOWswift/ui/llm_rlhf/tuner.py18
LOWswift/ui/llm_sample/llm_sample.py166
LOWswift/ui/llm_eval/llm_eval.py104
LOWswift/ui/llm_infer/runtime.py133
LOWswift/ui/llm_infer/runtime.py265
LOWswift/ui/llm_infer/llm_infer.py124
LOWswift/ui/llm_infer/llm_infer.py189
LOWswift/ui/llm_train/task.py64
LOWswift/ui/llm_train/runtime.py295
LOWswift/ui/llm_train/runtime.py374
LOWswift/ui/llm_train/runtime.py407
LOWswift/ui/llm_train/runtime.py542
LOWswift/ui/llm_train/runtime.py600
LOWswift/ui/llm_train/llm_train.py242
LOWswift/ui/llm_train/llm_train.py325
LOWswift/ui/llm_train/llm_train.py551
LOWswift/ui/llm_train/optimizer.py121
LOWswift/ui/llm_train/tuner.py290
LOWswift/loss/embedding.py88
LOWswift/loss/embedding.py115
LOWswift/callbacks/activation_cpu_offload.py330
LOWswift/callbacks/activation_cpu_offload.py597
LOWswift/dataset/dataset_meta.py120
LOWswift/dataset/packing.py32
LOWswift/dataset/loader.py70
LOWswift/dataset/dataset/mllm.py567
LOWswift/dataset/dataset/mllm.py239
LOWswift/dataset/dataset/mllm.py715
LOWswift/dataset/dataset/mllm.py791
LOWswift/dataset/dataset/mllm.py952
LOWswift/dataset/dataset/mllm.py1233
LOWswift/dataset/dataset/llm.py764
LOWswift/dataset/preprocessor/core.py170
LOWswift/dataset/preprocessor/core.py491
LOWswift/rewards/orm.py238
LOWswift/rlhf_trainers/rollout_mixin.py194
LOWswift/rlhf_trainers/rollout_mixin.py334
LOWswift/rlhf_trainers/rollout_mixin.py471
LOWswift/rlhf_trainers/rollout_mixin.py703
LOWswift/rlhf_trainers/rollout_mixin.py757
266 more matches not shown…
Excessive Try-Catch Wrapping171 hits · 192 pts
SeverityFileLineSnippet
LOWtests/run.py349 except Exception:
LOWtests/model_tag.py66 except Exception as e:
LOWtests/model_tag.py94 except Exception as e:
LOWtests/model_tag.py115 except Exception as e:
LOWtests/model_tag.py131 except Exception as e:
MEDIUMtests/model_tag.py53def _post_request(self, url, param):
MEDIUMtests/model_tag.py72def batch_commit_result(self):
MEDIUMtests/model_tag.py100def batch_refresh_stage(self):
MEDIUMtests/model_tag.py121def query_model_stage(self):
LOWtests/general/test_arch.py18 except Exception:
LOWtests/deploy/test_dataset.py12 except Exception:
LOWtests/train/test_vllm_importance_sampling_basic.py477 except Exception as e:
LOWdocs/source/BestPractices/GRPO.md78 except Exception as e:
LOWdocs/source/BestPractices/GRPO-Multi-Modal-Training.md79 except Exception:
LOWdocs/source/BestPractices/GRPO-Multi-Modal-Training.md96 except Exception:
LOW…source/Instruction/GRPO/DeveloperGuide/reward_model.md118 except Exception as e:
LOWdocs/source_en/BestPractices/GRPO.md82 except Exception as e:
LOW…s/source_en/BestPractices/GRPO-Multi-Modal-Training.md85 except Exception:
LOW…s/source_en/BestPractices/GRPO-Multi-Modal-Training.md102 except Exception:
LOW…rce_en/Instruction/GRPO/DeveloperGuide/reward_model.md117 except Exception as e:
LOWswift/ui/base.py347 except Exception as e:
LOWswift/ui/base.py368 except Exception:
LOWswift/ui/llm_grpo/external_runtime.py135 except Exception as e:
LOWswift/ui/llm_eval/eval.py103 except Exception as e:
LOWswift/ui/llm_infer/runtime.py259 except Exception as e:
LOWswift/ui/llm_train/runtime.py590 except Exception as e:
LOWswift/ui/llm_train/llm_train.py406 except Exception as e:
LOWswift/ui/llm_train/llm_train.py530 except Exception as err:
LOWswift/dataset/utils.py95 except Exception as e:
LOWswift/dataset/packing.py151 except Exception as e:
LOWswift/dataset/loader.py116 except Exception as e:
LOWswift/dataset/dataset/mllm.py796 except Exception:
LOWswift/dataset/preprocessor/core.py192 except Exception as e:
LOWswift/rewards/prm.py90 except Exception:
LOWswift/rewards/prm.py148 except Exception:
LOWswift/rewards/orm.py114 except Exception:
LOWswift/rewards/orm.py251 except Exception:
LOWswift/rewards/orm.py258 except Exception:
LOWswift/rewards/orm.py300 except Exception:
LOWswift/rewards/orm.py374 except Exception:
LOWswift/rewards/orm.py419 except Exception:
LOWswift/rewards/rm_plugin.py224 except Exception as e:
MEDIUMswift/rlhf_trainers/rollout_mixin.py1563def infer_task():
MEDIUMswift/rlhf_trainers/rollout_mixin.py1573def done(future):
LOWswift/rlhf_trainers/rollout_mixin.py1567 except Exception as e:
LOWswift/rlhf_trainers/rollout_mixin.py1577 except Exception as e:
LOWswift/rlhf_trainers/rollout_mixin.py1760 except Exception as e:
LOWswift/rlhf_trainers/reward_trainer.py102 except Exception as e:
LOWswift/rlhf_trainers/gkd_trainer.py1221 except Exception:
LOWswift/rlhf_trainers/gkd_trainer.py1273 except Exception as e:
MEDIUMswift/rlhf_trainers/gkd_trainer.py1230def _fetch_one(idx):
LOWswift/rlhf_trainers/utils.py1012 except Exception as e: # noqa: BLE001
LOWswift/rlhf_trainers/utils.py1099 except Exception as e:
LOWswift/rlhf_trainers/vllm_client.py98 except Exception:
LOWswift/rlhf_trainers/vllm_client.py172 except Exception as e:
LOWswift/rlhf_trainers/vllm_client.py244 except Exception as e:
LOWswift/rlhf_trainers/vllm_client.py292 except Exception as e:
LOWswift/rlhf_trainers/vllm_client.py354 except Exception as e:
LOWswift/rlhf_trainers/vllm_client.py397 except Exception as e:
LOWswift/rlhf_trainers/vllm_client.py421 except Exception as e:
111 more matches not shown…
Cross-File Repetition21 hits · 105 pts
SeverityFileLineSnippet
HIGHdocs/source/BestPractices/GRPO.md0evaluates completions based on mathematical correctness of the answer args: completions (list[str]): generated outputs t
HIGHdocs/source_en/BestPractices/GRPO.md0evaluates completions based on mathematical correctness of the answer args: completions (list[str]): generated outputs t
HIGHexamples/train/grpo/plugin/plugin.py0evaluates completions based on mathematical correctness of the answer args: completions (list[str]): generated outputs t
HIGHdocs/source/BestPractices/GRPO-Multi-Modal-Training.md0reward function that checks if the completion is correct. args: completions (list[str]): generated outputs solution (lis
HIGH…s/source_en/BestPractices/GRPO-Multi-Modal-Training.md0reward function that checks if the completion is correct. args: completions (list[str]): generated outputs solution (lis
HIGHexamples/train/grpo/plugin/plugin.py0reward function that checks if the completion is correct. args: completions (list[str]): generated outputs solution (lis
HIGHdocs/source/BestPractices/AMD-support.md0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHdocs/source/BestPractices/Qwen3_5-Best-Practice.md0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHdocs/source/BestPractices/Qwen3_5-Best-Practice.md0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHdocs/source_en/BestPractices/AMD-support.md0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHdocs/source_en/BestPractices/Qwen3_5-Best-Practice.md0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHdocs/source_en/BestPractices/Qwen3_5-Best-Practice.md0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHexamples/models/qwen3_5/mcore_grpo_moe.sh0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHexamples/train/grpo/plugin/gsm8k/gsm8k.sh0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHexamples/train/grpo/internal/fipo.sh0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHexamples/train/grpo/internal/real.sh0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHexamples/megatron/grpo/real.sh0you are a helpful math assistant. solve the problem step by step and put your final answer within \\boxed{}.
HIGHdocs/source/Instruction/GRPO/DeveloperGuide/gym_env.md0rolloutinferrequest( messages=[ {'role': 'system', 'content': 'a conversation between user and assistant. the user asks
HIGH…s/source/Instruction/GRPO/DeveloperGuide/multi_turn.md0rolloutinferrequest( messages=[ {'role': 'system', 'content': 'a conversation between user and assistant. the user asks
HIGH…s/source_en/Instruction/GRPO/DeveloperGuide/gym_env.md0rolloutinferrequest( messages=[ {'role': 'system', 'content': 'a conversation between user and assistant. the user asks
HIGH…ource_en/Instruction/GRPO/DeveloperGuide/multi_turn.md0rolloutinferrequest( messages=[ {'role': 'system', 'content': 'a conversation between user and assistant. the user asks
Decorative Section Separators32 hits · 99 pts
SeverityFileLineSnippet
MEDIUMswift/rlhf_trainers/utils.py1588# ============================================================================
MEDIUMswift/rlhf_trainers/utils.py1590# ============================================================================
MEDIUMswift/rlhf_trainers/grpo_trainer.py493 # --------------------------------------------------
MEDIUMswift/rlhf_trainers/grpo_trainer.py495 # --------------------------------------------------
MEDIUMswift/rlhf_trainers/grpo_trainer.py579 # --------------------------------------------------
MEDIUMswift/rlhf_trainers/grpo_trainer.py581 # --------------------------------------------------
MEDIUMswift/pipelines/infer/rollout.py308 # ------------------------------------------------------------------
MEDIUMswift/pipelines/infer/rollout.py310 # ------------------------------------------------------------------
MEDIUMswift/pipelines/infer/rollout.py433 # ── Step 1: receive + rebuild IPC handle (with reuse) ────────
MEDIUMswift/pipelines/infer/rollout.py471 # ── Step 2: stream buckets and load_weights per bucket ──────
MEDIUMswift/model/npu_patch/model.py18# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py20# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py177# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py179# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py209# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py211# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py293# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py295# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py383# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py385# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py429# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py431# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py483# ---------------------------------------------------------------------------
MEDIUMswift/model/npu_patch/model.py485# ---------------------------------------------------------------------------
MEDIUMswift/ray/megatron/rollout/ray_vllm_engine.py136 # ------------------------------------------------------------------
MEDIUMswift/ray/megatron/rollout/ray_vllm_engine.py138 # ------------------------------------------------------------------
MEDIUMswift/ray/megatron/checkpoint_engine/nccl.py240 # ── Send / Receive ───────────────────────────────────────────────────
MEDIUMswift/ray/megatron/checkpoint_engine/hccl.py168 # ── Core lifecycle ───────────────────────────────────────────────────
MEDIUMswift/ray/megatron/checkpoint_engine/hccl.py276 # ── Metadata exchange ────────────────────────────────────────────────
MEDIUMswift/ray/megatron/checkpoint_engine/hccl.py310 # ── Send / Receive ───────────────────────────────────────────────────
MEDIUMswift/megatron/arguments/megatron_args.py170 # ─────────────────────────── Not Supported Yet ───────────────────────────
MEDIUMexamples/train/rlhf/gkd/teacher_server.sh10# ========================================================================
Docstring Block Structure14 hits · 70 pts
SeverityFileLineSnippet
HIGHdocs/README.md18Load data from json/yaml/pickle files. This method provides a unified api for loading data from serialized file
HIGHswift/dataset/loader.py239Load and preprocess datasets. This function provides a unified interface to load datasets from various sources (Hug
HIGHswift/rewards/rm_plugin.py123 Extract the reward score from the model's output. Args: model_output (str): The model's ou
HIGHswift/rlhf_trainers/rollout_mixin.py335Split model parameters into batches for synchronized weight transfer. This method divides model parameters into
HIGHswift/rlhf_trainers/rollout_mixin.py1709 Attempt to encode each input using the template. If encoding fails, resample from a backup iterator unt
HIGHswift/template/register.py81Get or create a template instance for model input/output formatting. This function retrieves the appropriate templa
HIGHswift/loss_scale/mapping.py20Factory function to create a loss scale object from a string specification. The loss_scale string can be in three f
HIGHswift/utils/hub_utils.py27Download model snapshot safely with DDP context protection. This function attempts to download a model from Hugging
HIGHswift/utils/torch_utils.py167 Get the last valid (non-padding) token position indices for each sample. This function correctly handles seque
HIGHswift/rollout/multi_turn.py30 Perform asynchronous batched inference for multiple rollout requests. This method serves as the main e
HIGHswift/rollout/multi_turn.py183Execute multi-turn conversation rollout with built-in turn management logic. This implements the default multi-
HIGHswift/model/chunk_gated_delta_rule.py256 Args: q (torch.Tensor): queries of shape `[B, T, H, K]`. k (torch.Tensor):
HIGHswift/model/register.py539Load a pretrained model and its processor from a model hub or local path. Args: model_id_or_path: The model
HIGHswift/megatron/trainers/grpo_trainer.py1576 Attempt to encode each input using the template. If encoding fails, resample from a backup iterator unt
Verbosity Indicators29 hits · 52 pts
SeverityFileLineSnippet
LOWtests/utils/test_rewards.py193 '### Step 1: Recall the formula\n\n'
LOWtests/utils/test_rewards.py195 '### Step 2: Use the given terms\n\n'
LOWtests/utils/test_rewards.py200 '### Step 3: Find $a_9$\n\n'
LOWswift/metrics/reranker.py61 # Step 1: Find all positive sample indices (query boundaries)
LOWswift/metrics/reranker.py67 # Step 2: Split into groups (queries)
LOWswift/metrics/reranker.py85 # Step 3: Calculate metrics for each query independently
LOWswift/metrics/reranker.py135 # Step 4: Calculate mean metrics across all valid queries
LOWswift/template/templates/minicpm.py429 # Step 1: Base encode — produces input_ids with -100 for images
LOWswift/template/templates/minicpm.py443 # Step 2: Process images — replace -100 tokens with image placeholders
LOWswift/template/templates/minicpm.py457 # Step 3: Process audios — expand audio_start/audio_end pairs with <unk> tokens
LOWswift/template/templates/minicpm.py495 # Step 4: Compute image_bound using start/end token boundaries
LOWswift/template/templates/minicpm.py520 # Step 5: Compute audio_bounds
LOWswift/sequence_parallel/ulysses.py511 # Step 1: Gather from all sequence parallel ranks
LOWswift/sequence_parallel/ulysses.py519 # Step 2: Gather all rp chunks
LOWswift/megatron/trainers/vocab_parallel_utils.py41 # Step 1: Find global max for numerical stability
LOWswift/megatron/trainers/vocab_parallel_utils.py45 # Step 2: Compute exp(logits - max) and sum across all TP ranks
LOWswift/megatron/trainers/vocab_parallel_utils.py50 # Step 3: Compute log_softmax
LOWswift/megatron/trainers/grpo_trainer.py387 # Step 2: Compute KL from logps if kl_in_reward is enabled
LOWswift/megatron/trainers/grpo_trainer.py392 # Step 3: Compute advantages (with KL penalty if kl_in_reward is enabled)
LOWswift/megatron/trainers/grpo_trainer.py396 # Step 4: Add advantages to encoded batches
LOWswift/megatron/trainers/grpo_trainer.py557 # Step 1: Update or append assistant message
LOWswift/megatron/trainers/grpo_trainer.py565 # Step 2: Add token IDs and loss mask
LOWswift/megatron/trainers/grpo_trainer.py574 # Step 3: Attach rollout extra info
LOWswift/megatron/trainers/grpo_trainer.py578 # Step 4: Store finish reason (used for truncation filters etc.)
LOWswift/megatron/trainers/grpo_trainer.py583 # Step 5: Store rollout logprobs for importance sampling correction
LOWswift/megatron/trainers/grpo_trainer.py591 # Step 6: Store rollout routed_experts for routing replay
LOWswift/megatron/trainers/grpo_trainer.py375 # Step 1: Encode batches and compute logps first (unified flow like GRPOTrainer)
LOWswift/megatron/trainers/grpo_trainer.py489 # Step 1: Wake up the engine if it's sleeping (vLLM colocate mode)
LOWswift/megatron/trainers/grpo_trainer.py496 # Step 2: Load model weights
Self-Referential Comments18 hits · 48 pts
SeverityFileLineSnippet
MEDIUMswift/rewards/rm_plugin.py3# This module provides plugins for integrating external reward models,
MEDIUMswift/rewards/rm_plugin.py169 # Define a mapping for role capitalization if needed
MEDIUMswift/rlhf_trainers/utils.py356 # Create a DeepSpeedPlugin with the processed config
MEDIUMswift/rlhf_trainers/grpo_trainer.py2014 # Create a copy to avoid modifying the original table used by other loggers.
MEDIUMswift/template/base.py461 # This function is only used to handle scenarios where the model needs
MEDIUMswift/template/base.py1587 """This function is important for multi-modal training, as it registers the post_encode method
MEDIUMswift/pipelines/train/tuner.py127 """This function is only useful on the vera tuner"""
MEDIUMswift/pipelines/infer/rollout.py144 # Create a stateless process group to manage communication between training processes and vLLM workers.
MEDIUMswift/pipelines/eval/utils.py140 # Create a future to receive the result asynchronously
MEDIUMswift/utils/hf_config.py13 """This class is used to read config from config.json(maybe params.json also)"""
MEDIUMswift/rollout/multi_turn.py478 # Create a RolloutOutput for the current round
MEDIUMswift/rollout/multi_turn.py543 # Create a mock inputs object to use the template's _swift_prepare_inputs method
MEDIUMexamples/train/grpo/plugin/plugin.py73 # Define a regex pattern that only allows numbers, operators, parentheses, and whitespace
MEDIUMexamples/train/grpo/plugin/plugin.py220 # Create the sandbox by hand, currently there's no context manager for this version
MEDIUMexamples/train/grpo/plugin/plugin.py226 # Create a list of tasks for running scripts concurrently
MEDIUMexamples/train/grpo/plugin/plugin.py205 # Create a new event loop and set it
MEDIUMexamples/train/think_model/qwen3_demo1.sh3# This method is also applicable to the Deepseek-R1 series of models.
MEDIUMexamples/custom/my_qwen2_5_omni/my_register.py307 """This function is typically used to solve the zero2/zero3 hanging issue in mixed model training,
Over-Commented Block48 hits · 47 pts
SeverityFileLineSnippet
LOWtests/general/test_dataset.py21 'AI-ModelScope/LongAlpaca-12k#1000'
LOWtests/general/test_dataset.py81
LOWtests/train/test_sft.py421 # test_llm_hqq()
LOWtests/test_align/test_vllm_vlm.py201 # test_ovis2()
LOWtests/test_align/test_template/test_agent.py721 print(f'labels: {template.safe_decode(encoded2["labels"])}')
LOWtests/test_align/test_template/test_agent.py741 # test_glm4_5()
LOWtests/test_align/test_template/test_llm.py741 # test_qwen1_5()
LOWtests/test_align/test_template/test_llm.py761 # test_phi4()
LOWtests/test_align/test_template/test_llm.py781 # test_glm4_5()
LOWtests/test_align/test_template/test_video.py401 # test_qwen2_vl()
LOWtests/test_align/test_template/test_vision.py1261
LOWtests/test_align/test_template/test_vision.py1281 # test_deepseek_vl()
LOWtests/test_align/test_template/test_vision.py1301 # test_internvl2_5_mpo()
LOWtests/test_align/test_template/test_vision.py1321 # test_glyph()
LOWtests/test_align/test_template/test_vision.py1341 # test_mistral_2506()
LOWtests/test_align/test_template/test_template.py161
LOWtests/eval/test_eval.py61 test_eval_llm()
LOWdocs/source/conf.py1# Configuration file for the Sphinx documentation builder.
LOWdocs/source/conf.py81
LOWdocs/source_en/conf.py1# Configuration file for the Sphinx documentation builder.
LOWdocs/source_en/conf.py81
LOWdocs/source_en/Megatron-SWIFT/LoRA-Training.md141# swift export
LOWswift/rlhf_trainers/args_mixin.py441 log_rollout_offpolicy_metrics: bool = False # Log off-policy metrics even when IS correction is disabled
LOWswift/rlhf_trainers/grpo_trainer.py181 )
LOWswift/rlhf_trainers/grpo_trainer.py2601 # 2d. Log PPL difference (sequence-level perplexity difference)
LOWswift/pipelines/infer/rollout.py301 """
LOWswift/infer_engine/vllm_engine.py481 else:
LOWexamples/ascend/activation_cpu_offload/train.sh41# {'train_runtime': 79.7064, 'train_samples_per_second': 6.311, 'train_steps_per_second': 0.201, 'train_loss': 1.9164841
LOWexamples/deploy/vllm_dp.sh1CUDA_VISIBLE_DEVICES=0,1 swift deploy \
LOWexamples/deploy/vllm_dp.sh21# "temperature": 0
LOWexamples/deploy/vllm.sh1CUDA_VISIBLE_DEVICES=0 swift deploy \
LOWexamples/models/gpt_oss/internvl3_5_gpt.sh41 --moe_expert_capacity_factor 2 \
LOWexamples/models/qwen3_next/mtp.sh41 --sequence_parallel true \
LOWexamples/models/qwen3_5/mcore.sh61# CUDA_VISIBLE_DEVICES=0,1,2,3 \
LOWexamples/models/qwen3_5/transformers.sh41# IMAGE_MAX_TOKEN_NUM=1024 \
LOWexamples/models/qwen3_5/fp8.sh61# swift infer \
LOW…mples/train/multimodal/lora_llm_full_vit/merge_lora.sh1CUDA_VISIBLE_DEVICES=0 \
LOW…s/train/sequence_parallel/sequence_parallel_qwen3_5.sh21 --logging_steps 1 \
LOWexamples/train/rlhf/opsd/opsd.sh1# OPSD Training Script
LOWexamples/train/grpo/plugin/run_external_scheduler.sh1# This script require main branch ms-swift
LOWexamples/train/grpo/plugin/deepeyes/deepeyes.sh1# 8 * 80G
LOWexamples/train/grpo/internal/full_lmdeploy.sh1# The LMDeploy backend in GRPO has been deprecated in Swift 3.5.
LOWexamples/train/grpo/external/vllm_multi_turn.sh1# Exp: https://github.com/modelscope/ms-swift/pull/5307#issuecomment-3219803922
LOWexamples/megatron/mcore_bridge/full/dense.sh41# VIDEO_MAX_TOKEN_NUM=128 \
LOWexamples/megatron/mcore_bridge/lora/seq_cls.sh41 --no_save_rng true \
LOWexamples/megatron/multimodal/lora_llm_vit_full/sft.sh61# IMAGE_MAX_TOKEN_NUM=1024 \
LOWexamples/megatron/grpo/dense_server.sh1# MAX_PIXELS=602112 \
LOWexamples/megatron/lora/mtp.sh41 --dataloader_num_workers 8 \
Redundant / Tautological Comments21 hits · 34 pts
SeverityFileLineSnippet
LOWswift/callbacks/activation_cpu_offload.py603 # Check if model is wrapped with FSDP
LOWswift/callbacks/activation_cpu_offload.py607 # Check if fsdp_config is a dictionary and has activation_cpu_offload enabled
LOWswift/callbacks/lisa.py43 # Check if it's time to switch active layers, including at step 0
LOWswift/rlhf_trainers/rollout_mixin.py985 # Check if the number of logprobs matches the number of loss_mask=1 tokens
LOWswift/rlhf_trainers/utils.py932 # Check if already patched (idempotent).
LOWswift/rlhf_trainers/utils.py964 # Check if inner_model has layers attribute
LOWswift/rlhf_trainers/grpo_trainer.py2323 # Set max_length to None to disable truncation, as the input length has already been truncated earlier.
LOWswift/pipelines/eval/utils.py168 # Check if we've reached the desired batch size
LOWswift/trainers/mixin.py406 # Check if we should delete older checkpoint(s)
LOWswift/trainers/reranker_trainer.py18 # Check if we have a custom loss function
LOWswift/rollout/multi_turn.py289 # Check if the number of logprobs matches the number of loss_mask=1 tokens
LOWswift/infer_engine/utils.py539 # Check if we have a cached last_output from the previous iteration.
LOWswift/infer_engine/utils.py606 # Check if need to run the usual non-async path
LOWswift/hub/hub.py351 # Write the file if it has changed
LOWswift/arguments/rlhf_args.py559 # Check if teacher_deepspeed is a predefined name
LOWswift/megatron/utils/megatron_lm_utils.py520 # Set bucket_size to infinity if overlap_grad_reduce is False.
LOWswift/megatron/trainers/grpo_trainer.py1103 # Check if this is the PP last stage (only last stage has labels and computes loss)
LOWswift/agent_template/minimax_m2.py72 # Check if using react format
LOWexamples/train/grpo/plugin/plugin.py69 # Check if all numbers are used exactly once
LOWexamples/train/grpo/plugin/plugin.py81 # Check if the equation is correct and matches the ground truth
LOWexamples/train/grpo/plugin/plugin.py57 # Check if the format is correct
Slop Phrases5 hits · 13 pts
SeverityFileLineSnippet
LOWswift/model/models/qwen.py1687 loss.device) # make sure to reside in the same device
MEDIUMexamples/train/multi-gpu/fsdp2_lora/train.sh2# NOTE: for swift>=3.12, you can use --fsdp fsdp2 instead of accelerate launch
MEDIUMexamples/train/multi-gpu/fsdp_qlora/train.sh2# NOTE: for swift>=3.12, you can use --fsdp fsdp2 instead of accelerate launch
MEDIUMexamples/megatron/fp8/lora.sh3# However, you can use BF16 weights to perform Merge-LoRA.
MEDIUMexamples/megatron/fp8/lora.sh64# Alternatively, you can use BF16 base model + BF16 LoRA for inference
Cross-Language Confusion3 hits · 12 pts
SeverityFileLineSnippet
HIGHswift/rlhf_trainers/grpo_trainer.py2535 - kl: Direct KL divergence estimator KL(π_rollout || π_training)
HIGHswift/hub/hub.py357 repo.push(commit_message)
HIGHswift/megatron/trainers/vocab_parallel_utils.py108 KL(target || input) = sum(target_prob * (target_log_prob - input_log_prob))
Magic Placeholder Names2 hits · 10 pts
SeverityFileLineSnippet
HIGHdocs/source/Instruction/Sample.md80OPENAI_API_KEY="your_api_key" \
HIGHdocs/source_en/Instruction/Sample.md84OPENAI_API_KEY="your_api_key" \
AI Slop Vocabulary4 hits · 9 pts
SeverityFileLineSnippet
LOWtests/train/test_vllm_importance_sampling_basic.py20 # In testing, just return the tensor as-is
MEDIUMswift/callbacks/perf_log.py58 # TODO Collect comprehensive TFLOPS data. Then provide a fallback strategy based on lookup tables.
MEDIUMswift/template/templates/minicpm.py496 # This is more robust than finding consecutive <unk> tokens, especially
LOWexamples/megatron/export/lora.sh3# simply set `--merge_lora true`
Synthetic Comment Markers1 hit · 5 pts
SeverityFileLineSnippet
HIGH…s/source_en/BestPractices/GRPO-Multi-Modal-Training.md278This task is based on the experiments in [open-r1-multimodal](https://github.com/EvolvingLMMs-Lab/open-r1-multimodal.git
Overly Generic Function Names1 hit · 1 pts
SeverityFileLineSnippet
LOWswift/callbacks/activation_cpu_offload.py490 def my_function(*inputs):