Repository Analysis

liguodongiot/llm-action

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

9.2 Low AI signal View on GitHub
9.2
Adjusted Score
9.2
Raw Score
100%
Time Factor
2026-05-25
Last Push
24,391
Stars
HTML
Language
108,717
Lines of Code
783
Files
425
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 19HIGH 53MEDIUM 54LOW 299

Pattern Findings

425 matches across 14 categories. Click a row to expand file-level details.

Hallucination Indicators19 hits · 255 pts
SeverityFileLineSnippet
CRITICAL…m-train/pytorch/distribution/tensor-parallel/README.md54torch.distributed.tensor.parallel.style.make_input_replicate_1d(input, device_mesh=None)
CRITICAL…-train/pytorch/distribution/pipeline-parallel/1-流水线.md73- torch.distributed.pipeline.sync.skip.skippable.skippable(stash=(), pop=())
CRITICAL…-train/pytorch/distribution/pipeline-parallel/1-流水线.md113- torch.distributed.pipeline.sync.skip.skippable.stash(name, tensor)
CRITICAL…-train/pytorch/distribution/pipeline-parallel/1-流水线.md115- torch.distributed.pipeline.sync.skip.skippable.pop(name)
CRITICAL…-train/pytorch/distribution/pipeline-parallel/1-流水线.md117- torch.distributed.pipeline.sync.skip.skippable.verify_skippables(module)
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py247 model.language_model.embedding.word_embeddings.weight.data.copy_(out_word_embed[tp_rank])
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py249 model.language_model.embedding.position_embeddings.weight.data.copy_(pos_embed)
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py302 l.self_attention.query_key_value.weight.data.copy_(qkv_weight[tp_rank])
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py303 l.self_attention.dense.weight.data.copy_(dense_weight[tp_rank])
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py306 l.mlp.dense_h_to_4h.weight.data.copy_(mlp_l0_weight[tp_rank])
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py307 l.mlp.dense_4h_to_h.weight.data.copy_(mlp_l1_weight[tp_rank])
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py309 l.self_attention.query_key_value.bias.data.copy_(qkv_bias[tp_rank])
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py310 l.self_attention.dense.bias.data.copy_(dense_bias)
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py311 l.mlp.dense_h_to_4h.bias.data.copy_(mlp_l0_bias[tp_rank])
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py312 l.mlp.dense_4h_to_h.bias.data.copy_(mlp_l1_bias)
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py323 models[tp_rank].language_model.encoder.final_layernorm.weight.data.copy_(final_layernorm_weight)
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py324 models[tp_rank].language_model.encoder.final_layernorm.bias.data.copy_(final_layernorm_bias)
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py352 models[tp_rank].language_model.pooler.dense.weight.data.copy_(pooler_weight)
CRITICAL…t2/merge_ck_and_inference/checkpoint_saver_megatron.py353 models[tp_rank].language_model.pooler.dense.bias.data.copy_(pooler_bias)
Cross-File Repetition47 hits · 235 pts
SeverityFileLineSnippet
HIGH…-inference/ascend/mindformers/chatglm3/chatglm-stat.py0# 2. 自定义修改配置后实例化 config = autoconfig.from_pretrained('/root/workspace/model/chatglm3-6b_ms/run_glm3_6b.yaml') config.use
HIGH…m-inference/ascend/mindformers/chatglm3/chatglm-gen.py0# 2. 自定义修改配置后实例化 config = autoconfig.from_pretrained('/root/workspace/model/chatglm3-6b_ms/run_glm3_6b.yaml') config.use
HIGH…rence/ascend/mindformers/chatglm3/chatglm-inference.py0# 2. 自定义修改配置后实例化 config = autoconfig.from_pretrained('/root/workspace/model/chatglm3-6b_ms/run_glm3_6b.yaml') config.use
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0collect a list of masked token indices, and labels, and batch them, padding to max length in the batch.
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0collect a list of masked token indices, and labels, and batch them, padding to max length in the batch.
HIGHai-framework/deepspeed/hello_bert/train_bert.py0collect a list of masked token indices, and labels, and batch them, padding to max length in the batch.
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0given a text string, randomly mask wordpieces for bert mlm training. args: text (str): the input text tokenizer (tokeniz
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0given a text string, randomly mask wordpieces for bert mlm training. args: text (str): the input text tokenizer (tokeniz
HIGHai-framework/deepspeed/hello_bert/train_bert.py0given a text string, randomly mask wordpieces for bert mlm training. args: text (str): the input text tokenizer (tokeniz
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0a [map style dataset](https://pytorch.org/docs/stable/data.html) for iterating over the wikitext dataset. note that this
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0a [map style dataset](https://pytorch.org/docs/stable/data.html) for iterating over the wikitext dataset. note that this
HIGHai-framework/deepspeed/hello_bert/train_bert.py0a [map style dataset](https://pytorch.org/docs/stable/data.html) for iterating over the wikitext dataset. note that this
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0create the dataloader. args: mask_prob (float): fraction of tokens to mask random_replace_prob (float): fraction of mask
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0create the dataloader. args: mask_prob (float): fraction of tokens to mask random_replace_prob (float): fraction of mask
HIGHai-framework/deepspeed/hello_bert/train_bert.py0create the dataloader. args: mask_prob (float): fraction of tokens to mask random_replace_prob (float): fraction of mask
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0the current `transformers` library does not provide support for masked_token_indices. this function provides the support
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0the current `transformers` library does not provide support for masked_token_indices. this function provides the support
HIGHai-framework/deepspeed/hello_bert/train_bert.py0the current `transformers` library does not provide support for masked_token_indices. this function provides the support
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0the forward pass for the mlm task args: src_tokens (torch.tensor): the masked token indices. shape: (batch, seq_len) att
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0the forward pass for the mlm task args: src_tokens (torch.tensor): the masked token indices. shape: (batch, seq_len) att
HIGHai-framework/deepspeed/hello_bert/train_bert.py0the forward pass for the mlm task args: src_tokens (torch.tensor): the masked token indices. shape: (batch, seq_len) att
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0create a unique identifier by choosing `length` random characters from list of ascii characters and numbers
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0create a unique identifier by choosing `length` random characters from list of ascii characters and numbers
HIGHai-framework/deepspeed/hello_bert/train_bert.py0create a unique identifier by choosing `length` random characters from list of ascii characters and numbers
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0create an experiment directory and save all arguments in it. additionally, also store the githash and gitdiff. finally c
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0create an experiment directory and save all arguments in it. additionally, also store the githash and gitdiff. finally c
HIGHai-framework/deepspeed/hello_bert/train_bert.py0create an experiment directory and save all arguments in it. additionally, also store the githash and gitdiff. finally c
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0loads the optimizer state dict and model state dict from the load_checkpoint_dir into the passed model and optimizer. se
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0loads the optimizer state dict and model state dict from the load_checkpoint_dir into the passed model and optimizer. se
HIGHai-framework/deepspeed/hello_bert/train_bert.py0loads the optimizer state dict and model state dict from the load_checkpoint_dir into the passed model and optimizer. se
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0trains a [bert style](https://arxiv.org/pdf/1810.04805.pdf) (transformer encoder only) model for mlm task args: checkpoi
HIGHai-framework/deepspeed/hello_bert/train_bert_ds.py0trains a [bert style](https://arxiv.org/pdf/1810.04805.pdf) (transformer encoder only) model for mlm task args: checkpoi
HIGHai-framework/deepspeed/hello_bert/train_bert.py0trains a [bert style](https://arxiv.org/pdf/1810.04805.pdf) (transformer encoder only) model for mlm task args: checkpoi
HIGHllm-train/alpaca/train_ddp.py0resize tokenizer and embedding. note: this is the unoptimized version that may make your embedding size not be divisible
HIGHllm-train/alpaca/train.py0resize tokenizer and embedding. note: this is the unoptimized version that may make your embedding size not be divisible
HIGHllm-train/qlora/qlora.py0resize tokenizer and embedding. note: this is the unoptimized version that may make your embedding size not be divisible
HIGHllm-train/chinese-llama-alpaca/run_clm_sft_with_peft.py0resize tokenizer and embedding. note: this is the unoptimized version that may make your embedding size not be divisible
HIGHllm-localization/ascend/standford-alpaca/train.py0resize tokenizer and embedding. note: this is the unoptimized version that may make your embedding size not be divisible
HIGHllm-train/alpaca/train_ddp.py0make dataset and collator for supervised fine-tuning.
HIGHllm-train/alpaca/train.py0make dataset and collator for supervised fine-tuning.
HIGHllm-localization/ascend/standford-alpaca/train.py0make dataset and collator for supervised fine-tuning.
HIGHllm-train/megatron/gpt2/data/cMinhash.cpp0only used if instantiated manually by the user, or if cython doesn't * know how to convert the type
HIGHllm-train/megatron/gpt2/data/cMinhash.cpp0only used if instantiated manually by the user, or if cython doesn't * know how to convert the type
HIGHllm-train/megatron/gpt2/data/cMinhash.cpp0only used if instantiated manually by the user, or if cython doesn't * know how to convert the type
HIGHllm-train/megatron/gpt2/data/cMinhash.cpp0only used if instantiated manually by the user, or if cython doesn't * know how to convert the type
HIGHllm-train/megatron/gpt2/data/cMinhash.cpp0only used if instantiated manually by the user, or if cython doesn't * know how to convert the type
HIGHllm-train/megatron/gpt2/data/cMinhash.cpp0only used if instantiated manually by the user, or if cython doesn't * know how to convert the type
Unused Imports142 hits · 142 pts
SeverityFileLineSnippet
LOWllm-data-engineering/sft-dataset/jinja-llm-bloom.py2
LOWllm-data-engineering/sft-dataset/jinja-llm-baichuan2.py2
LOWllm-data-engineering/sft-dataset/jinja-llm-baichuan.py2
LOWllm-data-engineering/sft-dataset/jinja-llm.py2
LOWllm-data-engineering/sft-dataset/jinja-llm-chatglm3.py2
LOWllm-eval/llm-performance/stat_gpu_memory.py1
LOW…performance/hardware-performance/pynvml-stat-memory.py1
LOW…al/llm-performance/vllm/vllm-locust-qwen1.5-7b-long.py3
LOW…al/llm-performance/vllm/vllm-locust-qwen1.5-7b-long.py4
LOW…al/llm-performance/vllm/vllm-locust-qwen1.5-7b-long.py4
LOW…al/llm-performance/mindie/lantency/stat_input_token.py4
LOW…al/llm-performance/mindie/lantency/stat_input_token.py4
LOW…al/llm-performance/mindie/lantency/perfermance-stat.py5
LOW…performance/mindie/locust-lantency-throughput/hello.py3
LOW…performance/mindie/locust-lantency-throughput/hello.py4
LOW…ocust-lantency-throughput/llm-910b4-chatglm3-6b-2tp.py3
LOW…ocust-lantency-throughput/llm-910b4-chatglm3-6b-2tp.py4
LOW…ocust-lantency-throughput/llm-910b4-chatglm3-6b-2tp.py4
LOW…e/locust-lantency-throughput/llm-910b4-qwen-72b-8tp.py2
LOW…e/locust-lantency-throughput/llm-910b4-qwen-72b-8tp.py3
LOW…e/locust-lantency-throughput/llm-910b4-qwen-72b-8tp.py3
LOW…cust-lantency-throughput/llm-910b4-baichuan2-7b-2tp.py3
LOW…cust-lantency-throughput/llm-910b4-baichuan2-7b-2tp.py4
LOW…cust-lantency-throughput/llm-910b4-baichuan2-7b-2tp.py4
LOW…ie/locust-lantency-throughput/llm-910b4-qwen1.5-4tp.py2
LOW…ie/locust-lantency-throughput/llm-910b4-qwen1.5-4tp.py3
LOW…ie/locust-lantency-throughput/llm-910b4-qwen1.5-4tp.py3
LOW…nference/ascend/mindformers/mindsporelite-inference.py12
LOWllm-inference/ascend/mindformers/mindsporelite-stat.py3
LOWllm-inference/ascend/mindformers/mindsporelite-stat.py11
LOW…nference/ascend/mindformers/baichuan2/baichuan-stat.py2
LOW…nce/ascend/mindformers/baichuan2/baichuan-inference.py1
LOW…nce/ascend/mindformers/baichuan2/baichuan-inference.py2
LOW…-inference/ascend/mindformers/chatglm3/chatglm-stat.py2
LOW…m-inference/ascend/mindformers/chatglm3/chatglm-gen.py2
LOW…rence/ascend/mindformers/chatglm3/chatglm-inference.py5
LOWllm-inference/web/flask/llm-qwen-mindspore-lite.py4
LOWllm-inference/web/flask/llm-qwen-mindspore-lite.py7
LOWllm-inference/web/flask/llm-qwen-mindspore-lite.py9
LOWllm-inference/web/fastapi/llm-qwen-mindspore-lite.py4
LOWllm-inference/web/fastapi/llm-qwen-mindspore-lite.py7
LOWllm-inference/web/fastapi/llm-qwen-mindspore-lite.py9
LOWllm-inference/web/fastapi/llm-qwen-mindspore-lite.py15
LOWllm-inference/web/fastapi/llm-qwen-mindspore-lite.py17
LOW…ter-transformer/bloom/firefly_lambada_1w_stat_token.py1
LOW…ter-transformer/bloom/firefly_lambada_1w_stat_token.py10
LOW…ter-transformer/bloom/firefly_lambada_1w_stat_token.py12
LOW…er-transformer/megatron-gpt2/gpt_summarization_stat.py7
LOW…/faster-transformer/megatron-gpt2/gpt_summarization.py7
LOWllm-inference/triton/resnet50/client.py5
LOW…se/distribution-parallelism/moe-parallel/paddle_moe.py6
LOW…se/distribution-parallelism/moe-parallel/paddle_moe.py8
LOWai-framework/mxnet/oneflow_cnn_mnist.py6
LOWai-framework/mxnet/mxnet_cnn_mnist.py1
LOWai-framework/mxnet/mxnet_cnn_mnist.py12
LOWai-framework/mxnet/mxnet_cnn_mnist.py12
LOWai-framework/mxnet/mxnet_cnn_mnist.py26
LOWai-framework/mxnet/mxnet_mlp_mnist.py2
LOWai-framework/mxnet/mnist.py4
LOWllm-train/alpaca-lora/generate.py7
82 more matches not shown…
Decorative Section Separators32 hits · 110 pts
SeverityFileLineSnippet
MEDIUMllm-inference/DeepSpeed-Inference.md9# ---------------------------------------
MEDIUMllm-inference/DeepSpeed-Inference.md11# ---------------------------------------
MEDIUM…pytorch/distribution/pipeline-parallel/ddp_pipeline.py24# ----------------
MEDIUM…pytorch/distribution/pipeline-parallel/ddp_pipeline.py124# -------------------------------------
MEDIUM…pytorch/distribution/pipeline-parallel/ddp_pipeline.py137# -------------------
MEDIUM…pytorch/distribution/pipeline-parallel/ddp_pipeline.py246# -----------------------------------
MEDIUM…pytorch/distribution/pipeline-parallel/ddp_pipeline.py341# -------------
MEDIUM…pytorch/distribution/pipeline-parallel/ddp_pipeline.py445# -------------------------------------
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py19# -------------------------------------------
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py37# ------------------------
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py56# -------------------------------
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py119# -------------------------------------------
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py196# ----------------------------------------------
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py250# ---------------------
MEDIUM…t2/merge_ck_and_inference/checkpoint_saver_megatron.py204 #-----------
MEDIUM…t2/merge_ck_and_inference/checkpoint_saver_megatron.py254 #-------------------
MEDIUMllm-algo/chatglm/模型架构.md220 # ===================================
MEDIUMllm-algo/chatglm/模型架构.md222 # ===================================
MEDIUMllm-algo/chatglm/模型架构.md264 # =========================
MEDIUMllm-algo/chatglm/模型架构.md266 # =========================
MEDIUMllm-algo/chatglm2/模型架构.md138 # ===========================
MEDIUMllm-algo/chatglm2/模型架构.md140 # ===========================
MEDIUMllm-algo/chatglm2/模型架构.md160 # =========================
MEDIUMllm-algo/chatglm2/模型架构.md162 # =========================
MEDIUMllm-algo/chatglm2/模型架构.md253 # =================================================
MEDIUMllm-algo/chatglm2/模型架构.md255 # =================================================
MEDIUMllm-algo/chatglm2/模型架构.md256 # =====================
MEDIUMllm-algo/chatglm2/模型架构.md258 # =====================
MEDIUMllm-algo/chatglm2/模型架构.md322 # ==================================
MEDIUMllm-algo/chatglm2/模型架构.md324 # ==================================
MEDIUMllm-algo/chatglm2/模型架构.md328 # =================
MEDIUMllm-algo/chatglm2/模型架构.md330 # =================
Over-Commented Block71 hits · 62 pts
SeverityFileLineSnippet
LOWllm-tools/profiler-recipe.py21# Name Self CPU CPU total CPU time avg # of Calls
LOWllm-tools/profiler-recipe.py41# --------------------------------- ------------ -------------------------------------------
LOWllm-tools/profiler-recipe.py81
LOWllm-tools/profiler-recipe.py121# --------------------------------- ------------ ------------ ------------
LOWllm-tools/profiler-recipe.py141# aten::empty 94.79 Mb 94.79 Mb 121
LOWllm-tools/profiler-recipe.py181 with_stack=True,
LOWllm-pipeline/REAEMD.md81# --llama \
LOWllm-pipeline/REAEMD.md101# --gradient_checkpointing \
LOWllm-pipeline/REAEMD.md121# --num_train_epochs 2 \
LOWllm-train/qlora/accuracy.py1# Copyright 2020 The HuggingFace Datasets Authors and the current dataset script contributor.
LOWllm-train/chinese-llama-alpaca/run_clm_pt_with_peft.py1#!/usr/bin/env python
LOWllm-train/chinese-llama-alpaca/run_clm_sft_with_peft.py1#!/usr/bin/env python
LOW…ibution/data-parallel/minGPT-ddp/sbatch_run_sig_opt.sh1
LOW…rch/distribution/data-parallel/minGPT-ddp/multinode.sh1#!/bin/bash
LOW…istribution/data-parallel/minGPT-ddp/sbatch_run_sig.sh1#!/bin/bash
LOW…ch/distribution/data-parallel/minGPT-ddp/sbatch_run.sh1
LOW…pytorch/distribution/pipeline-parallel/ddp_pipeline.py61
LOW…pytorch/distribution/pipeline-parallel/ddp_pipeline.py121
LOW…pytorch/distribution/pipeline-parallel/ddp_pipeline.py141######################################################################
LOW…pytorch/distribution/pipeline-parallel/ddp_pipeline.py221######################################################################
LOW…pytorch/distribution/pipeline-parallel/ddp_pipeline.py241 # Need batch dimension first for pipeline parallelism.
LOW…pytorch/distribution/pipeline-parallel/ddp_pipeline.py341# -------------
LOW…pytorch/distribution/pipeline-parallel/ddp_pipeline.py461 mp.spawn(run_worker, args=(world_size, ), nprocs=world_size, join=True)
LOW…pytorch/distribution/pipeline-parallel/ddp_pipeline.py481# [RANK 1]: -----------------------------------------------------------------------------------------
LOW…pytorch/distribution/pipeline-parallel/ddp_pipeline.py501# [RANK 0]: | epoch 3 | 20/ 50 batches | lr 4.51 | ms/batch 698.27 | loss 12.01 | ppl 164364.60
LOWllm-train/alpa/train/pipeshard_parallelism.py41ray.init()
LOWllm-train/alpa/train/pipeshard_parallelism.py181
LOWllm-train/alpa/train/pipeshard_parallelism.py201#
LOWllm-train/alpa/train/pipeshard_parallelism.py241auto_pipeline_actual_state = auto_pipeline_train_step(state, batch)
LOWllm-train/alpa/train/pipeshard_parallelism.py261#
LOW…egatron/gpt2/merge_ck_and_inference/checkpoint_util.py1import argparse
LOW…egatron/gpt2/merge_ck_and_inference/checkpoint_util.py21
LOW…egatron/gpt2/merge_ck_and_inference/checkpoint_util.py41# consumed_valid_samples
LOW…egatron/gpt2/merge_ck_and_inference/checkpoint_util.py61# "mlp l1 weight"
LOWllm-train/megatron/gpt2/data/cMinhash.cpp21END: Cython Metadata */
LOWllm-train/megatron/gpt2/data/cMinhash.cpp41 #endif
LOWllm-train/megatron/gpt2/data/cMinhash.cpp61#else
LOWllm-train/megatron/gpt2/data/cMinhash.cpp81 #define __Pyx_PyCode_New(a, k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos)\
LOWllm-train/megatron/gpt2/data/cMinhash.cpp101#endif
LOWllm-train/megatron/gpt2/data/cMinhash.cpp121#endif
LOWllm-train/megatron/gpt2/data/cMinhash.cpp141 #define PyObject_Free(p) PyMem_Free(p)
LOWllm-train/megatron/gpt2/data/cMinhash.cpp161#if PY_MAJOR_VERSION >= 3
LOWllm-train/megatron/gpt2/data/cMinhash.cpp181 #define PyInt_FromSsize_t PyLong_FromSsize_t
LOWllm-train/megatron/gpt2/data/cMinhash.cpp201#else
LOWllm-train/megatron/gpt2/data/cMinhash.cpp221#define __Pyx_PyType_AsAsync(obj) NULL
LOWllm-train/megatron/gpt2/data/cMinhash.cpp281
LOWllm-train/megatron/gpt2/data/cMinhash.cpp301#include "stdlib.h"
LOWllm-train/megatron/gpt2/data/cMinhash.cpp321# else
LOWllm-train/megatron/gpt2/data/cMinhash.cpp341#define __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT 0
LOWllm-train/megatron/gpt2/data/cMinhash.cpp361 #define __Pyx_sst_abs(value) abs(value)
LOWllm-train/megatron/gpt2/data/cMinhash.cpp381 #define __Pyx_PyStr_FromString __Pyx_PyBytes_FromString
LOWllm-train/megatron/gpt2/data/cMinhash.cpp401#else
LOWllm-train/megatron/gpt2/data/cMinhash.cpp521
LOWllm-train/megatron/gpt2/data/cMinhash.cpp601#endif
LOWllm-train/megatron/gpt2/data/cMinhash.cpp621 #define __pyx_atomic_incr_aligned(value, lock) _InterlockedIncrement(value)
LOWllm-train/megatron/gpt2/data/cMinhash.cpp1041 #define __Pyx_RefNannyDeclarations void *__pyx_refnanny = NULL;
LOWllm-train/megatron/gpt2/data/cMinhash.cpp1061 #define __Pyx_XINCREF(r) do { if((r) != NULL) {__Pyx_INCREF(r); }} while(0)
LOWllm-train/megatron/gpt2/data/cMinhash.cpp1141#endif
LOWllm-train/megatron/gpt2/data/cMinhash.cpp1161 int memview_is_new_reference);
LOWllm-train/megatron/gpt2/data/cMinhash.cpp1181
11 more matches not shown…
Self-Referential Comments17 hits · 56 pts
SeverityFileLineSnippet
MEDIUMllm-inference/flexflow-serve/benchmark-batch1.py43# Create the sampling configs
MEDIUMai-framework/deepspeed/hello_bert/train_bert_ds.py117 # Create the labels first
MEDIUMai-framework/deepspeed/hello_bert/train_bert_ds.py941 # Create the labels first
MEDIUMai-framework/deepspeed/hello_bert/train_bert.py117 # Create the labels first
MEDIUMllm-train/galore/torchrun_main.py433 # The below code is only executed during the update step
MEDIUM…tribution/tensor-parallel/sequence_parallel_example.py49 # Create a optimizer for the parallelized module.
MEDIUM…ch/distribution/tensor-parallel/2d_parallel_example.py75 # Create a optimizer for the parallelized module.
MEDIUM…istribution/tensor-parallel/tensor_parallel_example.py57 # Create a optimizer for the parallelized module.
MEDIUM…pytorch/distribution/pipeline-parallel/ddp_pipeline.py23# Define the model
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py101# Define the training step
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py127# Define a MLP model with manual stage boundaries.
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py154# Define the training step.
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py213# Define the parallel method.
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py222# Define the training step. The function body is the same as the above one.
MEDIUMllm-compression/quantization/llm-qat/cfd70ff/utils.py16# Define a utility method for setting the logging parameters of a logger
MEDIUMllm-compression/quantization/llm-qat/cfd70ff/utils.py24 # Define a formatter for the log messages
MEDIUMllm-compression/quantization/llm-qat/cfd70ff/utils.py29 # Create a console handler for outputting log messages to the console
Hyper-Verbose Identifiers43 hits · 44 pts
SeverityFileLineSnippet
LOWllm-data-engineering/sft-dataset/数据集格式.md93def preprocess_function_train(examples):
LOWllm-data-engineering/sft-dataset/数据集格式.md156def build_inputs_with_special_tokens(
LOWllm-data-engineering/sft-dataset/数据集格式.md434def build_inputs_with_special_tokens(self, token_ids_0, token_ids_1=None):
LOW…nference/ascend/mindformers/mindsporelite-inference.py26def pipeline_from_model_paths(args_, tokenizer):
LOW…nference/ascend/mindformers/mindsporelite-inference.py76def pipeline_from_infer_config(args_, tokenizer):
LOWllm-inference/ascend/mindformers/mindsporelite-stat.py33def pipeline_from_model_paths(args_, tokenizer):
LOWllm-inference/ascend/mindformers/mindsporelite-stat.py83def pipeline_from_infer_config(args_, tokenizer):
LOWllm-train/alpaca-lora/finetune_metrics_epoch.py152 def generate_and_tokenize_prompt(data_point):
LOWllm-train/alpaca-lora/finetune.py146 def generate_and_tokenize_prompt(data_point):
LOWllm-train/chatglm/main.py176 def preprocess_function_train(examples):
LOWllm-train/alpaca/train_ddp.py54def safe_save_model_for_hf_trainer(trainer: transformers.Trainer, output_dir: str):
LOWllm-train/alpaca/train_ddp.py63def smart_tokenizer_and_embedding_resize(
LOWllm-train/alpaca/train_ddp.py173def make_supervised_data_module(tokenizer: transformers.PreTrainedTokenizer, data_args) -> Dict:
LOWllm-train/alpaca/train.py53def safe_save_model_for_hf_trainer(trainer: transformers.Trainer, output_dir: str):
LOWllm-train/alpaca/train.py62def smart_tokenizer_and_embedding_resize(
LOWllm-train/alpaca/train.py172def make_supervised_data_module(tokenizer: transformers.PreTrainedTokenizer, data_args) -> Dict:
LOWllm-train/qlora/qlora.py346def print_trainable_parameters(args, model):
LOWllm-train/qlora/qlora.py363def smart_tokenizer_and_embedding_resize(
LOWllm-train/qlora/qlora.py438def extract_unnatural_instructions_data(examples, extract_reformulations=False):
LOWllm-train/chinese-llama-alpaca/run_clm_pt_with_peft.py75def preprocess_logits_for_metrics(logits, labels):
LOWllm-train/chinese-llama-alpaca/run_clm_pt_with_peft.py83def fault_tolerance_data_collator(features: List) -> Dict[str, Any]:
LOWllm-train/chinese-llama-alpaca/run_clm_sft_with_peft.py433def smart_tokenizer_and_embedding_resize(
LOWllm-train/alpa/train/pipeshard_parallelism.py161def manual_pipeline_train_step(state, batch):
LOWllm-train/megatron/gpt2/data/cMinhash.cpp8142 * cdef setitem_slice_assign_scalar(self, memoryview dst, value):
LOWllm-train/megatron/gpt2/data/cMinhash.cpp8186 * cdef setitem_slice_assign_scalar(self, memoryview dst, value): # <<<<<<<<<<<<<<
LOWllm-train/megatron/gpt2/data/cMinhash.cpp8213 * cdef setitem_slice_assign_scalar(self, memoryview dst, value):
LOWllm-train/megatron/gpt2/data/cMinhash.cpp8451 * cdef setitem_slice_assign_scalar(self, memoryview dst, value): # <<<<<<<<<<<<<<
LOWllm-train/megatron/gpt2/data/cMinhash.cpp14045 * cdef memoryview_copy_from_slice(memoryview memview, __Pyx_memviewslice *memviewslice): # <<<<<<<<<<<<<<
LOWllm-train/megatron/gpt2/data/cMinhash.cpp14149 * cdef memoryview_copy_from_slice(memoryview memview, __Pyx_memviewslice *memviewslice): # <<<<<<<<<<<<<<
LOWllm-algo/chatglm/模型架构.md176def apply_rotary_pos_emb_index(q, k, cos, sin, position_id):
LOWllm-algo/chatglm/模型架构.md362 def split_tensor_along_last_dim(self, tensor, num_partitions,
LOWllm-algo/chatglm2/模型架构.md749 def _update_model_kwargs_for_generation(
LOWllm-algo/chatglm2/模型架构.md780 def prepare_inputs_for_generation(
LOWllm-compression/quantization/llm-qat/cfd70ff/utils.py39def safe_save_model_for_hf_trainer(trainer: transformers.Trainer, output_dir: str):
LOWllm-localization/ascend/standford-alpaca/train.py51def smart_tokenizer_and_embedding_resize(
LOWllm-localization/ascend/standford-alpaca/train.py161def make_supervised_data_module(tokenizer: transformers.PreTrainedTokenizer, data_args) -> Dict:
LOWllm-localization/ascend/mindie/script/model-test.py1237 def __compare_simplified_dataset_results(self):
LOWllm-localization/ascend/mindie/script/model-test.py1370 def __compare_full_dataset_results(self):
LOWllm-localization/ascend/mindie/script/model-test.py1456 def __patch_hf_transformers_utils(self):
LOWllm-localization/ascend/mindie/script/model-test.py799 def process_before_extraction(gen, choice_dict):
LOWllm-localization/ascend/mindie/script/model-test.py977 def __run_full_dataset_truthfulqa(self):
LOWllm-localization/ascend/mindie/script/model-test.py986 def format_prompt_with_answer_strings(question, ans):
LOWllm-localization/ascend/mindie/script/model-test.py1153 def __run_full_dataset_humaneval(self):
Cross-Language Confusion5 hits · 38 pts
SeverityFileLineSnippet
HIGHllm-train/megatron/gpt2/data/download.py251 url varchar(2048) not null,
HIGHllm-train/megatron/gpt2/data/download.py252 domain varchar(255) not null,
HIGHllm-train/megatron/gpt2/data/download.py253 word_count int null,
HIGHllm-train/megatron/gpt2/data/download.py254 elapsed int null,
HIGHllm-train/megatron/gpt2/data/download.py255 scraper varchar(255) not null,
Deep Nesting36 hits · 36 pts
SeverityFileLineSnippet
LOW…er-transformer/megatron-gpt2/gpt_summarization_stat.py24
LOW…er-transformer/megatron-gpt2/gpt_summarization_stat.py354
LOW…/faster-transformer/megatron-gpt2/gpt_summarization.py21
LOW…/faster-transformer/megatron-gpt2/gpt_summarization.py327
LOWai-framework/mxnet/mxnet_cnn_mnist.py122
LOWllm-train/alpaca-lora/export_state_dict_checkpoint.py80
LOWllm-train/chatglm/main.py38
LOWllm-train/chatglm/main.py148
LOWllm-train/chatglm/main.py176
LOWllm-train/qlora/qlora.py262
LOWllm-train/qlora/qlora.py438
LOWllm-train/qlora/qlora.py475
LOWllm-train/qlora/qlora.py490
LOWllm-train/qlora/qlora.py514
LOWllm-train/qlora/qlora.py545
LOWllm-train/galore/torchrun_main.py134
LOW…/peft/clm/peft_lora_clm_accelerate_ds_zero3_offload.py109
LOW…/chinese-llama-alpaca/merge_llama_with_chinese_lora.py67
LOW…/chinese-llama-alpaca/merge_llama_with_chinese_lora.py111
LOWllm-train/chinese-llama-alpaca/run_clm_pt_with_peft.py83
LOW…2/merge_ck_and_inference/checkpoint_loader_megatron.py19
LOW…t2/merge_ck_and_inference/checkpoint_saver_megatron.py22
LOWllm-train/megatron/gpt2/data/download.py193
LOWllm-localization/ascend/mindie/script/model-test.py297
LOWllm-localization/ascend/mindie/script/model-test.py471
LOWllm-localization/ascend/mindie/script/model-test.py535
LOWllm-localization/ascend/mindie/script/model-test.py587
LOWllm-localization/ascend/mindie/script/model-test.py669
LOWllm-localization/ascend/mindie/script/model-test.py789
LOWllm-localization/ascend/mindie/script/model-test.py889
LOWllm-localization/ascend/mindie/script/model-test.py1075
LOWllm-localization/ascend/mindie/script/model-test.py1153
LOWllm-localization/ascend/mindie/script/model-test.py1224
LOWllm-localization/ascend/mindie/script/model-test.py1264
LOWllm-localization/ascend/mindie/script/model-test.py1499
LOWllm-localization/ascend/mindie/script/model-test.py334
Excessive Try-Catch Wrapping5 hits · 7 pts
SeverityFileLineSnippet
MEDIUM…er-transformer/megatron-gpt2/gpt_summarization_stat.py342 print('Error with datapoint : ', data_point_idx)
MEDIUM…/faster-transformer/megatron-gpt2/gpt_summarization.py325 print('Error with datapoint : ', data_point_idx)
LOWllm-train/chinese-llama-alpaca/run_clm_pt_with_peft.py463 except Exception:
MEDIUM…ron/gpt2/merge_ck_and_inference/text_generation_cli.py20 print(f"Error {response.status_code}: {response.json()['message']}")
LOWllm-localization/ascend/mindie/script/model-test.py258 except Exception as e:
Redundant / Tautological Comments4 hits · 6 pts
SeverityFileLineSnippet
LOWllm-train/alpaca-lora/finetune_metrics_epoch.py104 # Check if parameter passed or if set within environ
LOWllm-train/alpaca-lora/finetune.py98 # Check if parameter passed or if set within environ
LOW…-compression/quantization/llm-qat/f4d873a/datautils.py91 # Loop through the list of dictionaries
LOW…-compression/quantization/llm-qat/f4d873a/datautils.py98 # Append the value to the list associated with the key in dict_of_lists
Slop Phrases2 hits · 6 pts
SeverityFileLineSnippet
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py44# Alternatively, you can use the following command to connect to an existing
MEDIUMllm-train/alpa/train/pipeshard_parallelism.py191# device assignment of each stage, you can use the more advanced
Docstring Block Structure1 hit · 5 pts
SeverityFileLineSnippet
HIGHllm-train/qlora/accuracy.py33 Args: predictions (`list` of `int`): Predicted labels. references (`list` of `int`): Ground truth labels. n
Fake / Example Data1 hit · 1 pts
SeverityFileLineSnippet
LOWllm-data-engineering/sft-dataset/jinja-demo.py10result = template.render(name='John Doe')