A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
3300 matches across 18 categories. Click a row to expand file-level details.
| Severity | File | Line | Snippet |
|---|---|---|---|
| HIGH | archive/merge_tensors/merge_safetensor_gguf.py | 0 | :param folder_path: folder path :return: key_to_file_map |
| HIGH | archive/kt-sft/merge_tensors/merge_safetensor_gguf.py | 0 | :param folder_path: folder path :return: key_to_file_map |
| HIGH | kt-kernel/scripts/check.py | 0 | :param folder_path: folder path :return: key_to_file_map |
| HIGH | archive/csrc/ktransformers_ext/bench/bench_attention.py | 0 | description : author : jianwei dong date : 2024-08-28 10:32:05 version : 1.0.0 lasteditors : jianwei dong lastedittime : |
| HIGH | …/csrc/ktransformers_ext/bench/bench_attention_torch.py | 0 | description : author : jianwei dong date : 2024-08-28 10:32:05 version : 1.0.0 lasteditors : jianwei dong lastedittime : |
| HIGH | …kt-sft/csrc/ktransformers_ext/bench/bench_attention.py | 0 | description : author : jianwei dong date : 2024-08-28 10:32:05 version : 1.0.0 lasteditors : jianwei dong lastedittime : |
| HIGH | …/csrc/ktransformers_ext/bench/bench_attention_torch.py | 0 | description : author : jianwei dong date : 2024-08-28 10:32:05 version : 1.0.0 lasteditors : jianwei dong lastedittime : |
| HIGH | kt-kernel/bench/bench_attention.py | 0 | description : author : jianwei dong date : 2024-08-28 10:32:05 version : 1.0.0 lasteditors : jianwei dong lastedittime : |
| HIGH | kt-kernel/bench/bench_attention_torch.py | 0 | description : author : jianwei dong date : 2024-08-28 10:32:05 version : 1.0.0 lasteditors : jianwei dong lastedittime : |
| HIGH | archive/csrc/ktransformers_ext/bench/bench_moe.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …chive/kt-sft/csrc/ktransformers_ext/bench/bench_moe.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/bench/bench_moe_kernel.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/bench/bench_moe_amx.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/bench/bench_moe_amx_k.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/bench/bench_moe_kml.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | archive/csrc/ktransformers_ext/bench/bench_mlp.py | 0 | description : author : chenht2022 date : 2024-07-16 10:43:18 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …chive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py | 0 | description : author : chenht2022 date : 2024-07-16 10:43:18 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/bench/bench_mlp.py | 0 | description : author : chenht2022 date : 2024-07-16 10:43:18 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | archive/csrc/ktransformers_ext/bench/bench_linear.py | 0 | description : author : chenht2022 date : 2024-07-25 10:31:59 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …ve/kt-sft/csrc/ktransformers_ext/bench/bench_linear.py | 0 | description : author : chenht2022 date : 2024-07-25 10:31:59 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/bench/bench_linear.py | 0 | description : author : chenht2022 date : 2024-07-25 10:31:59 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …ive/csrc/ktransformers_ext/bench/bench_linear_torch.py | 0 | description : author : chenht2022 date : 2024-07-25 10:31:59 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …sft/csrc/ktransformers_ext/bench/bench_linear_torch.py | 0 | description : author : chenht2022 date : 2024-07-25 10:31:59 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/bench/bench_linear_torch.py | 0 | description : author : chenht2022 date : 2024-07-25 10:31:59 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | archive/csrc/ktransformers_ext/bench/bench_mlp_torch.py | 0 | description : author : chenht2022 date : 2024-07-16 10:43:18 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py | 0 | description : author : chenht2022 date : 2024-07-16 10:43:18 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/bench/bench_mlp_torch.py | 0 | description : author : chenht2022 date : 2024-07-16 10:43:18 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | archive/csrc/ktransformers_ext/examples/test_mlp.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …ive/kt-sft/csrc/ktransformers_ext/examples/test_mlp.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/examples/test_mlp.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | archive/csrc/ktransformers_ext/examples/test_moe.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …ft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …ive/kt-sft/csrc/ktransformers_ext/examples/test_moe.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/examples/test_moe_kernel.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/examples/test_moe_kml.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …hive/csrc/ktransformers_ext/examples/test_attention.py | 0 | description : author : jianwei dong date : 2024-08-28 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 2 |
| HIGH | …-sft/csrc/ktransformers_ext/examples/test_attention.py | 0 | description : author : jianwei dong date : 2024-08-28 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 2 |
| HIGH | kt-kernel/examples/test_attention.py | 0 | description : author : jianwei dong date : 2024-08-28 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 2 |
| HIGH | archive/csrc/ktransformers_ext/examples/test_linear.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | …/kt-sft/csrc/ktransformers_ext/examples/test_linear.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | kt-kernel/examples/test_linear.py | 0 | description : author : chenht2022 date : 2024-07-25 10:32:05 version : 1.0.0 lasteditors : chenht2022 lastedittime : 202 |
| HIGH | archive/csrc/custom_marlin/utils/format24.py | 0 | class for creating n:m sparsity masks. masks will be created using the n:m ratio, where for every block of m weights, n |
| HIGH | archive/kt-sft/csrc/custom_marlin/utils/format24.py | 0 | class for creating n:m sparsity masks. masks will be created using the n:m ratio, where for every block of m weights, n |
| HIGH | …xt/operators/custom_marlin/quantize/utils/format_24.py | 0 | class for creating n:m sparsity masks. masks will be created using the n:m ratio, where for every block of m weights, n |
| HIGH | …xt/operators/custom_marlin/quantize/utils/format_24.py | 0 | class for creating n:m sparsity masks. masks will be created using the n:m ratio, where for every block of m weights, n |
| HIGH | archive/kt-sft/ktransformers/local_chat.py | 0 | description : author : boxin zhang, azure-tang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/kt-sft/ktransformers/optimize/optimize.py | 0 | description : author : boxin zhang, azure-tang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/kt-sft/ktransformers/util/utils.py | 0 | description : author : boxin zhang, azure-tang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/ktransformers/local_chat_test.py | 0 | description : author : boxin zhang, azure-tang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/ktransformers/local_chat.py | 0 | description : author : boxin zhang, azure-tang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/ktransformers/optimize/optimize.py | 0 | description : author : boxin zhang, azure-tang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/ktransformers/util/utils.py | 0 | description : author : boxin zhang, azure-tang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/kt-sft/ktransformers/local_chat.py | 0 | '): # end multi lines input line = line[:-3] # suffix |
| HIGH | archive/ktransformers/local_chat.py | 0 | '): # end multi lines input line = line[:-3] # suffix |
| HIGH | kt-kernel/examples/test_deepseekv3_prefill.py | 0 | '): # end multi lines input line = line[:-3] # suffix |
| HIGH | archive/kt-sft/ktransformers/operators/attention.py | 0 | description : author : boxin zhang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/kt-sft/ktransformers/operators/base_operator.py | 0 | description : author : boxin zhang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/kt-sft/ktransformers/operators/RoPE.py | 0 | description : author : boxin zhang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| HIGH | archive/kt-sft/ktransformers/util/cuda_graph_runner.py | 0 | description : author : boxin zhang version : 0.1.0 copyright (c) 2024 by kvcache.ai, all rights reserved. |
| 329 more matches not shown… | |||
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | ktransformers.py | 8 | |
| LOW | ktransformers.py | 29 | |
| LOW | …chive/merge_tensors/merge_safetensor_gguf_for_qwen3.py | 18 | |
| LOW | archive/merge_tensors/merge_safetensor_gguf.py | 5 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_moe_torch.py | 12 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_moe_torch.py | 12 | |
| LOW | …ive/csrc/ktransformers_ext/bench/bench_linear_torch.py | 12 | |
| LOW | …ive/csrc/ktransformers_ext/bench/bench_linear_torch.py | 12 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_mlp_torch.py | 12 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_mlp_torch.py | 12 | |
| LOW | …/csrc/ktransformers_ext/bench/bench_attention_torch.py | 16 | |
| LOW | archive/csrc/ktransformers_ext/cuda/setup.py | 2 | |
| LOW | archive/csrc/ktransformers_ext/cuda/setup.py | 3 | |
| LOW | archive/csrc/ktransformers_ext/cuda/test_dequant.py | 1 | |
| LOW | archive/csrc/ktransformers_ext/examples/test_mlp.py | 13 | |
| LOW | archive/csrc/ktransformers_ext/examples/test_moe.py | 13 | |
| LOW | …hive/csrc/ktransformers_ext/examples/test_attention.py | 13 | |
| LOW | archive/csrc/ktransformers_ext/examples/test_linear.py | 13 | |
| LOW | archive/csrc/custom_marlin/setup.py | 1 | |
| LOW | archive/csrc/custom_marlin/setup.py | 2 | |
| LOW | archive/csrc/custom_marlin/test_cuda_graph.py | 1 | |
| LOW | archive/csrc/custom_marlin/test_cuda_graph.py | 6 | |
| LOW | archive/csrc/custom_marlin/utils/marlin_utils.py | 13 | |
| LOW | archive/kt-sft/withoutKT_PEFT.py | 2 | |
| LOW | archive/kt-sft/withoutKT_PEFT.py | 4 | |
| LOW | archive/kt-sft/withoutKT_PEFT.py | 9 | |
| LOW | archive/kt-sft/withoutKT_PEFT.py | 9 | |
| LOW | archive/kt-sft/merge_tensors/merge_safetensor_gguf.py | 5 | |
| LOW | …kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py | 12 | |
| LOW | …kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py | 12 | |
| LOW | …sft/csrc/ktransformers_ext/bench/bench_linear_torch.py | 12 | |
| LOW | …sft/csrc/ktransformers_ext/bench/bench_linear_torch.py | 12 | |
| LOW | …kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py | 12 | |
| LOW | …kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py | 12 | |
| LOW | …/csrc/ktransformers_ext/bench/bench_attention_torch.py | 16 | |
| LOW | archive/kt-sft/csrc/ktransformers_ext/cuda/setup.py | 2 | |
| LOW | archive/kt-sft/csrc/ktransformers_ext/cuda/setup.py | 3 | |
| LOW | …ive/kt-sft/csrc/ktransformers_ext/cuda/test_dequant.py | 1 | |
| LOW | …ive/kt-sft/csrc/ktransformers_ext/examples/test_mlp.py | 13 | |
| LOW | …ive/kt-sft/csrc/ktransformers_ext/examples/test_moe.py | 13 | |
| LOW | …-sft/csrc/ktransformers_ext/examples/test_attention.py | 13 | |
| LOW | …/kt-sft/csrc/ktransformers_ext/examples/test_linear.py | 13 | |
| LOW | archive/kt-sft/csrc/custom_marlin/setup.py | 1 | |
| LOW | archive/kt-sft/csrc/custom_marlin/setup.py | 2 | |
| LOW | archive/kt-sft/csrc/custom_marlin/test_cuda_graph.py | 1 | |
| LOW | archive/kt-sft/csrc/custom_marlin/test_cuda_graph.py | 6 | |
| LOW | archive/kt-sft/csrc/custom_marlin/utils/marlin_utils.py | 13 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 2 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 8 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 9 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 11 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 11 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 11 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 11 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 19 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 21 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 21 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 22 | |
| LOW | archive/kt-sft/ktransformers/moe_test_module.py | 25 | |
| LOW | archive/kt-sft/ktransformers/__init__.py | 18 | |
| 857 more matches not shown… | |||
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | docker/docker-utils.sh | 1 | #!/usr/bin/env bash |
| LOW | docker/docker-utils.sh | 161 | ################################################################################ |
| LOW | docker/build-docker-tar.sh | 1 | #!/usr/bin/env bash |
| LOW | docker/push-to-dockerhub.sh | 1 | #!/usr/bin/env bash |
| LOW | docker/push-to-dockerhub.sh | 581 | # - Automatic version detection |
| LOW | …chive/merge_tensors/merge_safetensor_gguf_for_qwen3.py | 1 | # coding=utf-8 |
| LOW | archive/csrc/ktransformers_ext/ext_bindings.cpp | 21 | #if defined(__x86_64__) && defined(__HAS_AVX512F__) && defined(__HAS_AMX__) |
| LOW | archive/csrc/ktransformers_ext/vendors/musa.h | 1 | #pragma once |
| LOW | archive/csrc/ktransformers_ext/vendors/musa.h | 21 | #define cublasDestroy mublasDestroy |
| LOW | archive/csrc/ktransformers_ext/vendors/musa.h | 41 | #define cudaEventCreateWithFlags musaEventCreateWithFlags |
| LOW | archive/csrc/ktransformers_ext/vendors/musa.h | 61 | #define cudaMallocManaged musaMallocManaged |
| LOW | archive/csrc/ktransformers_ext/vendors/musa.h | 81 | #define cudaStreamWaitEvent musaStreamWaitEvent |
| LOW | archive/csrc/ktransformers_ext/vendors/musa.h | 101 | #define cuMemGetAllocationGranularity muMemGetAllocationGranularity |
| LOW | archive/csrc/ktransformers_ext/vendors/musa.h | 121 | #define cudaGraphExecUpdate musaGraphExecUpdate |
| LOW | archive/csrc/ktransformers_ext/vendors/hip.h | 1 | #pragma once |
| LOW | archive/csrc/ktransformers_ext/vendors/hip.h | 21 | #define CUBLAS_TF32_TENSOR_OP_MATH 0 |
| LOW | archive/csrc/ktransformers_ext/vendors/hip.h | 41 | #define cublasSgemm hipblasSgemm |
| LOW | archive/csrc/ktransformers_ext/vendors/hip.h | 61 | #define cudaGetDevice hipGetDevice |
| LOW | archive/csrc/ktransformers_ext/vendors/hip.h | 81 | #define cudaMemset hipMemset |
| LOW | archive/csrc/ktransformers_ext/vendors/hip.h | 101 | #define cudaStreamCreateWithFlags hipStreamCreateWithFlags |
| LOW | archive/csrc/ktransformers_ext/vendors/hip.h | 121 | #define cudaGraphKernelNodeSetParams hipGraphKernelNodeSetParams |
| LOW | archive/csrc/ktransformers_ext/vendors/hip.h | 141 | #define CUBLAS_STATUS_INTERNAL_ERROR HIPBLAS_STATUS_INTERNAL_ERROR |
| LOW | archive/csrc/ktransformers_ext/vendors/hip.h | 161 | #define RDNA2 |
| LOW | archive/csrc/ktransformers_ext/vendors/cuda.h | 1 | #pragma once |
| LOW | archive/csrc/ktransformers_ext/vendors/vendor.h | 1 | #ifndef CPUINFER_VENDOR_VENDOR_H |
| LOW | …ive/csrc/ktransformers_ext/operators/kvcache/kvcache.h | 21 | #include <fstream> |
| LOW | archive/csrc/ktransformers_ext/operators/amx/la/amx.hpp | 21 | #include <sys/syscall.h> |
| LOW | archive/csrc/ktransformers_ext/operators/amx/la/amx.hpp | 41 | namespace amx { |
| LOW | archive/csrc/ktransformers_ext/cpu_backend/cpuinfer.h | 21 | #ifdef KTRANSFORMERS_USE_CUDA |
| LOW | …hive/csrc/ktransformers_ext/cpu_backend/vendors/musa.h | 1 | #pragma once |
| LOW | …hive/csrc/ktransformers_ext/cpu_backend/vendors/musa.h | 21 | #define cublasDestroy mublasDestroy |
| LOW | …hive/csrc/ktransformers_ext/cpu_backend/vendors/musa.h | 41 | #define cudaEventCreateWithFlags musaEventCreateWithFlags |
| LOW | …hive/csrc/ktransformers_ext/cpu_backend/vendors/musa.h | 61 | #define cudaMallocManaged musaMallocManaged |
| LOW | …hive/csrc/ktransformers_ext/cpu_backend/vendors/musa.h | 81 | #define cudaStreamWaitEvent musaStreamWaitEvent |
| LOW | …hive/csrc/ktransformers_ext/cpu_backend/vendors/musa.h | 101 | #define cuMemGetAllocationGranularity muMemGetAllocationGranularity |
| LOW | …hive/csrc/ktransformers_ext/cpu_backend/vendors/musa.h | 121 | #define cudaGraphExecUpdate musaGraphExecUpdate |
| LOW | …chive/csrc/ktransformers_ext/cpu_backend/vendors/hip.h | 1 | #pragma once |
| LOW | …chive/csrc/ktransformers_ext/cpu_backend/vendors/hip.h | 21 | #define CUBLAS_TF32_TENSOR_OP_MATH 0 |
| LOW | …chive/csrc/ktransformers_ext/cpu_backend/vendors/hip.h | 41 | #define cublasSgemm hipblasSgemm |
| LOW | …chive/csrc/ktransformers_ext/cpu_backend/vendors/hip.h | 61 | #define cudaGetDevice hipGetDevice |
| LOW | …chive/csrc/ktransformers_ext/cpu_backend/vendors/hip.h | 81 | #define cudaMemset hipMemset |
| LOW | …chive/csrc/ktransformers_ext/cpu_backend/vendors/hip.h | 101 | #define cudaStreamCreateWithFlags hipStreamCreateWithFlags |
| LOW | …chive/csrc/ktransformers_ext/cpu_backend/vendors/hip.h | 121 | #define cudaGraphKernelNodeSetParams hipGraphKernelNodeSetParams |
| LOW | …chive/csrc/ktransformers_ext/cpu_backend/vendors/hip.h | 141 | #define CUBLAS_STATUS_INTERNAL_ERROR HIPBLAS_STATUS_INTERNAL_ERROR |
| LOW | …chive/csrc/ktransformers_ext/cpu_backend/vendors/hip.h | 161 | #define RDNA2 |
| LOW | …hive/csrc/ktransformers_ext/cpu_backend/vendors/cuda.h | 1 | #pragma once |
| LOW | …ve/csrc/ktransformers_ext/cpu_backend/vendors/vendor.h | 1 | #ifndef CPUINFER_VENDOR_VENDOR_H |
| LOW | archive/csrc/ktransformers_ext/cuda/binding.cpp | 1 | /** |
| LOW | archive/csrc/ktransformers_ext/cuda/gptq_marlin/ops.h | 21 | |
| LOW | archive/csrc/balance_serve/sched/scheduler.h | 1 | #pragma once |
| LOW | archive/csrc/balance_serve/sched/scheduler.cpp | 1 | #define SPDLOG_ACTIVE_LEVEL SPDLOG_LEVEL_INFO |
| LOW | archive/csrc/balance_serve/sched/metrics.h | 1 | #ifndef Metrics_H |
| LOW | archive/csrc/balance_serve/sched/utils/all.hpp | 1 | #pragma once |
| LOW | archive/csrc/balance_serve/kvc2/test/page_pool_test.cpp | 1 | |
| LOW | …ve/csrc/balance_serve/kvc2/test/kvc2test/lookup-mt.cpp | 61 | // // common prefix |
| LOW | …ve/csrc/balance_serve/kvc2/test/kvc2test/lookup-mt.cpp | 81 | // // insert partly new |
| LOW | …e/csrc/balance_serve/kvc2/test/kvc2test/lookup-gpu.cpp | 101 | cmp_handle_data(k1, k_from_gpu, 3); |
| LOW | …e/csrc/balance_serve/kvc2/test/kvc2test/lookup-gpu.cpp | 121 | |
| LOW | …e/csrc/balance_serve/kvc2/test/kvc2test/lookup-gpu.cpp | 141 | // auto ids2 = random_ids(10 * config.num_token_per_page, gen); |
| LOW | archive/csrc/balance_serve/kvc2/src/async_store.cpp | 1 | |
| 586 more matches not shown… | |||
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | ktransformers.py | 30 | except Exception: |
| MEDIUM | ktransformers.py | 27 | def has_sft_support() -> bool: |
| LOW | …chive/merge_tensors/merge_safetensor_gguf_for_qwen3.py | 47 | except Exception as e: |
| MEDIUM | …chive/merge_tensors/merge_safetensor_gguf_for_qwen3.py | 48 | print(f"Error reading Safetensor file {file_path}: {e}") |
| LOW | archive/merge_tensors/merge_safetensor_gguf.py | 48 | except Exception as e: |
| MEDIUM | archive/merge_tensors/merge_safetensor_gguf.py | 49 | print(f"Error reading Safetensor file {file_path}: {e}") |
| LOW | archive/kt-sft/setup.py | 45 | except Exception: |
| LOW | archive/kt-sft/setup.py | 73 | except Exception: |
| LOW | archive/kt-sft/merge_tensors/merge_safetensor_gguf.py | 48 | except Exception as e: |
| MEDIUM | archive/kt-sft/merge_tensors/merge_safetensor_gguf.py | 49 | print(f"Error reading Safetensor file {file_path}: {e}") |
| LOW | archive/kt-sft/test_adapter/infer_with_adapter.py | 27 | except Exception as e: |
| LOW | archive/kt-sft/test_adapter/inspect_adapter.py | 60 | except Exception as e: |
| LOW | archive/kt-sft/test_adapter/inspect_adapter.py | 77 | except Exception as e: |
| LOW | archive/kt-sft/test_adapter/inspect_adapter.py | 94 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/local_chat.py | 224 | except Exception as e: |
| MEDIUM | archive/kt-sft/ktransformers/util/custom_loader.py | 71 | print(f"Error opening Safetensor file {file_path}: {e}") |
| MEDIUM | archive/kt-sft/ktransformers/util/custom_loader.py | 81 | print(f"Error reading Safetensor file {file_path}: {e}") |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 70 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 80 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 557 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 566 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/util/utils.py | 74 | except Exception: |
| LOW | archive/kt-sft/ktransformers/util/weight_loader.py | 85 | except Exception as e: |
| MEDIUM | archive/kt-sft/ktransformers/util/weight_loader.py | 86 | print(f"Error opening Safetensor file {file_path}: {e}") |
| LOW | archive/kt-sft/ktransformers/util/weight_loader.py | 95 | except Exception as e: |
| MEDIUM | archive/kt-sft/ktransformers/util/weight_loader.py | 96 | print(f"Error reading Safetensor file {file_path}: {e}") |
| LOW | archive/kt-sft/ktransformers/tests/mmlu_pro_test.py | 150 | except Exception as e: |
| MEDIUM | archive/kt-sft/ktransformers/tests/mmlu_pro_test.py | 151 | print(f"Error processing request {i}: {e}") |
| LOW | archive/kt-sft/ktransformers/tests/mmlu_test_multi.py | 156 | except Exception as e: |
| MEDIUM | archive/kt-sft/ktransformers/tests/mmlu_test_multi.py | 157 | print(f"Error processing request {index}: {e}") |
| LOW | archive/kt-sft/ktransformers/tests/mmlu_test.py | 142 | except Exception as e: |
| MEDIUM | archive/kt-sft/ktransformers/tests/mmlu_test.py | 143 | print(f"Error processing request {i}: {e}") |
| LOW | archive/kt-sft/ktransformers/tests/test_speed.py | 116 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/tests/test_speed.py | 134 | except Exception as e: |
| MEDIUM | archive/kt-sft/ktransformers/tests/test_speed.py | 48 | def fetch_event_stream(session, request_id, prompt, max_tokens, model): |
| LOW | archive/kt-sft/ktransformers/tests/test_client.py | 63 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/tests/test_client.py | 73 | except Exception as e: |
| MEDIUM | archive/kt-sft/ktransformers/tests/test_client.py | 15 | def fetch_event_stream(session, payload, request_id, stream): |
| LOW | …chive/kt-sft/ktransformers/tests/humaneval/eval_api.py | 75 | except Exception as e: |
| MEDIUM | …chive/kt-sft/ktransformers/tests/humaneval/eval_api.py | 78 | print(f"Error: {e}") |
| LOW | …chive/kt-sft/ktransformers/tests/AIME_2024/eval_api.py | 110 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/server/utils/sql_utils.py | 97 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/server/utils/sql_utils.py | 108 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/server/utils/sql_utils.py | 123 | except Exception as e: |
| LOW | …kt-sft/ktransformers/server/balance_serve/sched_rpc.py | 98 | except Exception as e: |
| LOW | …serve/inference/distributed/custom_all_reduce_utils.py | 244 | except Exception as e: |
| LOW | …/balance_serve/inference/distributed/parallel_state.py | 1248 | except Exception as e: |
| MEDIUM | …/balance_serve/inference/distributed/parallel_state.py | 1249 | print("Error ignored in is_in_the_same_node: %s", e) |
| LOW | …lance_serve/inference/distributed/custom_all_reduce.py | 20 | except Exception: |
| LOW | …/balance_serve/inference/distributed/pynccl_wrapper.py | 193 | except Exception as e: |
| LOW | …s/server/balance_serve/inference/distributed/pynccl.py | 62 | except Exception: |
| LOW | …-sft/ktransformers/server/api/openai/endpoints/chat.py | 379 | except Exception as e: |
| LOW | archive/kt-sft/ktransformers/sft/lora.py | 150 | except Exception: |
| LOW | archive/kt-sft/ktransformers/sft/lora.py | 228 | except Exception: |
| LOW | archive/kt-sft/ktransformers/sft/lora.py | 327 | except Exception: |
| MEDIUM | …hive/kt-sft/ktransformers/sft/peft_utils/peft_model.py | 153 | def active_adapters(self) -> list[str]: |
| LOW | …hive/kt-sft/ktransformers/sft/peft_utils/peft_model.py | 919 | except Exception: # something went wrong, roll back |
| LOW | …t-sft/ktransformers/sft/flops_utils/lora_test_utils.py | 29 | except Exception as e: |
| LOW | …t-sft/ktransformers/sft/flops_utils/lora_test_utils.py | 40 | except Exception as e: |
| LOW | …t-sft/ktransformers/sft/flops_utils/lora_test_utils.py | 58 | except Exception as e: |
| 255 more matches not shown… | |||
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | archive/setup.py | 80 | def get_musa_bare_metal_version(self, musa_dir): |
| LOW | archive/setup.py | 90 | def get_rocm_bare_metal_version(self, rocm_dir): |
| LOW | archive/setup.py | 154 | def get_cuda_bare_metal_version(self, cuda_dir): |
| LOW | archive/setup.py | 163 | def get_cuda_version_of_torch(self): |
| LOW | archive/setup.py | 365 | def run_command_with_live_tail(ext: str, command: List[str], output_lines: int = 20, |
| LOW | …chive/merge_tensors/merge_safetensor_gguf_for_qwen3.py | 27 | def read_safetensor_keys_from_folder(folder_path) -> dict: |
| LOW | archive/merge_tensors/merge_safetensor_gguf.py | 15 | def read_safetensor_keys_from_folder(folder_path)->dict: |
| LOW | archive/csrc/custom_marlin/utils/format24.py | 21 | def _calculate_meta_reordering_scatter_offsets(m, meta_ncols, meta_dtype, |
| LOW | archive/csrc/custom_marlin/utils/format24.py | 52 | def sparse_semi_structured_from_dense_cutlass(dense): |
| LOW | archive/csrc/custom_marlin/utils/format24.py | 184 | def sparse_semi_structured_to_dense_cutlass(sparse, meta_reordered): |
| LOW | archive/kt-sft/setup.py | 105 | def get_musa_bare_metal_version(self, musa_dir): |
| LOW | archive/kt-sft/setup.py | 115 | def get_rocm_bare_metal_version(self, rocm_dir): |
| LOW | archive/kt-sft/setup.py | 179 | def get_cuda_bare_metal_version(self, cuda_dir): |
| LOW | archive/kt-sft/setup.py | 188 | def get_cuda_version_of_torch(self): |
| LOW | archive/kt-sft/setup.py | 384 | def run_command_with_live_tail(ext: str, command: List[str], output_lines: int = 20, |
| LOW | archive/kt-sft/merge_tensors/merge_safetensor_gguf.py | 15 | def read_safetensor_keys_from_folder(folder_path)->dict: |
| LOW | …kt-sft/csrc/ktransformers_ext/examples/test_sft_moe.py | 604 | def test_backward_one_vs_many_comparison(): |
| LOW | archive/kt-sft/csrc/custom_marlin/utils/format24.py | 21 | def _calculate_meta_reordering_scatter_offsets(m, meta_ncols, meta_dtype, |
| LOW | archive/kt-sft/csrc/custom_marlin/utils/format24.py | 52 | def sparse_semi_structured_from_dense_cutlass(dense): |
| LOW | archive/kt-sft/csrc/custom_marlin/utils/format24.py | 184 | def sparse_semi_structured_to_dense_cutlass(sparse, meta_reordered): |
| LOW | archive/kt-sft/ktransformers/local_chat.py | 239 | # def first_token_argmax_baseline(model, tokenizer, prompt_text, device): |
| LOW | archive/kt-sft/ktransformers/operators/cpuinfer.py | 328 | def update_importance_one_block( |
| LOW | archive/kt-sft/ktransformers/operators/cpuinfer.py | 473 | def clear_importance_all_layers( |
| LOW | archive/kt-sft/ktransformers/operators/cpuinfer.py | 704 | def get_all_kvcache_one_layer( |
| LOW | …ve/kt-sft/ktransformers/operators/dynamic_attention.py | 271 | def get_preselect_block_table_and_attn_score( |
| LOW | …ive/kt-sft/ktransformers/operators/triton_attention.py | 165 | def _decode_grouped_att_m_fwd( |
| LOW | …ive/kt-sft/ktransformers/operators/triton_attention.py | 313 | def _decode_softmax_reducev_fwd( |
| LOW | …ive/kt-sft/ktransformers/operators/triton_attention.py | 358 | def decode_attention_fwd_grouped( |
| LOW | archive/kt-sft/ktransformers/util/custom_gguf.py | 97 | def quant_shape_to_byte_shape(shape: Sequence[int], quant_type: GGMLQuantizationType): |
| LOW | archive/kt-sft/ktransformers/util/custom_gguf.py | 635 | def translate_name_to_gguf_mixtral(name): |
| LOW | archive/kt-sft/ktransformers/util/custom_gguf.py | 704 | def translate_adapter_name_to_gguf(name): |
| LOW | …chive/kt-sft/ktransformers/util/modeling_rope_utils.py | 29 | def _compute_default_rope_parameters( |
| LOW | …chive/kt-sft/ktransformers/util/modeling_rope_utils.py | 71 | def _compute_linear_scaling_rope_parameters( |
| LOW | …chive/kt-sft/ktransformers/util/modeling_rope_utils.py | 112 | def _compute_dynamic_ntk_parameters( |
| LOW | …chive/kt-sft/ktransformers/util/modeling_rope_utils.py | 259 | def _compute_longrope_parameters( |
| LOW | …chive/kt-sft/ktransformers/util/modeling_rope_utils.py | 407 | def _validate_default_rope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): |
| LOW | …chive/kt-sft/ktransformers/util/modeling_rope_utils.py | 415 | def _validate_linear_scaling_rope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): |
| LOW | …chive/kt-sft/ktransformers/util/modeling_rope_utils.py | 427 | def _validate_dynamic_scaling_rope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): |
| LOW | …chive/kt-sft/ktransformers/util/modeling_rope_utils.py | 441 | def _validate_yarn_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): |
| LOW | …chive/kt-sft/ktransformers/util/modeling_rope_utils.py | 479 | def _validate_longrope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 389 | def get_undequanted_tensor_and_ggml_type(self, name): |
| LOW | archive/kt-sft/ktransformers/util/utils.py | 527 | def prefill_and_generate_capture( |
| LOW | …/kt-sft/ktransformers/server/utils/create_interface.py | 38 | def get_thread_context_manager() -> GlobalContextManager: |
| LOW | …kt-sft/ktransformers/server/backend/context_manager.py | 29 | async def get_context_by_run_object(self, run: RunObject) -> ThreadContext: |
| LOW | archive/kt-sft/ktransformers/server/backend/base.py | 57 | def report_last_time_performance(self): |
| LOW | …transformers/server/backend/interfaces/transformers.py | 176 | def format_and_tokenize_input_ids(self, thread_id: ObjectID, messages: List): |
| LOW | …ransformers/server/backend/interfaces/balance_serve.py | 94 | def report_last_time_performance(profiler: Profiler): |
| LOW | …ransformers/server/backend/interfaces/balance_serve.py | 411 | def format_and_tokenize_input_ids(self, thread_id: ObjectID, messages: List): |
| LOW | …/ktransformers/server/schemas/assistants/assistants.py | 133 | def get_related_threads_objects(self) -> List: |
| LOW | …ft/ktransformers/server/schemas/assistants/messages.py | 160 | def stream_response_with_event(self, event: MessageBase.Status) -> MessageStreamResponse: |
| LOW | …t/ktransformers/server/schemas/assistants/streaming.py | 136 | def wrap_async_generator_into_queue(async_events: AsyncIterable) -> asyncio.Queue: |
| LOW | …kt-sft/ktransformers/server/schemas/assistants/runs.py | 105 | def stream_response_with_event(self,event:RunBase.Status)->RunStreamResponse: |
| LOW | …kt-sft/ktransformers/server/schemas/assistants/runs.py | 123 | def create_message_creation_step(self): |
| LOW | …kt-sft/ktransformers/server/balance_serve/sched_rpc.py | 179 | def get_inference_context_raw(self): |
| LOW | …/balance_serve/inference/distributed/parallel_state.py | 891 | def init_model_parallel_group( |
| LOW | …/balance_serve/inference/distributed/parallel_state.py | 967 | def init_distributed_environment( |
| LOW | …/balance_serve/inference/distributed/parallel_state.py | 1014 | def initialize_model_parallel( |
| LOW | …/balance_serve/inference/distributed/parallel_state.py | 1091 | def ensure_model_parallel_initialized( |
| LOW | …/balance_serve/inference/distributed/parallel_state.py | 1120 | def model_parallel_is_initialized(): |
| LOW | …/balance_serve/inference/distributed/parallel_state.py | 1129 | def patch_tensor_parallel_group(tp_group: GroupCoordinator): |
| 246 more matches not shown… | |||
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | archive/setup.py | 238 | |
| LOW | archive/setup.py | 490 | |
| LOW | …chive/merge_tensors/merge_safetensor_gguf_for_qwen3.py | 27 | |
| LOW | …chive/merge_tensors/merge_safetensor_gguf_for_qwen3.py | 103 | |
| LOW | archive/merge_tensors/merge_safetensor_gguf.py | 15 | |
| LOW | archive/merge_tensors/merge_safetensor_gguf.py | 97 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_moe_torch.py | 80 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_moe.py | 31 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_moe_amx.py | 29 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_mlp.py | 28 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_linear.py | 28 | |
| LOW | …ive/csrc/ktransformers_ext/bench/bench_linear_torch.py | 26 | |
| LOW | archive/csrc/ktransformers_ext/bench/bench_mlp_torch.py | 47 | |
| LOW | archive/kt-sft/setup.py | 259 | |
| LOW | archive/kt-sft/setup.py | 509 | |
| LOW | archive/kt-sft/merge_tensors/merge_safetensor_gguf.py | 15 | |
| LOW | archive/kt-sft/merge_tensors/merge_safetensor_gguf.py | 97 | |
| LOW | …kt-sft/csrc/ktransformers_ext/bench/bench_moe_torch.py | 80 | |
| LOW | …chive/kt-sft/csrc/ktransformers_ext/bench/bench_moe.py | 31 | |
| LOW | …e/kt-sft/csrc/ktransformers_ext/bench/bench_moe_amx.py | 29 | |
| LOW | …chive/kt-sft/csrc/ktransformers_ext/bench/bench_mlp.py | 28 | |
| LOW | …ve/kt-sft/csrc/ktransformers_ext/bench/bench_linear.py | 28 | |
| LOW | …sft/csrc/ktransformers_ext/bench/bench_linear_torch.py | 26 | |
| LOW | …kt-sft/csrc/ktransformers_ext/bench/bench_mlp_torch.py | 47 | |
| LOW | …ft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py | 476 | |
| LOW | …ft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py | 501 | |
| LOW | …ft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py | 536 | |
| LOW | …ft/csrc/ktransformers_ext/examples/test_sft_amx_moe.py | 551 | |
| LOW | archive/kt-sft/ktransformers/local_chat.py | 87 | |
| LOW | archive/kt-sft/ktransformers/optimize/optimize.py | 20 | |
| LOW | archive/kt-sft/ktransformers/optimize/optimize.py | 55 | |
| LOW | archive/kt-sft/ktransformers/operators/linear.py | 83 | |
| LOW | archive/kt-sft/ktransformers/operators/cpuinfer.py | 30 | |
| LOW | …ve/kt-sft/ktransformers/operators/dynamic_attention.py | 271 | |
| LOW | …ve/kt-sft/ktransformers/operators/dynamic_attention.py | 605 | |
| LOW | …sft/ktransformers/operators/balance_serve_attention.py | 327 | |
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 88 | |
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 353 | |
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 799 | |
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 1022 | |
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 1069 | |
| LOW | archive/kt-sft/ktransformers/util/custom_gguf.py | 170 | |
| LOW | archive/kt-sft/ktransformers/util/vendors.py | 23 | |
| LOW | archive/kt-sft/ktransformers/util/vendors.py | 75 | |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 47 | |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 100 | |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 260 | |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 296 | |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 426 | |
| LOW | archive/kt-sft/ktransformers/util/custom_loader.py | 508 | |
| LOW | archive/kt-sft/ktransformers/util/utils.py | 166 | |
| LOW | archive/kt-sft/ktransformers/util/utils.py | 61 | |
| LOW | archive/kt-sft/ktransformers/util/weight_loader.py | 59 | |
| LOW | archive/kt-sft/ktransformers/util/weight_loader.py | 190 | |
| LOW | archive/kt-sft/ktransformers/tests/mmlu_test_multi.py | 115 | |
| LOW | archive/kt-sft/ktransformers/tests/test_speed.py | 48 | |
| LOW | archive/kt-sft/ktransformers/tests/test_client.py | 15 | |
| LOW | …chive/kt-sft/ktransformers/tests/humaneval/eval_api.py | 34 | |
| LOW | …/kt-sft/ktransformers/server/utils/create_interface.py | 19 | |
| LOW | …kt-sft/ktransformers/server/backend/context_manager.py | 29 | |
| 274 more matches not shown… | |||
| Severity | File | Line | Snippet |
|---|---|---|---|
| MEDIUM | install.sh | 47 | # ─── Helpers ─────────────────────────────────────────────────────────────────── |
| MEDIUM | install.sh | 81 | # ─── Submodule init ──────────────────────────────────────────────────────────── |
| MEDIUM | install.sh | 97 | # ─── sglang install ─────────────────────────────────────────────────────────── |
| MEDIUM | install.sh | 126 | # ─── kt-kernel install ──────────────────────────────────────────────────────── |
| MEDIUM | install.sh | 145 | # ─── deps install ───────────────────────────────────────────────────────────── |
| MEDIUM | install.sh | 161 | # ─── "all" subcommand ───────────────────────────────────────────────────────── |
| MEDIUM | install.sh | 212 | # ─── Subcommand dispatcher ──────────────────────────────────────────────────── |
| MEDIUM | …/balance_serve/inference/distributed/parallel_state.py | 327 | # -------------------------------------------- |
| MEDIUM | archive/third_party/llamafile/tinyblas_cpu.h | 30 | // ╚═╝ ╚═╝╚═╝ ╚═╝ ╚══╝ ╚═════╝ ╚═══╝╚═╝ ╚═╝╚═════╝ |
| MEDIUM | …ive/ktransformers/operators/ascend/ascend_attention.py | 920 | # ------------------------------------------------------- |
| MEDIUM | …ive/ktransformers/operators/ascend/ascend_attention.py | 922 | # ------------------------------------------------------- |
| MEDIUM | …ive/ktransformers/operators/ascend/ascend_attention.py | 994 | # ------------------------------------------------------- |
| MEDIUM | …ive/ktransformers/operators/ascend/ascend_attention.py | 996 | # ------------------------------------------------------- |
| MEDIUM | archive/ktransformers/tests/UT/test_kdeepseek_ln_npu.py | 12 | # ========================== |
| MEDIUM | archive/ktransformers/tests/UT/test_kdeepseek_ln_npu.py | 14 | # ========================== |
| MEDIUM | …s/tests/UT/test_kdeepseek_attention_w8a8a2serve_npu.py | 221 | # ========================== |
| MEDIUM | …s/tests/UT/test_kdeepseek_attention_w8a8a2serve_npu.py | 223 | # ========================== |
| MEDIUM | …/balance_serve/inference/distributed/parallel_state.py | 327 | # -------------------------------------------- |
| MEDIUM | …sformers/models/ascend/custom_ascend_modeling_qwen3.py | 87 | # --------------------------------------------------- |
| MEDIUM | …sformers/models/ascend/custom_ascend_modeling_qwen3.py | 89 | # --------------------------------------------------- |
| MEDIUM | kt-kernel/setup.py | 61 | # ------------------------- |
| MEDIUM | kt-kernel/setup.py | 63 | # ------------------------- |
| MEDIUM | kt-kernel/bench/bench_write_buffer.py | 102 | # ============================================================================== |
| MEDIUM | kt-kernel/bench/bench_write_buffer.py | 104 | # ============================================================================== |
| MEDIUM | kt-kernel/bench/bench_write_buffer.py | 326 | # ============================================================================== |
| MEDIUM | kt-kernel/bench/bench_write_buffer.py | 328 | # ============================================================================== |
| MEDIUM | kt-kernel/bench/bench_write_buffer.py | 408 | # ============================================================================== |
| MEDIUM | kt-kernel/bench/bench_write_buffer.py | 410 | # ============================================================================== |
| MEDIUM | kt-kernel/test/test_native_moe_loader_auto_release.py | 79 | # --------------------------------------------------------------------------- |
| MEDIUM | kt-kernel/test/test_native_moe_loader_auto_release.py | 81 | # --------------------------------------------------------------------------- |
| MEDIUM | kt-kernel/test/test_native_moe_loader_auto_release.py | 199 | # --------------------------------------------------------------------------- |
| MEDIUM | kt-kernel/test/test_native_moe_loader_auto_release.py | 201 | # --------------------------------------------------------------------------- |
| MEDIUM | kt-kernel/python/experts.py | 298 | # ============================================================================= |
| MEDIUM | kt-kernel/python/experts.py | 300 | # ============================================================================= |
| MEDIUM | kt-kernel/python/cli/utils/model_registry.py | 377 | # ============================================================================ |
| MEDIUM | kt-kernel/python/cli/utils/model_registry.py | 379 | # ============================================================================ |
| MEDIUM | kt-kernel/python/sft/wrapper.py | 46 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/wrapper.py | 48 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/wrapper.py | 136 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/wrapper.py | 138 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/wrapper.py | 408 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/wrapper.py | 410 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/arch.py | 21 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/arch.py | 23 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/arch.py | 42 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/arch.py | 44 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 307 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 309 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 454 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 456 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 580 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 582 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 29 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 31 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 82 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 84 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 132 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 134 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 523 | # ============================================================================= |
| MEDIUM | kt-kernel/python/sft/lora.py | 525 | # ============================================================================= |
| 23 more matches not shown… | |||
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | docker/docker-utils.sh | 106 | # Check if image exists |
| LOW | docker/docker-utils.sh | 230 | # Check if Docker daemon is running |
| LOW | docker/docker-utils.sh | 240 | # Check if user is logged into Docker registry |
| LOW | docker/docker-utils.sh | 297 | # Check if file/directory exists and is writable |
| LOW | docker/build-docker-tar.sh | 373 | # Check if tar file already exists |
| LOW | docker/push-to-dockerhub.sh | 290 | # Check if we should skip build |
| LOW | docker/push-to-dockerhub.sh | 862 | # Check if we should skip build |
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 470 | # Check if we need to allocate or expand buffers |
| LOW | archive/kt-sft/ktransformers/util/weight_loader.py | 177 | # Check if any safetensor files exist in the folder |
| LOW | archive/kt-sft/ktransformers/util/weight_loader.py | 197 | # Check if path exists |
| LOW | archive/kt-sft/ktransformers/util/weight_loader.py | 362 | # Check if any GGUF files exist in the folder |
| LOW | …-sft/ktransformers/server/api/openai/endpoints/chat.py | 203 | # Check if tools are present |
| LOW | …hive/kt-sft/ktransformers/sft/peft_utils/lora_layer.py | 571 | fan_in_fan_out: bool = False, # Set this to True if the layer to replace stores weight like (fan_in, fan_out) |
| LOW | …hive/kt-sft/ktransformers/sft/peft_utils/lora_layer.py | 1021 | fan_in_fan_out: bool = False, # Set this to True if the layer to replace stores weight like (fan_in, fan_out) |
| LOW | archive/ktransformers/util/weight_loader.py | 177 | # Check if any safetensor files exist in the folder |
| LOW | archive/ktransformers/util/weight_loader.py | 197 | # Check if path exists |
| LOW | archive/ktransformers/util/weight_loader.py | 362 | # Check if any GGUF files exist in the folder |
| LOW | …hive/ktransformers/server/api/openai/endpoints/chat.py | 203 | # Check if tools are present |
| LOW | kt-kernel/bench/bench_bf16_moe.py | 222 | # Print results |
| LOW | kt-kernel/bench/bench_fp8_perchannel_moe.py | 234 | # Print results |
| LOW | kt-kernel/bench/bench_fp8_moe.py | 243 | # Print results |
| LOW | kt-kernel/test/per_commit/test_moe_amx_bench_int8.py | 23 | # Check if dependencies are available |
| LOW | kt-kernel/test/per_commit/test_moe_amx_accuracy_int4.py | 19 | # Check if dependencies are available |
| LOW | …kernel/test/per_commit/test_moe_amx_accuracy_int4_1.py | 19 | # Check if dependencies are available |
| LOW | kt-kernel/test/per_commit/test_moe_amx_bench_int4.py | 23 | # Check if dependencies are available |
| LOW | kt-kernel/test/per_commit/test_basic_cpu.py | 17 | # Check if kt_kernel_ext is available |
| LOW | …ernel/test/per_commit/test_moe_amx_accuracy_int4_1k.py | 19 | # Check if dependencies are available |
| LOW | kt-kernel/test/per_commit/test_moe_amx_accuracy_int8.py | 19 | # Check if dependencies are available |
| LOW | kt-kernel/test/per_commit/test_moe_amx_bench_int4_1k.py | 24 | # Check if dependencies are available |
| LOW | kt-kernel/test/per_commit/test_moe_amx_bench_int4_1.py | 23 | # Check if dependencies are available |
| LOW | kt-kernel/python/_cpu_detect.py | 87 | # Check if all required flags are present |
| LOW | kt-kernel/python/utils/llamafile.py | 84 | # Check if intermediate_size is divisible by QK_K |
| LOW | kt-kernel/python/utils/loader.py | 213 | # Check if backward weights exist |
| LOW | kt-kernel/python/utils/loader.py | 341 | # Check if any key matches this format pattern |
| LOW | kt-kernel/python/cli/main.py | 373 | # Check if path exists or parent is writable |
| LOW | kt-kernel/python/cli/main.py | 380 | # Check if we can create it (parent writable) |
| LOW | kt-kernel/python/cli/main.py | 407 | # Check if already installed |
| LOW | kt-kernel/python/cli/main.py | 505 | # Check if this is first run |
| LOW | kt-kernel/python/cli/utils/console.py | 142 | # Check if response matches a choice directly |
| LOW | kt-kernel/python/cli/utils/quant_interactive.py | 226 | # Check if available space >= required * 1.2 (20% buffer) |
| LOW | kt-kernel/python/cli/utils/model_verifier.py | 28 | # Read file in chunks to handle large files |
| LOW | kt-kernel/python/cli/utils/model_verifier.py | 671 | # Check if already verified |
| LOW | kt-kernel/python/cli/utils/model_verifier.py | 683 | # Check if repo_id exists |
| LOW | kt-kernel/python/cli/utils/model_scanner.py | 94 | # Check if size meets minimum threshold |
| LOW | kt-kernel/python/cli/utils/model_scanner.py | 633 | # Check if this root is a parent of any already selected root |
| LOW | kt-kernel/python/cli/utils/kv_cache_calculator.py | 67 | # Check if it's MLA (Multi-head Latent Attention) model |
| LOW | kt-kernel/python/cli/utils/kv_cache_calculator.py | 96 | # Check if it's NSA (Native Sparse Attention) model |
| LOW | kt-kernel/python/cli/utils/model_discovery.py | 101 | # Check if already in registry |
| LOW | kt-kernel/python/cli/utils/model_discovery.py | 105 | # Check if already discovered in this session |
| LOW | kt-kernel/python/cli/utils/tuna_engine.py | 202 | # Check if process has output |
| LOW | kt-kernel/python/cli/utils/tuna_engine.py | 321 | # Check if we got a valid response |
| LOW | kt-kernel/python/cli/utils/tuna_engine.py | 432 | # Check if even 0 doesn't work |
| LOW | kt-kernel/python/cli/utils/download_helper.py | 77 | # Check if filename matches pattern |
| LOW | kt-kernel/python/cli/utils/environment.py | 117 | # Check if venv is available (built into Python) |
| LOW | kt-kernel/python/cli/utils/environment.py | 146 | # Check if env_name appears as a separate word in the output |
| LOW | kt-kernel/python/cli/utils/environment.py | 703 | # Check if writable |
| LOW | kt-kernel/python/cli/utils/environment.py | 742 | # Check if parent exists for paths that don't exist yet |
| LOW | kt-kernel/python/cli/utils/environment.py | 917 | # Check if this directory is a model |
| LOW | kt-kernel/python/cli/utils/model_registry.py | 276 | # Check if query is contained in name |
| LOW | kt-kernel/python/cli/utils/model_registry.py | 280 | # Check if query is contained in aliases |
| 41 more matches not shown… | |||
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | kt-kernel/operators/moe-sft-tp.hpp | 353 | // Step 1: For each NUMA, allocate and copy partitioned weights |
| LOW | kt-kernel/operators/moe-sft-tp.hpp | 392 | // Step 2: Set weight pointers BEFORE load_weights (Bug #24 fix) |
| LOW | kt-kernel/operators/moe-sft-tp.hpp | 399 | // Step 3: Prepare backward weights (this also clears weight pointers) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 954 | // Step 1: Expert routing (reuse base class logic) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 973 | // Step 2: Buffer pool allocation (reuse base class logic) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 1075 | // Step 3: Copy input to expert buffers |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 1112 | // Step 4: Quantize input |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 1120 | // Step 5: Gate + Up GEMM (base projection) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 1227 | // Step 6: Activation (silu(gate) * up) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 1284 | // Step 7: Quantize intermediate for down projection |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 1293 | // Step 8: Down GEMM |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 1345 | // Step 9: Weighted merge |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 1559 | // Step 1: Down projection backward |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 1721 | // Step 4: Compute grad_weights (gradient for routing weights) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 3374 | // Step 1: input @ lora_A^T -> lora_intermediate |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 3406 | // Step 2: Quantize lora_intermediate to BufferA |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 3539 | // Step 1: intermediate @ down_lora_A^T -> lora_intermediate |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 3568 | // Step 2: Quantize lora_intermediate to BufferA |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 3759 | // Step 1: intermediate = input @ lora_A^T (optimized with T_BLOCK=4, R_BLOCK=4) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 3765 | // Step 2: output += scale * (intermediate @ lora_B_transposed) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 3820 | // Step 1: intermediate = input @ lora_A^T (optimized with T_BLOCK=4, R_BLOCK=4) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 3831 | // Step 2: output += scale * (intermediate @ lora_B_transposed) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 4231 | // Step 1: Zero per-expert grad_output buffers |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 4243 | // Step 2: Scatter grad_output to per-expert BF16 buffers |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 4290 | // Step 3: Quantize scattered grad_output to BufferA |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 4383 | // Step 1: grad_output @ down_lora_B_transposed -> [local_num_tokens, rank] |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 4390 | // Step 2: grad_times_b @ down_lora_A -> [local_num_tokens, inter_size] (AVX512) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 4402 | // Step 5: LoRA gradient computation (parallelized across blocks) |
| LOW | kt-kernel/operators/amx/sft_moe.hpp | 5414 | // Step 6: grad_A = G_B^T @ X |
| LOW | kt-kernel/operators/amx/test/test_lora_fused_add.cpp | 1657 | // Step 1: Reduce 512 -> 256 by adding high/low halves (8 ops) |
| LOW | kt-kernel/operators/amx/test/test_lora_fused_add.cpp | 1671 | // Step 2: Pack pairs into single 512-bit vectors |
| LOW | kt-kernel/operators/amx/test/test_lora_fused_add.cpp | 1679 | // Step 3: Reduce 256 -> 128 within each pair |
| LOW | kt-kernel/operators/amx/test/test_lora_fused_add.cpp | 1686 | // Step 4: Reduce 128 -> 64 -> 32 within each |
| LOW | kt-kernel/operators/amx/la/avx_kernels.hpp | 924 | // Step 1: Interleave 16-bit |
| LOW | kt-kernel/operators/amx/la/avx_kernels.hpp | 934 | // Step 2: Interleave 32-bit |
| LOW | kt-kernel/operators/amx/la/avx_kernels.hpp | 944 | // Step 3: Interleave 64-bit |
| LOW | kt-kernel/operators/amx/la/avx_kernels.hpp | 985 | // Step 1: Interleave 16-bit |
| LOW | kt-kernel/operators/amx/la/avx_kernels.hpp | 1003 | // Step 2: Interleave 32-bit |
| LOW | kt-kernel/operators/amx/la/avx_kernels.hpp | 1021 | // Step 3: Interleave 64-bit |
| LOW | kt-kernel/operators/amx/la/avx_kernels.hpp | 1039 | // Step 4: Permute 128-bit lanes |
| LOW | kt-kernel/python/cli/utils/quant_interactive.py | 245 | # Step 1: Select model |
| LOW | kt-kernel/python/cli/utils/quant_interactive.py | 260 | # Step 2: Configure quantization method |
| LOW | kt-kernel/python/cli/utils/quant_interactive.py | 263 | # Step 3: Configure CPU parameters |
| LOW | kt-kernel/python/cli/utils/quant_interactive.py | 266 | # Step 4: Configure output path |
| LOW | kt-kernel/python/cli/utils/quant_interactive.py | 288 | # Step 5: Calculate space requirements and check availability |
| LOW | kt-kernel/python/cli/utils/run_interactive.py | 893 | # Step 1: Select model |
| LOW | kt-kernel/python/cli/utils/run_interactive.py | 898 | # Step 2: Select inference method |
| LOW | kt-kernel/python/cli/utils/run_interactive.py | 993 | # Step 3: Configure NUMA and CPU |
| LOW | kt-kernel/python/cli/utils/run_interactive.py | 996 | # Step 4: Configure GPU experts |
| LOW | kt-kernel/python/cli/utils/run_interactive.py | 999 | # Step 5: Configure KV Cache (only for raw) |
| LOW | kt-kernel/python/cli/utils/run_interactive.py | 1003 | # Step 6: Select GPUs and TP |
| LOW | kt-kernel/python/cli/utils/run_interactive.py | 1008 | # Step 7: Configure parsers (optional) |
| LOW | kt-kernel/python/cli/utils/run_interactive.py | 1011 | # Step 8: Configure host and port |
| LOW | kt-kernel/python/cli/utils/run_interactive.py | 1035 | # Step 9: Save configuration |
| LOW | kt-kernel/python/cli/commands/run.py | 330 | # Step 2: Resolve model |
| LOW | kt-kernel/python/cli/commands/run.py | 390 | # Step 3: Check quantized weights (only if explicitly requested) |
| LOW | kt-kernel/python/cli/commands/run.py | 414 | # Step 4: Build command |
| LOW | kt-kernel/python/cli/commands/run.py | 514 | # Step 5: Show configuration summary |
| LOW | kt-kernel/python/cli/commands/run.py | 544 | # Step 6: Show or execute |
| LOW | kt-kernel/python/cli/commands/model.py | 2583 | # Step 1: Delete the corrupted/missing file if it exists |
| 8 more matches not shown… | |||
| Severity | File | Line | Snippet |
|---|---|---|---|
| HIGH | archive/csrc/custom_marlin/utils/format24.py | 142 | -1, idxs0.unsqueeze(-1)) # type: ignore[possibly-undefined] |
| HIGH | archive/csrc/custom_marlin/utils/format24.py | 149 | k // 2) # type: ignore[possibly-undefined] |
| HIGH | archive/csrc/custom_marlin/utils/format24.py | 172 | (m * meta_ncols, )) # type: ignore[possibly-undefined] |
| HIGH | archive/kt-sft/csrc/custom_marlin/utils/format24.py | 142 | -1, idxs0.unsqueeze(-1)) # type: ignore[possibly-undefined] |
| HIGH | archive/kt-sft/csrc/custom_marlin/utils/format24.py | 149 | k // 2) # type: ignore[possibly-undefined] |
| HIGH | archive/kt-sft/csrc/custom_marlin/utils/format24.py | 172 | (m * meta_ncols, )) # type: ignore[possibly-undefined] |
| HIGH | …xt/operators/custom_marlin/quantize/utils/format_24.py | 142 | -1, idxs0.unsqueeze(-1)) # type: ignore[possibly-undefined] |
| HIGH | …xt/operators/custom_marlin/quantize/utils/format_24.py | 149 | k // 2) # type: ignore[possibly-undefined] |
| HIGH | …xt/operators/custom_marlin/quantize/utils/format_24.py | 172 | (m * meta_ncols, )) # type: ignore[possibly-undefined] |
| HIGH | …hive/kt-sft/ktransformers/sft/peft_utils/peft_model.py | 1393 | trainable params: 1843200 || all params: 775873280 || trainable%: 0.23756456724479544 |
| HIGH | …xt/operators/custom_marlin/quantize/utils/format_24.py | 142 | -1, idxs0.unsqueeze(-1)) # type: ignore[possibly-undefined] |
| HIGH | …xt/operators/custom_marlin/quantize/utils/format_24.py | 149 | k // 2) # type: ignore[possibly-undefined] |
| HIGH | …xt/operators/custom_marlin/quantize/utils/format_24.py | 172 | (m * meta_ncols, )) # type: ignore[possibly-undefined] |
| HIGH | kt-kernel/python/cli/i18n.py | 306 | "sglang_recommend_source": "Recommend reinstalling with the kvcache-ai fork: pip uninstall sglang -y && pip inst |
| HIGH | kt-kernel/python/cli/i18n.py | 926 | "sglang_recommend_source": "建议重新安装 kvcache-ai 分支: pip uninstall sglang -y && pip install sglang-kt", |
| HIGH | kt-kernel/python/cli/commands/doctor.py | 426 | kt_kernel_hint = "Reinstall SGLang: pip uninstall sglang -y && pip install sglang-kt (or run ./install.sh fr |
| Severity | File | Line | Snippet |
|---|---|---|---|
| MEDIUM | …chive/kt-sft/ktransformers/models/modeling_deepseek.py | 165 | |
| MEDIUM | …chive/kt-sft/ktransformers/models/modeling_deepseek.py | 166 | |
| MEDIUM | …chive/kt-sft/ktransformers/models/modeling_deepseek.py | 195 | |
| MEDIUM | …chive/kt-sft/ktransformers/models/modeling_deepseek.py | 196 | |
| MEDIUM | archive/ktransformers/models/modeling_deepseek.py | 164 | |
| MEDIUM | archive/ktransformers/models/modeling_deepseek.py | 165 | |
| MEDIUM | archive/ktransformers/models/modeling_deepseek.py | 194 | |
| MEDIUM | archive/ktransformers/models/modeling_deepseek.py | 195 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 131 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 133 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 137 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 140 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 148 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 149 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 154 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 155 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 160 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 173 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 176 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 177 | |
| MEDIUM | kt-kernel/python/cli/commands/bench.py | 179 | |
| MEDIUM | kt-kernel/examples/test_mla.py | 298 | |
| MEDIUM | kt-kernel/examples/test_mla.py | 300 | |
| MEDIUM | kt-kernel/examples/test_mla.py | 301 | |
| MEDIUM | kt-kernel/examples/test_mla.py | 302 | |
| MEDIUM | kt-kernel/examples/test_mla.py | 307 | |
| MEDIUM | kt-kernel/examples/test_mla.py | 309 | |
| MEDIUM | kt-kernel/examples/test_mla.py | 310 | |
| MEDIUM | kt-kernel/examples/test_mla.py | 311 | |
| MEDIUM | kt-kernel/examples/test_gate.py | 40 | |
| MEDIUM | kt-kernel/examples/test_gate.py | 42 | |
| MEDIUM | kt-kernel/examples/test_gate.py | 43 | |
| MEDIUM | kt-kernel/examples/test_gate.py | 44 | |
| MEDIUM | kt-kernel/examples/test_mla_quant.py | 32 | |
| MEDIUM | kt-kernel/examples/test_mla_quant.py | 34 | |
| MEDIUM | kt-kernel/examples/test_mla_quant.py | 35 | |
| MEDIUM | kt-kernel/examples/test_mla_quant.py | 36 |
| Severity | File | Line | Snippet |
|---|---|---|---|
| MEDIUM | archive/csrc/balance_serve/kvc2/test/pytest_load.py | 8 | # Create a kvc2 instance |
| MEDIUM | …/balance_serve/kvc2/test/pytest_raw_insert_and_read.py | 8 | # Create a kvc2 instance |
| MEDIUM | archive/csrc/balance_serve/kvc2/test/pytest_mem_read.py | 8 | # Create a kvc2 instance |
| MEDIUM | …csrc/balance_serve/kvc2/test/pytest_mem_prefix_test.py | 8 | # Create a kvc2 instance |
| MEDIUM | archive/csrc/custom_marlin/utils/quant_utils.py | 41 | # Create a tensor for bitwise right shift operation |
| MEDIUM | archive/kt-sft/csrc/custom_marlin/utils/quant_utils.py | 41 | # Create a tensor for bitwise right shift operation |
| MEDIUM | archive/kt-sft/ktransformers/util/custom_loader.py | 552 | # Create the appropriate loader based on detected file types |
| MEDIUM | archive/kt-sft/ktransformers/util/utils.py | 266 | # This function is to check if we run this model on XPU with FP16 dtype |
| MEDIUM | …/balance_serve/inference/distributed/pynccl_wrapper.py | 1 | # This file is a pure Python wrapper for the NCCL library. |
| MEDIUM | archive/ktransformers/util/custom_loader.py | 579 | # Create the appropriate loader based on detected file types |
| MEDIUM | archive/ktransformers/util/utils.py | 324 | # This function is to check if we run this model on XPU with FP16 dtype |
| MEDIUM | …/balance_serve/inference/distributed/pynccl_wrapper.py | 1 | # This file is a pure Python wrapper for the NCCL library. |
| MEDIUM | kt-kernel/python/experts.py | 81 | # Create a mask where experts 0, 2, 5 are on GPU |
| MEDIUM | kt-kernel/python/experts_base.py | 288 | # Create a new pinned tensor and copy data into it |
| MEDIUM | kt-kernel/python/utils/llamafile.py | 122 | # Initialize base class |
| MEDIUM | kt-kernel/python/utils/moe_kernel.py | 86 | # Initialize base class |
| MEDIUM | kt-kernel/python/utils/amx.py | 238 | # Initialize base class |
| MEDIUM | kt-kernel/python/cli/main.py | 47 | # Create main app with dynamic help |
| MEDIUM | kt-kernel/python/cli/utils/user_model_registry.py | 88 | self.save() # Create the file |
| MEDIUM | kt-kernel/python/cli/commands/model.py | 899 | # Create a sub-row with empty cells except for the first column (7 columns total with #) |
| MEDIUM | kt-kernel/python/cli/commands/model.py | 948 | # Create a sub-row with empty cells except for the first column |
| MEDIUM | kt-kernel/python/sft/weights.py | 171 | # Create a CPU tensor with the correct shape but NO physical memory. |
| Severity | File | Line | Snippet |
|---|---|---|---|
| HIGH | archive/kt-sft/ktransformers/util/custom_loader.py | 509 | Create a model loader for the given path by detecting the model format. The function checks for the pre |
| HIGH | …-sft/ktransformers/ktransformers_ext/triton/fp8gemm.py | 86 | Dequantizes the given weight tensor using the provided scale tensor. Args: x (torch.Tensor): The quant |
| HIGH | …hive/kt-sft/ktransformers/sft/peft_utils/peft_model.py | 935 | Get the status of each adapter layer in the model. This method returns a list of `TunerLayerStatus` dataclass i |
| HIGH | …hive/kt-sft/ktransformers/sft/peft_utils/peft_model.py | 965 | Get the status of tuners of the model. This method returns a `TunerModelStatus` dataclass instance, which conta |
| HIGH | …hive/kt-sft/ktransformers/sft/peft_utils/peft_model.py | 1663 | Get the status of each adapter layer in the model. This function returns a list of `TunerLayerStatus` dataclass ins |
| HIGH | …hive/kt-sft/ktransformers/sft/peft_utils/peft_model.py | 1781 | Get the status of tuners of the model. This function returns a `TunerModelStatus` dataclass instance, which contain |
| HIGH | archive/ktransformers/util/custom_loader.py | 536 | Create a model loader for the given path by detecting the model format. The function checks for the pre |
| HIGH | …hive/ktransformers/ktransformers_ext/triton/fp8gemm.py | 86 | Dequantizes the given weight tensor using the provided scale tensor. Args: x (torch.Tensor): The quant |
| HIGH | kt-kernel/python/_cpu_detect.py | 166 | Load the appropriate kt_kernel_ext variant. Tries to import the specified variant, with automatic fallback to |
| HIGH | kt-kernel/python/experts.py | 153 | Factory method to create the appropriate backend implementation. Args: layer_idx: Layer in |
| HIGH | kt-kernel/python/cli/utils/tuna_engine.py | 21 | Get the number of experts per layer from model config. Args: model_path: Path to the model directory |
| HIGH | kt-kernel/python/cli/utils/tuna_engine.py | 397 | Run tuna auto-tuning to find optimal num_gpu_experts. Args: model_path: Path to the model tens |
| HIGH | kt-kernel/python/sft/arch.py | 63 | Get MoE architecture configuration based on model type. Args: config: HuggingFace model configuration |
| Severity | File | Line | Snippet |
|---|---|---|---|
| MEDIUM | …hive/kt-sft/ktransformers/sft/peft_utils/peft_model.py | 831 | # TODO: consider replacing this patching of methods with a more robust mechanism: setting a flag and |
| LOW | …hive/kt-sft/ktransformers/sft/peft_utils/lora_model.py | 128 | # model, just add a `peft_config` dict attribute to your model. |
| LOW | …ive/ktransformers/operators/ascend/ascend_attention.py | 215 | # FIXME this is wrong in random choose pages for sched, currently just use kv without history |
| MEDIUM | archive/ktransformers/models/modeling_smallthinker.py | 1051 | # "unexpected if using padding tokens in conjunction with `inputs_embeds.`" |
| MEDIUM | kt-kernel/bench/compare_moe_performance.py | 291 | """Get comprehensive system information""" |
| MEDIUM | kt-kernel/operators/amx/sft_moe.hpp | 94 | // Check BF16 buffer for NaN/Inf (using robust v != v check) |
| MEDIUM | kt-kernel/operators/amx/sft_moe.hpp | 99 | // Use val != val for robust NaN detection |
| MEDIUM | kt-kernel/operators/amx/sft_moe.hpp | 121 | // Check FP32 buffer for NaN/Inf (using robust v != v check) |
| MEDIUM | kt-kernel/operators/amx/sft_moe.hpp | 126 | // Use val != val for robust NaN detection |
| MEDIUM | kt-kernel/operators/amx/sft_moe.hpp | 1447 | // Use v != v for robust NaN detection |
| MEDIUM | kt-kernel/operators/amx/sft_moe.hpp | 1799 | // Use v != v for robust NaN detection |
| MEDIUM | kt-kernel/operators/amx/sft_moe.hpp | 1839 | // Use fv != fv for robust NaN detection |
| MEDIUM | kt-kernel/operators/amx/sft_moe.hpp | 1959 | // Use v != v for robust NaN detection |
| MEDIUM | kt-kernel/operators/amx/test/mmq.cpp | 689 | // pack again with 128 to fully utilize vector length |
| MEDIUM | kt-kernel/operators/amx/test/mmq.cpp | 731 | // pack again with 128 to fully utilize vector length |
| MEDIUM | kt-kernel/operators/amx/test/mmq.cpp | 833 | // pack again with 128 to fully utilize vector length |
| MEDIUM | kt-kernel/operators/amx/test/mmq-test.cpp | 693 | // pack again with 128 to fully utilize vector length |
| MEDIUM | kt-kernel/operators/amx/test/mmq-test.cpp | 735 | // pack again with 128 to fully utilize vector length |
| MEDIUM | kt-kernel/operators/amx/test/mmq-test.cpp | 837 | // pack again with 128 to fully utilize vector length |
| Severity | File | Line | Snippet |
|---|---|---|---|
| CRITICAL | …sformers/models/ascend/custom_ascend_modeling_qwen3.py | 70 | self.model.embed_tokens.weight.data = self.model.embed_tokens.weight.data.to(torch.float16) |
| CRITICAL | …sformers/models/ascend/custom_ascend_modeling_qwen3.py | 73 | self.model.norm.weight.data = self.model.norm.weight.data.to(torch.float16) |
| CRITICAL | …sformers/models/ascend/custom_ascend_modeling_qwen3.py | 75 | self.model.norm.bias.data = self.model.norm.bias.data.to(torch.float16) |
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 969 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 1126 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 1475 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/kt-sft/ktransformers/operators/experts.py | 1771 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/kt-sft/ktransformers/models/modeling_mixtral.py | 878 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/kt-sft/ktransformers/models/modeling_mixtral.py | 1465 | loss += self.router_aux_loss_coef * aux_loss.to(loss.device) # make sure to reside in the same device |
| LOW | …hive/kt-sft/ktransformers/models/modeling_qwen3_moe.py | 292 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | …hive/kt-sft/ktransformers/models/modeling_qwen3_moe.py | 1166 | loss += self.router_aux_loss_coef * aux_loss.to(loss.device) # make sure to reside in the same device |
| LOW | …hive/kt-sft/ktransformers/models/modeling_qwen2_moe.py | 848 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | …hive/kt-sft/ktransformers/models/modeling_qwen2_moe.py | 1455 | loss += self.router_aux_loss_coef * aux_loss.to(loss.device) # make sure to reside in the same device |
| LOW | archive/ktransformers/operators/experts.py | 547 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/ktransformers/operators/experts.py | 665 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/ktransformers/operators/experts.py | 863 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/ktransformers/operators/experts.py | 1161 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/ktransformers/models/modeling_smallthinker.py | 116 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/ktransformers/models/modeling_smallthinker.py | 956 | loss += self.router_aux_loss_coef * aux_loss.to(loss.device) # make sure to reside in the same device |
| LOW | archive/ktransformers/models/modeling_mixtral.py | 877 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/ktransformers/models/modeling_mixtral.py | 1464 | loss += self.router_aux_loss_coef * aux_loss.to(loss.device) # make sure to reside in the same device |
| LOW | archive/ktransformers/models/modeling_qwen3_moe.py | 291 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/ktransformers/models/modeling_qwen3_moe.py | 1165 | loss += self.router_aux_loss_coef * aux_loss.to(loss.device) # make sure to reside in the same device |
| LOW | archive/ktransformers/models/modeling_qwen2_moe.py | 847 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/ktransformers/models/modeling_qwen2_moe.py | 1454 | loss += self.router_aux_loss_coef * aux_loss.to(loss.device) # make sure to reside in the same device |
| MEDIUM | archive/ktransformers/models/custom_cache.py | 372 | # you can use following code as check |
| LOW | archive/ktransformers/models/modeling_qwen3_next.py | 853 | # the current expert. We need to make sure to multiply the output hidden |
| LOW | archive/ktransformers/models/modeling_qwen3_next.py | 1255 | loss += self.router_aux_loss_coef * aux_loss.to(loss.device) # make sure to reside in the same device |
| Severity | File | Line | Snippet |
|---|---|---|---|
| HIGH | kt-kernel/examples/test_moe_amx.py | 428 | # Only test BF16 and INT8 as requested |
| HIGH | kt-kernel/examples/test_moe_amx.py | 486 | # Only test BF16 and INT8 as requested |
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | docker/build-docker-tar.sh | 15 | # Usage: |
| LOW | docker/push-to-dockerhub.sh | 16 | # Usage: |
| LOW | docker/push-to-dockerhub.sh | 588 | # Usage: |
| LOW | …hive/csrc/balance_serve/kvc2/test/test_cuda_stream.cpp | 86 | // Example usage |