vllm-project/vllm

16.3

Adjusted Score

16.3

Raw Score

100%

Time Factor

2026-07-14

Last Push

86.2K

Stars

Python

Language

1.6M

Lines of Code

5.5K

Files

18.9K

Pattern Hits

2026-07-14

Scan Date

0.09

HC Hit Rate

What These Metrics Mean

Adjusted Score: Primary synthetic code indicator. Raw score normalised per 1,000 lines of code and multiplied by the temporal discount factor. This is the definitive comparative metric — use it to rank repositories by AI authorship density.
Raw Score: The unmodified sum of all severity-weighted, context-multiplied pattern match scores before temporal discounting. Reflects the absolute signal strength independent of when the repository was last active.
Time Factor: The temporal discount multiplier (0–100%) applied to the raw score. Repositories last updated before ChatGPT's launch (Nov 2022) receive a 5% factor. Full signal is only assigned to repositories active in the post-adoption era (Jan 2024+).
Pattern Hits: Total count of individual pattern matches across all files and categories. A high hit count with a low score may indicate a very large codebase with isolated AI snippets; a low count with a high score indicates dense, concentrated AI signatures.
HC Hit Rate: High+Critical pattern hits per file, averaged across the repository. This orthogonal signal catches repositories where a few files are densely packed with high-severity AI tells — a strong indicator even when the normalised score appears moderate due to codebase size.
Lines of Code / Files: Total lines and files analysed. The scanner examines 94 file extensions. These denominators are used to normalise the score, enabling fair comparison between repositories of vastly different sizes.

Score History

This chart maps the temporal evolution of the adjusted synthetic code score across successive scan runs. An upward trajectory indicates ongoing incorporation of AI-generated code or expanding LLM-assisted scaffolding; a stable or declining trajectory may reflect active human refactoring, code removal, or the adoption of stricter authorship policies. The dashed secondary line (right axis) independently tracks total raw pattern hit count, which can diverge from the normalised score when codebase size changes significantly between scans.

Severity Breakdown

Classifies detected patterns by their diagnostic confidence and structural impact. CRITICAL patterns (coefficient 10) represent definitive synthetic signatures — hallucinated imports, explicit LLM attribution metadata — virtually never produced by human authors. HIGH (5) indicates strong structural tells such as cross-file repetition or cross-linguistic idioms. MEDIUM (2) covers recognisable conversational padding and AI-specific vocabulary. LOW (1) captures subtle indicators like tautological comments and generic boilerplate that require density to carry independent signal.

CRITICAL 11HIGH 483MEDIUM 2172LOW 16241

Directory Score Breakdown

This horizontal bar chart decomposes the repository's raw synthetic code score by top-level directory, allowing you to pinpoint precisely which modules or components carry the highest AI authorship density. Directories with disproportionately high scores relative to their size warrant targeted manual review: concentrated AI signatures often trace back to mass-generated configuration layers, auto-ported test suites, LLM-scaffolded boilerplate classes, or entire subsystems authored under heavy copilot assistance. Use this view to prioritise your human code-review effort.

Pattern Findings

The scanner identified 18907 distinct pattern matches across 27 syntactic categories. Each entry below represents a discrete location in the source code where the engine recorded a statistically significant AI authorship indicator. Expand any category row to inspect the individual file paths, line numbers, code snippets, and the lexical context (CODE, COMMENT, or STRING) in which each match was detected.

Reading the findings table: The Severity column indicates the diagnostic confidence level (CRITICAL / HIGH / MEDIUM / LOW). The Context column identifies whether the match occurred inside executable code, an inline comment, or a string literal — comment-context matches receive a ×1.5 weight because LLMs systematically over-annotate. The ⚡ bolt icon marks clustered matches: three or more patterns within a 10-line window, each receiving an additional ×1.5 density multiplier as dense clusters constitute far stronger evidence of synthetic authorship than isolated hits.

Hyper-Verbose Identifiers10712 hits · 10466 pts

Severity	File	Line	Snippet	Context
LOW⚡	setup.py	57	def should_require_rust_frontend() -> bool:	CODE
LOW⚡	setup.py	62	def get_precompiled_rust_extension_paths() -> list[Path]:	CODE
LOW⚡	setup.py	66	def get_missing_precompiled_rust_extension_modules() -> list[str]:	CODE
LOW	setup.py	77	def has_precompiled_rust_extensions() -> bool:	CODE
LOW	setup.py	494	def fetch_metadata_for_variant(	CODE
LOW	setup.py	528	def detect_system_cuda_variant() -> str:	CODE
LOW	setup.py	582	def fetch_wheel_from_pypi_index(index_url: str, package: str = "vllm") -> str:	CODE
LOW	setup.py	737	def extract_precompiled_and_patch_package(	CODE
LOW	setup.py	860	def get_base_commit_in_main_branch() -> str:	CODE
LOW	csrc/cpu/generate_cpu_attn_dispatch.py	92	def generate_cases_for_isa_group(isa_list: list[str], include_fp8: bool = False) -> str:	CODE
LOW	csrc/libtorch_stable/quantization/machete/generate.py	334	def generate_type_option_name(kernel_types: TypeConfig):	STRING
LOW	csrc/libtorch_stable/quantization/machete/generate.py	370	def unsigned_type_with_bitwidth(num_bits):	STRING
LOW	tools/generate_versions_json.py	86	def generate_bake_native_json(args: dict[str, str]) -> dict:	CODE
LOW	tools/install_nixl_from_source_ubuntu.py	65	def install_system_dependencies():	CODE
LOW	tools/install_nixl_from_source_ubuntu.py	105	def build_and_install_prerequisites(args):	CODE
LOW	tools/build_rust.py	39	def rust_py_extension_module_names() -> list[str]:	CODE
LOW⚡	tools/pre_commit/generate_attention_backend_docs.py	1119	def _expand_flash_attn_variants(	CODE
LOW⚡	tools/pre_commit/generate_attention_backend_docs.py	1242	def parse_cuda_priority_lists() -> dict[str, list[str]]:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	264	def _find_exact_cc_in_function(tree: ast.AST, func_name: str) -> str \| None:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	320	def parse_mla_prefill_registry() -> dict[str, str]:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	340	def parse_mla_prefill_priorities() -> dict[str, list[str]]:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	406	def parse_mla_dimensions_call(node: ast.AST) -> str \| None:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	436	def parse_supported_mla_dimensions(node: ast.AST \| None) -> list[str]:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	449	def parse_mla_prefill_backend_file(class_path: str) -> dict[str, Any] \| None:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	529	def parse_mla_prefill_backends() -> list[dict[str, Any]]:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	932	def parse_flash_attn_features() -> dict[str, dict[str, Any]]:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	1069	def parse_flashinfer_trtllm_features() -> dict[str, dict[str, Any]]:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	1171	def _expand_flashinfer_variants(	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	1295	def _get_backends_from_return(stmts: list) -> list[str]:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	1582	def generate_priority_section(priorities: dict[str, list[str]]) -> str:	CODE
LOW	…s/gumbel_precision/prove_exponential_race_precision.py	31	def measure_exponential_lower_tail(	CODE
LOW	tools/profiler/visualize_layerwise_profile.py	75	def shorten_plot_legend_strings(legend, max_char_len: int):	CODE
LOW	tools/profiler/visualize_layerwise_profile.py	94	def attempt_to_make_names_unique(entries_and_traces):	CODE
LOW	tools/profiler/visualize_layerwise_profile.py	144	def group_trace_by_operations(trace_df: "pd.DataFrame") -> "pd.DataFrame":	CODE
LOW	tools/profiler/nsys_profile_tools/gputrc2graph.py	45	def gen_nonoverlapped_sum_from_gputrace(self, in_file, out_file):	CODE
LOW	tools/profiler/nsys_profile_tools/gputrc2graph.py	66	def sum_non_overlapping_intervals(self, df):	CODE
LOW	tools/vllm-rocm/pin_rocm_dependencies.py	20	def extract_version_from_wheel(wheel_name: str) -> str:	CODE
LOW	tools/vllm-rocm/pin_rocm_dependencies.py	40	def get_custom_wheel_versions(install_dir: str) -> dict[str, str]:	CODE
LOW	tools/vllm-rocm/pin_rocm_dependencies.py	94	def pin_dependencies_in_requirements(requirements_path: str, versions: dict[str, str]):	CODE
LOW	tests/test_sequence.py	9	def test_sequence_intermediate_tensors_equal():	CODE
LOW	tests/test_zen_cpu_platform_detection.py	35	def test_is_amd_zen_cpu_returns_false_when_cpuinfo_missing():	CODE
LOW	tests/test_version.py	36	def test_prev_minor_version_was(version_tuple, version_str, expected):	CODE
LOW⚡	tests/test_ray_env_utils.py	36	def test_arbitrary_var_propagated(self):	CODE
LOW⚡	tests/test_ray_env_utils.py	42	def test_worker_specific_excluded(self):	CODE
LOW⚡	tests/test_ray_env_utils.py	50	def test_non_carry_over_blacklist(self):	CODE
LOW	tests/test_fxgraphcache_pickle_patch.py	20	def test_valueerror_converted_to_bypass(self):	CODE
LOW	tests/test_fxgraphcache_pickle_patch.py	30	def test_original_valueerror_chained(self):	CODE
LOW	tests/test_fxgraphcache_pickle_patch.py	44	def test_non_valueerror_propagates(self):	CODE
LOW	tests/test_fxgraphcache_pickle_patch.py	54	def test_normal_return_preserved(self):	CODE
LOW	tests/test_fxgraphcache_pickle_patch.py	76	def test_sentinel_attribute_set(self):	CODE
LOW	tests/test_fxgraphcache_pickle_patch.py	90	def test_patch_applied_in_current_environment():	CODE
LOW	tests/conftest.py	219	def init_test_http_connection():	CODE
LOW	tests/conftest.py	266	def should_do_global_cleanup_after_test(request) -> bool:	CODE
LOW	tests/conftest.py	740	def _hidden_states_to_seq_logprobs(	CODE
LOW	tests/conftest.py	763	def _hidden_states_to_logprobs(	CODE
LOW	tests/conftest.py	790	def generate_greedy_logprobs_limit(	CODE
LOW	tests/conftest.py	1070	def _final_steps_generate_w_logprobs(	CODE
LOW	tests/conftest.py	1180	def generate_prompt_perplexity(	CODE
LOW	tests/conftest.py	1289	def _wait_for_rocm_memory_release(self, gpu_memory_utilization: float) -> None:	CODE
LOW	tests/conftest.py	1331	def temporary_enable_log_propagate():	CODE
10652 more matches not shown…

Decorative Section Separators1685 hits · 5814 pts

Severity	File	Line	Snippet	Context
MEDIUM⚡	tools/pre_commit/generate_attention_backend_docs.py	1114	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tools/pre_commit/generate_attention_backend_docs.py	1116	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tools/pre_commit/generate_attention_backend_docs.py	1237	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tools/pre_commit/generate_attention_backend_docs.py	1239	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tools/pre_commit/generate_attention_backend_docs.py	1747	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tools/pre_commit/generate_attention_backend_docs.py	1749	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	24	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	26	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	76	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	78	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	283	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	285	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	600	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	602	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	888	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	890	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	1375	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	1381	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	1474	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tools/pre_commit/generate_attention_backend_docs.py	1476	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	106	# -----------------------------------------------------------------------------	COMMENT
MEDIUM⚡	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	108	# -----------------------------------------------------------------------------	COMMENT
MEDIUM⚡	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	117	# -----------------------------------------------------------------------------	COMMENT
MEDIUM⚡	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	121	# -----------------------------------------------------------------------------	COMMENT
MEDIUM⚡	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	135	# -----------------------------------------------------------------------------	COMMENT
MEDIUM⚡	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	138	# -----------------------------------------------------------------------------	COMMENT
MEDIUM⚡	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	151	# -----------------------------------------------------------------------------	COMMENT
MEDIUM⚡	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	157	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	65	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	67	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	78	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	81	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	93	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	95	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	175	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	179	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	192	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	199	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	214	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	217	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	240	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	243	# -----------------------------------------------------------------------------	COMMENT
MEDIUM	tests/test_ray_env.py	11	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tests/test_ray_env.py	13	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_ray_env.py	46	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_ray_env.py	48	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_ray_env.py	61	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_ray_env.py	63	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tests/test_ray_env.py	97	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tests/test_ray_env.py	99	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_ray_env.py	133	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_ray_env.py	135	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_ray_env.py	147	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_ray_env.py	149	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_jit_monitor.py	34	# ------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/test_jit_monitor.py	36	# ------------------------------------------------------------------	COMMENT
MEDIUM	tests/test_jit_monitor.py	121	# ------------------------------------------------------------------	COMMENT
MEDIUM	tests/test_jit_monitor.py	123	# ------------------------------------------------------------------	COMMENT
MEDIUM	tests/test_jit_monitor.py	434	# ------------------------------------------------------------------	COMMENT
MEDIUM	tests/test_jit_monitor.py	436	# ------------------------------------------------------------------	COMMENT
1625 more matches not shown…

Cross-File Repetition385 hits · 1925 pts

Severity	File	Line	Snippet	Context
HIGH	tests/v1/attention/test_mla_prefill_selector.py	0	clear lru cache to ensure each test case runs without caching.	STRING
HIGH	tests/kernels/attention/test_attention_selector.py	0	clear lru cache to ensure each test case runs without caching.	STRING
HIGH	tests/kernels/attention/test_mha_attn.py	0	clear lru cache to ensure each test case runs without caching.	STRING
HIGH	tests/kernels/attention/test_rocm_attention_selector.py	0	clear lru cache to ensure each test case runs without caching.	STRING
HIGH	tests/v1/logits_processors/utils.py	0	fake logit processor to support unit testing and examples	STRING
HIGH	docs/features/custom_logitsprocs.md	0	fake logit processor to support unit testing and examples	STRING
HIGH	examples/features/logits_processor/custom.py	0	fake logit processor to support unit testing and examples	STRING
HIGH	tests/v1/logits_processors/utils.py	0	the request-level logits processor masks out all logits except the token id identified by `target_token`	STRING
HIGH	docs/features/custom_logitsprocs.md	0	the request-level logits processor masks out all logits except the token id identified by `target_token`	STRING
HIGH	examples/features/logits_processor/custom_req.py	0	the request-level logits processor masks out all logits except the token id identified by `target_token`	STRING
HIGH	examples/features/logits_processor/custom_req_init.py	0	the request-level logits processor masks out all logits except the token id identified by `target_token`	STRING
HIGH	tests/v1/logits_processors/utils.py	0	example of wrapping a fake request-level logit processor to create a batch-level logits processor	STRING
HIGH	docs/features/custom_logitsprocs.md	0	example of wrapping a fake request-level logit processor to create a batch-level logits processor	STRING
HIGH	examples/features/logits_processor/custom_req.py	0	example of wrapping a fake request-level logit processor to create a batch-level logits processor	STRING
HIGH	tests/v1/logits_processors/utils.py	0	this method returns a new request-level logits processor, customized to the `target_token` value associated with a parti	STRING
HIGH	docs/features/custom_logitsprocs.md	0	this method returns a new request-level logits processor, customized to the `target_token` value associated with a parti	STRING
HIGH	examples/features/logits_processor/custom_req.py	0	this method returns a new request-level logits processor, customized to the `target_token` value associated with a parti	STRING
HIGH	examples/features/logits_processor/custom_req_init.py	0	this method returns a new request-level logits processor, customized to the `target_token` value associated with a parti	STRING
HIGH	…s/v1/kv_connector/nixl_integration/toy_proxy_server.py	0	lifespan context manager to handle startup and shutdown events.	STRING
HIGH	…cache/disagg_prefill_lmcache_v1/disagg_proxy_server.py	0	lifespan context manager to handle startup and shutdown events.	STRING
HIGH	…regated/mooncake_connector/mooncake_connector_proxy.py	0	lifespan context manager to handle startup and shutdown events.	STRING
HIGH	tests/tool_parsers/test_granite_tool_parser.py	0	<longcat_tool_call>{ "name": "test_function", "arguments": { "string_field": "hello", "int_field": 42, "float_field": 3.	STRING
HIGH	tests/tool_parsers/test_granite_20b_fc_tool_parser.py	0	<longcat_tool_call>{ "name": "test_function", "arguments": { "string_field": "hello", "int_field": 42, "float_field": 3.	STRING
HIGH	tests/tool_parsers/test_phi4mini_tool_parser.py	0	<longcat_tool_call>{ "name": "test_function", "arguments": { "string_field": "hello", "int_field": 42, "float_field": 3.	STRING
HIGH	tests/tool_parsers/test_longcat_tool_parser.py	0	<longcat_tool_call>{ "name": "test_function", "arguments": { "string_field": "hello", "int_field": 42, "float_field": 3.	STRING
HIGH	tests/kernels/quantization/test_block_int8.py	0	sets the default cuda device for all tests in this module.	STRING
HIGH	tests/kernels/quantization/test_int8_kernel.py	0	sets the default cuda device for all tests in this module.	STRING
HIGH	tests/kernels/moe/test_block_int8.py	0	sets the default cuda device for all tests in this module.	STRING
HIGH	tests/kernels/moe/test_triton_moe_ptpc_fp8.py	0	sets the default cuda device for all tests in this module.	STRING
HIGH	tests/distributed/test_pipeline_parallel.py	0	warning: this test runs in both single-node (4 gpus) and multi-node (2 node with 2 gpus each) modes. if the test only us	STRING
HIGH	tests/distributed/test_context_parallel.py	0	warning: this test runs in both single-node (4 gpus) and multi-node (2 node with 2 gpus each) modes. if the test only us	STRING
HIGH	tests/compile/correctness_e2e/test_sequence_parallel.py	0	warning: this test runs in both single-node (4 gpus) and multi-node (2 node with 2 gpus each) modes. if the test only us	STRING
HIGH	tests/models/multimodal/pooling/test_colpali.py	0	create a small solid-color png image and return its base64 data uri.	STRING
HIGH	tests/models/multimodal/pooling/test_colqwen3.py	0	create a small solid-color png image and return its base64 data uri.	STRING
HIGH	tests/entrypoints/pooling/scoring/util.py	0	create a small solid-color png image and return its base64 data uri.	STRING
HIGH	tests/models/multimodal/pooling/test_colpali.py	0	build a scoremultimodalparam containing an image (and optional text).	STRING
HIGH	tests/models/multimodal/pooling/test_colqwen3.py	0	build a scoremultimodalparam containing an image (and optional text).	STRING
HIGH	tests/entrypoints/pooling/scoring/util.py	0	build a scoremultimodalparam containing an image (and optional text).	STRING
HIGH	tests/models/multimodal/pooling/test_colpali.py	0	verify per-token embedding shape and l2 normalization.	STRING
HIGH	tests/models/multimodal/pooling/test_colqwen3.py	0	verify per-token embedding shape and l2 normalization.	STRING
HIGH	tests/models/multimodal/pooling/test_colqwen3_5.py	0	verify per-token embedding shape and l2 normalization.	STRING
HIGH	tests/models/multimodal/pooling/test_colpali.py	0	verify that relevant documents score higher than irrelevant ones.	STRING
HIGH	tests/models/multimodal/pooling/test_colqwen3.py	0	verify that relevant documents score higher than irrelevant ones.	STRING
HIGH	tests/models/multimodal/pooling/test_colqwen3_5.py	0	verify that relevant documents score higher than irrelevant ones.	STRING
HIGH	tests/models/multimodal/generation/test_phi4siglip.py	0	sanitize vllm output [phi3v] to be comparable with hf output.	STRING
HIGH	tests/models/multimodal/generation/test_phi4mm.py	0	sanitize vllm output [phi3v] to be comparable with hf output.	STRING
HIGH	…/models/multimodal/generation/vlm_utils/model_utils.py	0	sanitize vllm output [phi3v] to be comparable with hf output.	STRING
HIGH	…ts/models/multimodal/generation/test_granite_speech.py	0	sanitize vllm output [phi3v] to be comparable with hf output.	STRING
HIGH	tests/entrypoints/openai/completion/test_completion.py	0	root ::= select_statement select_statement ::= "select " column " from " table " where " condition column ::= "col_1 " \|	STRING
HIGH	…ypoints/openai/chat_completion/test_chat_completion.py	0	root ::= select_statement select_statement ::= "select " column " from " table " where " condition column ::= "col_1 " \|	STRING
HIGH	docs/features/structured_outputs.md	0	root ::= select_statement select_statement ::= "select " column " from " table " where " condition column ::= "col_1 " \|	STRING
HIGH	…atures/structured_outputs/structured_outputs_client.py	0	root ::= select_statement select_statement ::= "select " column " from " table " where " condition column ::= "col_1 " \|	STRING
HIGH	…tures/structured_outputs/structured_outputs_offline.py	0	root ::= select_statement select_statement ::= "select " column " from " table " where " condition column ::= "col_1 " \|	STRING
HIGH	benchmarks/benchmark_serving_structured_output.py	0	root ::= select_statement select_statement ::= "select " column " from " table " where " condition column ::= "col_1 " \|	STRING
HIGH	tests/lora/test_qwen3moe_tp.py	0	i want you to act as a sql terminal in front of an example database, you need only to return the sql command to me. do n	STRING
HIGH	tests/lora/test_llama_tp.py	0	i want you to act as a sql terminal in front of an example database, you need only to return the sql command to me. do n	STRING
HIGH	tests/lora/test_olmoe_tp.py	0	i want you to act as a sql terminal in front of an example database, you need only to return the sql command to me. do n	STRING
HIGH	vllm/sampling_params.py	0	whether to include the stop strings in output text.	STRING
HIGH	…m/entrypoints/speech_to_text/transcription/protocol.py	0	whether to include the stop strings in output text.	STRING
HIGH	vllm/entrypoints/speech_to_text/translation/protocol.py	0	whether to include the stop strings in output text.	STRING
325 more matches not shown…

Unused Imports1388 hits · 1287 pts

Severity	File	Line	Context
LOW	…c/cutlass_extensions/vllm_cutlass_library_extension.py	6	CODE
LOW	tools/build_rust.py	6	CODE
LOW	…s/gumbel_precision/prove_exponential_race_precision.py	18	CODE
LOW	tests/conftest.py	77	CODE
LOW	tests/conftest.py	77	CODE
LOW	tests/conftest.py	78	CODE
LOW	tests/utils.py	63	CODE
LOW	tests/utils.py	2176	CODE
LOW	tests/v1/kv_offload/tiering/p2p/test_sessions.py	13	CODE
LOW	tests/v1/kv_offload/tiering/p2p/test_zmq_transport.py	5	CODE
LOW	tests/v1/kv_offload/tiering/p2p/test_data_transport.py	5	CODE
LOW	tests/v1/kv_offload/tiering/p2p/test_manager.py	9	CODE
LOW	tests/v1/attention/test_mla_backends.py	75	CODE
LOW	tests/v1/attention/test_attention_backends.py	50	CODE
LOW	tests/v1/logits_processors/utils.py	13	CODE
LOW	tests/v1/cudagraph/test_cudagraph_mode.py	38	CODE
LOW	tests/v1/cudagraph/test_breakable_cudagraph.py	7	CODE
LOW	tests/v1/kv_connector/unit/test_tp_mapping.py	10	CODE
LOW	tests/v1/kv_connector/unit/test_hf3fs_client.py	17	CODE
LOW	tests/v1/kv_connector/unit/test_hf3fs_client.py	17	CODE
LOW	tests/v1/kv_connector/unit/test_hf3fs_client.py	17	CODE
LOW	tests/v1/kv_connector/unit/test_hf3fs_client.py	17	CODE
LOW	tests/v1/kv_connector/unit/test_hf3fs_client.py	17	CODE
LOW	tests/v1/kv_connector/unit/test_nixl_push_connector.py	21	CODE
LOW	…/v1/kv_connector/unit/offloading_connector/conftest.py	3	CODE
LOW	tests/v1/spec_decode/test_backup_token_async_spec.py	9	CODE
LOW	tests/v1/sample/test_topk_topp_sampler.py	33	CODE
LOW	tests/v1/engine/conftest.py	20	CODE
LOW	tests/v1/engine/conftest.py	20	CODE
LOW	tests/v1/simple_kv_offload/test_scheduler.py	5	CODE
LOW	tests/v1/simple_kv_offload/test_worker.py	10	CODE
LOW	tests/tool_use/test_gemma4_responses_adjust_request.py	32	CODE
LOW	tests/tool_parsers/test_poolside_v1_tool_parser.py	21	CODE
LOW	tests/renderers/test_chat_utils_prompt_embeds.py	6	CODE
LOW	tests/kernels/core/test_vit_fp8_attn.py	20	CODE
LOW	tests/kernels/core/test_fused_q_kv_rmsnorm.py	11	CODE
LOW	tests/kernels/ir/test_ir_ops.py	11	CODE
LOW	tests/kernels/ir/test_layernorm.py	7	CODE
LOW	tests/kernels/mamba/test_precopy_mamba_align.py	20	CODE
LOW	tests/kernels/mamba/test_gdn_forward_core_split.py	30	CODE
LOW	tests/kernels/mamba/test_ssu_dispatch.py	24	CODE
LOW	tests/kernels/moe/test_moe.py	18	CODE
LOW	tests/distributed/test_eplb_spec_decode.py	3	CODE
LOW	…add_dummy_platform/vllm_add_dummy_platform/__init__.py	10	CODE
LOW	tests/cuda/scripts/check_device_count_respects_env.py	14	CODE
LOW	tests/model_executor/test_oink_integration.py	37	CODE
LOW	tests/models/language/pooling/embed_utils.py	8	CODE
LOW	tests/models/language/pooling/test_reward.py	18	CODE
LOW	tests/models/multimodal/generation/test_pixtral.py	24	CODE
LOW	tests/vllm_test_utils/vllm_test_utils/__init__.py	8	CODE
LOW	tests/vllm_test_utils/vllm_test_utils/__init__.py	8	CODE
LOW	tests/vllm_test_utils/vllm_test_utils/__init__.py	9	CODE
LOW	tests/vllm_test_utils/vllm_test_utils/__init__.py	9	CODE
LOW	tests/parser/engine/test_replay.py	13	CODE
LOW	tests/parser/engine/trace_builder.py	14	CODE
LOW	tests/parser/engine/test_delegating_replay.py	14	CODE
LOW	tests/parser/engine/streaming_helpers.py	5	CODE
LOW	tests/parser/engine/replay_harness.py	10	CODE
LOW	tests/parser/engine/test_parser_engine.py	8	CODE
LOW	tests/parser/engine/test_ufffd_reasoning_transition.py	14	CODE
1328 more matches not shown…

Deep Nesting1124 hits · 1014 pts

Severity	File	Line	Context
LOW	use_existing_torch.py	21	CODE
LOW	setup.py	1004	CODE
LOW	setup.py	1048	CODE
LOW	setup.py	197	CODE
LOW	setup.py	638	CODE
LOW	setup.py	737	CODE
LOW	csrc/cpu/generate_cpu_attn_dispatch.py	92	CODE
LOW	…ibtorch_stable/quantization/marlin/generate_kernels.py	173	CODE
LOW	…btorch_stable/moe/marlin_moe_wna16/generate_kernels.py	173	CODE
LOW	tools/report_build_time_ninja.py	151	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	116	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	141	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	162	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	207	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	340	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	449	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	614	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	658	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	932	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	1242	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	1295	CODE
LOW	tools/pre_commit/validate_config.py	73	CODE
LOW	tools/pre_commit/check_boolean_context_manager.py	21	CODE
LOW	tools/pre_commit/check_spdx_header.py	60	CODE
LOW	tools/pre_commit/check_spdx_header.py	94	CODE
LOW	tools/pre_commit/check_spdx_header.py	139	CODE
LOW	tools/vllm-rocm/pin_rocm_dependencies.py	40	CODE
LOW	tools/vllm-rocm/pin_rocm_dependencies.py	94	CODE
LOW	tests/conftest.py	527	CODE
LOW	tests/conftest.py	984	CODE
LOW	tests/utils.py	1212	CODE
LOW	tests/utils.py	1545	CODE
LOW	tests/utils.py	1709	CODE
LOW	tests/utils.py	1835	CODE
LOW	tests/utils.py	576	CODE
LOW	tests/utils.py	1715	CODE
LOW	tests/utils.py	1858	CODE
LOW	tests/v1/utils.py	12	CODE
LOW	tests/v1/kv_offload/cpu/test_gpu_worker.py	223	CODE
LOW	tests/v1/tracing/test_tracing.py	22	CODE
LOW	tests/v1/attention/test_mla_backends.py	1039	CODE
LOW	tests/v1/attention/test_sparse_mla_backends.py	551	CODE
LOW	tests/v1/logits_processors/test_correctness.py	304	CODE
LOW	tests/v1/logits_processors/test_correctness.py	456	CODE
LOW	tests/v1/core/test_scheduler.py	2313	CODE
LOW	tests/v1/core/utils.py	200	CODE
LOW	tests/v1/kv_connector/unit/test_nixl_connector.py	2924	CODE
LOW	tests/v1/kv_connector/unit/test_offloading_connector.py	306	CODE
LOW	tests/v1/kv_connector/unit/test_offloading_connector.py	358	CODE
LOW	tests/v1/kv_connector/unit/test_offloading_connector.py	374	CODE
LOW	tests/v1/kv_connector/unit/utils.py	318	CODE
LOW	tests/v1/kv_connector/unit/test_mooncake_connector.py	1214	CODE
LOW	…kv_connector/unit/offloading_connector/test_metrics.py	219	CODE
LOW	tests/v1/determinism/test_batch_invariance.py	28	CODE
LOW	tests/v1/determinism/test_batch_invariance.py	155	CODE
LOW	tests/v1/determinism/test_batch_invariance.py	646	CODE
LOW	tests/v1/determinism/test_nvfp4_batch_invariant.py	45	CODE
LOW	tests/v1/spec_decode/test_eagle.py	48	CODE
LOW	tests/v1/spec_decode/test_acceptance_length.py	181	CODE
LOW	tests/v1/spec_decode/test_acceptance_length.py	231	CODE
1064 more matches not shown…

Self-Referential Comments317 hits · 1010 pts

Severity	File	Line	Snippet	Context
MEDIUM	tools/install_deepgemm.sh	84	# Create a temporary directory for the build	COMMENT
MEDIUM	tools/report_build_time_ninja.py	201	# Create a list that is in order by time stamp and has entries for the	COMMENT
MEDIUM	tools/pre_commit/update-dockerfile-graph.sh	26	# Define the target file path	COMMENT
MEDIUM	tests/conftest.py	553	# Create a copy to avoid modifying the original dict	COMMENT
MEDIUM	tests/utils.py	787	# Create a dedicated process group so we can kill	COMMENT
MEDIUM	tests/utils.py	1718	# Create a unique temporary file to store exception info from child	COMMENT
MEDIUM⚡	tests/test_access_log_filter.py	259	# Create a logger with our filter (simulating uvicorn.access)	COMMENT
MEDIUM⚡	tests/test_access_log_filter.py	266	# Create a custom handler that tracks messages	COMMENT
MEDIUM	tests/test_config.py	841	# Create a new mock and run the method with the same S3 URL	COMMENT
MEDIUM	tests/v1/test_tensor_ipc_queue.py	193	# Create a CPU tensor	COMMENT
MEDIUM	tests/v1/test_tensor_ipc_queue.py	511	# Create a CPU tensor	COMMENT
MEDIUM	tests/v1/test_tensor_ipc_queue.py	642	# Create a CPU tensor	COMMENT
MEDIUM	tests/v1/test_tensor_ipc_queue.py	905	# Create a tensor queue	COMMENT
MEDIUM	tests/v1/test_serial_utils.py	189	# Create a sample Python object	COMMENT
MEDIUM	tests/v1/test_serial_utils.py	207	# Create a sample tensor	COMMENT
MEDIUM	tests/v1/test_serial_utils.py	227	# Create a sample numpy array	COMMENT
MEDIUM	tests/v1/test_serial_utils.py	313	# Create a request with a non-multimodal tensor	COMMENT
MEDIUM	tests/v1/test_serial_utils.py	354	# Create a request with None for the tensor field	COMMENT
MEDIUM⚡	tests/v1/kv_offload/test_file_mapper.py	51	# Create a copy of the mock config to avoid modifying the global one	COMMENT
MEDIUM	tests/v1/metrics/test_ray_metrics.py	71	# Create the actor and call the async method	COMMENT
MEDIUM	tests/v1/attention/test_mla_backends.py	285	# Create a realistic slot mapping that corresponds to the block table	COMMENT
MEDIUM	tests/v1/attention/test_mla_backends.py	1524	# Create a summary for the single-line failure message	COMMENT
MEDIUM	tests/v1/attention/test_attention_backends.py	186	# Create a realistic slot mapping that corresponds to the block table	COMMENT
MEDIUM	tests/v1/logits_processors/test_correctness.py	807	# Define a shuffled batch of requests which individually use a different	COMMENT
MEDIUM	tests/v1/logits_processors/test_custom_offline.py	28	# Create a mixture of requests which do and don't utilize the dummy logitproc	COMMENT
MEDIUM	tests/v1/logits_processors/test_custom_offline.py	63	# Create a vLLM instance and load custom logitproc	COMMENT
MEDIUM	tests/v1/logits_processors/test_custom_offline.py	70	# Create a reference vLLM instance without custom logitproc	COMMENT
MEDIUM⚡	tests/v1/core/test_kv_cache_utils.py	258	# Create a list of KVCacheBlock objects	COMMENT
MEDIUM⚡	tests/v1/core/test_kv_cache_utils.py	261	# Create a FreeKVCacheBlockQueue with these blocks	COMMENT
MEDIUM⚡	tests/v1/core/test_kv_cache_utils.py	463	# Create a list of KVCacheBlock objects	COMMENT
MEDIUM⚡	tests/v1/core/test_kv_cache_utils.py	466	# Create a FreeKVCacheBlockQueue with these blocks	COMMENT
MEDIUM	tests/v1/core/test_kv_cache_utils.py	304	# Create an empty FreeKVCacheBlockQueue with these blocks	COMMENT
MEDIUM	tests/v1/core/test_kv_cache_utils.py	352	# Create an empty FreeKVCacheBlockQueue	COMMENT
MEDIUM	tests/v1/core/test_kv_cache_utils.py	406	# Create an empty FreeKVCacheBlockQueue with these blocks	COMMENT
MEDIUM	tests/v1/core/test_kv_cache_utils.py	1358	# Create a VllmConfig	COMMENT
MEDIUM	tests/v1/core/test_kv_cache_utils.py	1394	# Create a VllmConfig	COMMENT
MEDIUM⚡	tests/v1/core/test_scheduler.py	3034	# Create a request and schedule it	COMMENT
MEDIUM	tests/v1/core/test_scheduler.py	3061	# Create a high priority request and schedule it	COMMENT
MEDIUM	tests/v1/core/test_scheduler.py	3831	# Create a request and schedule it (and to be preempted)	COMMENT
MEDIUM	tests/v1/core/test_scheduler.py	3881	# Create a high priority request and schedule it	COMMENT
MEDIUM	tests/v1/core/test_scheduler.py	4751	# Create a text-only request (no mm_features).	COMMENT
MEDIUM	tests/v1/cudagraph/test_cudagraph_dispatch.py	56	# Create a real LoRAConfig with specialize_active_lora enabled	COMMENT
MEDIUM	tests/v1/kv_connector/unit/test_nixl_connector.py	2270	# Create a request that triggers do_remote_decode so that	COMMENT
MEDIUM⚡	tests/v1/kv_connector/unit/test_lmcache_connector.py	216	# Create a mock object that is not LMCacheKVEvents	COMMENT
MEDIUM	tests/v1/kv_connector/unit/test_moriio_connector.py	219	# Define a fake remote engine id for testing	COMMENT
MEDIUM	…ts/v1/kv_connector/unit/test_decode_bench_connector.py	145	# Create a request with multiple blocks worth of tokens	COMMENT
MEDIUM	…ts/v1/kv_connector/unit/test_decode_bench_connector.py	189	# Create a request	COMMENT
MEDIUM	…ts/v1/kv_connector/unit/test_decode_bench_connector.py	211	# Create a request with just 1 token	COMMENT
MEDIUM	…ts/v1/kv_connector/unit/test_decode_bench_connector.py	229	# Create a request with 2 tokens	COMMENT
MEDIUM	…ts/v1/kv_connector/unit/test_decode_bench_connector.py	255	# Create a request with many blocks	COMMENT
MEDIUM	…ts/v1/kv_connector/unit/test_decode_bench_connector.py	338	# Create a request that doesn't align to block boundaries	COMMENT
MEDIUM	tests/v1/kv_connector/unit/test_nixl_connector_hma.py	77	# Create a mock worker with just the required attributes	COMMENT
MEDIUM	tests/v1/kv_connector/unit/test_example_connector.py	146	# Create the LLM instance	COMMENT
MEDIUM	…r/extract_hidden_states_integration/test_extraction.py	45	# Create a minimal Llama config with small dimensions	COMMENT
MEDIUM	…r/extract_hidden_states_integration/test_extraction.py	60	# Create a simple tokenizer	COMMENT
MEDIUM	tests/v1/determinism/test_batch_invariance.py	103	# Create a batch of size `max_batch_size` and insert the needle at	COMMENT
MEDIUM	tests/v1/distributed/test_external_lb_dp.py	154	# Create a client for each server	COMMENT
MEDIUM	tests/v1/distributed/test_hybrid_lb_dp.py	182	# Create a client for each node (each node has its own API endpoint)	COMMENT
MEDIUM	tests/v1/streaming_input/test_async_llm_streaming.py	20	# Create a minimal mock without initializing the full engine	COMMENT
MEDIUM	tests/v1/streaming_input/test_async_llm_streaming.py	50	# Create a mock queue with outputs	COMMENT
257 more matches not shown…

Over-Commented Block963 hits · 850 pts

Severity	File	Line	Snippet	Context
LOW	CMakeLists.txt	1	cmake_minimum_required(VERSION 3.26)	COMMENT
LOW	CMakeLists.txt	61	#	COMMENT
LOW	CMakeLists.txt	101	find_program(NVCC_EXECUTABLE nvcc)	COMMENT
LOW	CMakeLists.txt	241	#	COMMENT
LOW	csrc/torch_utils.h	1	#pragma once	COMMENT
LOW	csrc/torch_bindings.cpp	1	// Provides torch::Tensor for ops.h (previously included transitively via	COMMENT
LOW	csrc/cumem_allocator_compat.h	101	} // extern "C"	COMMENT
LOW	csrc/cumem_allocator.cpp	1	// A CUDAPluggableAllocator based on cumem* APIs.	COMMENT
LOW	csrc/cuda_compat.h	41	#define VLLM_LDG(arg) __ldg(arg)	COMMENT
LOW	csrc/cuda_compat.h	61	#endif	COMMENT
LOW	csrc/cuda_utils.h	1	#pragma once	COMMENT
LOW	csrc/spinloop.cpp	1	#include <Python.h>	COMMENT
LOW	csrc/attention/attention_dtypes.h	1	#pragma once	COMMENT
LOW	csrc/core/scalar_type.hpp	1	#pragma once	COMMENT
LOW	csrc/core/registration.h	1	#pragma once	COMMENT
LOW	csrc/cpu/cpu_attn_neon.hpp	1	#ifndef CPU_ATTN_NEON_HPP	COMMENT
LOW	csrc/cpu/utils.cpp	1	#ifndef VLLM_NUMA_DISABLED	COMMENT
LOW	csrc/cpu/cpu_wna16.cpp	1	#include "cpu/cpu_types.hpp"	COMMENT
LOW	csrc/cpu/cpu_fused_moe.cpp	1	#include "cpu/cpu_types.hpp"	COMMENT
LOW	csrc/cpu/cpu_types.hpp	1	#ifndef CPU_TYPES_HPP	COMMENT
LOW	csrc/cpu/cpu_types.hpp	21	#include "cpu_types_scalar.hpp"	COMMENT
LOW	csrc/cpu/cpu_types_riscv.hpp	1	#ifndef CPU_TYPES_RISCV_HPP	COMMENT
LOW	csrc/cpu/cpu_attn_impl.hpp	981	// - q_heads_buffer: [MaxQHeadNumPerIteration, head_dim]	COMMENT
LOW	csrc/cpu/cpu_attn_impl.hpp	1001	// - q_heads_per_kv	COMMENT
LOW	csrc/cpu/cpu_attn_impl.hpp	1121	// print_logits("masked logits", logits_buffer, q_head_num,	COMMENT
LOW	csrc/cpu/cpu_attn_impl.hpp	1821		COMMENT
LOW	csrc/cpu/cpu_attn_impl.hpp	1841		COMMENT
LOW	csrc/cpu/cpu_attn_vxe.hpp	381	} // namespace cpu_attention	COMMENT
LOW	csrc/cpu/generate_cpu_attn_dispatch.py	141	#ifdef CPU_CAPABILITY_AMXBF16	COMMENT
LOW	csrc/cpu/cpu_arch_macros.h	61	#endif	COMMENT
LOW	csrc/cpu/cpu_arch_macros.h	161	#include <riscv_vector.h>	COMMENT
LOW	csrc/cpu/cpu_attn_rvv.hpp	1	// SPDX-License-Identifier: Apache-2.0	COMMENT
LOW	csrc/cpu/cpu_attn_fp8.hpp	1	// SPDX-License-Identifier: Apache-2.0	COMMENT
LOW	csrc/cpu/cpu_types_vxe.hpp	1		COMMENT
LOW	csrc/cpu/cpu_types_vxe.hpp	21	#define vec_sr(a, b) ((a) >> (b)) // Vector Shift Right Algebraic	COMMENT
LOW	csrc/cpu/cpu_types_x86.hpp	1		COMMENT
LOW	csrc/cpu/shm.cpp	1	#include "cpu/cpu_types.hpp"	COMMENT
LOW	csrc/cpu/cpu_types_riscv_impl.hpp	1	#ifndef CPU_TYPES_RISCV_IMPL_HPP	COMMENT
LOW	csrc/cpu/cpu_types_riscv_impl.hpp	1001	#define CPU_KERNEL_GUARD_IN(NAME)	COMMENT
LOW	csrc/cpu/cpu_types_riscv_defs.hpp	1	#ifndef CPU_TYPES_RISCV_DEFS_HPP	COMMENT
LOW	csrc/cpu/cpu_types_riscv_defs.hpp	21	#define BOOL_256 b16	COMMENT
LOW	csrc/cpu/cpu_attn_vsx.hpp	1	// SPDX-License-Identifier: Apache-2.0	COMMENT
LOW	csrc/cpu/cpu_attn_vsx.hpp	361		COMMENT
LOW	csrc/cpu/sgl-kernels/gemm.cpp	81	constexpr int BLOCK_N = block_size_n();	COMMENT
LOW	csrc/cpu/sgl-kernels/fla.cpp	1	// Adapted from	COMMENT
LOW	csrc/cpu/sgl-kernels/gemm.h	1	// Adapted from	COMMENT
LOW	csrc/cpu/sgl-kernels/vec.h	1	// Adapted from	COMMENT
LOW	csrc/cpu/sgl-kernels/moe.cpp	1	// Adapted from	COMMENT
LOW	csrc/cpu/sgl-kernels/moe.cpp	21	// allocates 2 intermediate_caches instead of 3	COMMENT
LOW	csrc/cpu/sgl-kernels/moe.cpp	1261	// unlike triton kernel, we fuse silu with gemm1 so only need 2 intermediate_caches:	COMMENT
LOW	csrc/cpu/sgl-kernels/moe_int4.cpp	61	// num_threads * BLOCK_M * K +	COMMENT
LOW	csrc/cpu/sgl-kernels/common.h	1	// Adapted from	COMMENT
LOW	csrc/cpu/sgl-kernels/common.h	201	}	COMMENT
LOW	csrc/cpu/sgl-kernels/common.h	281	return std::max(1, (actual_nth >> 1) * 2);	COMMENT
LOW	csrc/cpu/sgl-kernels/conv.cpp	141	Unroll<ROWS * COLS>{}(loadb);	COMMENT
LOW	csrc/cpu/sgl-kernels/conv.cpp	541	//	COMMENT
LOW	csrc/libtorch_stable/torch_utils.h	1	#pragma once	COMMENT
LOW	csrc/libtorch_stable/torch_bindings.cpp	1	#include "ops.h"	COMMENT
LOW	csrc/libtorch_stable/cub_helpers.h	1	#pragma once	COMMENT
LOW	csrc/libtorch_stable/launch_bounds_utils.h	1	#pragma once	COMMENT
903 more matches not shown…

Excessive Try-Catch Wrapping688 hits · 741 pts

Severity	File	Line	Snippet	Context
MEDIUM	setup.py	134	def find_tcmalloc() -> Path \| None:	CODE
MEDIUM	setup.py	860	def get_base_commit_in_main_branch() -> str:	CODE
LOW	setup.py	142	except Exception:	CODE
LOW	setup.py	231	except Exception as e:	CODE
LOW	setup.py	546	except Exception:	CODE
LOW	setup.py	557	except Exception:	CODE
LOW	setup.py	684	except Exception as e:	CODE
LOW	setup.py	914	except Exception as err:	CODE
LOW	setup.py	985	except Exception:	CODE
MEDIUM	tools/generate_cmake_presets.py	168	print(f"Error writing file: {e}")	CODE
LOW	tools/install_nixl_from_source_ubuntu.py	30	except Exception:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	331	except Exception:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	352	except Exception:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	464	except Exception:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	809	except Exception:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	823	except Exception as e:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	906	except Exception:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	943	except Exception:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	1081	except Exception:	CODE
LOW	tools/pre_commit/generate_attention_backend_docs.py	1263	except Exception:	CODE
LOW	tools/profiler/nsys_profile_tools/gputrc2graph.py	239	except Exception:	CODE
LOW	tools/vllm-rocm/pin_rocm_dependencies.py	79	except Exception as e:	CODE
LOW	tests/conftest.py	1316	except Exception:	CODE
LOW	tests/conftest.py	1579	except Exception as e:	CODE
MEDIUM	tests/utils.py	117	def _nvml():	CODE
LOW	tests/utils.py	81	except Exception as e:	CODE
LOW	tests/utils.py	315	except Exception:	CODE
LOW	tests/utils.py	614	except Exception as e:	CODE
LOW	tests/utils.py	700	except Exception:	CODE
LOW	tests/utils.py	1747	except Exception as e:	CODE
LOW	tests/utils.py	1770	except Exception:	CODE
LOW	tests/utils.py	1964	except Exception as e:	CODE
LOW	tests/utils.py	2179	except Exception:	CODE
LOW	tests/v1/test_tensor_ipc_queue.py	83	except Exception as e:	CODE
LOW	tests/v1/test_tensor_ipc_queue.py	120	except Exception as e:	CODE
LOW	tests/v1/test_tensor_ipc_queue.py	308	except Exception as e:	CODE
LOW	tests/v1/test_tensor_ipc_queue.py	409	except Exception as e:	CODE
LOW	tests/v1/test_tensor_ipc_queue.py	443	except Exception as e:	CODE
LOW	tests/v1/test_tensor_ipc_queue.py	532	except Exception as e:	CODE
LOW	tests/v1/test_tensor_ipc_queue.py	570	except Exception as e:	CODE
LOW	tests/v1/test_tensor_ipc_queue.py	721	except Exception as e:	CODE
LOW	tests/v1/utils.py	64	except Exception as e:	CODE
LOW⚡	tests/v1/kv_offload/cpu/test_shared_offload_region.py	156	except Exception as e:	CODE
LOW	tests/v1/kv_offload/cpu/test_shared_offload_region.py	117	except Exception as e:	CODE
LOW	tests/v1/kv_offload/tiering/p2p/p2p_connector_proxy.py	220	except Exception as e:	CODE
LOW	tests/v1/kv_offload/tiering/p2p/p2p_connector_proxy.py	261	except Exception as exc:	CODE
LOW	tests/v1/kv_offload/tiering/p2p/p2p_connector_proxy.py	272	except Exception as exc:	CODE
LOW	tests/v1/kv_offload/tiering/p2p/p2p_connector_proxy.py	292	except Exception as e:	CODE
MEDIUM	tests/v1/kv_offload/tiering/p2p/p2p_connector_proxy.py	255	def _run_decode():	CODE
LOW	tests/v1/shutdown/test_forward_error.py	82	except Exception as e:	CODE
LOW	tests/v1/shutdown/test_processor_error.py	39	except Exception as e:	CODE
LOW⚡	tests/v1/kv_connector/unit/test_hf3fs_client.py	28	except Exception:	CODE
LOW	tests/v1/kv_connector/unit/test_multi_connector.py	390	except Exception as e:	CODE
LOW	tests/v1/kv_connector/unit/utils.py	353	except Exception as e:	CODE
LOW	…/v1/kv_connector/unit/test_mooncake_store_connector.py	700	except Exception:	CODE
LOW	…/v1/kv_connector/unit/test_mooncake_store_connector.py	737	except Exception:	CODE
LOW	tests/v1/kv_connector/unit/test_rixl_gpu_mem_diag.py	36	except Exception:	CODE
LOW	…s/v1/kv_connector/nixl_integration/toy_proxy_server.py	253	except Exception as e:	CODE
MEDIUM	…s/v1/kv_connector/nixl_integration/toy_proxy_server.py	258	print(f"Error occurred in disagg prefill proxy server - {api} endpoint")	CODE
MEDIUM	…/kv_connector/nixl_integration/test_disagg_accuracy.py	159	print(f"Error writing to file: {e}")	CODE
628 more matches not shown…

Redundant / Tautological Comments273 hits · 415 pts

Severity	File	Line	Snippet	Context
LOW	setup.py	882	# Check if the upstream_main_commit exists in the local repo	COMMENT
LOW	tools/install_torchcodec_rocm.sh	19	# Check if torchcodec is already installed and working	COMMENT
LOW	tools/pre_commit/generate_attention_backend_docs.py	368	# Check if it's a capability.major == 10 check (Blackwell)	COMMENT
LOW	tools/pre_commit/generate_attention_backend_docs.py	832	# Check if this is an MLA backend by parent class or naming	COMMENT
LOW	tools/pre_commit/generate_attention_backend_docs.py	1280	# Check if this is the "if use_mla:" branch	COMMENT
LOW	tools/pre_commit/check_forbidden_imports.py	99	# Check if it's allowed	COMMENT
LOW	tools/pre_commit/update-dockerfile-graph.sh	10	# Check if docker/Dockerfile is among the provided files	COMMENT
LOW	tools/pre_commit/update-dockerfile-graph.sh	14	# Check if Docker is installed and running	COMMENT
LOW	tools/pre_commit/update-dockerfile-graph.sh	71	# Check if the graph has changed	COMMENT
LOW	tools/vllm-rocm/pin_rocm_dependencies.py	148	# Check if this line is for one of our custom packages	COMMENT
LOW	tests/conftest.py	387	# Set this to avoid hanging issue	COMMENT
LOW	tests/conftest.py	926	# Set this to avoid hanging issue	COMMENT
LOW	tests/utils.py	544	os.kill(spid, 0) # Check if still alive	CODE
LOW	tests/test_config.py	601	# Check if LONGCHAT_ROPE_PARAMETERS entries are in longchat_model_config	COMMENT
LOW	tests/v1/attention/test_mla_backends.py	1158	# Set num_speculative_tokens to query_len - 1	COMMENT
LOW	tests/v1/attention/test_sparse_mla_backends.py	631	# Set some to -1 to test masking	COMMENT
LOW	tests/v1/attention/test_sparse_mla_backends.py	635	# Set some to out of bounds	COMMENT
LOW	tests/v1/attention/test_sparse_mla_backends.py	691	# Set some to -1 to test masking	COMMENT
LOW	tests/v1/attention/test_sparse_mla_backends.py	695	# Set some to out of bounds	COMMENT
LOW	tests/v1/core/test_scheduler.py	2377	# Verify if position length is identical	COMMENT
LOW	tests/v1/core/test_scheduler.py	3290	# Check if scheduled_encoder_inputs is empty as expected	COMMENT
LOW	tests/v1/core/test_scheduler.py	3642	# Set up to test different encoder cache existence scenario after preemption	COMMENT
LOW	tests/v1/core/test_scheduler.py	3972	# Set up to test different encoder cache existence scenario after preemption	COMMENT
LOW	tests/v1/core/utils.py	255	# Verify if position length is identical	COMMENT
LOW	…extract_hidden_states_integration/predictable_llama.py	81	# Check if we need auxiliary hidden states	COMMENT
LOW	…nnector/nixl_integration/config_sweep_accuracy_test.sh	97	# Check if cross-layers is enabled (non-empty)	COMMENT
LOW	tests/v1/determinism/test_batch_invariance.py	280	# Check if tokens match first	COMMENT
LOW	tests/v1/determinism/test_batch_invariance.py	559	# Check if tokens match first	COMMENT
LOW	tests/v1/determinism/test_batch_invariance.py	788	# Check if tokens match	COMMENT
LOW	tests/v1/determinism/test_batch_invariance.py	806	# Check if logprobs match bitwise	COMMENT
LOW	tests/v1/spec_decode/test_acceptance_length.py	108	# Check if get_valid_backends is actually defined in the platform class	COMMENT
LOW	…1/ec_connector/integration/run_epd_correctness_test.sh	25	# Set 1 to use multimodal prompts; else to use text-only	COMMENT
LOW	…ts/v1/ec_connector/integration/test_epd_correctness.py	218	# Check if server is ready	COMMENT
LOW	tests/v1/e2e/spec_decode/test_async_spec_decode.py	34	# Increment counter	COMMENT
LOW	tests/v1/engine/test_engine_core_client.py	311	# Check if all request IDs in outputs have finished	COMMENT
LOW	tests/v1/engine/utils.py	131	# Check if the sampled_token_id occurs in choice_tensor[1:]	COMMENT
LOW	tests/utils_/test_network_utils.py	79	# Check if IPv6 is supported by trying to create an IPv6 socket	COMMENT
LOW⚡	tests/tool_parsers/test_mistral_tool_parser.py	129	# Check if the slice from the current position matches the target sequence	COMMENT
LOW	tests/kernels/mamba/test_cpu_short_conv.py	111	# Check if KV cache was updated	COMMENT
LOW	tests/kernels/mamba/test_cpu_short_conv.py	169	# Check if KV cache was updated	COMMENT
LOW	tests/kernels/moe/test_moe_layer.py	1819	# Check if enough GPUs available	COMMENT
LOW⚡	tests/kernels/moe/test_rocm_aiter_topk.py	26	# Check if aiter package is installed	COMMENT
LOW⚡	tests/kernels/moe/test_rocm_aiter_topk.py	35	# Check if the op exists in torch.ops.vllm	COMMENT
LOW⚡	tests/kernels/moe/test_rocm_aiter_topk.py	38	# Check if the op is callable	COMMENT
LOW⚡	tests/kernels/moe/test_rocm_aiter_topk.py	44	# Check if the op exists in torch.ops.vllm	COMMENT
LOW⚡	tests/kernels/moe/test_rocm_aiter_topk.py	47	# Check if the op is callable	COMMENT
LOW	tests/evals/gsm8k/gsm8k_eval.py	352	# Print results to terminal	COMMENT
LOW	tests/distributed/test_eplb_execute.py	150	# Check if the weights are correct	COMMENT
LOW	tests/model_executor/test_qwen3_omni.py	29	# Check if it's a special token that should be compressed	COMMENT
LOW	…model_loader/runai_streamer_loader/test_runai_utils.py	56	# Read the file in chunks to handle large files efficiently	COMMENT
LOW⚡	tests/models/multimodal/generation/test_maverick.py	59	# Print the outputs	COMMENT
LOW	tests/quantization/test_gptq_v2.py	43	# Check if gptq_v2 format is correctly loaded	COMMENT
LOW	tests/quantization/test_gptq_v2.py	105	# Print the output sequences if failed	COMMENT
LOW	tests/compile/test_config.py	50	# Check if get_raw_stream exists in builtins	COMMENT
LOW	tests/compile/fusions_e2e/conftest.py	44	# Print the outputs.	COMMENT
LOW	tests/compile/fullgraph/test_full_graph.py	245	# Print the outputs.	COMMENT
LOW	tests/entrypoints/llm/offline_mode/test_offline_mode.py	71	# Set HF to offline mode and ensure we can still construct an LLM	COMMENT
LOW	tests/entrypoints/llm/offline_mode/test_offline_mode.py	140	# Set HF to offline mode and ensure we can still construct an LLM	COMMENT
LOW	tests/entrypoints/serve/utils/test_request_logger.py	95	# Set max_log_len to 10	COMMENT
LOW	…rve/sagemaker/test_sagemaker_middleware_integration.py	326	# Check if environment variable middleware was applied	COMMENT
213 more matches not shown…

Modern Structural Boilerplate342 hits · 344 pts

Severity	File	Line	Snippet	Context
LOW	setup.py	35	logger = logging.getLogger(__name__)	CODE
LOW	tools/profiler/nsys_profile_tools/gputrc2graph.py	15	logger = logging.getLogger(__name__)	CODE
LOW	tests/v1/kv_offload/tiering/p2p/p2p_connector_proxy.py	32	logger = logging.getLogger(__name__)	CODE
LOW⚡	tests/v1/kv_connector/unit/test_multi_connector.py	118	def update_state_after_alloc(self, request, blocks, num_tokens) -> None:	CODE
LOW	tests/v1/kv_connector/unit/test_multi_connector.py	88	def update_state_after_alloc(self, request, blocks, num_tokens) -> None:	CODE
LOW	…/v1/kv_connector/unit/offloading_connector/conftest.py	7	__all__ = ["request_runner"]	CODE
LOW	…s/v1/kv_connector/nixl_integration/toy_proxy_server.py	15	logger = logging.getLogger(__name__)	CODE
LOW	tests/v1/worker/test_gpu_worker_weight_transfer.py	28	def update_weights(self, update_info: dict) -> None:	CODE
LOW	tests/v1/engine/test_core_engine_actor_manager.py	45	def _set_visible_devices(self, vllm_config: Any, local_dp_rank: int) -> None:	CODE
LOW	tests/kernels/moe/test_marlin_vs_trtllm_mxint4.py	77	__all__ = [	CODE
LOW⚡	tests/distributed/eplb_utils.py	73	def set_env_vars_and_device(env: dict[str, str]) -> None:	CODE
LOW	tests/models/language/generation/test_hybrid.py	68	def _set_conv_state_layout(monkeypatch, layout: str) -> None:	CODE
LOW	…/models/multimodal/generation/vlm_utils/model_utils.py	37	logger = logging.getLogger(__name__)	CODE
LOW	tests/vllm_test_utils/vllm_test_utils/__init__.py	11	__all__ = ["blame", "BlameResult", "monitor", "MonitoredValues"]	CODE
LOW	tests/parser/engine/replay_harness.py	82	def set_vocab(self, vocab: dict[str, int]) -> None:	CODE
LOW	tests/entrypoints/openai/test_dp_supervisor.py	501	async def _set_healthy(port: int, use_ssl: bool = False) -> None:	CODE
LOW	tests/entrypoints/openai/test_dp_supervisor.py	511	async def _set_unhealthy(port: int, use_ssl: bool = False) -> None:	CODE
LOW	tests/entrypoints/openai/responses/conftest.py	15	logger = logging.getLogger(__name__)	CODE
LOW	…/entrypoints/openai/responses/test_parsable_context.py	21	logger = logging.getLogger(__name__)	CODE
LOW	tests/entrypoints/openai/responses/test_harmony.py	30	logger = logging.getLogger(__name__)	CODE
LOW	vllm/envs.py	533	logger = logging.getLogger(__name__)	CODE
LOW	vllm/env_override.py	443	def _update_scheduler_patched(self) -> None:	CODE
LOW	vllm/sampling_params.py	659	def update_from_tokenizer(self, tokenizer: TokenizerLike) -> None:	CODE
LOW	vllm/__init__.py	76	__all__ = [	CODE
LOW	vllm/v1/request.py	257	def update_block_hashes(self) -> None:	CODE
LOW	vllm/v1/kv_offload/tiering/example/manager.py	30	logger = logging.getLogger(__name__)	CODE
LOW	vllm/v1/kv_offload/tiering/p2p/data/__init__.py	6	__all__ = [	CODE
LOW	vllm/v1/kv_offload/tiering/p2p/control/__init__.py	12	__all__ = [	CODE
LOW	vllm/v1/kv_offload/tiering/p2p/session/__init__.py	10	__all__ = [	CODE
LOW	vllm/v1/kv_offload/tiering/fs/io.py	9	logger = logging.getLogger(__name__)	CODE
LOW	vllm/v1/metrics/stats.py	304	def update_from_output(self, prefill_stats: PrefillStats) -> None:	CODE
LOW	vllm/v1/attention/backends/flash_attn_diffkv.py	41	def set_head_size_v(cls, head_size_v: int) -> None:	CODE
LOW	vllm/v1/attention/backends/triton_attn_diffkv.py	81	def set_head_size_v(cls, head_size_v: int) -> None:	CODE
LOW	vllm/v1/attention/backends/mla/prefill/__init__.py	10	__all__ = [	CODE
LOW	vllm/v1/attention/ops/triton_decode_attention.py	42	logger = logging.getLogger(__name__)	CODE
LOW	vllm/v1/core/sched/async_scheduler.py	19	def _update_after_schedule(self, scheduler_output: SchedulerOutput) -> None:	CODE
LOW	vllm/v1/core/sched/interface.py	111	def update_draft_token_ids(self, draft_token_ids: "DraftTokenIds") -> None:	CODE
LOW	vllm/v1/core/sched/interface.py	204	def set_pause_state(self, pause_state: PauseState) -> None:	CODE
LOW⚡	vllm/v1/core/sched/scheduler.py	2474	def _update_waiting_for_remote_kv(self, request: Request) -> None:	CODE
LOW	vllm/v1/core/sched/scheduler.py	1197	def _update_after_schedule(self, scheduler_output: SchedulerOutput) -> None:	CODE
LOW	vllm/v1/core/sched/scheduler.py	1987	def update_draft_token_ids(self, draft_token_ids: DraftTokenIds) -> None:	CODE
LOW	vllm/v1/core/sched/scheduler.py	2176	def set_pause_state(self, pause_state: PauseState) -> None:	CODE
LOW	vllm/v1/spec_decode/extract_hidden_states.py	89	def set_eplb_state(self, eplb_state: EplbState) -> None:	CODE
LOW⚡	vllm/v1/spec_decode/gemma4.py	61	def set_per_group_block_table(self, gid: int, block_table: torch.Tensor) -> None:	CODE
LOW	vllm/v1/spec_decode/gemma4.py	115	def _setup_centroids_cuda_graphs(self) -> None:	CODE
LOW	vllm/v1/spec_decode/llm_base_proposer.py	343	def set_eplb_state(self, eplb_state: EplbState) -> None:	CODE
LOW	vllm/v1/sample/thinking_budget_state.py	236	def _update_think_state(self, state: dict[str, Any]) -> None:	CODE
LOW	vllm/v1/sample/logits_processor/__init__.py	344	__all__ = [	CODE
LOW	vllm/v1/executor/vllm_net_devices.py	139	def set_worker_gpu_nic_mapping(local_rank: int) -> None:	CODE
LOW	vllm/v1/executor/vllm_net_devices.py	220	def set_worker_net_device(local_rank: int, vllm_config: VllmConfig) -> None:	CODE
LOW	vllm/v1/executor/__init__.py	6	__all__ = ["Executor", "UniProcExecutor"]	CODE
LOW	vllm/v1/worker/gpu_worker.py	441	def update_config(self, overrides: dict[str, Any]) -> None:	CODE
LOW	vllm/v1/worker/gpu_worker.py	702	def update_max_model_len(self, max_model_len: int) -> None:	CODE
LOW	vllm/v1/worker/gpu_worker.py	933	def _set_draft_weight_update_target(self) -> None:	CODE
LOW	vllm/v1/worker/gpu_worker.py	1251	def update_weights(self, update_info: dict) -> None:	CODE
LOW	vllm/v1/worker/gpu_input_batch.py	1021	def update_async_output_token_ids(self) -> None:	CODE
LOW	vllm/v1/worker/gpu_input_batch.py	1068	def update_async_spec_token_ids(self, draft_token_ids: list[list[int]]) -> None:	CODE
LOW	vllm/v1/worker/gpu_model_runner.py	944	def update_max_model_len(self, max_model_len: int) -> None:	CODE
LOW	vllm/v1/worker/gpu_model_runner.py	5219	def update_config(self, overrides: dict[str, Any]) -> None:	CODE
LOW	vllm/v1/worker/gpu_model_runner.py	5410	def _setup_eagle3_aux_hidden_state_outputs(self) -> None:	CODE
282 more matches not shown…

Structural Annotation Overuse178 hits · 323 pts

Severity	File	Line	Snippet	Context
LOW⚡	tests/v1/kv_offload/tiering/test_async_lookup.py	136	# Step 1: lookup key 1, flush	COMMENT
LOW⚡	tests/v1/kv_offload/tiering/test_async_lookup.py	142	# Step 2: lookup keys 2 and 3, flush	COMMENT
LOW	tests/v1/logits_processors/test_correctness.py	1180	# Step 1: think-start token appears.	COMMENT
LOW⚡	tests/v1/core/test_scheduler.py	5221	# Step 1: A's load is admitted; B's is held back by the reservation (B never	COMMENT
LOW⚡	tests/v1/core/test_scheduler.py	5231	# Step 2: nothing changes until A's recv lands.	COMMENT
LOW⚡	tests/v1/core/test_scheduler.py	5240	# Step 3: A makes forward progress straight to RUNNING - no preemption was	COMMENT
LOW⚡	tests/v1/kv_connector/unit/test_lmcache_connector.py	567	# Step 1: Get events from lmcache engine	COMMENT
LOW⚡	tests/v1/kv_connector/unit/test_lmcache_connector.py	576	# Step 2: Update connector output (simulate receiving from worker)	COMMENT
LOW⚡	tests/v1/kv_connector/unit/test_lmcache_connector.py	582	# Step 3: Take events	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	433	# Step 2: 5 blocks are in use (2 new for remote blocks).	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	441	# Step 3: finish recving (5 blocks in use)	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	450	# Step 4: try to schedule, remote request is put to running list	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	460	# Step 5: Remote request will be put back to waiting list	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	468	# Step 6: finish the request, free it.	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	477	# Step 7: now we can schedule (with 2 blocks computed),	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	551	# Step 3: finish the request, free it.	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	560	# Step 4: now we can initiate KV transfer (with 2 blocks computed).	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	568	# Step 5: finish recving (5 blocks in use)	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	577	# Step 6: schedule remote request	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	584	# Step 7: free everything.	COMMENT
LOW	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	493	# Step 8: free everything.	COMMENT
LOW	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	540	# Step 2: 3 blocks are in use,	COMMENT
LOW	tests/v1/determinism/test_batch_invariance.py	701	# Step 1: Run decode and collect logprobs	COMMENT
LOW	tests/v1/determinism/test_batch_invariance.py	720	# Step 2: For each token position, run prefill and compare	COMMENT
LOW⚡	tests/v1/streaming_input/test_scheduler_streaming.py	373	# Step 2: Schedule creates NewRequestData	COMMENT
LOW⚡	tests/v1/streaming_input/test_scheduler_streaming.py	447	# Step 7: Schedule again - now request uses cached state	COMMENT
LOW⚡	tests/v1/streaming_input/test_scheduler_streaming.py	514	# Step 12: Add new streaming request with seq_id=1	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	388	# Step 3: Simulate model runner caching the prompt_token_ids	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	421	# Step 6: Verify request state after Cycle 1	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	459	# Step 8: Calculate num_tokens like gpu_model_runner.py:1284 does	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	495	# Step 11: Verify request transitioned to WAITING_FOR_STREAMING_REQ	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	526	# Step 13: Scheduler schedules the updated session	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	544	# Step 14: Model runner caches NEW prompt_token_ids reference	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	557	# Step 15: FINAL CRITICAL VERIFICATION	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	84	# Step 1: Add initial request with 3 prompt tokens, all computed	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	100	# Step 2: Create streaming update with extended prompt	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	116	# Step 3: Verify no free_indices leak (old slot recycled)	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	155	# Step 1: Add initial request with one audio feature	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	175	# Step 2: Create streaming update with additional multimodal feature	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	192	# Step 3: Verify no free_indices leak	COMMENT
LOW	…/v1/streaming_input/test_gpu_model_runner_streaming.py	59	# Step 1: Create initial request state with some computed tokens	COMMENT
LOW	…/v1/streaming_input/test_gpu_model_runner_streaming.py	77	# Step 2: Create new request data with extended prompt	COMMENT
LOW⚡	…/v1/streaming_input/test_gpu_model_runner_streaming.py	96	# Step 3: Update the request	COMMENT
LOW⚡	…/v1/streaming_input/test_gpu_model_runner_streaming.py	101	# Step 4: Verify the request state was updated correctly	COMMENT
LOW	…/v1/streaming_input/test_gpu_model_runner_streaming.py	131	# Step 1: Create initial request state with one multimodal feature	COMMENT
LOW	…/v1/streaming_input/test_gpu_model_runner_streaming.py	156	# Step 2: Create new request data with additional multimodal feature	COMMENT
LOW⚡	…/v1/streaming_input/test_gpu_model_runner_streaming.py	176	# Step 3: Update the request	COMMENT
LOW⚡	…/v1/streaming_input/test_gpu_model_runner_streaming.py	181	# Step 4: Verify the request state was updated correctly	COMMENT
LOW⚡	…1/ec_connector/unit/cpu/scheduler/test_step_tracker.py	15	# Step 1: entry was added, committed to slot. Deque fills to maxlen=1,	COMMENT
LOW⚡	…1/ec_connector/unit/cpu/scheduler/test_step_tracker.py	20	# Step 2: deque is full, oldest slot (containing h1) expires.	COMMENT
LOW⚡	…s/v1/ec_connector/unit/cpu/scheduler/test_scheduler.py	276	# Step 3: unpin fires, now eviction works.	COMMENT
LOW⚡	tests/v1/ec_connector/integration/README.md	61	### Step 1: Baseline	COMMENT
LOW⚡	tests/v1/ec_connector/integration/README.md	68	### Step 2: EPD (1E + 1PD)	COMMENT
LOW⚡	tests/v1/ec_connector/integration/README.md	76	### Step 3: EPD (1E + 1P + 1D)	COMMENT
LOW⚡	…1/ec_connector/integration/run_epd_correctness_test.sh	459	# Step 1: Run baseline	COMMENT
LOW⚡	…1/ec_connector/integration/run_epd_correctness_test.sh	462	# Step 2: Test 1E + 1PD	COMMENT
LOW⚡	…1/ec_connector/integration/run_epd_correctness_test.sh	465	# Step 3: Test baseline 1P + 1D	COMMENT
LOW⚡	…1/ec_connector/integration/run_epd_correctness_test.sh	468	# Step 4: Test 1E + 1P + 1D	COMMENT
LOW	tests/kernels/test_fused_inv_rope_fp8_quant.py	683	# Step 1: In-place CUDA RoPE (same as production)	COMMENT
LOW	tests/kernels/test_fused_inv_rope_fp8_quant.py	695	# Step 2: Reshape + quant + reshape (same as production)	COMMENT
118 more matches not shown…

AI Structural Patterns343 hits · 322 pts

Severity	File	Line	Context
LOW	tests/conftest.py	373	CODE
LOW	tests/conftest.py	411	CODE
LOW	tests/conftest.py	910	CODE
LOW	tests/v1/kv_offload/tiering/p2p/test_manager.py	388	CODE
LOW	tests/v1/attention/utils.py	184	CODE
LOW	tests/v1/attention/test_attention_backends.py	312	CODE
LOW	tests/v1/core/test_kv_cache_utils.py	113	CODE
LOW	tests/v1/core/test_scheduler.py	2190	CODE
LOW	tests/v1/core/test_scheduler.py	2313	CODE
LOW	tests/v1/core/utils.py	45	CODE
LOW	tests/v1/core/utils.py	200	CODE
LOW	tests/v1/kv_connector/unit/test_moriio_connector.py	237	CODE
LOW	tests/v1/kv_connector/unit/test_nixl_push_connector.py	353	CODE
LOW	tests/v1/kv_connector/unit/utils.py	91	CODE
LOW	tests/v1/kv_connector/unit/utils.py	193	CODE
LOW	tests/v1/sample/test_rejection_sampler.py	71	CODE
LOW	tests/v1/e2e/general/test_mamba_prefix_cache.py	178	CODE
LOW	tests/kernels/utils.py	278	CODE
LOW	tests/kernels/utils.py	855	CODE
LOW	tests/kernels/attention/test_mha_attn.py	117	CODE
LOW	tests/kernels/attention/test_attention.py	54	CODE
LOW	…sts/kernels/attention/test_triton_prefill_attention.py	75	CODE
LOW	tests/kernels/mamba/test_mamba_ssm.py	112	CODE
LOW	tests/kernels/quantization/test_nvfp4_quant.py	81	CODE
LOW	tests/kernels/moe/test_deepep_v2_moe.py	194	CODE
LOW	tests/kernels/moe/test_moe_permute_unpermute.py	114	CODE
LOW	tests/kernels/moe/test_moe.py	1540	CODE
LOW	tests/kernels/moe/utils.py	50	CODE
LOW	tests/kernels/moe/test_moe_layer.py	975	CODE
LOW	tests/kernels/moe/test_moe_layer.py	1078	CODE
LOW	tests/evals/mrcr/mrcr_eval.py	161	CODE
LOW	tests/evals/gsm8k/gsm8k_eval.py	208	CODE
LOW	tests/distributed/test_multiproc_executor.py	23	CODE
LOW	…/models/multimodal/generation/vlm_utils/model_utils.py	588	CODE
LOW	tests/models/transformers/fusers/test_moe.py	148	CODE
LOW	tests/quantization/reference_mxfp4.py	88	CODE
LOW	tests/quantization/reference_mxfp4.py	235	CODE
LOW	tests/compile/test_graph_partition.py	103	CODE
LOW	tests/compile/fullgraph/test_toy_llama.py	155	CODE
LOW	tests/compile/passes/test_silu_mul_quant_fusion.py	132	CODE
LOW	tests/compile/passes/test_fusion.py	502	CODE
LOW	tests/benchmarks/test_random_dataset.py	177	CODE
LOW	tests/entrypoints/openai/utils.py	138	CODE
LOW	vllm/sampling_params.py	356	CODE
LOW	vllm/_custom_ops.py	2534	CODE
LOW	vllm/_custom_ops.py	3194	CODE
LOW	vllm/_xpu_ops.py	758	CODE
LOW	vllm/_xpu_ops.py	837	CODE
LOW	vllm/forward_context.py	212	CODE
LOW	vllm/forward_context.py	260	CODE
LOW	vllm/_aiter_ops.py	113	CODE
LOW	vllm/_aiter_ops.py	173	CODE
LOW	vllm/_aiter_ops.py	202	CODE
LOW	vllm/_aiter_ops.py	239	CODE
LOW	vllm/_aiter_ops.py	473	CODE
LOW	vllm/_aiter_ops.py	541	CODE
LOW	vllm/_aiter_ops.py	2192	CODE
LOW	vllm/_aiter_ops.py	2242	CODE
LOW	vllm/_aiter_ops.py	2363	CODE
LOW	vllm/_aiter_ops.py	2719	CODE
283 more matches not shown…

Cross-Language Confusion56 hits · 282 pts

Severity	File	Line	Snippet	Context
HIGH	setup.py	705	"build_tag": null,	STRING
HIGH	setup.py	709	"variant": null,	STRING
HIGH	csrc/cpu/generate_cpu_attn_dispatch.py	156	(__riscv_v_min_vlen == 128 \|\| __riscv_v_min_vlen == 256)	CODE
HIGH	csrc/cpu/generate_cpu_attn_dispatch.py	228	"&& (__riscv_v_min_vlen == 128 \|\| __riscv_v_min_vlen == 256)",	CODE
HIGH⚡	csrc/libtorch_stable/quantization/machete/generate.py	512	"M > 256 && K <= 16384 && N <= 4096": ((128, 128), (2, 1, 1)),	STRING
HIGH⚡	csrc/libtorch_stable/quantization/machete/generate.py	515	"M > 128 && K <= 4096 && N <= 4096": ((128, 64), (2, 1, 1)),	STRING
HIGH⚡	csrc/libtorch_stable/quantization/machete/generate.py	516	"M > 128 && K <= 8192 && N <= 8192": ((128, 128), (2, 1, 1)),	STRING
HIGH⚡	csrc/libtorch_stable/quantization/machete/generate.py	519	"M > 64 && K <= 4069 && N <= 4069": ((128, 32), (2, 1, 1)),	STRING
HIGH⚡	csrc/libtorch_stable/quantization/machete/generate.py	520	"M > 64 && K <= 4069 && N <= 8192": ((128, 64), (2, 1, 1)),	STRING
HIGH⚡	csrc/libtorch_stable/quantization/machete/generate.py	521	"M > 64 && K >= 8192 && N >= 12288": ((256, 128), (2, 1, 1)),	STRING
HIGH⚡	csrc/libtorch_stable/quantization/machete/generate.py	524	"M > 32 && K <= 6144 && N <= 6144": ((128, 16), (1, 1, 1)),	STRING
HIGH⚡	csrc/libtorch_stable/quantization/machete/generate.py	525	"M > 32 && K >= 16384 && N >= 12288": ((256, 64), (2, 1, 1)),	STRING
HIGH⚡	csrc/libtorch_stable/quantization/machete/generate.py	528	"M > 16 && K <= 12288 && N <= 8192": ((128, 32), (2, 1, 1)),	STRING
HIGH	tests/v1/attention/test_trtllm_attention_integration.py	149	# Randomly permute blocks (starting from block 1; block 0 is null).	COMMENT
HIGH	tests/v1/attention/test_mla_backends.py	262	# Permute the context blocks (excluding block 0 which is null)	COMMENT
HIGH	tests/v1/attention/test_attention_backends.py	163	# Permute the context blocks (excluding block 0 which is null)	COMMENT
HIGH⚡	tests/v1/core/test_scheduler.py	3025	num_blocks=5, # Can hold 64 tokens (first block is null)	CODE
HIGH	tests/v1/core/test_scheduler.py	3819	num_blocks=15, # can hold 244 tokens with 14 blocks (first block is null)	CODE
HIGH	tests/v1/core/test_scheduler.py	4060	num_blocks=11, # Can hold 160 tokens (first block is null)	CODE
HIGH	tests/v1/cudagraph/test_cudagraph_mode.py	67	# when above code raises, `llm` may be undefined, so we need to catch that	COMMENT
HIGH	tests/v1/cudagraph/test_cudagraph_mode.py	123	# when above code raises, `llm` may be undefined, so we need to catch that	COMMENT
HIGH	…_connector/unit/offloading_connector/test_scheduler.py	2609	# 4 GPU blocks: block 0 is null, blocks 1-3 are usable.	COMMENT
HIGH	tests/tool_parsers/test_granite_tool_parser.py	44	"null_field": null,	CODE
HIGH	tests/tool_parsers/test_granite_20b_fc_tool_parser.py	36	"null_field": null,	CODE
HIGH⚡	tests/tool_parsers/test_olmo3_tool_parser.py	34	"role=null, "	CODE
HIGH⚡	tests/tool_parsers/test_olmo3_tool_parser.py	43	'"role": null, '	CODE
HIGH⚡	tests/tool_parsers/test_pythonic_tool_parser.py	35	'"role": null, '	CODE
HIGH	tests/tool_parsers/test_deepseekv3_tool_parser.py	43	""""bool_field": true, "null_field": null, """	STRING
HIGH	tests/tool_parsers/test_internlm2_tool_parser.py	54	"null_field": null,	CODE
HIGH	tests/tool_parsers/test_llama4_pythonic_tool_parser.py	35	'"role": null, '	CODE
HIGH	tests/tool_parsers/test_phi4mini_tool_parser.py	51	"null_field": null,	CODE
HIGH⚡	tests/tool_parsers/test_lfm2_tool_parser.py	38	'"role": null, '	CODE
HIGH	tests/tool_parsers/test_longcat_tool_parser.py	54	"null_field": null,	CODE
HIGH⚡	tests/tool_parsers/test_hunyuan_a13b_tool_parser.py	45	'<tool_calls>[{"name": "get_weather", "arguments": {"city": "San Francisco", "metric": "celsius"}}, {"name":	CODE
HIGH	tests/tool_parsers/test_qwen3coder_tool_parser.py	594	# Multi non-null: anyOf[string, integer, null] → first non-null is string	COMMENT
HIGH	tests/parser/engine/test_qwen3.py	263	' "multiSelect": false, "answer": null}]'	CODE
HIGH	tests/parser/engine/test_qwen3.py	274	' "multiSelect": false, "answer": null}]',	CODE
HIGH	tests/parser/engine/test_qwen3.py	1143	' "multiSelect": false, "answer": null}]'	CODE
HIGH	tests/parser/engine/test_qwen3.py	1164	' "multiSelect": false, "answer": null}]',	CODE
HIGH⚡	tests/parser/engine/test_parser_engine.py	604	result = engine._fix_arg_types('{"val": null}', "f")	CODE
HIGH⚡	tests/parser/engine/test_gemma4_streaming_reasoning.py	1202	text = "<\|tool_call>call:configure{value:null}<tool_call\|>"	CODE
HIGH⚡	tests/parser/engine/test_gemma4_streaming_reasoning.py	1210	text = "<\|tool_call>call:configure{label:null}<tool_call\|>"	CODE
HIGH	…e_out/token_in_token_out/test_return_routed_experts.py	36	'{"sliding_window": null}',	CODE
HIGH	tests/entrypoints/openai/test_return_routed_experts.py	33	'{"sliding_window": null}',	CODE
HIGH	vllm/_custom_ops.py	1690	assert k_times_2 % 2 == 0, "input width must be even (gate \|\| up layout)"	CODE
HIGH	vllm/_custom_ops.py	1799	assert k_times_2 % 2 == 0, "input width must be even (gate \|\| up layout)"	CODE
HIGH	vllm/_custom_ops.py	1670	input_tensor: The input tensor with gate \|\| up layout [m_topk, k*2]	STRING
HIGH	vllm/v1/core/single_type_kv_cache_manager.py	1148	# result [null] [null] ... [null] [hit block 1 (1st block contain	COMMENT
HIGH	vllm/v1/core/single_type_kv_cache_manager.py	462	every (non-null) block — the default for full attention.	STRING
HIGH	vllm/v1/core/single_type_kv_cache_manager.py	1095	[null, null, block 3], otherwise, we return [null, null]	STRING
HIGH	vllm/v1/core/single_type_kv_cache_manager.py	1100	we return 4 blocks[null, null, null, null]	STRING
HIGH	vllm/tool_parsers/utils.py	447	(null, true, false) that some models produce instead of Python	STRING
HIGH	…ibuted/kv_transfer/kv_connector/v1/flexkv_connector.py	47	cd FlexKV && bash build.sh	STRING
HIGH	vllm/benchmarks/datasets/datasets.py	4171	# undefined), language (en), transcription directive (en), punctuation	STRING
HIGH	…isaggregated/flexkv_connector/prefix_caching_flexkv.py	12	2. cd FlexKV && bash build.sh	STRING
HIGH	.buildkite/scripts/generate-nightly-index.py	209	"build_tag": null,	CODE

AI Slop Vocabulary113 hits · 277 pts

Severity	File	Line	Snippet	Context
LOW	csrc/libtorch_stable/quantization/machete/generate.py	435	# For now, we can just use the first accumulator type seen since	STRING
MEDIUM⚡	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	145	# More robust: count lines matching our UID.	COMMENT
MEDIUM	tests/v1/logits_processors/test_custom_offline.py	28	# Create a mixture of requests which do and don't utilize the dummy logitproc	COMMENT
LOW	tests/v1/cudagraph/test_breakable_cudagraph.py	235	# Outside capture: decorator should just call through.	COMMENT
LOW⚡	tests/tool_parsers/test_mistral_tool_parser.py	135	# Otherwise, just add the current token and move to the next one	COMMENT
MEDIUM⚡	tests/renderers/test_sparse_tensor_validation.py	58	# explicitly so this fixture is robust to process-wide invariant-check state	COMMENT
MEDIUM	tests/kernels/mamba/test_mamba_mixer2.py	106	# - utilize mock patching to disable TP when	COMMENT
MEDIUM	tests/distributed/test_pynccl.py	442	# Essentially this is an all-gather operation.	COMMENT
MEDIUM	tests/distributed/test_context_parallel.py	46	# .buildkite/lm-eval-harness/configs/DeepSeek-V2-Lite-Chat.yaml	COMMENT
MEDIUM	tests/distributed/test_context_parallel.py	48	# .buildkite/lm-eval-harness/configs/Qwen2.5-1.5B-Instruct.yaml	COMMENT
LOW	tests/model_executor/test_qwen3_omni.py	41	# Regular token, just add it	COMMENT
MEDIUM	tests/models/registry.py	1056	# Keep the init/schema harness eager (these tests never run forward).	COMMENT
MEDIUM	…s/compile/passes/distributed/test_fusion_all_reduce.py	754	# the call site). The patterns in this PR are robust to both Triton and	COMMENT
LOW	tests/lora/test_gptoss_tp.py	95	# For now just use TRITON_UNFUSED kernel	COMMENT
LOW	vllm/env_override.py	356	# functions just return True.	COMMENT
LOW	vllm/v1/attention/backends/flash_attn.py	337	# but for now just set it to `UNIFORM_BATCH` to get use to drop down	COMMENT
LOW	vllm/v1/attention/backends/utils.py	293	# then we can simply use a cdiv for the rest.	COMMENT
MEDIUM	vllm/v1/attention/ops/triton_decode_attention.py	383	# explicitly facilitate overlapping load/compute	COMMENT
LOW	vllm/v1/core/kv_cache_coordinator.py	485	# Single group; useless but just set ``use_eagle`` for consistency regardless.	COMMENT
MEDIUM	vllm/v1/core/encoder_cache_manager.py	321	# utilize the cache and this class will fold into EncoderCacheManager, as	COMMENT
LOW	vllm/v1/spec_decode/llm_base_proposer.py	609	# KV cache in sync, so just return an empty tensor.	COMMENT
LOW	vllm/v1/spec_decode/llm_base_proposer.py	1825	# Therefore, we can just return the logits.	COMMENT
MEDIUM	vllm/v1/worker/gpu_worker.py	807	# CUDAGraph memory size and may not utilize all gpu memory.	COMMENT
MEDIUM	vllm/v1/worker/gpu_worker.py	1104	# Generate the trace name by combining prefix with comprehensive rank suffix	COMMENT
LOW	vllm/v1/worker/gpu/model_runner.py	1314	# For piecewise and eager mode, just call model().	COMMENT
LOW	vllm/v1/engine/parallel_sampling.py	116	# If streaming, just return the current output	COMMENT
MEDIUM	vllm/tool_parsers/llama_tool_parser.py	263	# re-set stuff pertaining to progress in the current tool	COMMENT
MEDIUM	vllm/tool_parsers/jamba_tool_parser.py	218	# re-set stuff pertaining to progress in the current tool	COMMENT
MEDIUM	vllm/tool_parsers/granite_tool_parser.py	188	# re-set stuff pertaining to progress in the current tool	COMMENT
LOW	vllm/tool_parsers/xlam_tool_parser.py	560	# If we encounter an error, just return the delta text as regular content	COMMENT
MEDIUM	vllm/tool_parsers/granite_20b_fc_tool_parser.py	203	# re-set stuff pertaining to progress in the current tool	COMMENT
MEDIUM	vllm/tool_parsers/step3_tool_parser.py	51	# Explicit state flags for robust streaming	COMMENT
LOW	vllm/tokenizers/mistral.py	554	# if underlying tokenizer is sentencepiece, we just add "�".	COMMENT
LOW	vllm/platforms/cuda.py	692	# users can just use IR op priority directly	COMMENT
MEDIUM	vllm/distributed/utils.py	334	"""A robust barrier to synchronize all ranks.	STRING
MEDIUM	…ibuted/kv_transfer/kv_connector/v1/nixl/base_worker.py	730	# we can leverage host_buffer for permute.	COMMENT
MEDIUM	…ransfer/kv_connector/v1/mooncake/mooncake_connector.py	925	# Tasks can await async events, so a surplus (2x is a robust heuristic)	COMMENT
MEDIUM	vllm/config/model.py	311	"""Enable the custom cumem allocator to leverage advanced GPU memory	STRING
MEDIUM	vllm/config/parallel.py	598	# To make the initialization more robust we retry a few times	COMMENT
LOW	vllm/model_executor/layers/mamba/ops/causal_conv1d.py	185	# first chunk and does not have prior-token, so just set to 0	COMMENT
MEDIUM	vllm/model_executor/layers/fused_moe/modular_kernel.py	48	# The goal is to be able to utilize different communication mechanisms with	COMMENT
MEDIUM	…xecutor/layers/fused_moe/prepare_finalize/deepep_ll.py	40	# TODO (varun) : Optimize leverage num_tokens_per_expert counts	COMMENT
MEDIUM	…executor/layers/fused_moe/experts/fused_humming_moe.py	312	# Neighboring nodes are required to utilize distinct workspaces.	COMMENT
MEDIUM	vllm/model_executor/layers/quantization/fp8.py	287	# For GPUs that lack FP8 hardware support, we can leverage the Marlin	COMMENT
MEDIUM	vllm/model_executor/layers/quantization/modelopt.py	323	# Normalize quant_algo for robust matching (ModelOpt may emit lowercase).	COMMENT
MEDIUM	vllm/model_executor/layers/quantization/fbgemm_fp8.py	53	# For GPUs that lack FP8 hardware support, we can leverage the Marlin	COMMENT
MEDIUM	…executor/layers/quantization/utils/marlin_utils_fp4.py	109	# to fully utilize the E4M3 dynamic range (e.g., global_scale=1).	COMMENT
MEDIUM	…executor/layers/quantization/utils/marlin_utils_fp4.py	169	# For GPUs that lack FP4 hardware support, we can leverage the	COMMENT
MEDIUM	…executor/layers/quantization/utils/marlin_utils_fp8.py	60	# For GPUs that lack FP8 hardware support, we can leverage the	COMMENT
LOW	…odel_executor/layers/quantization/utils/quant_utils.py	422	# Unquantized layer: just return base weights	COMMENT
MEDIUM	…rs/quantization/compressed_tensors/transform/module.py	85	# do not fold into weight in order to utilize FWHT	COMMENT
MEDIUM	vllm/model_executor/models/deepseek_ocr.py	140	"""Example of overriding the wrapper class `__init__()` in order to utilize	STRING
MEDIUM	vllm/model_executor/models/deepseek_ocr.py	140	"""Example of overriding the wrapper class `__init__()` in order to utilize	STRING
LOW	vllm/model_executor/models/qwen3_asr.py	538	# No audio features, just return linear positions	COMMENT
LOW	vllm/model_executor/models/transformers/multimodal.py	199	# NOTE: we can't just set caching=False because base class method	COMMENT
LOW	vllm/models/deepseek_v4/xpu/xpu_sparse.py	62	# Profile run: no-op, just return q (no padding needed on XPU).	COMMENT
MEDIUM	…dels/deepseek_v4/nvidia/ops/fused_indexer_q_cutedsl.py	167	# all threads in a warp to be active since we utilize warp shuffle later.	COMMENT
LOW	vllm/reasoning/granite_reasoning_parser.py	195	# corrected; just return the delta text as normal content.	COMMENT
LOW	vllm/reasoning/hunyuan_a13b_reasoning_parser.py	88	# this id is not part of content, so just return [] here.	COMMENT
LOW	vllm/reasoning/olmo3_reasoning_parser.py	268	# this id is not part of content, so just return [] here.	COMMENT
53 more matches not shown…

Verbosity Indicators149 hits · 277 pts

Severity	File	Line	Snippet	Context
LOW⚡	tests/v1/kv_offload/tiering/test_async_lookup.py	136	# Step 1: lookup key 1, flush	COMMENT
LOW⚡	tests/v1/kv_offload/tiering/test_async_lookup.py	142	# Step 2: lookup keys 2 and 3, flush	COMMENT
LOW	tests/v1/logits_processors/test_correctness.py	1180	# Step 1: think-start token appears.	COMMENT
LOW⚡	tests/v1/core/test_scheduler.py	5221	# Step 1: A's load is admitted; B's is held back by the reservation (B never	COMMENT
LOW⚡	tests/v1/core/test_scheduler.py	5231	# Step 2: nothing changes until A's recv lands.	COMMENT
LOW⚡	tests/v1/core/test_scheduler.py	5240	# Step 3: A makes forward progress straight to RUNNING - no preemption was	COMMENT
LOW⚡	tests/v1/kv_connector/unit/test_lmcache_connector.py	567	# Step 1: Get events from lmcache engine	COMMENT
LOW⚡	tests/v1/kv_connector/unit/test_lmcache_connector.py	576	# Step 2: Update connector output (simulate receiving from worker)	COMMENT
LOW⚡	tests/v1/kv_connector/unit/test_lmcache_connector.py	582	# Step 3: Take events	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	433	# Step 2: 5 blocks are in use (2 new for remote blocks).	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	441	# Step 3: finish recving (5 blocks in use)	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	450	# Step 4: try to schedule, remote request is put to running list	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	460	# Step 5: Remote request will be put back to waiting list	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	468	# Step 6: finish the request, free it.	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	477	# Step 7: now we can schedule (with 2 blocks computed),	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	551	# Step 3: finish the request, free it.	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	560	# Step 4: now we can initiate KV transfer (with 2 blocks computed).	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	568	# Step 5: finish recving (5 blocks in use)	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	577	# Step 6: schedule remote request	COMMENT
LOW⚡	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	584	# Step 7: free everything.	COMMENT
LOW	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	493	# Step 8: free everything.	COMMENT
LOW	…/v1/kv_connector/unit/test_remote_prefill_lifecycle.py	540	# Step 2: 3 blocks are in use,	COMMENT
LOW	tests/v1/determinism/test_batch_invariance.py	701	# Step 1: Run decode and collect logprobs	COMMENT
LOW	tests/v1/determinism/test_batch_invariance.py	720	# Step 2: For each token position, run prefill and compare	COMMENT
LOW⚡	tests/v1/streaming_input/test_scheduler_streaming.py	373	# Step 2: Schedule creates NewRequestData	COMMENT
LOW⚡	tests/v1/streaming_input/test_scheduler_streaming.py	447	# Step 7: Schedule again - now request uses cached state	COMMENT
LOW⚡	tests/v1/streaming_input/test_scheduler_streaming.py	514	# Step 12: Add new streaming request with seq_id=1	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	388	# Step 3: Simulate model runner caching the prompt_token_ids	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	421	# Step 6: Verify request state after Cycle 1	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	459	# Step 8: Calculate num_tokens like gpu_model_runner.py:1284 does	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	495	# Step 11: Verify request transitioned to WAITING_FOR_STREAMING_REQ	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	526	# Step 13: Scheduler schedules the updated session	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	544	# Step 14: Model runner caches NEW prompt_token_ids reference	COMMENT
LOW	tests/v1/streaming_input/test_scheduler_streaming.py	557	# Step 15: FINAL CRITICAL VERIFICATION	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	84	# Step 1: Add initial request with 3 prompt tokens, all computed	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	100	# Step 2: Create streaming update with extended prompt	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	116	# Step 3: Verify no free_indices leak (old slot recycled)	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	155	# Step 1: Add initial request with one audio feature	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	175	# Step 2: Create streaming update with additional multimodal feature	COMMENT
LOW	…/streaming_input/test_gpu_model_runner_v2_streaming.py	192	# Step 3: Verify no free_indices leak	COMMENT
LOW	…/v1/streaming_input/test_gpu_model_runner_streaming.py	59	# Step 1: Create initial request state with some computed tokens	COMMENT
LOW	…/v1/streaming_input/test_gpu_model_runner_streaming.py	77	# Step 2: Create new request data with extended prompt	COMMENT
LOW⚡	…/v1/streaming_input/test_gpu_model_runner_streaming.py	96	# Step 3: Update the request	COMMENT
LOW⚡	…/v1/streaming_input/test_gpu_model_runner_streaming.py	101	# Step 4: Verify the request state was updated correctly	COMMENT
LOW	…/v1/streaming_input/test_gpu_model_runner_streaming.py	131	# Step 1: Create initial request state with one multimodal feature	COMMENT
LOW	…/v1/streaming_input/test_gpu_model_runner_streaming.py	156	# Step 2: Create new request data with additional multimodal feature	COMMENT
LOW⚡	…/v1/streaming_input/test_gpu_model_runner_streaming.py	176	# Step 3: Update the request	COMMENT
LOW⚡	…/v1/streaming_input/test_gpu_model_runner_streaming.py	181	# Step 4: Verify the request state was updated correctly	COMMENT
LOW⚡	…1/ec_connector/unit/cpu/scheduler/test_step_tracker.py	15	# Step 1: entry was added, committed to slot. Deque fills to maxlen=1,	COMMENT
LOW⚡	…1/ec_connector/unit/cpu/scheduler/test_step_tracker.py	20	# Step 2: deque is full, oldest slot (containing h1) expires.	COMMENT
LOW⚡	…s/v1/ec_connector/unit/cpu/scheduler/test_scheduler.py	276	# Step 3: unpin fires, now eviction works.	COMMENT
LOW⚡	…1/ec_connector/integration/run_epd_correctness_test.sh	459	# Step 1: Run baseline	COMMENT
LOW⚡	…1/ec_connector/integration/run_epd_correctness_test.sh	462	# Step 2: Test 1E + 1PD	COMMENT
LOW⚡	…1/ec_connector/integration/run_epd_correctness_test.sh	465	# Step 3: Test baseline 1P + 1D	COMMENT
LOW⚡	…1/ec_connector/integration/run_epd_correctness_test.sh	468	# Step 4: Test 1E + 1P + 1D	COMMENT
LOW	tests/kernels/test_fused_inv_rope_fp8_quant.py	683	# Step 1: In-place CUDA RoPE (same as production)	COMMENT
LOW	tests/kernels/test_fused_inv_rope_fp8_quant.py	695	# Step 2: Reshape + quant + reshape (same as production)	COMMENT
LOW⚡	…nts/multimodal/llm/test_mm_cache_external_injection.py	92	# Step 1: Normal requests to populate the cache	COMMENT
LOW⚡	…nts/multimodal/llm/test_mm_cache_external_injection.py	100	# Step 2: Use a second image to get valid expanded tokens and	COMMENT
LOW⚡	…ntrypoints/weight_transfer/test_weight_transfer_llm.py	262	# Step 1: Initialize weight transfer engine	COMMENT
89 more matches not shown…

Docstring Block Structure35 hits · 175 pts

Severity	File	Line	Snippet	Context
HIGH	vllm/v1/attention/backends/registry.py	238	Register or override a backend implementation. Args: backend: The AttentionBackendEnum member to register	STRING
HIGH	vllm/v1/attention/backends/mla/prefill/registry.py	106	Register or override an MLA prefill backend implementation. Args: backend: The MLAPrefillBackendEnum member	STRING
HIGH	vllm/v1/core/single_type_kv_cache_manager.py	1082	For chunked local attention, we need to find the longest cache hit prefix of the blocks that is not lon	STRING
HIGH	vllm/v1/structured_output/utils.py	329	Check if grammar appears to use Lark syntax. Args: grammar_str: Input grammar string Returns:	STRING
HIGH	vllm/v1/structured_output/utils.py	361	Convert a Lark grammar string to EBNF format. EBNF reference: https://github.com/ggerganov/llama.cpp/blob/	STRING
HIGH	vllm/v1/worker/utils.py	286	Select a block size that is supported by all backends and is a factor of kv_manager_block_size. If kv_mana	STRING
HIGH	vllm/tool_parsers/apertus_tool_parser.py	137	Buffers incoming delta chunks to prevent fragmentation of multi-token special tags. If a chunk	STRING
HIGH	vllm/tool_parsers/apertus_tool_parser.py	181	Extracts tool calls from a completely generated model response (Non-Streaming). Args: mode	STRING
HIGH	vllm/tool_parsers/apertus_tool_parser.py	283	Handles streaming chunks Args: previous_text: The complete model text generated prior to t	STRING
HIGH	vllm/tool_parsers/apertus_tool_parser.py	494	Calculates the exact string difference to safely append new tool parameters. This ensures characters l	STRING
HIGH	…ed/kv_transfer/kv_connector/v1/lmcache_mp_connector.py	738	Get number of new tokens that can be loaded from the external KV cache beyond the num_computed_tokens.	STRING
HIGH	vllm/distributed/kv_transfer/kv_connector/v1/base.py	459	Get number of new tokens that can be loaded from the external KV cache beyond the num_computed_tokens.	STRING
HIGH	…ed/kv_transfer/kv_connector/v1/moriio/moriio_engine.py	284	Get remote allocation info for a request. Args: transfer_id:TransferId The request ID Retu	STRING
HIGH	…nector/v1/lmcache_integration/multi_process_adapter.py	198	Submit a new lookup request to LMCache if there is no ongoing request. Supports both token-based and h	STRING
HIGH	…nector/v1/lmcache_integration/multi_process_adapter.py	601	Check and get the finished store and retrieve requests. Args: finished_req_ids_from_engine	STRING
HIGH	vllm/distributed/weight_transfer/factory.py	83	Create a weight transfer engine instance. Args: config: Weight transfer configuration containing th	STRING
HIGH	vllm/distributed/weight_transfer/base.py	118	Construct typed init info from dict with validation. Args: init_dict: Dictionary containin	STRING
HIGH	vllm/distributed/weight_transfer/base.py	138	Construct typed update info from dict with validation. Args: update_dict: Dictionary conta	STRING
HIGH	vllm/logging_utils/formatter.py	22	Shortens a file path for logging display: - Removes leading 'vllm' folder if present.	STRING
HIGH	vllm/model_executor/kernels/linear/__init__.py	514	Choose a _KernelT that can implement the given config for the given compute capability. Attempts to choose the	STRING
HIGH	vllm/model_executor/kernels/linear/__init__.py	692	Choose an MPLinearKernel that can implement the given config for the given compute capability. Attempts to cho	STRING
HIGH	…/model_executor/layers/fused_moe/expert_map_manager.py	337	Map global expert ID to local expert ID. Args: global_id: Global expert ID (0 to global_nu	STRING
HIGH	vllm/model_executor/layers/fla/ops/chunk.py	153	Args: q (torch.Tensor): Queries of shape `[B, T, H, K]`. k (torch.Tensor):	STRING
HIGH	vllm/model_executor/layers/fla/ops/fused_recurrent.py	530	Args: q (torch.Tensor): queries of shape `[B, T, H, K]`. k (torch.Tensor):	STRING
HIGH	vllm/model_executor/models/keye_vl1_5.py	79	Return num_patches per video. Args: grid_thw: Tensor with shape [N, 3] containing temporal, height, wi	STRING
HIGH	vllm/model_executor/models/isaac.py	243	Apply pixel shuffle to a packed vision sequence without unpacking per image. Args: x (`torch.Tensor`):	STRING
HIGH	vllm/multimodal/audio.py	95	Normalize audio to the specified format. This function handles channel reduction for multi-channel audio, suppo	STRING
HIGH	vllm/benchmarks/lib/ready_checker.py	25	Wait for an endpoint to become available before starting benchmarks. Args: request_func: The async req	STRING
HIGH	vllm/entrypoints/chat_utils.py	1466	Parses a given multi-modal content part based on its type. Args: part: A dict containing the content p	STRING
HIGH	vllm/entrypoints/speech_to_text/base/utils.py	18	Read an uploaded file enforcing a size limit before full materialization. The function first checks the Conte	STRING
HIGH	vllm/transformers_utils/processors/isaac.py	197	Convert normalized images into flattened ViT-style patches. Args: image (`torch.Tensor`): Tenso	STRING
HIGH	vllm/lora/resolver.py	72	Get a registered resolver instance by name. Args: resolver_name: Name of the resolver to get.	STRING
HIGH	…ications/chatbot/streamlit_openai_chatbot_webserver.py	111	Generate and stream LLM response with optional reasoning process. Args: messages (list): List of conversati	STRING
HIGH	benchmarks/benchmark_long_document_qa_throughput.py	68	Repeat each prompt in the list for a specified number of times. The order of prompts in the output list depends	STRING
HIGH	benchmarks/attention_benchmarks/batch_spec.py	74	Parse batch specification string into list of BatchRequest objects. Grammar: (<count>?) q<q_len>(k?) (s<seq_le	STRING

Hallucination Indicators11 hits · 115 pts

Severity	File	Line	Snippet	Context
CRITICAL	rust/src/cmd/src/cli/unsupported.rs	141	/// - `vllm.entrypoints.cli.serve.ServeSubcommand.subparser_init(...)`	COMMENT
CRITICAL	tests/v1/e2e/general/test_mamba_prefix_cache.py	886	assert engine.llm_engine.engine_core.engine_core.scheduler.reset_prefix_cache()	CODE
CRITICAL	tests/v1/e2e/general/test_mamba_prefix_cache.py	1138	engine.llm_engine.engine_core.engine_core.scheduler.reset_prefix_cache()	CODE
CRITICAL	tests/v1/e2e/general/test_mamba_prefix_cache.py	1163	assert engine.llm_engine.engine_core.engine_core.scheduler.reset_prefix_cache()	CODE
CRITICAL	tests/distributed/test_torchrun_example_moe.py	81	llm.llm_engine.model_executor.driver_worker.worker.model_runner.model.parameters()	CODE
CRITICAL	tests/distributed/test_torchrun_example.py	72	llm.llm_engine.model_executor.driver_worker.worker.model_runner.model.parameters()	CODE
CRITICAL	tests/models/language/generation/test_gemma.py	19	lambda self: self.model_runner.model.language_model.model.normalizer.cpu().item() # noqa: E501	CODE
CRITICAL	tests/models/language/generation/test_gemma.py	24	lambda self: self.model_runner.model.model.normalizer.cpu().item()	CODE
CRITICAL	vllm/v1/spec_decode/llm_base_proposer.py	1449	self.model.model.embed_tokens.weight.cpu(),	CODE
CRITICAL	vllm/model_executor/layers/fla/ops/utils.py	170	triton.runtime.driver.active.utils.get_device_properties(i)[	CODE
CRITICAL	docs/training/layerwise.md	124	model = llm.llm_engine.engine_core.engine_core.model_executor.driver_worker.worker.get_model()	CODE

Fake / Example Data71 hits · 80 pts

Severity	File	Line	Snippet	Context
LOW	tests/v1/attention/test_mla_backends.py	1466	["placeholder"],	CODE
LOW	tests/v1/attention/test_dspark_noncausal_sparse_mla.py	320	vllm_config.compilation_config.static_forward_context["placeholder"] = (	CODE
LOW	tests/v1/attention/test_dspark_noncausal_sparse_mla.py	325	kv_cache_spec, ["placeholder"], vllm_config, device	CODE
LOW	tests/v1/attention/test_sparse_mla_backends.py	458	vllm_config.compilation_config.static_forward_context["placeholder"] = (	CODE
LOW	tests/v1/attention/test_sparse_mla_backends.py	463	builder = builder_cls(kv_cache_spec, ["placeholder"], vllm_config, device)	CODE
LOW	tests/v1/attention/test_sparse_mla_backends.py	924	vllm_config.compilation_config.static_forward_context["placeholder"] = (	CODE
LOW	tests/v1/attention/test_sparse_mla_backends.py	929	builder = builder_cls(kv_cache_spec, ["placeholder"], vllm_config, device)	CODE
LOW	tests/v1/attention/test_attention_backends.py	514	["placeholder"],	CODE
LOW	tests/v1/attention/test_attention_backends.py	666	"placeholder": PerLayerParameters(	CODE
LOW	tests/v1/attention/test_attention_backends.py	714	kv_cache_spec, ["placeholder"], vllm_config, device	CODE
LOW⚡	tests/v1/kv_connector/unit/test_nixl_push_connector.py	698	payload = _registration_data("placeholder")	CODE
LOW	tests/tool_parsers/test_kimi_k2_tool_parser.py	85	'{"to": "user@example.com", "subject": "Daily Update"}',	CODE
LOW	tests/tool_parsers/test_kimi_k2_tool_parser.py	92	{"to": "user@example.com", "subject": "Daily Update"},	CODE
LOW	tests/tool_parsers/test_mistral_tool_parser.py	308	"name": "John Doe",	CODE
LOW	tests/tool_parsers/test_mistral_tool_parser.py	698	"name": "John Doe",	CODE
LOW	tests/tool_parsers/test_mistral_tool_parser.py	1061	"name": "John Doe",	CODE
LOW	tests/tool_parsers/test_mistral_tool_parser.py	301	"""[TOOL_CALLS] [{"arguments":{"name": "John Doe"}, "name": "get_age"}]""", # noqa: E501	STRING
LOW	tests/tool_parsers/test_mistral_tool_parser.py	691	"""[TOOL_CALLS] [{"arguments": {"name": "John Doe"}, "name": "get_age"}]""", # noqa: E501	STRING
LOW	tests/tool_parsers/test_mistral_tool_parser.py	1054	"""[TOOL_CALLS] [{"arguments": {"name": "John Doe"}, "name": "get_age"}]""", # noqa: E501	STRING
LOW⚡	tests/tool_parsers/test_olmo3_tool_parser.py	23	"register_user(name='John Doe', "	CODE
LOW⚡	tests/tool_parsers/test_olmo3_tool_parser.py	31	"register_user(name='John Doe', "	CODE
LOW⚡	tests/tool_parsers/test_olmo3_tool_parser.py	40	arguments='{"name": "John Doe", '	CODE
LOW⚡	tests/tool_parsers/test_pythonic_tool_parser.py	23	"register_user(name='John Doe', "	CODE
LOW⚡	tests/tool_parsers/test_pythonic_tool_parser.py	32	arguments='{"name": "John Doe", '	CODE
LOW⚡	tests/tool_parsers/test_lfm2_tool_parser.py	26	"register_user(name='John Doe', "	CODE
LOW⚡	tests/tool_parsers/test_lfm2_tool_parser.py	35	arguments='{"name": "John Doe", '	CODE
LOW	tests/tool_parsers/test_lfm2_tool_parser.py	339	"deliveryAddress='123 Main St')]"	CODE
LOW⚡	tests/tool_parsers/test_hunyuan_a13b_tool_parser.py	45	'<tool_calls>[{"name": "get_weather", "arguments": {"city": "San Francisco", "metric": "celsius"}}, {"name":	CODE
LOW⚡	tests/tool_parsers/test_hunyuan_a13b_tool_parser.py	53	"name": "John Doe",	CODE
LOW	tests/kernels/helion/test_pattern_matching.py	125	input_node = next(n for n in gm.graph.nodes if n.op == "placeholder")	CODE
LOW	tests/kernels/helion/test_pattern_matching.py	200	input_node = next(n for n in gm.graph.nodes if n.op == "placeholder")	CODE
LOW	…rnels/attention/test_rocm_aiter_mla_decode_metadata.py	114	layer_name = "placeholder"	CODE
LOW⚡	…ts/models/multimodal/processing/test_audioflamingo3.py	130	dummy_data = builder.get_dummy_mm_data(100, mm_counts, {})	CODE
LOW⚡	…ts/models/multimodal/processing/test_audioflamingo3.py	132	assert "audio" in dummy_data	CODE
LOW⚡	…ts/models/multimodal/processing/test_audioflamingo3.py	133	assert len(dummy_data["audio"]) == 2	CODE
LOW⚡	…ts/models/multimodal/processing/test_audioflamingo3.py	136	assert len(dummy_data["audio"][0]) == expected_len	CODE
LOW	tests/compile/test_graph_partition.py	64	and node.args[0].op == "placeholder"	CODE
LOW	tests/compile/test_graph_partition.py	125	and node.args[0].op == "placeholder"	CODE
LOW	tests/compile/test_graph_partition.py	193	assert [node.op for node in splitting_gm.graph.nodes] == ["placeholder"] + 2 * [	CODE
LOW	tests/compile/test_graph_partition.py	440	if node.op == "placeholder"	CODE
LOW	tests/compile/test_graph_partition.py	506	if n.op == "placeholder"	CODE
LOW	tests/compile/test_graph_partition.py	512	if n.op == "placeholder"	CODE
LOW	tests/compile/test_graph_partition.py	621	if n.op != "placeholder":	CODE
LOW⚡	tests/compile/passes/ir/test_clone_cleanup.py	225	x_node = [n for n in graph_module.graph.nodes if n.op == "placeholder"][0]	CODE
LOW⚡	tests/compile/passes/ir/test_clone_cleanup.py	237	placeholders = [n for n in graph_module.graph.nodes if n.op == "placeholder"]	CODE
LOW	tests/compile/passes/ir/test_clone_cleanup.py	255	placeholders = [n for n in graph_module.graph.nodes if n.op == "placeholder"]	CODE
LOW⚡	tests/compile/passes/ir/test_clone_cleanup.py	274	placeholders = [n for n in graph_module.graph.nodes if n.op == "placeholder"]	CODE
LOW⚡	tests/compile/passes/ir/test_clone_cleanup.py	292	x_node = [n for n in graph_module.graph.nodes if n.op == "placeholder"][0]	CODE
LOW⚡	tests/compile/passes/ir/test_clone_cleanup.py	311	x_node = [n for n in graph_module.graph.nodes if n.op == "placeholder"][0]	CODE
LOW⚡	tests/benchmarks/test_txt_slices_dataset.py	20	Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor	CODE
LOW⚡	tests/benchmarks/test_txt_slices_dataset.py	20	Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor	CODE
LOW⚡	tests/entrypoints/openai/test_run_batch.py	761	def _make_aiohttp_mocks(response_data: bytes = b"fake-data", status: int = 200):	CODE
LOW	vllm/model_executor/models/transformers/fx_utils.py	52	if node.op == "placeholder" and node.target == "hidden_states":	CODE
LOW	vllm/model_executor/models/transformers/fusers/qkv.py	62	and node.args[0].op == "placeholder"	CODE
LOW	…/model_executor/models/transformers/fusers/rms_norm.py	134	x = find_node(graph, lambda n: n.op == "placeholder")	CODE
LOW	…/model_executor/models/transformers/fusers/rms_norm.py	176	if (x := find_node(graph, lambda n: n.op == "placeholder")) is None:	CODE
LOW	vllm/compilation/piecewise_backend.py	30	if node.op == "placeholder":	CODE
LOW	vllm/compilation/piecewise_backend.py	59	if node.op != "placeholder":	CODE
LOW	vllm/compilation/backends.py	468	input_node.op == "placeholder"	CODE
LOW⚡	vllm/compilation/backends.py	558	if node.op in ("output", "placeholder"):	CODE
11 more matches not shown…

Modern AI Meta-Vocabulary26 hits · 72 pts

Severity	File	Line	Snippet	Context
MEDIUM⚡	tools/pre_commit/generate_attention_backend_docs.py	1748	# Top-level orchestration	COMMENT
MEDIUM⚡	tests/v1/kv_offload/tiering/p2p/test_manager.py	1038	# _poll_once orchestration	COMMENT
MEDIUM	tests/v1/kv_connector/unit/test_offloading_connector.py	153	# Use a long prompt that fits within the model's context window.	COMMENT
MEDIUM	tests/v1/engine/test_async_llm.py	893	# Wait for generation to start (event-driven)	COMMENT
MEDIUM	tests/v1/engine/test_async_llm.py	935	# Wait for some tokens (event-driven, handles slow token generation)	COMMENT
MEDIUM	tests/v1/engine/test_async_llm.py	999	# Wait for at least one token across any request (event-driven)	COMMENT
MEDIUM	tests/evals/mrcr/mrcr_eval.py	32	# Skip chain-of-thought on reasoning models; ignored by non-reasoning templates.	COMMENT
MEDIUM	tests/compile/fullgraph/test_basic_correctness.py	57	# embedding model	COMMENT
MEDIUM	tests/entrypoints/openai/responses/test_harmony.py	599	# NOTE: chain-of-thought should be removed.	COMMENT
MEDIUM	…ntrypoints/openai/chat_completion/test_serving_chat.py	1141	# adding max_tokens should exceed the model context window.	COMMENT
MEDIUM	vllm/_xpu_ops.py	853	window_size=(-1, -1), # -1 means infinite context window	CODE
MEDIUM	vllm/v1/attention/ops/vit_attn_wrappers.py	224	# Without it, hallucinations occur with the backend	COMMENT
MEDIUM	vllm/v1/executor/abstract.py	44	uses_ray: bool = False # whether the executor uses Ray for orchestration.	CODE
MEDIUM⚡	…ibuted/kv_transfer/kv_connector/v1/nixl/push_worker.py	245	# progress is event-driven (see module docstring).	COMMENT
MEDIUM	…d/kv_transfer/kv_connector/v1/mooncake/store/worker.py	4	# The transfer-thread scaffolding (KVTransferThread, KVCacheStoreSendingThread,	COMMENT
MEDIUM	vllm/model_executor/models/granite_speech.py	79	# work pretty well with zero shot.	COMMENT
MEDIUM	vllm/model_executor/models/phi3v.py	603	# initialized as an embedding model	COMMENT
MEDIUM	vllm/models/minimax_m3/amd/model.py	1336	# NVIDIA copy — it only orchestrates the shared vision tower + the per-platform	COMMENT
MEDIUM	vllm/reasoning/gemma4_utils.py	27	print(result["thinking"]) # chain-of-thought or None	STRING
MEDIUM	vllm/reasoning/gemma4_utils.py	76	>>> print(result["thinking"]) # chain-of-thought reasoning or None	STRING
MEDIUM	vllm/entrypoints/speech_to_text/base/serving.py	402	# which is a strong sign of hallucination in outputs.	COMMENT
MEDIUM	vllm/vllm_flash_attn/flash_attn_interface.py	138	window_size=(-1, -1), # -1 means infinite context window	CODE
MEDIUM	docs/serving/integrations/claude_code.md	3	[Claude Code](https://code.claude.com/docs/en/quickstart) is Anthropic's official agentic coding tool that lives in your	CODE
MEDIUM	docs/serving/integrations/codex.md	3	[Codex](https://github.com/openai/codex) is OpenAI's official agentic coding tool that lives in your terminal. It can un	CODE
MEDIUM	docs/features/tool_calling.md	221	The tool calling that is supported is the [JSON-based tool calling](https://llama.meta.com/docs/model-cards-and-prompt-f	CODE
MEDIUM	docs/deployment/frameworks/dify.md	3	[Dify](https://github.com/langgenius/dify) is an open-source LLM app development platform. Its intuitive interface combi	CODE

Example Usage Blocks21 hits · 31 pts

Severity	File	Line	Snippet	Context
LOW	tools/setup_deepgemm_pythons.sh	6	# Usage:	COMMENT
LOW	tools/vllm-rocm/generate-rocm-wheels-root-index.sh	11	# Usage:	COMMENT
LOW	docker/docker-bake.hcl	5	# Usage:	COMMENT
LOW	docker/docker-bake-rocm.hcl	6	# Usage:	COMMENT
LOW	docker/docker-bake-rocm.hcl	127	# Usage:	COMMENT
LOW	docker/entrypoints/test_vllm_nonroot_entrypoint.sh	9	# Usage:	COMMENT
LOW	…/nixl_integration/run_multi_connector_accuracy_test.sh	15	# Usage:	COMMENT
LOW	…nector/nixl_integration/spec_decode_acceptance_test.sh	13	# Usage:	COMMENT
LOW	…nixl_integration/run_multi_connector_edge_case_test.sh	15	# Usage:	COMMENT
LOW	examples/tool_calling/chat_with_tools_offline.py	44	# Usage:	COMMENT
LOW	examples/ray_serving/run_cluster.sh	8	# Usage:	COMMENT
LOW	examples/ray_serving/multi-node-serving.sh	11	# Example usage:	COMMENT
LOW	examples/generate/multimodal/mistral-small_offline.py	51	# Usage:	COMMENT
LOW	benchmarks/kv_cache_watermark.sh	29	# Usage:	COMMENT
LOW	…ttention_benchmarks/configs/mla_sparse_mha_vs_mqa.yaml	3	# Usage:	COMMENT
LOW	…s/attention_benchmarks/configs/mla_fa4_fp8_output.yaml	6	# Usage:	COMMENT
LOW	…nchmarks/attention_benchmarks/configs/mla_prefill.yaml	14	# Usage:	COMMENT
LOW	.buildkite/scripts/cache-rocm-base-wheels.sh	10	# Usage:	COMMENT
LOW	.buildkite/scripts/ci-bake-rocm.sh	9	# Usage:	COMMENT
LOW	.buildkite/scripts/ci-fetch-log.sh	4	# Usage:	COMMENT
LOW	.buildkite/scripts/tool_call/run-bfcl-eval.sh	5	# Usage:	COMMENT

Magic Placeholder Names5 hits · 20 pts

Severity	File	Line	Snippet	Context
HIGH	…les/pooling/embed/openai_embedding_long_text/client.py	22	--api-key your-api-key	STRING
HIGH	…les/pooling/embed/openai_embedding_long_text/client.py	32	--api-key your-api-key	STRING
HIGH	…les/pooling/embed/openai_embedding_long_text/client.py	44	API_KEY = "your-api-key" # Replace with your actual API key	CODE
HIGH	…es/pooling/embed/openai_embedding_long_text/service.sh	19	API_KEY=${API_KEY:-"your-api-key"}	CODE
HIGH	.github/ISSUE_TEMPLATE/400-bug-report.yml	20	Consider redacting or replacing sensitive values with placeholders like `<YOUR_TOKEN_HERE>` when sharing configura	CODE

TODO Padding11 hits · 16 pts

Severity	File	Line	Snippet	Context
LOW	csrc/cpu/sgl-kernels/common.h	365	// TODO: implement reverse order of [MB / cache_blocks_mb, NB, cache_blocks_mb]	COMMENT
LOW	…nector/v1/lmcache_integration/multi_process_adapter.py	632	# TODO: add error handling here	COMMENT
LOW	…m/model_executor/kernels/linear/mixed_precision/xpu.py	191	# TODO: implement asym case	COMMENT
LOW	vllm/entrypoints/openai/responses/protocol.py	739	# TODO: implement the other reason for incomplete_details,	COMMENT
LOW	vllm/lora/punica_wrapper/punica_base.py	324	# TODO: implement it based on torch ops	COMMENT
LOW	vllm/lora/punica_wrapper/punica_base.py	357	# TODO: implement it based on torch ops	COMMENT
LOW	vllm/lora/punica_wrapper/punica_base.py	381	# TODO: implement it based on torch ops	COMMENT
LOW	vllm/lora/punica_wrapper/punica_base.py	418	# TODO: implement it based on torch ops	COMMENT
LOW	vllm/lora/punica_wrapper/punica_base.py	448	# TODO: implement it based on torch ops	COMMENT
LOW	vllm/lora/punica_wrapper/punica_base.py	467	# TODO: implement it based on torch ops	COMMENT
LOW	vllm/lora/punica_wrapper/punica_base.py	494	# TODO: implement it based on torch ops	COMMENT

Slop Phrases7 hits · 16 pts

Severity	File	Line	Snippet	Context
MEDIUM	tests/models/multimodal/generation/test_common.py	64	# model arch happens to be a substring of another one, you can add a	COMMENT
MEDIUM	tests/models/multimodal/generation/test_common.py	85	# NOTE you can add --collect-only to any of the above commands to see	COMMENT
LOW	…ibuted/kv_transfer/kv_connector/v1/nixl/pull_worker.py	81	# while processing the next batch, we make sure to only set an	COMMENT
MEDIUM	vllm/model_executor/models/interfaces.py	230	as a language model component.	STRING
MEDIUM	…ache/disagg_prefill_lmcache_v1/disagg_vllm_launcher.sh	24	# secure random value. This is set to a fixed value for demonstration purposes only.	COMMENT
MEDIUM	examples/rl/rlhf_ipc.py	17	for demonstration purposes we simply zero out the weights.	STRING
MEDIUM	.buildkite/test-amd.yaml	1	# In this file, you can add more tests to run either by adding a new step or	COMMENT

AI Response Leakage1 hit · 8 pts

Severity	File	Line	Snippet	Context
HIGH	tests/v1/sample/test_logprobs.py	914	# Based on user's example: "In this example,"	COMMENT

Synthetic Comment Markers1 hit · 8 pts

Severity	File	Line	Snippet	Context
HIGH	…tic_prefix_caching/automatic_prefix_caching_offline.py	26	# A prompt containing a large markdown table. The table is randomly generated by GPT-4.	COMMENT

Dead Code1 hit · 2 pts

Severity	File	Line	Snippet	Context
MEDIUM	tests/v1/e2e/general/test_streaming_input.py	502		CODE

Overly Generic Function Names1 hit · 2 pts

Severity	File	Line	Snippet	Context
LOW⚡	…gated/disaggregated_serving/moriio_toy_proxy_server.py	227	async def handle_request(api: str, request: Request):	CODE

Analysis Overview

What These Metrics Mean

Score History

Severity Breakdown

Directory Score Breakdown

Pattern Findings