rasbt/LLMs-from-scratch

8.0

Adjusted Score

8.0

Raw Score

100%

Time Factor

2026-07-11

Last Push

99.0K

Stars

Jupyter Notebook

Language

58.0K

Lines of Code

273

Files

323

Pattern Hits

2026-07-14

Scan Date

0.03

HC Hit Rate

What These Metrics Mean

Adjusted Score: Primary synthetic code indicator. Raw score normalised per 1,000 lines of code and multiplied by the temporal discount factor. This is the definitive comparative metric — use it to rank repositories by AI authorship density.
Raw Score: The unmodified sum of all severity-weighted, context-multiplied pattern match scores before temporal discounting. Reflects the absolute signal strength independent of when the repository was last active.
Time Factor: The temporal discount multiplier (0–100%) applied to the raw score. Repositories last updated before ChatGPT's launch (Nov 2022) receive a 5% factor. Full signal is only assigned to repositories active in the post-adoption era (Jan 2024+).
Pattern Hits: Total count of individual pattern matches across all files and categories. A high hit count with a low score may indicate a very large codebase with isolated AI snippets; a low count with a high score indicates dense, concentrated AI signatures.
HC Hit Rate: High+Critical pattern hits per file, averaged across the repository. This orthogonal signal catches repositories where a few files are densely packed with high-severity AI tells — a strong indicator even when the normalised score appears moderate due to codebase size.
Lines of Code / Files: Total lines and files analysed. The scanner examines 94 file extensions. These denominators are used to normalise the score, enabling fair comparison between repositories of vastly different sizes.

Score History

This chart maps the temporal evolution of the adjusted synthetic code score across successive scan runs. An upward trajectory indicates ongoing incorporation of AI-generated code or expanding LLM-assisted scaffolding; a stable or declining trajectory may reflect active human refactoring, code removal, or the adoption of stricter authorship policies. The dashed secondary line (right axis) independently tracks total raw pattern hit count, which can diverge from the normalised score when codebase size changes significantly between scans.

Severity Breakdown

Classifies detected patterns by their diagnostic confidence and structural impact. CRITICAL patterns (coefficient 10) represent definitive synthetic signatures — hallucinated imports, explicit LLM attribution metadata — virtually never produced by human authors. HIGH (5) indicates strong structural tells such as cross-file repetition or cross-linguistic idioms. MEDIUM (2) covers recognisable conversational padding and AI-specific vocabulary. LOW (1) captures subtle indicators like tautological comments and generic boilerplate that require density to carry independent signal.

CRITICAL 0HIGH 8MEDIUM 63LOW 252

Directory Score Breakdown

This horizontal bar chart decomposes the repository's raw synthetic code score by top-level directory, allowing you to pinpoint precisely which modules or components carry the highest AI authorship density. Directories with disproportionately high scores relative to their size warrant targeted manual review: concentrated AI signatures often trace back to mass-generated configuration layers, auto-ported test suites, LLM-scaffolded boilerplate classes, or entire subsystems authored under heavy copilot assistance. Use this view to prioritise your human code-review effort.

Pattern Findings

The scanner identified 323 distinct pattern matches across 13 syntactic categories. Each entry below represents a discrete location in the source code where the engine recorded a statistically significant AI authorship indicator. Expand any category row to inspect the individual file paths, line numbers, code snippets, and the lexical context (CODE, COMMENT, or STRING) in which each match was detected.

Reading the findings table: The Severity column indicates the diagnostic confidence level (CRITICAL / HIGH / MEDIUM / LOW). The Context column identifies whether the match occurred inside executable code, an inline comment, or a string literal — comment-context matches receive a ×1.5 weight because LLMs systematically over-annotate. The ⚡ bolt icon marks clustered matches: three or more patterns within a 10-line window, each receiving an additional ×1.5 density multiplier as dense clusters constitute far stronger evidence of synthetic authorship than isolated hits.

Self-Referential Comments45 hits · 108 pts

Severity	File	Line	Snippet	Context
MEDIUM	ch07/01_main-chapter-code/previous_chapters.py	467	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch07/01_main-chapter-code/previous_chapters.py	468	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch07/01_main-chapter-code/exercise_experiments.py	286	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch07/01_main-chapter-code/exercise_experiments.py	287	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch07/01_main-chapter-code/ollama_evaluate.py	15	# Create the data payload as a dictionary	COMMENT
MEDIUM	ch07/01_main-chapter-code/gpt_instruction_finetuning.py	135	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch07/01_main-chapter-code/gpt_instruction_finetuning.py	136	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch07/04_preference-tuning-with-dpo/previous_chapters.py	468	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch07/04_preference-tuning-with-dpo/previous_chapters.py	469	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch06/01_main-chapter-code/gpt_class_finetune.py	230	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch06/01_main-chapter-code/gpt_class_finetune.py	231	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch06/01_main-chapter-code/gpt_download.py	111	# Define the block size for reading the file	STRING
MEDIUM	ch06/02_bonus_additional-experiments/gpt_download.py	111	# Define the block size for reading the file	STRING
MEDIUM	…_bonus_imdb-classification/download_prepare_dataset.py	58	# Create a DataFrame for each file and add it to the list	COMMENT
MEDIUM	ch06/03_bonus_imdb-classification/gpt_download.py	111	# Define the block size for reading the file	STRING
MEDIUM	…6/03_bonus_imdb-classification/train_sklearn_logreg.py	65	# Create a dummy classifier with the strategy to predict the most frequent class	COMMENT
MEDIUM	…nal-aws-sagemaker-notebook/cloudformation-template.yml	59	# Create a startup script that will run in the background	COMMENT
MEDIUM	…nal-aws-sagemaker-notebook/cloudformation-template.yml	104	# Create a flag file to indicate setup is complete	COMMENT
MEDIUM	appendix-D/01_main-chapter-code/previous_chapters.py	305	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	appendix-D/01_main-chapter-code/previous_chapters.py	306	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	appendix-E/01_main-chapter-code/previous_chapters.py	540	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	appendix-E/01_main-chapter-code/previous_chapters.py	541	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	appendix-E/01_main-chapter-code/gpt_download.py	111	# Define the block size for reading the file	STRING
MEDIUM	ch05/16_qwen3.5/qwen3_5_transformers.py	97	"""This function is intended to align with the l2norm implementation in the FLA library."""	STRING
MEDIUM	ch05/01_main-chapter-code/gpt_train.py	122	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch05/01_main-chapter-code/gpt_train.py	123	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch05/01_main-chapter-code/gpt_download.py	111	# Define the block size for reading the file	STRING
MEDIUM	ch05/01_main-chapter-code/gpt_generate.py	77	# Define the block size for reading the file	COMMENT
MEDIUM	ch05/07_gpt_to_llama/tests/tests_rope_and_parts.py	113	# Create a module to store the imported functions and classes	COMMENT
MEDIUM	ch05/10_llm-training-speed/01_opt_single_gpu.py	361	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch05/10_llm-training-speed/01_opt_single_gpu.py	362	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch05/10_llm-training-speed/00_orig.py	397	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch05/10_llm-training-speed/00_orig.py	398	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch05/10_llm-training-speed/02_opt_multi_gpu_ddp.py	426	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch05/10_llm-training-speed/02_opt_multi_gpu_ddp.py	427	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch05/05_bonus_hparam_tuning/hparam_search.py	18	# Define a grid of hyperparameters to search over	COMMENT
MEDIUM	ch05/18_muon/gpt_train_muon.py	166	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch05/18_muon/gpt_train_muon.py	167	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	ch05/18_muon/gpt_train.py	122	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	ch05/18_muon/gpt_train.py	123	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	pkg/llms_from_scratch/ch06.py	225	# Create a second x-axis for examples seen	COMMENT
MEDIUM	pkg/llms_from_scratch/ch06.py	226	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE
MEDIUM	pkg/llms_from_scratch/ch07.py	214	# Create the data payload as a dictionary	COMMENT
MEDIUM	pkg/llms_from_scratch/ch05.py	236	# Create a second x-axis for tokens seen	COMMENT
MEDIUM	pkg/llms_from_scratch/ch05.py	237	ax2 = ax1.twiny() # Create a second x-axis that shares the same y-axis	CODE

Hyper-Verbose Identifiers89 hits · 89 pts

Severity	File	Line	Snippet	Context
LOW	ch07/01_main-chapter-code/previous_chapters.py	338	def generate_and_print_sample(model, tokenizer, device, start_context):	CODE
LOW	ch07/01_main-chapter-code/exercise_experiments.py	190	def custom_collate_with_masking_fn(	CODE
LOW	ch07/02_dataset-utilities/find-near-duplicates.py	76	def find_print_and_remove_near_duplicates(json_data, remove_duplicates=False, threshold=0.75):	CODE
LOW	ch07/04_preference-tuning-with-dpo/previous_chapters.py	339	def generate_and_print_sample(model, tokenizer, device, start_context):	CODE
LOW	ch06/01_main-chapter-code/gpt_class_finetune.py	26	def download_and_unzip_spam_data(url, zip_path, extracted_path, data_file_path):	CODE
LOW	…_bonus_imdb-classification/download_prepare_dataset.py	31	def download_and_extract_dataset(dataset_url, target_file, directory):	CODE
LOW	…_bonus_imdb-classification/download_prepare_dataset.py	51	def load_dataset_to_dataframe(basepath="aclImdb", labels={"pos": 1, "neg": 0}):	CODE
LOW	appendix-D/01_main-chapter-code/previous_chapters.py	282	def generate_and_print_sample(model, tokenizer, device, start_context):	CODE
LOW	appendix-E/01_main-chapter-code/previous_chapters.py	364	def download_and_unzip_spam_data(url, zip_path, extracted_path, data_file_path):	CODE
LOW	ch04/06_swa/gpt_with_kv_mha.py	253	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/06_swa/tests.py	12	def test_cached_prefill_matches_uncached_swa():	CODE
LOW	ch04/06_swa/tests.py	37	def test_swa_matches_base_model_when_window_equals_context():	CODE
LOW	ch04/06_swa/plot_memory_estimates_swa.py	57	def calc_kv_bytes_total_mha_swa(	CODE
LOW	ch04/06_swa/plot_memory_estimates_swa.py	75	def calc_kv_bytes_total_gqa_swa(	CODE
LOW	ch04/06_swa/gpt_with_kv_swa.py	294	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/07_moe/plot_memory_estimates_moe.py	16	def calc_moe_active_and_total(	CODE
LOW	ch04/07_moe/plot_memory_estimates_moe.py	42	def plot_active_params_vs_experts(	CODE
LOW	ch04/07_moe/gpt_with_kv_moe.py	339	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/07_moe/memory_estimator_moe.py	39	def estimate_params_and_hidden(	CODE
LOW	ch04/07_moe/gpt_with_kv_ffn.py	279	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/04_gqa/gpt_with_kv_mha.py	253	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/04_gqa/gpt_with_kv_gqa.py	265	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/04_gqa/plot_memory_estimates_gqa.py	23	def plot_abs_kv_vs_context_multi_groups():	CODE
LOW	ch04/09_dsa/gpt_with_kv_dsa.py	389	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/09_dsa/test_dsa.py	16	def import_transformers_dsa_model():	CODE
LOW	ch04/09_dsa/test_dsa.py	89	def test_indexer_matches_transformers_reference():	CODE
LOW	ch04/09_dsa/test_dsa.py	163	def dense_attention_reference(attn, x):	CODE
LOW	ch04/09_dsa/test_dsa.py	181	def test_topk_full_equals_dense():	CODE
LOW	ch04/10_kv-sharing/gpt_with_kv_mha.py	253	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/10_kv-sharing/gpt_with_kv_sharing.py	270	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/10_kv-sharing/tests.py	10	def test_kv_sharing_matches_mha_when_all_layers_produce_kv():	CODE
LOW	ch04/10_kv-sharing/tests.py	55	def test_only_producer_layers_store_kv_cache():	CODE
LOW	ch04/10_kv-sharing/tests.py	84	def test_memory_estimator_counts_cached_layers():	CODE
LOW	…04/08_deltanet/plot_memory_estimates_gated_deltanet.py	27	def calc_kv_bytes_total_deltanet_no_conv(batch, emb_dim, n_layers, bytes_per_elem, n_heads):	CODE
LOW	ch04/05_mla/plot_memory_estimates_mla.py	33	def plot_abs_kv_vs_context_multiple():	CODE
LOW	ch04/05_mla/gpt_with_kv_mla.py	261	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/05_mla/gpt_with_kv_mha.py	253	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/03_kv-cache/gpt_with_kv_cache.py	280	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/03_kv-cache/README.md	173	def generate_text_simple_cached(model, idx, max_new_tokens,	CODE
LOW	ch04/03_kv-cache/gpt_with_kv_cache_optimized.py	306	def generate_text_simple_cached(model, idx, max_new_tokens, context_size=None, use_cache=True):	CODE
LOW	ch04/03_kv-cache/tests.py	32	def test_gpt_model_equivalence_not_cached(ModelClass):	CODE
LOW	ch04/03_kv-cache/tests.py	66	def test_gpt_model_equivalence_cached(ModelClass):	CODE
LOW	ch04/03_kv-cache/tests.py	113	def test_context_overflow_bug():	CODE
LOW	ch04/03_kv-cache/tests.py	150	def test_prefill_chunking_basic():	CODE
LOW	ch02/05_bpe-from-scratch/tests.py	11	def import_definitions_from_notebook(fullname, names):	CODE
LOW	ch02/05_bpe-from-scratch/tests.py	185	def test_no_eot_aliasing_and_disallowed_logic(imported_module, gpt2_files):	CODE
LOW	ch02/05_bpe-from-scratch/tests.py	214	def test_newline_roundtrip_and_equivalence(imported_module, gpt2_files, text):	CODE
LOW	ch02/05_bpe-from-scratch/tests.py	234	def test_space_newline_space_patterns(imported_module, gpt2_files):	CODE
LOW	ch02/05_bpe-from-scratch/tests.py	250	def test_multiple_leading_spaces_roundtrip(imported_module, gpt2_files):	CODE
LOW	ch05/16_qwen3.5/qwen3_5_transformers.py	66	def apply_mask_to_padding_states(hidden_states, attention_mask):	CODE
LOW	ch05/16_qwen3.5/qwen3_5_transformers.py	102	def torch_chunk_gated_delta_rule(	CODE
LOW	ch05/16_qwen3.5/qwen3_5_transformers.py	182	def torch_recurrent_gated_delta_rule(	CODE
LOW	ch05/01_main-chapter-code/gpt_train.py	61	def generate_and_print_sample(model, tokenizer, device, start_context):	CODE
LOW	ch05/15_tiny-aya/tests/test_tiny_aya_nb.py	54	def test_dummy_tiny_aya_forward(dummy_cfg_base, dummy_input, import_notebook_defs):	CODE
LOW	ch05/15_tiny-aya/tests/test_tiny_aya_nb.py	64	def test_tiny_aya_base_equivalence_with_transformers(import_notebook_defs):	CODE
LOW	ch05/15_tiny-aya/tests/test_tiny_aya_kvcache_nb.py	55	def test_dummy_tiny_aya_forward(dummy_cfg_base, dummy_input, import_notebook_defs):	CODE
LOW	ch05/15_tiny-aya/tests/test_tiny_aya_kvcache_nb.py	65	def test_tiny_aya_base_equivalence_with_transformers(import_notebook_defs):	CODE
LOW	ch05/07_gpt_to_llama/tests/tests_rope_and_parts.py	97	def import_definitions_from_notebook(notebooks):	CODE
LOW	ch05/10_llm-training-speed/01_opt_single_gpu.py	255	def generate_and_print_sample(model, tokenizer, device, start_context):	CODE
LOW	ch05/10_llm-training-speed/01_opt_single_gpu.py	269	def train_model_simple_with_timing(model, train_loader, val_loader, optimizer, device,	CODE
29 more matches not shown…

Deep Nesting47 hits · 47 pts

Severity	File	Line	Context
LOW	ch07/01_main-chapter-code/gpt_download.py	48	CODE
LOW	ch07/01_main-chapter-code/gpt_download.py	49	CODE
LOW	ch07/02_dataset-utilities/find-near-duplicates.py	41	CODE
LOW	ch06/01_main-chapter-code/gpt_download.py	47	CODE
LOW	ch06/01_main-chapter-code/gpt_download.py	48	CODE
LOW	ch06/02_bonus_additional-experiments/gpt_download.py	47	CODE
LOW	ch06/02_bonus_additional-experiments/gpt_download.py	48	CODE
LOW	…bonus_additional-experiments/additional_experiments.py	257	CODE
LOW	…_bonus_imdb-classification/download_prepare_dataset.py	31	CODE
LOW	…_bonus_imdb-classification/download_prepare_dataset.py	51	CODE
LOW	ch06/03_bonus_imdb-classification/gpt_download.py	47	CODE
LOW	ch06/03_bonus_imdb-classification/gpt_download.py	48	CODE
LOW	…nstalling-python-libraries/python_environment_check.py	20	CODE
LOW	appendix-E/01_main-chapter-code/gpt_download.py	47	CODE
LOW	appendix-E/01_main-chapter-code/gpt_download.py	48	CODE
LOW	ch04/07_moe/gpt_with_kv_moe.py	339	CODE
LOW	ch04/07_moe/gpt_with_kv_ffn.py	279	CODE
LOW	ch02/02_bonus_bytepair-encoder/bpe_openai_gpt2.py	148	CODE
LOW	ch05/16_qwen3.5/tests/qwen3_5_layer_debugger.py	248	CODE
LOW	…3_bonus_pretraining_on_gutenberg/pretraining_simple.py	80	CODE
LOW	ch05/12_gemma3/tests/gemma3_layer_debugger.py	206	CODE
LOW	ch05/01_main-chapter-code/gpt_download.py	48	CODE
LOW	ch05/01_main-chapter-code/gpt_download.py	49	CODE
LOW	ch05/01_main-chapter-code/gpt_generate.py	62	CODE
LOW	ch05/15_tiny-aya/tests/tiny_aya_layer_debugger.py	199	CODE
LOW	ch05/07_gpt_to_llama/tests/tests_rope_and_parts.py	96	CODE
LOW	ch05/07_gpt_to_llama/tests/tests_rope_and_parts.py	97	CODE
LOW	ch05/13_olmo3/tests/olmo3_layer_debugger.py	251	CODE
LOW	…/17_gemma4/tests/test_e4b/gemma4_e4b_layer_debugger.py	258	CODE
LOW	…/17_gemma4/tests/test_e2b/gemma4_e2b_layer_debugger.py	270	CODE
LOW	ch05/10_llm-training-speed/01_opt_single_gpu.py	269	CODE
LOW	ch05/10_llm-training-speed/00_orig.py	305	CODE
LOW	ch05/10_llm-training-speed/02_opt_multi_gpu_ddp.py	314	CODE
LOW	…qwen3-chat-interface/qwen3-chat-interface-multiturn.py	33	CODE
LOW	…/11_qwen3/qwen3-chat-interface/qwen3-chat-interface.py	32	CODE
LOW	.github/scripts/check_double_quotes.py	76	CODE
LOW	pkg/llms_from_scratch/utils.py	17	CODE
LOW	pkg/llms_from_scratch/utils.py	39	CODE
LOW	pkg/llms_from_scratch/utils.py	110	CODE
LOW	pkg/llms_from_scratch/utils.py	153	CODE
LOW	pkg/llms_from_scratch/qwen3.py	653	CODE
LOW	pkg/llms_from_scratch/ch05.py	280	CODE
LOW	pkg/llms_from_scratch/ch05.py	281	CODE
LOW	pkg/llms_from_scratch/appendix_d.py	23	CODE
LOW	pkg/llms_from_scratch/tests/test_qwen3.py	63	CODE
LOW	pkg/llms_from_scratch/tests/test_qwen3.py	630	CODE
LOW	pkg/llms_from_scratch/tests/test_qwen3.py	704	CODE

Cross-File Repetition8 hits · 40 pts

Severity	File	Snippet	Context
HIGH	ch06/01_main-chapter-code/gpt_download.py	def download_file(url, destination): # send a get request to download the file in streaming mode response = requests.get	STRING
HIGH	ch06/02_bonus_additional-experiments/gpt_download.py	def download_file(url, destination): # send a get request to download the file in streaming mode response = requests.get	STRING
HIGH	ch06/03_bonus_imdb-classification/gpt_download.py	def download_file(url, destination): # send a get request to download the file in streaming mode response = requests.get	STRING
HIGH	appendix-E/01_main-chapter-code/gpt_download.py	def download_file(url, destination): # send a get request to download the file in streaming mode response = requests.get	STRING
HIGH	ch05/01_main-chapter-code/gpt_download.py	def download_file(url, destination): # send a get request to download the file in streaming mode response = requests.get	STRING
HIGH	ch05/10_llm-training-speed/02_opt_multi_gpu_ddp.py	arguments: rank: a unique process id world_size: total number of processes in the group	STRING
HIGH	appendix-A/01_main-chapter-code/DDP-script.py	arguments: rank: a unique process id world_size: total number of processes in the group	STRING
HIGH	appendix-A/01_main-chapter-code/DDP-script-torchrun.py	arguments: rank: a unique process id world_size: total number of processes in the group	STRING

Decorative Section Separators10 hits · 39 pts

Severity	File	Line	Snippet	Context
MEDIUM⚡	…qwen3-chat-interface/qwen3-chat-interface-multiturn.py	22	# ============================================================	COMMENT
MEDIUM⚡	…qwen3-chat-interface/qwen3-chat-interface-multiturn.py	24	# ============================================================	COMMENT
MEDIUM⚡	…qwen3-chat-interface/qwen3-chat-interface-multiturn.py	30	# ============================================================	COMMENT
MEDIUM⚡	…/11_qwen3/qwen3-chat-interface/qwen3-chat-interface.py	21	# ============================================================	COMMENT
MEDIUM⚡	…/11_qwen3/qwen3-chat-interface/qwen3-chat-interface.py	23	# ============================================================	COMMENT
MEDIUM⚡	…/11_qwen3/qwen3-chat-interface/qwen3-chat-interface.py	29	# ============================================================	COMMENT
MEDIUM	pkg/llms_from_scratch/llama3.py	208	# ==============================================================================	COMMENT
MEDIUM	pkg/llms_from_scratch/llama3.py	257	# ==============================================================================	COMMENT
MEDIUM	pkg/llms_from_scratch/qwen3.py	332	# ==============================================================================	COMMENT
MEDIUM	pkg/llms_from_scratch/qwen3.py	381	# ==============================================================================	COMMENT

Excessive Try-Catch Wrapping29 hits · 35 pts

Severity	File	Line	Snippet	Context
LOW	ch07/01_main-chapter-code/gpt_download.py	91	except Exception as e:	CODE
LOW	ch06/01_main-chapter-code/gpt_download.py	91	except Exception as e:	CODE
LOW	ch06/02_bonus_additional-experiments/gpt_download.py	91	except Exception as e:	CODE
LOW	ch06/03_bonus_imdb-classification/gpt_download.py	91	except Exception as e:	CODE
LOW	…nstalling-python-libraries/python_environment_check.py	90	except Exception as e:	CODE
LOW	appendix-E/01_main-chapter-code/gpt_download.py	91	except Exception as e:	CODE
LOW	ch04/06_swa/plot_memory_estimates_swa.py	38	except Exception:	CODE
LOW	ch04/06_swa/memory_estimator_swa.py	37	except Exception:	CODE
LOW	ch05/16_qwen3.5/tests/qwen3_5_layer_debugger.py	20	except Exception:	CODE
LOW	ch05/16_qwen3.5/tests/qwen3_5_layer_debugger.py	39	except Exception:	CODE
MEDIUM	ch05/16_qwen3.5/tests/qwen3_5_layer_debugger.py	14	def _import_qwen3_5_classes():	CODE
LOW	ch05/16_qwen3.5/tests/test_qwen3_5_nb.py	22	except Exception:	CODE
LOW	ch05/16_qwen3.5/tests/test_qwen3_5_nb.py	43	except Exception:	CODE
MEDIUM	ch05/16_qwen3.5/tests/test_qwen3_5_nb.py	16	def _import_qwen3_5_classes():	CODE
LOW	ch05/01_main-chapter-code/gpt_download.py	91	except Exception as e:	CODE
LOW	ch05/17_gemma4/tests/test_gemma4_nb.py	77	except Exception:	CODE
MEDIUM	ch05/17_gemma4/tests/test_gemma4_nb.py	71	def gemma4_transformers_module():	CODE
LOW	…/17_gemma4/tests/test_e4b/gemma4_e4b_layer_debugger.py	20	except Exception:	CODE
LOW	…/17_gemma4/tests/test_e4b/gemma4_e4b_layer_debugger.py	42	except Exception:	CODE
MEDIUM	…/17_gemma4/tests/test_e4b/gemma4_e4b_layer_debugger.py	15	def _import_gemma4_classes():	CODE
LOW	…/17_gemma4/tests/test_e2b/gemma4_e2b_layer_debugger.py	20	except Exception:	CODE
LOW	…/17_gemma4/tests/test_e2b/gemma4_e2b_layer_debugger.py	42	except Exception:	CODE
MEDIUM	…/17_gemma4/tests/test_e2b/gemma4_e2b_layer_debugger.py	15	def _import_gemma4_classes():	CODE
LOW	.github/scripts/check_double_quotes.py	111	except Exception as e:	CODE
MEDIUM	.github/scripts/check_double_quotes.py	104	def check_file(path):	CODE
LOW	pkg/llms_from_scratch/utils.py	171	except Exception as e:	CODE
LOW	pkg/llms_from_scratch/ch05.py	323	except Exception as e:	CODE
LOW	pkg/llms_from_scratch/tests/test_qwen3.py	77	except Exception:	CODE
LOW	pkg/llms_from_scratch/tests/test_qwen3.py	654	except Exception:	CODE

Redundant / Tautological Comments28 hits · 30 pts

Severity	File	Line	Snippet	Context
LOW	ch07/01_main-chapter-code/previous_chapters.py	301	model.train() # Set model to training mode	CODE
LOW	ch07/01_main-chapter-code/gpt_download.py	55	# Check if file exists and has same size	COMMENT
LOW	ch07/04_preference-tuning-with-dpo/previous_chapters.py	302	model.train() # Set model to training mode	CODE
LOW	ch06/01_main-chapter-code/gpt_class_finetune.py	190	model.train() # Set model to training mode	CODE
LOW	ch06/01_main-chapter-code/gpt_download.py	54	# Check if file exists and has same size	COMMENT
LOW	ch06/01_main-chapter-code/gpt_download.py	104	# Check if file exists and has the same size	STRING
LOW	ch06/02_bonus_additional-experiments/gpt_download.py	54	# Check if file exists and has same size	COMMENT
LOW	ch06/02_bonus_additional-experiments/gpt_download.py	104	# Check if file exists and has the same size	STRING
LOW	…bonus_additional-experiments/additional_experiments.py	338	model.train() # Set model to training mode	CODE
LOW	ch06/03_bonus_imdb-classification/train_bert_hf_spam.py	237	model.train() # Set model to training mode	CODE
LOW	ch06/03_bonus_imdb-classification/train_gpt.py	184	model.train() # Set model to training mode	CODE
LOW	ch06/03_bonus_imdb-classification/gpt_download.py	54	# Check if file exists and has same size	COMMENT
LOW	ch06/03_bonus_imdb-classification/gpt_download.py	104	# Check if file exists and has the same size	STRING
LOW	ch06/03_bonus_imdb-classification/train_gpt_muon.py	219	model.train() # Set model to training mode	CODE
LOW	ch06/03_bonus_imdb-classification/train_bert_hf.py	140	model.train() # Set model to training mode	CODE
LOW	…nal-aws-sagemaker-notebook/cloudformation-template.yml	119	# Check if setup is still running or not started	COMMENT
LOW	appendix-E/01_main-chapter-code/previous_chapters.py	500	model.train() # Set model to training mode	CODE
LOW	appendix-E/01_main-chapter-code/gpt_download.py	54	# Check if file exists and has same size	COMMENT
LOW	appendix-E/01_main-chapter-code/gpt_download.py	104	# Check if file exists and has the same size	STRING
LOW	ch05/01_main-chapter-code/gpt_train.py	84	model.train() # Set model to training mode	CODE
LOW	ch05/01_main-chapter-code/gpt_download.py	55	# Check if file exists and has same size	COMMENT
LOW	ch05/01_main-chapter-code/gpt_download.py	104	# Check if file exists and has the same size	STRING
LOW	ch05/01_main-chapter-code/gpt_generate.py	70	# Check if file exists and has the same size	COMMENT
LOW	ch05/18_muon/gpt_train_muon.py	120	model.train() # Set model to training mode	CODE
LOW	ch05/18_muon/gpt_train.py	84	model.train() # Set model to training mode	CODE
LOW	pkg/llms_from_scratch/ch06.py	185	model.train() # Set model to training mode	CODE
LOW	pkg/llms_from_scratch/ch05.py	70	model.train() # Set model to training mode	CODE
LOW	pkg/llms_from_scratch/ch05.py	287	# Check if file exists and has same size	COMMENT

Unused Imports28 hits · 28 pts

Severity	File	Line	Context
LOW	.github/scripts/test_check_links.py	3	CODE
LOW	.github/scripts/check_links.py	14	CODE
LOW	pkg/llms_from_scratch/kv_cache/gpt2.py	6	CODE
LOW	pkg/llms_from_scratch/kv_cache/generate.py	6	CODE
LOW	pkg/llms_from_scratch/kv_cache/llama3.py	6	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	6	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/generate.py	6	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	6	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE
LOW	pkg/llms_from_scratch/kv_cache_batched/qwen3.py	7	CODE

Over-Commented Block24 hits · 24 pts

Severity	File	Line	Snippet	Context
LOW	ch06/01_main-chapter-code/previous_chapters.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	…6/02_bonus_additional-experiments/previous_chapters.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	ch06/03_bonus_imdb-classification/previous_chapters.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	ch04/07_moe/gpt_with_kv_ffn.py	141	torch.sqrt(torch.tensor(2.0 / torch.pi)) *	COMMENT
LOW	ch04/09_dsa/gpt_with_kv_dsa.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	ch04/09_dsa/gpt_with_kv_dsa.py	21		COMMENT
LOW	ch04/05_mla/gpt_with_kv_mla.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	ch02/02_bonus_bytepair-encoder/bpe_openai_gpt2.py	1	# Source: https://github.com/openai/gpt-2/blob/master/src/encoder.py	COMMENT
LOW	ch05/01_main-chapter-code/previous_chapters.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	ch05/07_gpt_to_llama/previous_chapters.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	ch05/10_llm-training-speed/01_opt_single_gpu.py	501	epochs_tensor = torch.linspace(0, OTHER_SETTINGS["num_epochs"], len(train_losses))	COMMENT
LOW	ch05/10_llm-training-speed/00_orig.py	521	###########################	COMMENT
LOW	ch05/10_llm-training-speed/02_opt_multi_gpu_ddp.py	601	# torch.save(model._orig_mod.state_dict(), "model.pth")	COMMENT
LOW	ch05/18_muon/previous_chapters.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	.github/scripts/check_links.py	1	#!/usr/bin/env -S uv run --script	COMMENT
LOW	pkg/llms_from_scratch/ch07.py	41	# if not os.path.exists(file_path):	COMMENT
LOW	pkg/llms_from_scratch/__init__.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	pkg/llms_from_scratch/llama3.py	201	# Combine heads, where self.d_out = self.num_heads * self.head_dim	COMMENT
LOW	pkg/llms_from_scratch/llama3.py	221	# │ │ │ │ │ │ │ │	COMMENT
LOW	pkg/llms_from_scratch/llama3.py	241	# [ x0 x1 x2 x3 x4 x5 x6 x7 ]	COMMENT
LOW	pkg/llms_from_scratch/qwen3.py	341	#	COMMENT
LOW	pkg/llms_from_scratch/qwen3.py	361	# 2) Interleaved (even/odd) style (original paper, Llama repo):	COMMENT
LOW	pkg/llms_from_scratch/kv_cache/__init__.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT
LOW	pkg/llms_from_scratch/kv_cache_batched/__init__.py	1	# Copyright (c) Sebastian Raschka under Apache License 2.0 (see LICENSE.txt).	COMMENT

Structural Annotation Overuse11 hits · 16 pts

Severity	File	Line	Snippet	Context
LOW	ch07/06_user_interface/README.md	16	## Step 1: Install dependencies	COMMENT
LOW	ch07/06_user_interface/README.md	27	## Step 2: Run `app` code	COMMENT
LOW	ch06/03_bonus_imdb-classification/README.md	29	## Step 1: Install Dependencies	COMMENT
LOW	ch06/03_bonus_imdb-classification/README.md	38	## Step 2: Download Dataset	COMMENT
LOW	ch06/03_bonus_imdb-classification/README.md	50	## Step 3: Run Models	COMMENT
LOW	ch06/04_user_interface/README.md	16	## Step 1: Install dependencies	COMMENT
LOW	ch06/04_user_interface/README.md	27	## Step 2: Run `app` code	COMMENT
LOW	ch05/06_user_interface/README.md	16	## Step 1: Install dependencies	COMMENT
LOW	ch05/06_user_interface/README.md	27	## Step 2: Run `app` code	COMMENT
LOW	ch05/11_qwen3/qwen3-chat-interface/README.md	16	## Step 1: Install dependencies	COMMENT
LOW	ch05/11_qwen3/qwen3-chat-interface/README.md	34	## Step 2: Run `app` code	COMMENT

AI Slop Vocabulary1 hit · 3 pts

Severity	File	Line	Snippet	Context
MEDIUM	pkg/llms_from_scratch/ch07.py	33	# The `requests` version above is more robust	COMMENT

Slop Phrases1 hit · 2 pts

Severity	File	Line	Snippet	Context
MEDIUM	…-tuning-with-dpo/instruction-data-with-preference.json	3520	"chosen": "It's worth noting that the most popular vegetable in the world is actually the potato.",	CODE

AI Structural Patterns2 hits · 2 pts

Severity	File	Line	Snippet	Context
LOW	ch05/16_qwen3.5/qwen3_5_transformers.py	93		CODE
LOW	ch05/16_qwen3.5/qwen3_5_transformers.py	411		CODE

Analysis Overview

What These Metrics Mean

Score History

Severity Breakdown

Directory Score Breakdown

Pattern Findings