NVIDIA-NeMo/NeMo

23.7

Adjusted Score

23.7

Raw Score

100%

Time Factor

2026-07-14

Last Push

17.8K

Stars

Python

Language

475.1K

Lines of Code

1.9K

Files

8.1K

Pattern Hits

2026-07-14

Scan Date

0.28

HC Hit Rate

What These Metrics Mean

Adjusted Score: Primary synthetic code indicator. Raw score normalised per 1,000 lines of code and multiplied by the temporal discount factor. This is the definitive comparative metric — use it to rank repositories by AI authorship density.
Raw Score: The unmodified sum of all severity-weighted, context-multiplied pattern match scores before temporal discounting. Reflects the absolute signal strength independent of when the repository was last active.
Time Factor: The temporal discount multiplier (0–100%) applied to the raw score. Repositories last updated before ChatGPT's launch (Nov 2022) receive a 5% factor. Full signal is only assigned to repositories active in the post-adoption era (Jan 2024+).
Pattern Hits: Total count of individual pattern matches across all files and categories. A high hit count with a low score may indicate a very large codebase with isolated AI snippets; a low count with a high score indicates dense, concentrated AI signatures.
HC Hit Rate: High+Critical pattern hits per file, averaged across the repository. This orthogonal signal catches repositories where a few files are densely packed with high-severity AI tells — a strong indicator even when the normalised score appears moderate due to codebase size.
Lines of Code / Files: Total lines and files analysed. The scanner examines 94 file extensions. These denominators are used to normalise the score, enabling fair comparison between repositories of vastly different sizes.

Score History

This chart maps the temporal evolution of the adjusted synthetic code score across successive scan runs. An upward trajectory indicates ongoing incorporation of AI-generated code or expanding LLM-assisted scaffolding; a stable or declining trajectory may reflect active human refactoring, code removal, or the adoption of stricter authorship policies. The dashed secondary line (right axis) independently tracks total raw pattern hit count, which can diverge from the normalised score when codebase size changes significantly between scans.

Severity Breakdown

Classifies detected patterns by their diagnostic confidence and structural impact. CRITICAL patterns (coefficient 10) represent definitive synthetic signatures — hallucinated imports, explicit LLM attribution metadata — virtually never produced by human authors. HIGH (5) indicates strong structural tells such as cross-file repetition or cross-linguistic idioms. MEDIUM (2) covers recognisable conversational padding and AI-specific vocabulary. LOW (1) captures subtle indicators like tautological comments and generic boilerplate that require density to carry independent signal.

CRITICAL 14HIGH 526MEDIUM 410LOW 7117

Directory Score Breakdown

This horizontal bar chart decomposes the repository's raw synthetic code score by top-level directory, allowing you to pinpoint precisely which modules or components carry the highest AI authorship density. Directories with disproportionately high scores relative to their size warrant targeted manual review: concentrated AI signatures often trace back to mass-generated configuration layers, auto-ported test suites, LLM-scaffolded boilerplate classes, or entire subsystems authored under heavy copilot assistance. Use this view to prioritise your human code-review effort.

Pattern Findings

The scanner identified 8067 distinct pattern matches across 23 syntactic categories. Each entry below represents a discrete location in the source code where the engine recorded a statistically significant AI authorship indicator. Expand any category row to inspect the individual file paths, line numbers, code snippets, and the lexical context (CODE, COMMENT, or STRING) in which each match was detected.

Reading the findings table: The Severity column indicates the diagnostic confidence level (CRITICAL / HIGH / MEDIUM / LOW). The Context column identifies whether the match occurred inside executable code, an inline comment, or a string literal — comment-context matches receive a ×1.5 weight because LLMs systematically over-annotate. The ⚡ bolt icon marks clustered matches: three or more patterns within a 10-line window, each receiving an additional ×1.5 density multiplier as dense clusters constitute far stronger evidence of synthetic authorship than isolated hits.

Hyper-Verbose Identifiers2999 hits · 3016 pts

Severity	File	Line	Snippet	Context
LOW	tools/nemo_forced_aligner/align_eou.py	248	def get_manifests_for_this_rank(manifest_list, num_nodes, num_gpus, node_idx, gpu_idx):	CODE
LOW	…orced_aligner/tests/test_add_t_start_end_to_utt_obj.py	260	def test_add_t_start_end_to_utt_obj(alignment, expected_output_utterance, output_timestep_duration):	CODE
LOW	tools/nemo_forced_aligner/utils/make_ass_files.py	335	def make_token_level_ass_file(utt_obj, output_dir_root, ass_file_config, audio_dur):	CODE
LOW	tools/ctc_segmentation/scripts/utils.py	135	def _prepare_tokenized_text_for_bpe_model(text: List[str], tokenizer, vocabulary: List[str], blank_idx: int = 0):	CODE
LOW	tools/ctc_segmentation/scripts/utils.py	213	def determine_utterance_segments(config, utt_begin_indices, char_probs, timings, text, char_list):	CODE
LOW	tools/ctc_segmentation/scripts/utils.py	303	def write_labels_for_audacity(	CODE
LOW	tools/speech_data_explorer/data_explorer.py	54	def _ensure_numba_coverage_compatibility():	CODE
LOW	tools/speech_data_explorer/data_explorer.py	293	def expand_sharded_path_without_braceexpand(path_pattern):	CODE
LOW	tools/speech_data_explorer/data_explorer.py	627	def build_tar_index_from_local(tar_path):	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	85	def recommend_hyperparameters_human_readable(recommended_hyperparameters):	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	92	def recommend_hyperparameters(df, model=None):	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	148	def estimating_customization_job_time(df, recommended_hyperparameters):	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	165	def warn_completion_is_not_empty(df):	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	181	def warn_imbalanced_completion(df):	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	304	def convert_into_prompt_completion_only(df, prompt_template="{prompt}", completion_template="{completion}"):	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	311	def warn_and_drop_long_samples(df, max_total_char_length):	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	363	def split_into_train_validation(df, val_proportion=0.1):	CODE
LOW	…ration/tests/test_customization_dataset_preparation.py	39	def test_recommend_hyperparameters():	CODE
LOW	…ration/tests/test_customization_dataset_preparation.py	83	def test_warn_completion_is_not_empty():	CODE
LOW	…ration/tests/test_customization_dataset_preparation.py	106	def test_warn_imbalanced_completion():	CODE
LOW	…ration/tests/test_customization_dataset_preparation.py	206	def test_warn_duplicated_rows():	CODE
LOW	…ration/tests/test_customization_dataset_preparation.py	223	def test_drop_duplicated_rows():	CODE
LOW⚡	…ration/tests/test_customization_dataset_preparation.py	262	def test_drop_unrequired_fields():	CODE
LOW⚡	…ration/tests/test_customization_dataset_preparation.py	271	def test_convert_into_template():	CODE
LOW⚡	…ration/tests/test_customization_dataset_preparation.py	295	def test_convert_into_prompt_completion_only():	CODE
LOW⚡	…ration/tests/test_customization_dataset_preparation.py	313	def get_indexes_of_long_examples(df, max_total_char_length):	CODE
LOW⚡	…ration/tests/test_customization_dataset_preparation.py	318	def test_warn_and_drop_long_samples():	CODE
LOW	…ration/tests/test_customization_dataset_preparation.py	346	def test_show_first_example_in_df():	CODE
LOW	…ration/tests/test_customization_dataset_preparation.py	356	def test_get_prepared_filename():	CODE
LOW	…ration/tests/test_customization_dataset_preparation.py	374	def test_split_into_train_validation():	CODE
LOW⚡	nemo/lightning/base_callback.py	78	def on_save_checkpoint_success(self, args, *kwargs) -> None:	CODE
LOW	nemo/lightning/__init__.py	24	def _is_slurm_interactive_mode():	CODE
LOW	nemo/lightning/callback_group.py	141	def hook_class_init_with_callbacks(cls, start_callback: str, end_callback: str) -> None:	CODE
LOW	nemo/lightning/one_logger_callback.py	37	def get_one_logger_init_config() -> Dict[str, Any]:	CODE
LOW	nemo/lightning/one_logger_callback.py	67	def _get_base_callback_config(	CODE
LOW	nemo/lightning/one_logger_callback.py	205	def _should_enable_for_current_rank() -> bool:	CODE
LOW⚡	nemo/core/connectors/save_restore_connector.py	602	def _inject_model_parallel_rank_for_ckpt(self, dirname, basename):	CODE
LOW⚡	nemo/core/connectors/save_restore_connector.py	608	def _make_nemo_file_from_folder(filename, source_dir):	CODE
LOW⚡	nemo/core/connectors/save_restore_connector.py	618	def _make_nemo_file_from_folder_with_multistorageclient(filename, source_dir):	CODE
LOW	nemo/core/connectors/save_restore_connector.py	98	def load_config_and_state_dict(	CODE
LOW	nemo/core/connectors/save_restore_connector.py	224	def load_instance_with_state_dict(self, instance, state_dict, strict):	CODE
LOW	nemo/core/connectors/save_restore_connector.py	512	def check_artifact_and_query_basename_match(query_path: str) -> bool:	CODE
LOW	nemo/core/connectors/save_restore_connector.py	706	def _unpack_nemo_file_with_multistorageclient(	CODE
LOW	nemo/core/connectors/save_restore_connector.py	748	def _load_state_dict_from_disk(model_weights, map_location='cpu'):	CODE
LOW	nemo/core/config/optimizers.py	237	def register_optimizer_params(name: str, optimizer_params: OptimizerParams):	CODE
LOW	nemo/core/config/schedulers.py	234	def register_scheduler_params(name: str, scheduler_params: SchedulerParams):	CODE
LOW	nemo/core/classes/exportable.py	295	def disabled_deployment_input_names(self) -> List[str]:	CODE
LOW	nemo/core/classes/exportable.py	300	def disabled_deployment_output_names(self) -> List[str]:	CODE
LOW	nemo/core/classes/exportable.py	339	def dynamic_shapes_for_export(self, use_dynamo=False):	CODE
LOW	nemo/core/classes/common.py	306	def _get_allowed_target_class(target_path: str):	CODE
LOW	nemo/core/classes/common.py	312	def _validate_config_targets_recursive(config_node: Any):	CODE
LOW	nemo/core/classes/common.py	342	def is_semantic_typecheck_enabled():	CODE
LOW	nemo/core/classes/common.py	539	def _attach_and_validate_output_types(self, out_objects, ignore_collections=False, output_types=None):	CODE
LOW	nemo/core/classes/common.py	834	def _inspect_signature_for_trainer(cls, check_cls):	CODE
LOW	nemo/core/classes/common.py	975	def get_available_model_names(cls) -> List[str]:	CODE
LOW	nemo/core/classes/common.py	1055	def _get_ngc_pretrained_model_info(cls, model_name: str, refresh_cache: bool = False) -> Tuple[type, str]:	CODE
LOW	nemo/core/classes/common.py	1114	def _get_hf_hub_pretrained_model_info(cls, model_name: str, refresh_cache: bool = False) -> Tuple[type, str]:	CODE
LOW	nemo/core/classes/common.py	1397	def set_semantic_check_enabled(enabled: bool = True):	CODE
LOW	nemo/core/classes/modelPT.py	298	def has_native_or_submodules_artifacts(self) -> bool:	CODE
LOW	nemo/core/classes/modelPT.py	577	def setup_multiple_validation_data(self, val_data_config: Union[DictConfig, Dict]):	CODE
2939 more matches not shown…

Cross-File Repetition444 hits · 2220 pts

Severity	File	Line	Snippet	Context
HIGH	nemo/core/config/optimizers.py	0	convenience method to obtain an optimizer class and partially instantiate it with optimizer kwargs. args: name: name of	STRING
HIGH	nemo/core/config/schedulers.py	0	convenience method to obtain an optimizer class and partially instantiate it with optimizer kwargs. args: name: name of	STRING
HIGH	nemo/core/optim/lr_scheduler.py	0	convenience method to obtain an optimizer class and partially instantiate it with optimizer kwargs. args: name: name of	STRING
HIGH	nemo/core/optim/optimizers.py	0	convenience method to obtain an optimizer class and partially instantiate it with optimizer kwargs. args: name: name of	STRING
HIGH	nemo/core/classes/exportable.py	0	implement this method to return a set of output names disabled for export	STRING
HIGH	nemo/collections/asr/modules/conv_asr.py	0	implement this method to return a set of output names disabled for export	STRING
HIGH	nemo/collections/asr/modules/rnnt.py	0	implement this method to return a set of output names disabled for export	STRING
HIGH	nemo/collections/tts/models/fastpitch.py	0	implement this method to return a set of output names disabled for export	STRING
HIGH	nemo/core/classes/mixins/adapter_mixins.py	0	add an adapter module to this model. args: name: a globally unique name for the adapter. will be used to access, enable	STRING
HIGH	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	0	add an adapter module to this model. args: name: a globally unique name for the adapter. will be used to access, enable	STRING
HIGH	…llections/tts/parts/mixins/fastpitch_adapter_mixins.py	0	add an adapter module to this model. args: name: a globally unique name for the adapter. will be used to access, enable	STRING
HIGH	nemo/core/classes/mixins/adapter_mixins.py	0	checks if any adapter module has been instantiated. returns: bool, determining if any adapter module has been instantiat	STRING
HIGH	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	0	checks if any adapter module has been instantiated. returns: bool, determining if any adapter module has been instantiat	STRING
HIGH	…llections/tts/parts/mixins/fastpitch_adapter_mixins.py	0	checks if any adapter module has been instantiated. returns: bool, determining if any adapter module has been instantiat	STRING
HIGH	nemo/core/classes/mixins/adapter_mixins.py	0	updated the internal adapter config, determining if an adapter (or all adapters) are either enabled or disabled. a commo	STRING
HIGH	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	0	updated the internal adapter config, determining if an adapter (or all adapters) are either enabled or disabled. a commo	STRING
HIGH	…llections/tts/parts/mixins/fastpitch_adapter_mixins.py	0	updated the internal adapter config, determining if an adapter (or all adapters) are either enabled or disabled. a commo	STRING
HIGH	nemo/core/classes/mixins/adapter_mixins.py	0	utility method to resolve a given global/module adapter name to its components. always returns a tuple representing (mod	STRING
HIGH	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	0	utility method to resolve a given global/module adapter name to its components. always returns a tuple representing (mod	STRING
HIGH	…llections/tts/parts/mixins/fastpitch_adapter_mixins.py	0	utility method to resolve a given global/module adapter name to its components. always returns a tuple representing (mod	STRING
HIGH	nemo/core/classes/mixins/adapter_mixins.py	0	fastpitch adapter mixin that can augment any encoder module with adapter module support. this mixin class should be used	STRING
HIGH	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	0	fastpitch adapter mixin that can augment any encoder module with adapter module support. this mixin class should be used	STRING
HIGH	…llections/tts/parts/mixins/fastpitch_adapter_mixins.py	0	fastpitch adapter mixin that can augment any encoder module with adapter module support. this mixin class should be used	STRING
HIGH	…ts/voice_agent/pipecat/services/nemo/streaming_diar.py	0	configuration parameters for diarization inference.	STRING
HIGH	…asks/diarization/neural_diarizer/e2e_diarize_speech.py	0	configuration parameters for diarization inference.	STRING
HIGH	nemo/collections/asr/parts/mixins/diarization.py	0	configuration parameters for diarization inference.	STRING
HIGH	nemo/agents/voice_agent/pipecat/services/nemo/utils.py	0	update the buffer with the new frame args: frame (frame): frame to update the buffer with	STRING
HIGH	…ence/streaming/buffering/incremental_audio_bufferer.py	0	update the buffer with the new frame args: frame (frame): frame to update the buffer with	STRING
HIGH	…ns/asr/inference/streaming/buffering/audio_bufferer.py	0	update the buffer with the new frame args: frame (frame): frame to update the buffer with	STRING
HIGH	nemo/collections/speechlm2/models/salm_asr_decoder.py	0	returns the audio duration corresponding to a single frame/token at the output of ``self.perception``.	STRING
HIGH	nemo/collections/speechlm2/models/salm_automodel.py	0	returns the audio duration corresponding to a single frame/token at the output of ``self.perception``.	STRING
HIGH	nemo/collections/speechlm2/models/salm.py	0	returns the audio duration corresponding to a single frame/token at the output of ``self.perception``.	STRING
HIGH	nemo/collections/speechlm2/models/salm_asr_decoder.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	nemo/collections/speechlm2/models/salm_automodel.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	nemo/collections/speechlm2/models/duplex_ear_tts.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	nemo/collections/speechlm2/models/salm.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	nemo/collections/speechlm2/models/duplex_s2s_model.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	…ns/speechlm2/models/duplex_s2s_speech_decoder_model.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	nemo/collections/asr/models/ssl_models.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	nemo/collections/asr/models/asr_eou_models.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	nemo/collections/asr/models/aed_multitask_models.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	nemo/collections/asr/models/asr_model.py	0	return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.	STRING
HIGH	nemo/collections/speechlm2/models/duplex_ear_tts.py	0	return the size of the audio codec codebook including extra speech bos and eos tokens.	STRING
HIGH	nemo/collections/speechlm2/models/duplex_s2s_model.py	0	return the size of the audio codec codebook including extra speech bos and eos tokens.	STRING
HIGH	…ns/speechlm2/models/duplex_s2s_speech_decoder_model.py	0	return the size of the audio codec codebook including extra speech bos and eos tokens.	STRING
HIGH	nemo/collections/speechlm2/models/duplex_ear_tts.py	0	indicates start of utterance generation (not start of inference!).	STRING
HIGH	nemo/collections/speechlm2/models/duplex_s2s_model.py	0	indicates start of utterance generation (not start of inference!).	STRING
HIGH	…ns/speechlm2/models/duplex_s2s_speech_decoder_model.py	0	indicates start of utterance generation (not start of inference!).	STRING
HIGH	nemo/collections/speechlm2/models/duplex_ear_tts.py	0	indicates start of inference (the very first frame).	STRING
HIGH	nemo/collections/speechlm2/models/duplex_s2s_model.py	0	indicates start of inference (the very first frame).	STRING
HIGH	…ns/speechlm2/models/duplex_s2s_speech_decoder_model.py	0	indicates start of inference (the very first frame).	STRING
HIGH	nemo/collections/speechlm2/models/duplex_ear_tts.py	0	text pad id is used as a 'blank' for frames when the model is not speaking and for frames where the model is speaking bu	STRING
HIGH	nemo/collections/speechlm2/models/duplex_s2s_model.py	0	text pad id is used as a 'blank' for frames when the model is not speaking and for frames where the model is speaking bu	STRING
HIGH	…ns/speechlm2/models/duplex_s2s_speech_decoder_model.py	0	text pad id is used as a 'blank' for frames when the model is not speaking and for frames where the model is speaking bu	STRING
HIGH	nemo/collections/speechlm2/parts/metrics/asr_cer_wer.py	0	computes the final score and deallocates asr and partial results.	STRING
HIGH	nemo/collections/speechlm2/parts/metrics/asr_bleu.py	0	computes the final score and deallocates asr and partial results.	STRING
HIGH	nemo/collections/speechlm2/parts/metrics/secs.py	0	computes the final score and deallocates asr and partial results.	STRING
HIGH	nemo/collections/speechlm2/data/s2s_dataset.py	0	strips timestamp tokens from text, e.g. turns: '<\|0\|> hey <\|3\|> <\|3\|> how <\|5\|> <\|7\|> are <\|8\|> <\|8\|> <\|10\|> you? <\|12\|>	STRING
HIGH	…o/collections/speechlm2/data/duplex_ear_tts_dataset.py	0	strips timestamp tokens from text, e.g. turns: '<\|0\|> hey <\|3\|> <\|3\|> how <\|5\|> <\|7\|> are <\|8\|> <\|8\|> <\|10\|> you? <\|12\|>	STRING
HIGH	nemo/collections/common/data/lhotse/cutset.py	0	strips timestamp tokens from text, e.g. turns: '<\|0\|> hey <\|3\|> <\|3\|> how <\|5\|> <\|7\|> are <\|8\|> <\|8\|> <\|10\|> you? <\|12\|>	STRING
384 more matches not shown…

Over-Commented Block1693 hits · 1671 pts

Severity	File	Line	Snippet	Context
LOW	nemo_dependencies.py	1	#!/usr/bin/env python3	COMMENT
LOW	.pre-commit-config.yaml	1	# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	pyproject.toml	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	.readthedocs.yml	1	# =============================================================================	COMMENT
LOW	setup.py	1	# ! /usr/bin/python	COMMENT
LOW	tools/nemo_forced_aligner/align.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/nemo_forced_aligner/align_eou.py	1	# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	…s/nemo_forced_aligner/tests/test_restore_token_case.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/nemo_forced_aligner/tests/test_get_utt_obj.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	…orced_aligner/tests/test_add_t_start_end_to_utt_obj.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/nemo_forced_aligner/utils/make_output_manifest.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/nemo_forced_aligner/utils/constants.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/nemo_forced_aligner/utils/data_prep.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/nemo_forced_aligner/utils/make_ass_files.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/nemo_forced_aligner/utils/make_ctm_files.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/speech_data_simulator/multispeaker_simulator.py	1	# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/ctc_segmentation/run_segmentation.sh	1	#!/bin/bash	COMMENT
LOW	tools/ctc_segmentation/run_filter.sh	1	#!/bin/bash	COMMENT
LOW	tools/ctc_segmentation/scripts/prepare_data.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	…ols/ctc_segmentation/scripts/get_metrics_and_filter.py	1	# Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.	COMMENT
LOW	tools/ctc_segmentation/scripts/normalization_helpers.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/ctc_segmentation/scripts/verify_segments.py	1	# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/ctc_segmentation/scripts/cut_audio.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/ctc_segmentation/scripts/utils.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/ctc_segmentation/scripts/run_ctc_segmentation.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/speech_data_explorer/data_explorer.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/customization_dataset_preparation/__init__.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	…taset_preparation/customization_dataset_preparation.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	…ration/tests/test_customization_dataset_preparation.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	…ls/customization_dataset_preparation/tests/__init__.py	1	# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/rir_corpus_generator/rir_mix_generator.py	1	# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/rir_corpus_generator/rir_corpus_generator.py	1	# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/asr_evaluator/asr_evaluator.py	1	# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	tools/asr_evaluator/utils.py	1	# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/package_info.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/constants.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/__init__.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/lightning/base_callback.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/lightning/__init__.py	1	# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/lightning/callback_group.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/lightning/one_logger_callback.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/lightning/base.py	1	# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/__init__.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/connectors/save_restore_connector.py	1	# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/connectors/__init__.py	1	# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/neural_types/elements.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/neural_types/neural_type.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/neural_types/__init__.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/neural_types/comparison.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/neural_types/axes.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/base_config.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/__init__.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/optimizers.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/pytorch_lightning.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/pytorch.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/modelPT.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/hydra_runner.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/schedulers.py	1	# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/templates/__init__.py	1	# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.	COMMENT
LOW	nemo/core/config/templates/model_card.py	1	# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.	COMMENT
1633 more matches not shown…

Decorative Section Separators228 hits · 825 pts

Severity	File	Line	Snippet	Context
MEDIUM	.readthedocs.yml	1	# =============================================================================	COMMENT
MEDIUM	.readthedocs.yml	15	# =============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/vllm/salm/backends.py	48	# ── Base backend ────────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/speechlm2/vllm/salm/backends.py	79	# ── Transformer backend (Qwen3, etc.) ────────────────────────────────	COMMENT
MEDIUM	nemo/collections/speechlm2/vllm/salm/backends.py	216	# ── Hybrid backend (NemotronH / Mamba+MoE) ──────────────────────────	COMMENT
MEDIUM⚡	nemo/collections/speechlm2/vllm/salm/backends.py	304	# ── Factory ─────────────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/speechlm2/vllm/salm/audio.py	81	# ── Helpers ─────────────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/speechlm2/vllm/salm/audio.py	187	# ── Multimodal contract types ───────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	38	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	40	# ==============================================================================	COMMENT
MEDIUM⚡	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	104	# ==============================================================================	COMMENT
MEDIUM⚡	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	106	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	436	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	438	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_model.py	34	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_model.py	36	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_model.py	95	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_model.py	97	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_model.py	237	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_model.py	239	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/common/data/lhotse/broadcasting.py	55	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	nemo/collections/common/data/lhotse/broadcasting.py	57	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	nemo/collections/common/data/lhotse/broadcasting.py	183	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	nemo/collections/common/data/lhotse/broadcasting.py	185	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	nemo/collections/audio/modules/transforms.py	383	# ------------------------------------------------------------------	COMMENT
MEDIUM	nemo/collections/audio/modules/transforms.py	385	# ------------------------------------------------------------------	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	34	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	36	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	60	# ==============================================================================	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	91	# ─── Type aliases ──────────────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	100	# ─── Constants ─────────────────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	158	# ─── RTTM / UEM parsing ───────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	316	# ─── UEM manipulation helpers ─────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	543	# ─── Speaker segment timeline ─────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	633	# ─── Bipartite speaker matching ───────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	677	# ─── Per-segment speaker scoring ─────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	796	# ─── Main diarization scoring ─────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	904	# ─── Output formatting ────────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	973	# ─── Top-level evaluate ───────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	1113	# ─── DER result wrapper ────────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/md_eval.py	1119	# ───────────────────────────────────────────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/asr/metrics/der.py	54	# ─── Lhotse-backed annotation helpers ──────────────────────────────────────	COMMENT
MEDIUM	nemo/collections/tts/parts/utils/tts_dataset_utils.py	463	# =============================================================================	COMMENT
MEDIUM	nemo/collections/tts/parts/utils/tts_dataset_utils.py	465	# =============================================================================	COMMENT
MEDIUM⚡	tests/collections/speaker_tasks/utils/test_der.py	612	# ─── Tests: Multi-file scoring ───────────────────────────────────────────	COMMENT
MEDIUM	tests/collections/speaker_tasks/utils/test_der.py	68	# ─── Helpers ──────────────────────────────────────────────────────────────	COMMENT
MEDIUM	tests/collections/speaker_tasks/utils/test_der.py	182	# ─── Tests: md_eval low-level engine ──────────────────────────────────────	COMMENT
MEDIUM	tests/collections/speaker_tasks/utils/test_der.py	417	# ─── Tests: der.py public API (score_labels_from_rttm_labels) ────────────	COMMENT
MEDIUM	tests/collections/speaker_tasks/utils/test_der.py	672	# ─── Tests: External-engine-verified values (cross-validated) ────────────	COMMENT
MEDIUM	tests/collections/speaker_tasks/utils/test_der.py	902	# ─── Tests: regression for no-UEM scoring (parity with external lib) ─────	COMMENT
MEDIUM	tests/collections/speaker_tasks/utils/test_der.py	1057	# ─── Tests: lhotse-based replacement for the external annotation lib ─────	COMMENT
MEDIUM	tests/collections/speaker_tasks/utils/test_der.py	1549	# ─── Tests: audio_end clipping ────────────────────────────────────────────	COMMENT
MEDIUM	tests/collections/speechlm2/test_salm_automodel_lora.py	123	# ---------------------------------------------------------------------------	COMMENT
MEDIUM	tests/collections/speechlm2/test_salm_automodel_lora.py	125	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/collections/speechlm2/test_salm_automodel_lora.py	231	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/collections/speechlm2/test_salm_automodel_lora.py	233	# ---------------------------------------------------------------------------	COMMENT
MEDIUM⚡	tests/collections/speechlm2/test_to_hf.py	173	# ──────────────────────────────────────────────────────────────────────	COMMENT
MEDIUM⚡	tests/collections/speechlm2/test_to_hf.py	175	# ──────────────────────────────────────────────────────────────────────	COMMENT
MEDIUM⚡	tests/collections/speechlm2/test_to_hf.py	188	# ──────────────────────────────────────────────────────────────────────	COMMENT
MEDIUM⚡	tests/collections/speechlm2/test_to_hf.py	190	# ──────────────────────────────────────────────────────────────────────	COMMENT
168 more matches not shown…

Unused Imports755 hits · 730 pts

Severity	File	Line	Context
LOW	tools/speech_data_explorer/data_explorer.py	20	CODE
LOW	tools/speech_data_explorer/data_explorer.py	39	CODE
LOW	tools/speech_data_explorer/data_explorer.py	40	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/__init__.py	16	CODE
LOW	nemo/lightning/__init__.py	20	CODE
LOW	nemo/lightning/__init__.py	20	CODE
LOW	nemo/core/__init__.py	15	CODE
LOW	nemo/core/__init__.py	16	CODE
LOW	nemo/core/connectors/save_restore_connector.py	15	CODE
LOW	nemo/core/neural_types/__init__.py	16	CODE
LOW	nemo/core/neural_types/__init__.py	17	CODE
LOW	nemo/core/neural_types/__init__.py	18	CODE
LOW	nemo/core/neural_types/__init__.py	19	CODE
LOW	nemo/core/config/__init__.py	15	CODE
LOW	nemo/core/config/__init__.py	16	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	17	CODE
LOW	nemo/core/config/__init__.py	31	CODE
LOW	nemo/core/config/__init__.py	32	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/config/__init__.py	33	CODE
LOW	nemo/core/classes/__init__.py	16	CODE
LOW	nemo/core/classes/__init__.py	17	CODE
LOW	nemo/core/classes/__init__.py	18	CODE
LOW	nemo/core/classes/__init__.py	20	CODE
LOW	nemo/core/classes/__init__.py	20	CODE
LOW	nemo/core/classes/__init__.py	20	CODE
LOW	nemo/core/classes/__init__.py	20	CODE
695 more matches not shown…

Deep Nesting720 hits · 677 pts

Severity	File	Line	Context
LOW	nemo_dependencies.py	27	CODE
LOW	nemo_dependencies.py	44	CODE
LOW	nemo_dependencies.py	90	CODE
LOW	nemo_dependencies.py	115	CODE
LOW	tools/nemo_forced_aligner/align_eou.py	445	CODE
LOW	…orced_aligner/tests/test_add_t_start_end_to_utt_obj.py	260	CODE
LOW	tools/nemo_forced_aligner/utils/data_prep.py	68	CODE
LOW	tools/nemo_forced_aligner/utils/make_ass_files.py	111	CODE
LOW	tools/nemo_forced_aligner/utils/make_ass_files.py	179	CODE
LOW	tools/nemo_forced_aligner/utils/make_ass_files.py	335	CODE
LOW	tools/nemo_forced_aligner/utils/make_ctm_files.py	69	CODE
LOW	tools/ctc_segmentation/scripts/prepare_data.py	125	CODE
LOW	tools/ctc_segmentation/scripts/prepare_data.py	221	CODE
LOW	tools/ctc_segmentation/scripts/prepare_data.py	227	CODE
LOW	tools/ctc_segmentation/scripts/cut_audio.py	50	CODE
LOW	tools/ctc_segmentation/scripts/utils.py	167	CODE
LOW	tools/ctc_segmentation/scripts/utils.py	213	CODE
LOW	tools/ctc_segmentation/scripts/utils.py	267	CODE
LOW	tools/ctc_segmentation/scripts/utils.py	303	CODE
LOW	tools/speech_data_explorer/data_explorer.py	552	CODE
LOW	tools/speech_data_explorer/data_explorer.py	695	CODE
LOW	tools/speech_data_explorer/data_explorer.py	829	CODE
LOW	tools/speech_data_explorer/data_explorer.py	996	CODE
LOW	tools/speech_data_explorer/data_explorer.py	1038	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	63	CODE
LOW	…taset_preparation/customization_dataset_preparation.py	237	CODE
LOW	tools/asr_evaluator/utils.py	37	CODE
LOW	tools/asr_evaluator/utils.py	85	CODE
LOW	tools/asr_evaluator/utils.py	270	CODE
LOW	nemo/lightning/one_logger_callback.py	152	CODE
LOW	nemo/lightning/base.py	55	CODE
LOW	nemo/core/connectors/save_restore_connector.py	53	CODE
LOW	nemo/core/connectors/save_restore_connector.py	98	CODE
LOW	nemo/core/connectors/save_restore_connector.py	290	CODE
LOW	nemo/core/connectors/save_restore_connector.py	362	CODE
LOW	nemo/core/connectors/save_restore_connector.py	455	CODE
LOW	nemo/core/neural_types/elements.py	99	CODE
LOW	nemo/core/neural_types/neural_type.py	66	CODE
LOW	nemo/core/neural_types/neural_type.py	93	CODE
LOW	nemo/core/neural_types/neural_type.py	179	CODE
LOW	nemo/core/neural_types/axes.py	60	CODE
LOW	nemo/core/config/hydra_runner.py	53	CODE
LOW	nemo/core/config/hydra_runner.py	69	CODE
LOW	nemo/core/config/hydra_runner.py	71	CODE
LOW	nemo/core/classes/exportable.py	138	CODE
LOW	nemo/core/classes/common.py	118	CODE
LOW	nemo/core/classes/common.py	388	CODE
LOW	nemo/core/classes/common.py	451	CODE
LOW	nemo/core/classes/common.py	539	CODE
LOW	nemo/core/classes/common.py	751	CODE
LOW	nemo/core/classes/common.py	1055	CODE
LOW	nemo/core/classes/modelPT.py	627	CODE
LOW	nemo/core/classes/modelPT.py	794	CODE
LOW	nemo/core/classes/modelPT.py	961	CODE
LOW	nemo/core/classes/modelPT.py	1057	CODE
LOW	nemo/core/classes/modelPT.py	1262	CODE
LOW	nemo/core/classes/modelPT.py	1959	CODE
LOW	nemo/core/classes/modelPT.py	2016	CODE
LOW	nemo/core/classes/mixins/access_mixins.py	77	CODE
LOW	nemo/core/classes/mixins/adapter_mixins.py	125	CODE
660 more matches not shown…

Self-Referential Comments104 hits · 302 pts

Severity	File	Line	Snippet	Context
MEDIUM	nemo/core/neural_types/axes.py	86	"""This class represents axis semantics and (optionally) it's dimensionality	STRING
MEDIUM	nemo/core/config/modelPT.py	87	# Create the config builder	STRING
MEDIUM	…e_agent/pipecat/transports/network/websocket_server.py	172	# Create a task to monitor the websocket connection	COMMENT
MEDIUM	nemo/utils/import_utils.py	15	# This file is taken from https://github.com/NVIDIA-NeMo/Curator/blob/dask/nemo_curator/utils/import_utils.py,	COMMENT
MEDIUM	nemo/utils/exp_manager.py	645	# Create the logging directory if it does not exist	COMMENT
MEDIUM	nemo/utils/exp_manager.py	1338	# Create the callback and attach it to trainer	COMMENT
MEDIUM	nemo/utils/metaclasses.py	36	# Create a new object instance - one per class.	COMMENT
MEDIUM	nemo/utils/decorators/deprecated.py	87	# Create a banner	COMMENT
MEDIUM	nemo/collections/speechlm2/models/duplex_ear_tts.py	460	# Create a random dropout decision for each BOS instance	COMMENT
MEDIUM	nemo/collections/speechlm2/models/duplex_ear_tts.py	471	# Create a mask of the same shape as target_text_tokens	COMMENT
MEDIUM	…/collections/speechlm2/parts/metrics/results_logger.py	216	# Create a wav with eou prediction for debug purposes	COMMENT
MEDIUM⚡	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	94	# Create a range tensor from 0 to max_length - 1	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	146	# Create the window tensor on the same device as the waveform.	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_model.py	158	# Create a range tensor from 0 to max_length - 1	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_model.py	331	# Create a new, dense character vocabulary sorted by the original token ID	COMMENT
MEDIUM	nemo/collections/speechlm2/modules/ear_tts_model.py	834	# Create a padded tensor for the character IDs	COMMENT
MEDIUM	…o/collections/speechlm2/data/duplex_ear_tts_dataset.py	715	# Create a deepcopy and update duration	COMMENT
MEDIUM	…o/collections/speechlm2/data/duplex_ear_tts_dataset.py	773	# Create a zero tensor of shape [T] (assuming mono audio)	COMMENT
MEDIUM	nemo/collections/common/prompts/canary.py	74	# This method handles a level of indirection for Canary.	COMMENT
MEDIUM	nemo/collections/common/prompts/canary2.py	104	# This method handles a level of indirection for Canary.	COMMENT
MEDIUM	nemo/collections/common/data/lhotse/cutset.py	1504	# Create a stream for each dataset.	COMMENT
MEDIUM	nemo/collections/common/data/lhotse/cutset.py	1671	# Create a new Recording with the extended audio	COMMENT
MEDIUM	nemo/collections/common/data/lhotse/cutset.py	1706	# Create a Recording from the silence audio	COMMENT
MEDIUM	nemo/collections/audio/parts/submodules/diffusion.py	160	"""This class implements the Ornstein-Uhlenbeck SDE with variance exploding noise schedule.	STRING
MEDIUM	nemo/collections/audio/parts/submodules/diffusion.py	512	# Create a copy of SDE	COMMENT
MEDIUM	…o/collections/audio/parts/submodules/conformer_unet.py	159	# Create the self-attention and padding masks	COMMENT
MEDIUM	…lections/audio/parts/submodules/schroedinger_bridge.py	415	# Create a copy of the noise schedule	COMMENT
MEDIUM	nemo/collections/audio/data/data_simulation.py	594	# Create a radom set of microphones	COMMENT
MEDIUM	nemo/collections/audio/data/data_simulation.py	1685	# Define a window around the direct path delay	COMMENT
MEDIUM	nemo/collections/asr/losses/rnnt_pytorch.py	179	"""This function implements Equation 7 in the TDT paper https://arxiv.org/pdf/2304.06795.pdf,	STRING
MEDIUM	nemo/collections/asr/models/aed_multitask_models.py	1196	# This method is a legacy helper for Canary that checks whether prompt slot values were provided	COMMENT
MEDIUM	nemo/collections/asr/parts/features.py	34	# This file contains code artifacts adapted from https://github.com/ryanleary/patter	COMMENT
MEDIUM⚡	nemo/collections/asr/parts/mixins/transcription.py	301	# Create a results of the same type as each element in processed_outputs	COMMENT
MEDIUM⚡	nemo/collections/asr/parts/mixins/transcription.py	308	# Create a results of the same type as each element in processed_outputs	COMMENT
MEDIUM⚡	nemo/collections/asr/parts/mixins/transcription.py	316	# Create a results of the same type as each element in processed_outputs	COMMENT
MEDIUM	nemo/collections/asr/parts/mixins/transcription.py	382	# Create a DataLoader if not already present	COMMENT
MEDIUM	nemo/collections/asr/parts/mixins/diarization.py	255	# Create a results of the same type as each element in processed_outputs	COMMENT
MEDIUM	nemo/collections/asr/parts/mixins/diarization.py	262	# Create a results of the same type as each element in processed_outputs	COMMENT
MEDIUM	nemo/collections/asr/parts/mixins/diarization.py	323	# Create a DataLoader if not already present	COMMENT
MEDIUM	nemo/collections/asr/parts/utils/numba_utils.py	33	# Create an empty output array	COMMENT
MEDIUM⚡	…llections/asr/parts/utils/multispk_transcribe_utils.py	1227	# Initialize the instance manager with the batch size of the chunk audio.	COMMENT
MEDIUM	…llections/asr/parts/utils/multispk_transcribe_utils.py	1114	# Initialize the instance manager with the batch size of the chunk audio.	COMMENT
MEDIUM	nemo/collections/asr/parts/utils/transcribe_utils.py	245	# Create a preprocessor to convert audio samples into raw features,	COMMENT
MEDIUM	nemo/collections/asr/parts/utils/diarization_utils.py	189	# Create a list containing string formatted transcript	COMMENT
MEDIUM	nemo/collections/asr/parts/utils/diarization_utils.py	554	# Create a split segment and add it to the corresponding interval	COMMENT
MEDIUM	nemo/collections/asr/parts/utils/diarization_utils.py	1139	# Create a transscript information json dictionary from the output variables	COMMENT
MEDIUM	…ections/asr/parts/utils/batched_beam_decoding_utils.py	971	# Create a range tensor: [0, 1, 2, ..., max_other_len-1]	COMMENT
MEDIUM	nemo/collections/asr/parts/submodules/spectr_augment.py	199	# Create a mask_tensor with all the indices.	COMMENT
MEDIUM	nemo/collections/asr/parts/submodules/spectr_augment.py	206	# Create a final mask that aligns with the full tensor	COMMENT
MEDIUM	…llections/asr/parts/submodules/multi_head_attention.py	616	# Create a helper tensor to find the local indices of global attention	COMMENT
MEDIUM	nemo/collections/asr/parts/preprocessing/features.py	34	# This file contains code artifacts adapted from https://github.com/ryanleary/patter	COMMENT
MEDIUM	nemo/collections/asr/parts/preprocessing/segment.py	34	# This file contains code artifacts adapted from https://github.com/ryanleary/patter	COMMENT
MEDIUM	nemo/collections/asr/parts/preprocessing/perturb.py	34	# This file contains code artifacts adapted from https://github.com/ryanleary/patter	COMMENT
MEDIUM	nemo/collections/asr/parts/preprocessing/perturb.py	1354	"""This function is used to iterate through utterances with different offsets for each file."""	STRING
MEDIUM	nemo/collections/asr/inference/utils/context_manager.py	170	# Create a dummy context with None values	COMMENT
MEDIUM	nemo/collections/asr/inference/utils/bpe_decoder.py	139	# Create a text segment	COMMENT
MEDIUM	nemo/collections/asr/inference/utils/manifest_io.py	135	# Create a mapping of audio filepaths to their index in the manifest	COMMENT
MEDIUM	nemo/collections/asr/inference/utils/manifest_io.py	140	# Define an order of the audio filepaths	COMMENT
MEDIUM	…ctions/asr/inference/streaming/framing/multi_stream.py	176	# Create a new stream	COMMENT
MEDIUM	nemo/collections/asr/modules/conformer_encoder.py	705	# Create the self-attention and padding masks	COMMENT
44 more matches not shown…

Docstring Block Structure60 hits · 300 pts

Severity	File	Line	Snippet	Context
HIGH	nemo/utils/dependency.py	61	Import an optional dependency, raising a clear error if it is not installed. Args: module_name: The module	STRING
HIGH	nemo/collections/speechlm2/models/duplex_ear_tts.py	1108	Returns a dictionary of initial inputs for inference, using registered buffers. Args: B (i	STRING
HIGH	nemo/collections/speechlm2/models/nemotron_voicechat.py	435	Runs full offline duplex speech-to-speech inference. This method performs: 1. Streaming S	STRING
HIGH	nemo/collections/speechlm2/parts/optim_setup.py	170	Utility used to freeze select model parameters, and skip them for the purpose of initializing an optimizer's pa	STRING
HIGH	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	173	Converts a spectrogram back into a waveform using the overlap-add method. This function is an approximate inver	STRING
HIGH	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	362	Computes a Mel-scaled spectrogram from an audio waveform. This function transforms a standard spectrogram into	STRING
HIGH	nemo/collections/speechlm2/data/salm_dataset.py	48	A dataset for Speech-Augmented Language Models (SALM) that processes multimodal conversations containing both t	STRING
HIGH	nemo/collections/speechlm2/data/s2s_dataset.py	29	A dataset for duplex speech-to-speech models that handles bidirectional conversations. This dataset processes	STRING
HIGH	…o/collections/speechlm2/data/duplex_ear_tts_dataset.py	34	A dataset for duplex speech-to-speech models that handles bidirectional conversations. This dataset processes	STRING
HIGH	…ctions/common/tokenizers/huggingface/auto_tokenizer.py	236	Adds a dictionary of special tokens (eos, pad, cls...). If special tokens are NOT in the vocabulary, they are	STRING
HIGH	nemo/collections/common/callbacks/ema.py	168	EMAOptimizer is a wrapper for torch.optim.Optimizer that computes Exponential Moving Average of parameters regi	STRING
HIGH	nemo/collections/common/parts/preprocessing/manifest.py	48	Iterate through json lines of provided manifests. NeMo ASR pipelines often assume certain manifest files structure.	STRING
HIGH	nemo/collections/common/parts/preprocessing/parsers.py	229	Creates parser from labels, set of arguments and concise parser name. Args: labels: List of labels to alloc	STRING
HIGH	nemo/collections/asr/metrics/md_eval.py	255	Parse a UEM (Un-partitioned Evaluation Map) file. Args: uem_file: Path to the UEM file. If ``None``, return	STRING
HIGH	nemo/collections/asr/models/sortformer_diar_models.py	748	One-step forward pass for diarization inference in streaming mode. Args: processed_signal	STRING
HIGH	nemo/collections/asr/models/rnnt_models.py	193	Helper method to extract the rnnt loss name, and potentially its kwargs to be passed. Args:	STRING
HIGH	nemo/collections/asr/parts/utils/transcribe_utils.py	348	Prepare audio data for transcription. Args: cfg (DictConfig): Configuration dictionary containing the f	STRING
HIGH	nemo/collections/asr/parts/utils/asr_batching.py	207	Instantiates a Semi Sorted (Batch) Sampler. Args: model: ASR Model. dataset: Dataset which all	STRING
HIGH	nemo/collections/asr/parts/utils/speaker_utils.py	777	Combine overlaps with floating point numbers. Since neighboring integers are considered as continuous range, we	STRING
HIGH	…ections/asr/parts/utils/batched_beam_decoding_utils.py	920	Merge two batched beam hypotheses structures by concatenating transcripts. Used for streaming/chunked i	STRING
HIGH	…ons/asr/parts/submodules/rnnt_maes_batched_computer.py	384	Combines acoustic model log probabilities with language model scores based on the specified blank LM score mode	STRING
HIGH	…ons/asr/parts/submodules/rnnt_maes_batched_computer.py	412	Performs top-k selection and pruning for language model (LM) and automatic speech recognition (ASR) outputs	STRING
HIGH	nemo/collections/asr/parts/submodules/tdnn_attention.py	26	Statistics and time average pooling (TAP) layer This computes mean and, optionally, standard deviation statistics a	STRING
HIGH	…o/collections/asr/inference/pipelines/base_pipeline.py	521	Resolve language_code to a strict prompt index; raise if invalid. Args: language_code: (str	STRING
HIGH	…o/collections/asr/inference/pipelines/base_pipeline.py	554	Build prompt vectors for a batch of states using one-hot encoding. Args: states: (list) Lis	STRING
HIGH	nemo/collections/asr/inference/nmt/llm_translator.py	110	Setup device for the LLM model. Args: device: (str) device to run the model on	STRING
HIGH	nemo/collections/asr/inference/nmt/llm_translator.py	139	Returns prompt template for the LLM model. Args: model_name: (str) name of the model to get	STRING
HIGH	nemo/collections/asr/inference/nmt/llm_translator.py	156	Load NMT model in vLLM format. Args: llm_params: (dict) parameters for the LLM model	STRING
HIGH	nemo/collections/asr/data/audio_to_text_dataset.py	930	Normalize manifest or tarred audio file paths into a ``ListConfig`` of lists. Handles string inputs (comma-sep	STRING
HIGH	nemo/collections/asr/data/audio_to_text_dataset.py	963	Chain multiple bucketed datasets using the specified bucketing strategy. When multiple datasets are provided (	STRING
HIGH	nemo/collections/asr/data/audio_to_text_dataset.py	1015	Calculate per-bucket batch sizes for adaptive bucketing. Supports two modes: linear scaling (integer ``bucketi	STRING
HIGH	nemo/collections/tts/models/magpietts.py	941	Normalize speaker_indices to a tensor of shape (batch_size,). Args: speaker_indices: Speaker select	STRING
HIGH	nemo/collections/tts/models/magpietts.py	993	Get baked context embeddings for a batch, with per-element speaker selection. Args: batch_size: Num	STRING
HIGH	nemo/collections/tts/models/magpietts.py	1362	Convert attention probability matrices to numpy images for logging. Args: attention_prob_m	STRING
HIGH	nemo/collections/tts/models/magpietts.py	1404	Decode audio codes to waveforms and convert to numpy arrays for logging. Args: logits: Mod	STRING
HIGH	nemo/collections/tts/models/magpietts.py	1886	Prepare all context tensors for the decoder. This method orchestrates text encoding, context extraction, and mo	STRING
HIGH	nemo/collections/tts/models/magpietts.py	3719	Generate speech from raw text transcript. This is a convenience method for single-utterance text-to-sp	STRING
HIGH	nemo/collections/tts/parts/utils/tts_dataset_utils.py	484	Split a paragraph into sentences based on sentence-ending punctuation. Sentence separators are chosen from the	STRING
HIGH	nemo/collections/tts/parts/utils/tts_dataset_utils.py	779	Unified text chunking for inference: returns single chunk if below threshold, multiple sentence chunks if above	STRING
HIGH	…o/collections/tts/modules/magpietts_inference/utils.py	314	Load a MagpieTTS model from checkpoint or NeMo archive. Supports two loading modes: 1. Checkpoint mode: hparams	STRING
HIGH	…o/collections/tts/modules/magpietts_inference/utils.py	397	Load an EasyMagpieTTSInferenceModel (decoder-only) from checkpoint or NeMo archive. Uses the inference-only base cl	STRING
HIGH	scripts/asr_language_modeling/ngram_lm/ngram_merge.py	167	Calculates perplexity of a given ngram model on a test file. Args: ngram_mod (str): The pa	STRING
HIGH	scripts/asr_language_modeling/ngram_lm/ngram_merge.py	201	Converts an ngram model in binary format to ARPA format. Args: - ngram_mod (str): The path to	STRING
HIGH	scripts/asr_language_modeling/ngram_lm/ngram_merge.py	342	Function: make_symbol_list Create a symbol table for the input tokenizer model file. Args: nemo_m	STRING
HIGH	…ognition/partial_conversion_to_tarred_audio_dataset.py	66	Selects and returns a subset of shards from the tarred manifest file. Args: manifest_filepath (str): T	STRING
HIGH	…ognition/partial_conversion_to_tarred_audio_dataset.py	136	Creates tarred shards based on the provided configuration. Args: cfg (PartialASRTarredDatasetConfig):	STRING
HIGH	…/speech_recognition/convert_to_tarred_audio_dataset.py	180	Creates a new tarred dataset from a given manifest file. Args: manifest_path (str): Path t	STRING
HIGH	…/speech_recognition/convert_to_tarred_audio_dataset.py	359	Creates a concatenated tarred dataset from the base manifest and additional manifest files. Args:	STRING
HIGH	scripts/tts_comparison_report/reporting/models.py	83	Create sample metadata from one filewise metrics item. Args: item: One entry from the filewise metr	STRING
HIGH	scripts/tts_comparison_report/reporting/models.py	167	Create benchmark data by discovering benchmark artifacts in storage. Args: benchmark_name: Name of	STRING
HIGH	scripts/tts_comparison_report/reporting/models.py	297	Create bucket data by discovering benchmark artifacts in storage. Args: bucket_name: Display name o	STRING
HIGH	scripts/tts_comparison_report/reporting/models.py	370	Return the aggregated value of a metric for one benchmark. Args: metric_name: Name of the metric to	STRING
HIGH	scripts/tts_comparison_report/reporting/models.py	449	Return filewise samples for a metric from one or all benchmarks. Args: metric_name: Name of the met	STRING
HIGH	scripts/tts_comparison_report/reporting/models.py	469	Return generated audio file paths for a benchmark. Args: benchmark_name: Name of the benchmark.	STRING
HIGH	scripts/tts_comparison_report/reporting/models.py	495	Return sample metadata for a benchmark derived from filewise metrics. Args: benchmark_name: Name of	STRING
HIGH	scripts/tts_comparison_report/reporting/orchestrator.py	401	Generate evaluation reports, upload report artifacts to S3, and return report URLs. This method performs the fu	STRING
HIGH	…s_comparison_report/reporting/components/stat_tests.py	92	Run statistical tests for all distribution metrics. Args: bucket_baseline: Baseline bucket data. bu	STRING
HIGH	…comparison_report/reporting/components/audio_report.py	70	Prepare audio pairs for the selected benchmarks. Args: bucket_baseline: Baseline bucket data. bucke	STRING
HIGH	…omparison_report/reporting/components/metrics_table.py	64	Prepare formatted metric rows for one benchmark comparison table. Args: benchmark_name: Name of the benchma	STRING
HIGH	…omparison_report/reporting/components/metrics_table.py	105	Prepare formatted metric rows for the summary comparison table. Args: bucket_baseline: Baseline bucket data	STRING

AI Structural Patterns263 hits · 256 pts

Severity	File	Line	Context
LOW	tools/ctc_segmentation/scripts/prepare_data.py	125	CODE
LOW	tools/speech_data_explorer/data_explorer.py	996	CODE
LOW	nemo/core/classes/exportable.py	60	CODE
LOW	nemo/core/classes/exportable.py	138	CODE
LOW	nemo/core/classes/common.py	988	CODE
LOW	nemo/core/classes/mixins/adapter_mixins.py	606	CODE
LOW	nemo/core/classes/mixins/hf_io_mixin.py	118	CODE
LOW	nemo/core/classes/mixins/adapter_mixin_strategies.py	77	CODE
LOW	nemo/core/classes/mixins/adapter_mixin_strategies.py	185	CODE
LOW	nemo/core/classes/mixins/adapter_mixin_strategies.py	219	CODE
LOW	nemo/core/optim/adan.py	80	CODE
LOW	nemo/core/optim/adafactor.py	62	CODE
LOW	nemo/core/optim/novograd.py	48	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/diar.py	55	CODE
LOW	…nts/voice_agent/pipecat/services/nemo/streaming_asr.py	43	CODE
LOW	…gents/voice_agent/pipecat/services/nemo/turn_taking.py	45	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/tts.py	498	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/llm.py	309	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/stt.py	71	CODE
LOW	nemo/utils/export_utils.py	420	CODE
LOW	nemo/utils/export_utils.py	453	CODE
LOW	nemo/utils/timers.py	164	CODE
LOW	nemo/utils/callbacks/cuda_graph.py	139	CODE
LOW	nemo/utils/callbacks/nemo_model_checkpoint.py	47	CODE
LOW	nemo/utils/callbacks/dist_ckpt_io.py	223	CODE
LOW	nemo/collections/speechlm2/vllm/salm/config.py	70	CODE
LOW	nemo/collections/speechlm2/models/nemotron_voicechat.py	421	CODE
LOW	nemo/collections/speechlm2/parts/parallel.py	117	CODE
LOW	nemo/collections/speechlm2/parts/parallel.py	200	CODE
LOW	…/collections/speechlm2/parts/metrics/results_logger.py	158	CODE
LOW	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	704	CODE
LOW	nemo/collections/speechlm2/modules/ear_tts_vae_codec.py	785	CODE
LOW	nemo/collections/speechlm2/modules/ear_tts_model.py	760	CODE
LOW	nemo/collections/speechlm2/modules/ear_tts_model.py	1195	CODE
LOW	…o/collections/speechlm2/data/duplex_ear_tts_dataset.py	125	CODE
LOW	…tions/common/losses/latent_speaker_supervision_loss.py	89	CODE
LOW	…llections/common/tokenizers/sentencepiece_tokenizer.py	464	CODE
LOW	nemo/collections/common/tokenizers/tokenizer_utils.py	154	CODE
LOW	nemo/collections/common/tokenizers/char_tokenizer.py	89	CODE
LOW	nemo/collections/common/tokenizers/char_tokenizer.py	443	CODE
LOW	…mo/collections/common/tokenizers/tiktoken_tokenizer.py	105	CODE
LOW	…mo/collections/common/tokenizers/tiktoken_tokenizer.py	282	CODE
LOW	…ctions/common/tokenizers/huggingface/auto_tokenizer.py	35	CODE
LOW	…ons/common/tokenizers/text_to_speech/tts_tokenizers.py	429	CODE
LOW	…ons/common/tokenizers/text_to_speech/tts_tokenizers.py	543	CODE
LOW	…ons/common/tokenizers/text_to_speech/tts_tokenizers.py	782	CODE
LOW	…ons/common/tokenizers/text_to_speech/tts_tokenizers.py	932	CODE
LOW	…ons/common/tokenizers/text_to_speech/tts_tokenizers.py	1109	CODE
LOW	…ons/common/tokenizers/text_to_speech/tts_tokenizers.py	1223	CODE
LOW	nemo/collections/common/parts/utils.py	215	CODE
LOW	nemo/collections/common/parts/rnn.py	25	CODE
LOW	nemo/collections/common/parts/rnn.py	295	CODE
LOW	nemo/collections/common/data/dataset.py	47	CODE
LOW	nemo/collections/common/data/dataset.py	333	CODE
LOW	nemo/collections/audio/losses/audio.py	220	CODE
LOW	nemo/collections/audio/losses/audio.py	314	CODE
LOW	nemo/collections/audio/parts/utils/transforms.py	98	CODE
LOW	nemo/collections/audio/parts/utils/transforms.py	211	CODE
LOW	nemo/collections/audio/parts/submodules/diffusion.py	498	CODE
LOW	nemo/collections/audio/parts/submodules/multichannel.py	182	CODE
203 more matches not shown…

Excessive Try-Catch Wrapping214 hits · 241 pts

Severity	File	Line	Snippet	Context
LOW	nemo_dependencies.py	67	except Exception as e:	CODE
MEDIUM	nemo_dependencies.py	68	print(f"Error analyzing {file_path}: {e}")	CODE
LOW	tools/ctc_segmentation/scripts/prepare_data.py	121	except Exception as e:	CODE
MEDIUM	tools/ctc_segmentation/scripts/prepare_data.py	90	def _load_sox_transformer():	CODE
LOW	…ols/ctc_segmentation/scripts/get_metrics_and_filter.py	188	except Exception as e:	CODE
LOW	tools/ctc_segmentation/scripts/utils.py	131	except Exception as e:	CODE
LOW	tools/ctc_segmentation/scripts/run_ctc_segmentation.py	175	except Exception as e:	CODE
LOW	tools/speech_data_explorer/data_explorer.py	520	except Exception as e:	CODE
LOW	tools/speech_data_explorer/data_explorer.py	2793	except Exception as ex:	CODE
LOW	tools/speech_data_explorer/data_explorer.py	2855	except Exception as ex:	CODE
LOW	tools/speech_data_explorer/data_explorer.py	2880	except Exception as ex:	CODE
LOW	tools/speech_data_explorer/data_explorer.py	2907	except Exception as ex:	CODE
LOW	nemo/lightning/callback_group.py	93	except Exception:	CODE
LOW	nemo/core/connectors/save_restore_connector.py	763	except Exception as e:	CODE
LOW⚡	nemo/core/classes/common.py	129	except Exception:	CODE
LOW⚡	nemo/core/classes/common.py	133	except Exception as e2:	CODE
LOW	nemo/core/classes/common.py	601	except Exception:	CODE
LOW	nemo/core/classes/common.py	733	except Exception:	CODE
LOW	nemo/core/classes/common.py	791	except Exception as e:	CODE
LOW	nemo/core/classes/common.py	808	except Exception as e:	CODE
LOW	nemo/core/classes/modelPT.py	769	except Exception as e:	CODE
LOW	nemo/core/utils/cuda_python_utils.py	239	except Exception:	CODE
LOW	nemo/core/utils/numba_utils.py	143	except Exception:	CODE
LOW	nemo/agents/voice_agent/utils/config_manager.py	71	except Exception as e:	CODE
LOW	…o/agents/voice_agent/utils/tool_calling/basic_tools.py	54	except Exception as e:	CODE
LOW	…e_agent/pipecat/transports/network/websocket_server.py	191	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/diar.py	176	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/diar.py	181	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/diar.py	210	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/diar.py	241	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	223	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	279	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	383	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	393	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	496	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	549	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	608	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	738	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	777	except Exception as e:	CODE
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	808	except Exception as e:	CODE
LOW⚡	nemo/agents/voice_agent/pipecat/services/nemo/tts.py	176	except Exception as e:	CODE
LOW⚡	nemo/agents/voice_agent/pipecat/services/nemo/tts.py	181	except Exception as e:	CODE
LOW⚡	nemo/agents/voice_agent/pipecat/services/nemo/tts.py	184	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/tts.py	388	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/tts.py	398	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/tts.py	590	except Exception as e:	CODE
LOW	…ts/voice_agent/pipecat/services/nemo/streaming_diar.py	203	except Exception as e:	CODE
MEDIUM	…ts/voice_agent/pipecat/services/nemo/streaming_diar.py	204	print(f"Error in diarizer streaming step: {e}")	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/llm.py	174	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/llm.py	275	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/llm.py	370	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/llm.py	593	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/llm.py	687	except Exception as e:	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/stt.py	258	except Exception as e:	CODE
LOW	nemo/utils/export_utils.py	265	except Exception: # there may ne size mismatch and it may be OK	CODE
LOW	nemo/utils/export_utils.py	287	except Exception: # there may be size mismatch and it may be OK	CODE
LOW	nemo/utils/export_utils.py	400	except Exception:	CODE
LOW	nemo/utils/cast_utils.py	118	except Exception:	CODE
LOW	nemo/utils/env_var_parsing.py	105	except Exception:	CODE
LOW	nemo/utils/cloud.py	145	except Exception as e:	CODE
154 more matches not shown…

Modern Structural Boilerplate213 hits · 218 pts

Severity	File	Line	Snippet	Context
LOW	tools/ctc_segmentation/scripts/normalization_helpers.py	15	__all__ = ["LATIN_TO_RU", "RU_ABBREVIATIONS"]	CODE
LOW⚡	nemo/lightning/base_callback.py	83	def update_config(self, args, *kwargs) -> None:	CODE
LOW⚡	nemo/lightning/base_callback.py	88	__all__ = ["BaseCallback"]	CODE
LOW	nemo/lightning/__init__.py	35	__all__ = [	CODE
LOW	nemo/lightning/callback_group.py	58	def update_config(self, nemo_version: str, trainer: Any, **kwargs) -> None:	CODE
LOW	nemo/lightning/callback_group.py	185	__all__ = ['CallbackGroup', 'hook_class_init_with_callbacks']	CODE
LOW	nemo/lightning/one_logger_callback.py	34	__all__ = ['OneLoggerNeMoCallback']	CODE
LOW	nemo/lightning/one_logger_callback.py	247	def update_config(self, nemo_version: str, trainer: Trainer, **kwargs) -> None:	CODE
LOW	nemo/lightning/base.py	74	__all__ = ["get_vocab_size", "teardown"]	CODE
LOW	nemo/core/neural_types/elements.py	22	__all__ = [	CODE
LOW	nemo/core/neural_types/neural_type.py	23	__all__ = [	CODE
LOW	nemo/core/neural_types/comparison.py	17	__all__ = ['NeuralTypeComparisonResult']	CODE
LOW	nemo/core/neural_types/axes.py	18	__all__ = ['AxisKindAbstract', 'AxisKind', 'AxisType']	CODE
LOW	nemo/core/config/base_config.py	18	__all__ = ['Config']	CODE
LOW	nemo/core/config/optimizers.py	21	__all__ = [	CODE
LOW	nemo/core/config/pytorch_lightning.py	20	__all__ = ['TrainerConfig']	CODE
LOW	nemo/core/config/pytorch.py	20	__all__ = ['DataLoaderConfig']	CODE
LOW	nemo/core/classes/exportable.py	36	__all__ = ['ExportFormat', 'Exportable']	CODE
LOW	nemo/core/classes/dataset.py	21	__all__ = ['Dataset', 'IterableDataset']	CODE
LOW	nemo/core/classes/loss.py	19	__all__ = ['Loss']	CODE
LOW	nemo/core/classes/common.py	52	__all__ = ['Typing', 'FileIO', 'Model', 'Serialization', 'typecheck', 'PretrainedModelInfo']	CODE
LOW	nemo/core/classes/modelPT.py	45	__all__ = ['ModelPT']	CODE
LOW	nemo/core/classes/module.py	23	__all__ = ['NeuralModule', 'freeze', 'unfreeze']	CODE
LOW	nemo/core/classes/mixins/adapter_mixins.py	403	def set_accepted_adapter_types(self, adapter_types: List[Union[type, str]]) -> None:	CODE
LOW	nemo/core/optim/adafactor.py	28	__all__ = ['Adafactor']	CODE
LOW	nemo/core/optim/optimizers.py	77	__all__ = ['get_optimizer', 'register_optimizer', 'parse_optimizer_args']	CODE
LOW	nemo/core/optim/novograd.py	18	__all__ = ['Novograd']	CODE
LOW	nemo/core/utils/optional_libs.py	20	__all__ = [	CODE
LOW	nemo/agents/voice_agent/pipecat/services/nemo/utils.py	131	def _update_feature_buffer(self, feat_chunk: torch.Tensor) -> None:	CODE
LOW	nemo/utils/nemo_logging.py	30	__all__ = ["Logger", "LogMode"]	CODE
LOW	nemo/utils/env_var_parsing.py	47	__all__ = [	CODE
LOW	nemo/utils/timers.py	25	__all__ = ["NamedTimer", "SimpleTimer"]	CODE
LOW	nemo/utils/import_utils.py	25	logger = logging.getLogger(__name__)	CODE
LOW	nemo/utils/decorators/experimental.py	16	__all__ = ['experimental']	CODE
LOW	nemo/utils/decorators/deprecated.py	16	__all__ = [	CODE
LOW	nemo/utils/decorators/port_docs.py	18	__all__ = [	CODE
LOW	nemo/utils/callbacks/cuda_graph.py	51	__all__ = ["CUDAGraphCallback"]	CODE
LOW	nemo/utils/callbacks/nemo_model_checkpoint.py	495	def set_checkpoint_unfinished_marker(checkpoint_path: Union[Path, str], barrier_after=False) -> None:	CODE
LOW	nemo/utils/formatters/utils.py	21	__all__ = ["check_color_support", "to_unicode"]	CODE
LOW	nemo/utils/formatters/base.py	21	__all__ = ["BaseNeMoFormatter"]	CODE
LOW	nemo/collections/speechlm2/__init__.py	26	__all__ = [	CODE
LOW	nemo/collections/speechlm2/models/__init__.py	23	__all__ = [	CODE
LOW	nemo/collections/speechlm2/streaming/__init__.py	17	__all__ = ['DuplexSTTStreamingInference']	CODE
LOW	nemo/collections/speechlm2/parts/metrics/__init__.py	23	__all__ = [	CODE
LOW	nemo/collections/speechlm2/modules/__init__.py	17	__all__ = [	CODE
LOW	nemo/collections/speechlm2/modules/perception.py	94	def set_activation_checkpointing(self, enabled: bool) -> None:	CODE
LOW	nemo/collections/speechlm2/modules/perception.py	215	def _set_encoder_activation_checkpointing(encoder: nn.Module, enabled: bool) -> None:	CODE
LOW	nemo/collections/speechlm2/modules/perception.py	292	def set_activation_checkpointing(self, enabled: bool) -> None:	CODE
LOW	nemo/collections/speechlm2/data/__init__.py	20	__all__ = [	CODE
LOW	…/collections/common/metrics/classification_accuracy.py	24	__all__ = ['TopKClassificationAccuracy']	CODE
LOW	nemo/collections/common/metrics/perplexity.py	19	__all__ = ['Perplexity']	CODE
LOW	nemo/collections/common/metrics/__init__.py	24	__all__ = [	CODE
LOW	…llections/common/metrics/global_average_loss_metric.py	18	__all__ = ['GlobalAverageLossMetric']	CODE
LOW	…ections/common/metrics/metric_string_to_torchmetric.py	22	__all__ = ['MetricStringToTorchMetric', 'TextMetricsSet', 'ClassificationMetricsSet']	CODE
LOW	nemo/collections/common/losses/cross_entropy.py	22	__all__ = ['CrossEntropyLoss', 'NLLLoss']	CODE
LOW	nemo/collections/common/losses/mse_loss.py	20	__all__ = ['MSELoss']	CODE
LOW	nemo/collections/common/losses/aggregator.py	22	__all__ = ['AggregatorLoss']	CODE
LOW	…tions/common/losses/latent_speaker_supervision_loss.py	22	__all__ = ["LatentSpeakerSupervisionLoss"]	CODE
LOW	nemo/collections/common/losses/multi_similarity_loss.py	24	__all__ = ['MultiSimilarityLoss']	CODE
LOW	nemo/collections/common/losses/spanning_loss.py	20	__all__ = ['SpanningLoss']	CODE
153 more matches not shown…

Redundant / Tautological Comments141 hits · 210 pts

Severity	File	Line	Snippet	Context
LOW	tools/speech_data_simulator/conf/data_simulator.yaml	64	add_seg_aug: False # Set True to enable augmentation on each speech segment	CODE
LOW	tools/speech_data_simulator/conf/data_simulator.yaml	72	add_sess_aug: False # Set True to enable audio augmentation on the whole session	CODE
LOW	tools/speech_data_explorer/data_explorer.py	148	# Check if file exists	COMMENT
LOW	nemo/core/connectors/save_restore_connector.py	86	# Check if we are packing the folder into a nemo file	COMMENT
LOW	nemo/core/config/hydra_runner.py	94	# Check if user set the schema.	COMMENT
LOW⚡	nemo/core/classes/common.py	137	# Check if this is a missing dependency issue vs a malicious target	COMMENT
LOW⚡	nemo/core/classes/common.py	145	# Check if the module path is in one of our approved prefixes.	COMMENT
LOW	nemo/core/classes/common.py	488	# Check if keys exists in the defined input types	COMMENT
LOW	nemo/core/classes/common.py	1029	# Check if nemo_model_file_in_cache is a directory	COMMENT
LOW	nemo/core/classes/common.py	1143	# Check if api token exists, use if it does	COMMENT
LOW	nemo/core/classes/common.py	1149	# Check if model exists in HF	COMMENT
LOW	nemo/core/classes/modelPT.py	704	# Check if caller provided optimizer name, default to Adam otherwise	COMMENT
LOW	nemo/core/classes/modelPT.py	722	# Check if caller has optimizer kwargs, default to empty dictionary	COMMENT
LOW	nemo/core/classes/modelPT.py	1364	# Check if model is being resumed or not - only works if `Trainer` is attached to model	COMMENT
LOW	nemo/core/classes/modelPT.py	1550	# Assign trainer to the model	COMMENT
LOW	nemo/core/classes/mixins/adapter_mixins.py	362	# Check if type is supported (if available) and is an enabled adapter	COMMENT
LOW	nemo/core/classes/mixins/adapter_mixins.py	477	# Check if adapter is enabled or not	COMMENT
LOW	nemo/core/classes/mixins/hf_io_mixin.py	108	# Check if api token exists, use if it does	COMMENT
LOW	nemo/core/classes/mixins/adapter_mixin_strategies.py	241	# Check if globally allowed to compute aux loss	COMMENT
LOW	nemo/core/optim/distributed_adam.py	658	# Check if fragment needs to be updated	COMMENT
LOW	nemo/core/utils/process_launcher/launcher.py	269	# Check if all processes are completed or not	COMMENT
LOW	…ice_agent/pipecat/utils/text/simple_text_aggregator.py	74	# Check if the only period is a bullet point (e.g., "1. Alpha" or incomplete "1.")	COMMENT
LOW	…ice_agent/pipecat/utils/text/simple_text_aggregator.py	81	# Check if any of the abbreviations "e.", "i." "g.", "etc." are present in the text	COMMENT
LOW	…ents/voice_agent/pipecat/services/nemo/audio_logger.py	721	# Check if we need to start a new turn or append to existing turn	COMMENT
LOW	nemo/agents/voice_agent/pipecat/services/nemo/llm.py	457	# Check if there's already a vLLM process running on the same port and model	COMMENT
LOW	nemo/agents/voice_agent/pipecat/services/nemo/llm.py	461	# Check if this process is using the same port and model	COMMENT
LOW	nemo/agents/voice_agent/pipecat/services/nemo/llm.py	574	# Check if process is still running	COMMENT
LOW	nemo/utils/te_utils.py	23	# Check if Transformer Engine has quantized tensor classes	COMMENT
LOW	nemo/utils/exp_manager.py	1404	# Check if cuda is avialable as preemption is supported only on GPUs	COMMENT
LOW	nemo/utils/decorators/deprecated.py	48	# Check if we already warned about that function.	COMMENT
LOW	nemo/utils/callbacks/preemption.py	57	# Check if torch distributed is initialised, required for broadcasting the preemption signal to all the ranks	COMMENT
LOW	nemo/collections/speechlm2/models/duplex_ear_tts.py	1409	# Check if we should use the custom grouping	COMMENT
LOW	nemo/collections/speechlm2/parts/metrics/turn_taking.py	64	# Check if within tolerance	COMMENT
LOW	…o/collections/speechlm2/parts/metrics/mcq_evaluator.py	238	# Check if response is empty	COMMENT
LOW	…o/collections/speechlm2/parts/metrics/mcq_evaluator.py	282	# Check if correct	COMMENT
LOW	nemo/collections/speechlm2/parts/metrics/empty_text.py	49	# Check if hypothesis is empty or only whitespace	COMMENT
LOW	nemo/collections/speechlm2/data/force_align.py	269	# Check if this is a Segment object (has words_and_tokens attribute)	COMMENT
LOW	nemo/collections/speechlm2/data/force_align.py	273	# Check if this is a Word object (has 'text' and timing attributes)	COMMENT
LOW	…o/collections/speechlm2/data/duplex_ear_tts_dataset.py	585	# Check if system prompt exists in custom field	COMMENT
LOW	nemo/collections/common/parts/preprocessing/cleaners.py	220	# Check if there are non-numbers	COMMENT
LOW	nemo/collections/common/parts/preprocessing/cleaners.py	248	# Check if it is a currency	COMMENT
LOW	nemo/collections/common/prompts/formatter.py	343	# Check if the tokenizer is aggregate and perform extra checks.	COMMENT
LOW	nemo/collections/common/data/lhotse/cutset.py	567	# Check if we have any attributes that are propagated downwards to each item in the group.	COMMENT
LOW	nemo/collections/audio/data/data_simulation.py	1306	# Check if a feasible source is found	COMMENT
LOW	nemo/collections/asr/metrics/multitask.py	131	# Check if string is for value assertion or equivalency of values.	COMMENT
LOW	nemo/collections/asr/models/ssl_models.py	565	# Set flag to register tensors	COMMENT
LOW	nemo/collections/asr/models/label_models.py	528	# Check if all outputs are non-empty	COMMENT
LOW	nemo/collections/asr/models/aed_multitask_models.py	579	# Check if only one audio is provided with string	COMMENT
LOW	nemo/collections/asr/models/aed_multitask_models.py	596	# Check if chunking will be enabled	COMMENT
LOW	nemo/collections/asr/models/aed_multitask_models.py	1012	# Check if we have defaults for this role.	COMMENT
LOW	nemo/collections/asr/models/aed_multitask_models.py	1356	# Check if the model_restore_path is already an extracted directory (which happens during restore_from)	COMMENT
LOW⚡	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	89	# Check if default module name is None or not	COMMENT
LOW⚡	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	99	# Check if encoder adapters should be added	COMMENT
LOW⚡	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	106	# Check if module exists	COMMENT
LOW	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	165	# Check if default module name is None or not	COMMENT
LOW	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	174	# Check if encoder adapters should be used	COMMENT
LOW	nemo/collections/asr/parts/mixins/asr_adapter_mixins.py	197	# Check if encoder adapters should be used or are enabled	COMMENT
LOW	nemo/collections/asr/parts/mixins/transcription.py	285	# Check if internal config is valid	COMMENT
LOW	nemo/collections/asr/parts/mixins/transcription.py	365	# Check if internal config is valid	COMMENT
LOW	nemo/collections/asr/parts/mixins/transcription.py	489	# Check if audio is a list of strings (filepaths or manifests)	COMMENT
81 more matches not shown…

Hallucination Indicators14 hits · 105 pts

Severity	File	Line	Snippet	Context
CRITICAL	nemo/core/connectors/save_restore_connector.py	124	model = nemo.collections.asr.models.EncDecCTCModel.restore_from('asr.nemo')	STRING
CRITICAL	nemo/core/connectors/save_restore_connector.py	263	model = nemo.collections.asr.models.EncDecCTCModel.restore_from('asr.nemo')	STRING
CRITICAL	nemo/core/connectors/save_restore_connector.py	303	state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from('asr.nemo', './ckpts')	STRING
CRITICAL	nemo/core/connectors/save_restore_connector.py	312	state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from(	STRING
CRITICAL	nemo/core/classes/modelPT.py	462	model = nemo.collections.asr.models.EncDecCTCModel.restore_from('asr.nemo')	STRING
CRITICAL	nemo/core/classes/modelPT.py	1483	state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from('asr.nemo', './ckpts')	STRING
CRITICAL	nemo/core/classes/modelPT.py	1492	state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from(	STRING
CRITICAL	…parts/submodules/aed_decoding/aed_batched_streaming.py	109	pred_tokens_ids, batch_size, _ = self.asr_model.decoding.decoding.greedy_search._prepare_for_search(	CODE
CRITICAL	…parts/submodules/aed_decoding/aed_batched_streaming.py	141	self.asr_model.decoding.decoding.greedy_search._one_step_forward(	CODE
CRITICAL	…parts/submodules/aed_decoding/aed_batched_streaming.py	203	pred_tokens_ids, batch_size, _ = self.asr_model.decoding.decoding.greedy_search._prepare_for_search(	CODE
CRITICAL	…parts/submodules/aed_decoding/aed_batched_streaming.py	231	self.asr_model.decoding.decoding.greedy_search._one_step_forward(	CODE
CRITICAL	nemo/collections/tts/models/fastpitch.py	879	n_speakers = self.fastpitch.speaker_emb.weight.data.size()[0]	CODE
CRITICAL	tests/lightning/test_one_logger_callback.py	101	mock_provider_instance.with_base_config.return_value.with_export_config.return_value.configure_provider.assert_c	CODE
CRITICAL	tests/collections/tts/modules/test_transformer_2501.py	1233	assert moe_ffn.router.router.weight.grad.abs().sum() > 0, "Router weight grad must be non-zero"	CODE

Slop Phrases35 hits · 100 pts

Severity	File	Line	Snippet	Context
LOW	nemo/collections/asr/losses/ctc.py	45	# Don't forget to properly call base constructor	COMMENT
LOW	examples/asr/slurm_example.sh	34	CONTAINER=nvcr.io/nvidia/nemo:25.02.rc4 # Adjust to your needs. and make sure you have ngc key in ~/.config/enroot/.cred	CODE
MEDIUM	examples/tts/conf/fastpitch_ssl.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/fastpitch_align_ipa.yaml	2	# If you want to train a model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/fastpitch_align_44100_adapter.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/aligner.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/fastpitch_align_44100.yaml	2	# rate. If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/fastpitch_align_v1.05.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/fastpitch_align_ipa_adapter.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/hifigan_dataset/hifigan_44100.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/hifigan_dataset/hifigan_22050.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/hifigan/hifigan.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/hifigan/hifigan_44100.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/fastpitch/fastpitch_44100.yaml	2	# If you want to train a model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/fastpitch/fastpitch_22050.yaml	2	# If you want to train a model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/audio_codec/mel_codec_22050.yaml	3	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	…conf/audio_codec/audio_codec_low_frame_rate_22050.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/audio_codec/mel_codec_44100.yaml	3	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/audio_codec/audio_codec_44100.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/audio_codec/audio_codec_22050.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/audio_codec/audio_codec_16000.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/audio_codec/audio_codec_24000.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/audio_codec/encodec_24000.yaml	2	# If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	…es/tts/conf/zh/fastpitch_align_multispeaker_22050.yaml	2	# rate. If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/zh/fastpitch_align_22050.yaml	2	# rate. If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/de/fastpitch_align_44100_phoneme.yaml	2	# rate. If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/de/fastpitch_align_22050_mix.yaml	2	# rate. If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	…amples/tts/conf/de/fastpitch_align_44100_grapheme.yaml	2	# rate. If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	…amples/tts/conf/de/fastpitch_align_22050_grapheme.yaml	2	# rate. If you want to train model on other dataset, you can change config values according to your dataset.	COMMENT
MEDIUM	examples/tts/conf/es/fastpitch_align_44100_ipa.yaml	2	# 44.1KHz sampling rate. If you want to train model on other dataset, you can change config values according	COMMENT
MEDIUM	…mples/tts/conf/es/fastpitch_align_44100_ipa_multi.yaml	2	# 44.1KHz sampling rate. If you want to train model on other dataset, you can change config values according	COMMENT
MEDIUM	examples/tts/conf/es/fastpitch_align_44100.yaml	2	# 44.1KHz sampling rate. If you want to train model on other dataset, you can change config values according	COMMENT
MEDIUM	scripts/tokenizers/process_asr_text_tokenizer.py	37	# In either case, you can add commas to concatenate different manifests or different data files.	COMMENT
MEDIUM	…pts/dataset_processing/process_speech_commands_data.py	463	f'\n<<NOTE>> Duration computation was skipped for demonstration purposes on Colaboratory.\n'	CODE
MEDIUM	scripts/installers/install_opengrm.sh	19	# Alternatively, in the Linux Debian you can use: sudo apt install libngram-tools	COMMENT

Verbosity Indicators49 hits · 93 pts

Severity	File	Line	Snippet	Context
LOW⚡	…/collections/speechlm2/parts/metrics/results_logger.py	344	# Step 1: Each rank saves its own results with rank suffix	COMMENT
LOW⚡	…/collections/speechlm2/parts/metrics/results_logger.py	352	# Step 2: Synchronize all ranks before merging	COMMENT
LOW⚡	…/collections/speechlm2/parts/metrics/results_logger.py	356	# Step 3: Only rank 0 merges all results and computes final metrics	COMMENT
LOW⚡	…/collections/speechlm2/parts/metrics/results_logger.py	433	# Step 4: Broadcast metrics from rank 0 to all other ranks	COMMENT
LOW	…tions/common/losses/latent_speaker_supervision_loss.py	159	# Step 1: Identify speaker token positions and their speaker indices.	COMMENT
LOW⚡	…tions/common/losses/latent_speaker_supervision_loss.py	170	# Step 2: Forward-fill using cummax on position indices.	COMMENT
LOW⚡	…tions/common/losses/latent_speaker_supervision_loss.py	178	# Step 3: Gather the speaker index at the last speaker position.	COMMENT
LOW	nemo/collections/asr/models/online_diarizer.py	546	# Step 1: Get subsegments for embedding extraction.	COMMENT
LOW⚡	nemo/collections/asr/models/online_diarizer.py	585	# Step 4: Generate RTTM style diarization labels from segment ranges and cluster labels	COMMENT
LOW⚡	…llections/asr/parts/utils/multispk_transcribe_utils.py	1233	# Step 2: diarize or get GT rttms	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1251	# Step 3: update diar states	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1270	# Step 4: find active speakers	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1285	# Step 5: generate instance for active speakers	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1301	# Step 6:	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1315	# Step 7: ASR forward pass for active speakers	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1337	# Step 8: update ASR states	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1352	# Step 9: update seglsts with timestamps	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	689	# Step 1: Initialization	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	699	# Step 2: Get most likely labels for current frame	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	703	# Step 3: Get fusion scores	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	713	# Step 4: Get most likely labels with fusion scores. Labels that are blank or repeated are ignored.	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	719	# Step 5: Update labels if they initially weren't blank or repeated	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	724	# Step 6: Update fusion states and scores for non-blank and non-repeated labels	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	771	# Step 2: Get most likely labels for current frame	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	776	# Step 3: Get fusion scores	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	785	# Step 4: Get most likely labels with fusion scores. Labels that are blank or repeated are ignored.	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	791	# Step 5: Update labels if they initially weren't blank or repeated	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	799	# Step 6: Update fusion states and scores for non-blank and non-repeated labels	COMMENT
LOW	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	745	# Step 1: Initialization for fusion models	COMMENT
LOW	nemo/collections/asr/inference/streaming/state/state.py	218	# we need to check if the last token is the same as the first token of the completed output	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1020	# Step 1: Prepare parameters for sentence generation	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1027	# Step 2: Select a speaker	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1040	# Step 3: Generate a sentence	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1044	# Step 4: Generate a timestamp for either silence or overlap	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1541	# Step 1: Prepare parameters for sentence generation	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1548	# Step 2: Select a speaker	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1563	# Step 3: Generate a sentence	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1567	# Step 4: Generate a time-stamp for either silence or overlap	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1062	# Step 6: Build entries for output files	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1132	# Step 7: Normalize and write to disk	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1145	# Step 8: Clean up memory	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1588	# Step 6: Build entries for output files	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1651	# Step 7: Normalize and write to disk	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1906	# Step 1: Encode text input (always needed)	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1909	# Step 2: Get and scale attention prior	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1914	# Step 3: Process context based on model type	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1962	# Step 5: Apply CTC prior layer filtering	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1965	# Step 6: Return typed output	COMMENT
LOW	nemo/collections/tts/models/magpietts.py	1936	# Step 4: Dispatch to model-type-specific handler	COMMENT

Structural Annotation Overuse48 hits · 91 pts

Severity	File	Line	Snippet	Context
LOW⚡	…/collections/speechlm2/parts/metrics/results_logger.py	344	# Step 1: Each rank saves its own results with rank suffix	COMMENT
LOW⚡	…/collections/speechlm2/parts/metrics/results_logger.py	352	# Step 2: Synchronize all ranks before merging	COMMENT
LOW⚡	…/collections/speechlm2/parts/metrics/results_logger.py	356	# Step 3: Only rank 0 merges all results and computes final metrics	COMMENT
LOW⚡	…/collections/speechlm2/parts/metrics/results_logger.py	433	# Step 4: Broadcast metrics from rank 0 to all other ranks	COMMENT
LOW	…tions/common/losses/latent_speaker_supervision_loss.py	159	# Step 1: Identify speaker token positions and their speaker indices.	COMMENT
LOW⚡	…tions/common/losses/latent_speaker_supervision_loss.py	170	# Step 2: Forward-fill using cummax on position indices.	COMMENT
LOW⚡	…tions/common/losses/latent_speaker_supervision_loss.py	178	# Step 3: Gather the speaker index at the last speaker position.	COMMENT
LOW	nemo/collections/asr/models/online_diarizer.py	546	# Step 1: Get subsegments for embedding extraction.	COMMENT
LOW⚡	nemo/collections/asr/models/online_diarizer.py	579	# Step 3 - Clustering: Perform an online version of clustering algorithm	COMMENT
LOW⚡	nemo/collections/asr/models/online_diarizer.py	585	# Step 4: Generate RTTM style diarization labels from segment ranges and cluster labels	COMMENT
LOW⚡	…llections/asr/parts/utils/multispk_transcribe_utils.py	1233	# Step 2: diarize or get GT rttms	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1251	# Step 3: update diar states	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1270	# Step 4: find active speakers	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1285	# Step 5: generate instance for active speakers	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1315	# Step 7: ASR forward pass for active speakers	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1337	# Step 8: update ASR states	COMMENT
LOW	…llections/asr/parts/utils/multispk_transcribe_utils.py	1352	# Step 9: update seglsts with timestamps	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	689	# Step 1: Initialization	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	699	# Step 2: Get most likely labels for current frame	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	703	# Step 3: Get fusion scores	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	713	# Step 4: Get most likely labels with fusion scores. Labels that are blank or repeated are ignored.	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	719	# Step 5: Update labels if they initially weren't blank or repeated	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	724	# Step 6: Update fusion states and scores for non-blank and non-repeated labels	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	771	# Step 2: Get most likely labels for current frame	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	776	# Step 3: Get fusion scores	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	785	# Step 4: Get most likely labels with fusion scores. Labels that are blank or repeated are ignored.	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	791	# Step 5: Update labels if they initially weren't blank or repeated	COMMENT
LOW⚡	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	799	# Step 6: Update fusion states and scores for non-blank and non-repeated labels	COMMENT
LOW	…ollections/asr/parts/submodules/ctc_greedy_decoding.py	745	# Step 1: Initialization for fusion models	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1020	# Step 1: Prepare parameters for sentence generation	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1027	# Step 2: Select a speaker	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1040	# Step 3: Generate a sentence	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1044	# Step 4: Generate a timestamp for either silence or overlap	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1541	# Step 1: Prepare parameters for sentence generation	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1548	# Step 2: Select a speaker	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1563	# Step 3: Generate a sentence	COMMENT
LOW⚡	nemo/collections/asr/data/data_simulation.py	1567	# Step 4: Generate a time-stamp for either silence or overlap	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1062	# Step 6: Build entries for output files	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1132	# Step 7: Normalize and write to disk	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1145	# Step 8: Clean up memory	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1588	# Step 6: Build entries for output files	COMMENT
LOW	nemo/collections/asr/data/data_simulation.py	1651	# Step 7: Normalize and write to disk	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1906	# Step 1: Encode text input (always needed)	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1909	# Step 2: Get and scale attention prior	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1914	# Step 3: Process context based on model type	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1962	# Step 5: Apply CTC prior layer filtering	COMMENT
LOW⚡	nemo/collections/tts/models/magpietts.py	1965	# Step 6: Return typed output	COMMENT
LOW	nemo/collections/tts/models/magpietts.py	1936	# Step 4: Dispatch to model-type-specific handler	COMMENT

Cross-Language Confusion21 hits · 82 pts

Severity	File	Line	Snippet	Context
HIGH⚡	…ration/tests/test_customization_dataset_preparation.py	268	assert df_dropped_unnecessary_fields.equals(drop_unrequired_fields(df))	CODE
HIGH⚡	…ration/tests/test_customization_dataset_preparation.py	305	assert df_prompt.equals(	CODE
HIGH⚡	…ration/tests/test_customization_dataset_preparation.py	310	assert df_prompt.equals(convert_into_prompt_completion_only(df_prompt))	CODE
HIGH⚡	…ration/tests/test_customization_dataset_preparation.py	326	assert expected_df.equals(warn_and_drop_long_samples(df, 10000)[0])	CODE
HIGH	tools/asr_evaluator/utils.py	45	raise ValueError("decoder_type could only be null, ctc, rnnt or aed")	CODE
HIGH	tools/asr_evaluator/utils.py	119	f"Hybrid models only support rnnt or ctc decoding! Current decoder_type: {cfg.inference.decoder_type	CODE
HIGH	nemo/collections/audio/losses/audio.py	86	min_scale \|\| scale * target - estimate \|\|^2	STRING
HIGH	nemo/collections/audio/losses/audio.py	136	min_filter \|\| conv(filter, target) - estimate \|\|^2	STRING
HIGH	nemo/collections/audio/parts/utils/audio.py	441	min_scale \|\| scale * target - estimate \|\|^2	STRING
HIGH	nemo/collections/audio/parts/utils/audio.py	466	min_filter \|\| conv(filter, target) - estimate \|\|^2	STRING
HIGH	nemo/collections/asr/parts/utils/vad_utils.py	126	"or a list of {'audio_filepath': i, 'offset': 0, 'duration': null}."	CODE
HIGH	nemo/collections/asr/modules/rnnt_abstract.py	147	Stateful prediction of scores and state for a (possibly null) tokenset.	STRING
HIGH	nemo/collections/asr/modules/rnnt.py	709	Stateful prediction of scores and state for a (possibly null) tokenset.	STRING
HIGH	nemo/collections/asr/data/data_simulation.py	1265	orV_rcv (list or null): Microphone orientations	STRING
HIGH	nemo/collections/asr/data/audio_to_label.py	1335	"duration": null, # not used, will load the whole audio	STRING
HIGH	examples/speechlm2/nemotron_voicechat_eval.py	44	checkpoint_path (str \| null)	STRING
HIGH	examples/speechlm2/nemotron_voicechat_eval.py	50	* inference_speaker_reference (str \| null): Path to the reference audio used to condition the speaker's voice. S	STRING
HIGH	examples/asr/asr_adapters/train_asr_adapter.py	25	model.adapter.adapter_module_name=<null, or str module. Type: encoder, decoder, joint, or multiple with + between th	STRING
HIGH	examples/asr/asr_adapters/train_asr_adapter.py	51	model.adapter.adapter_module_name=<null, or str module. Type: encoder, decoder, joint, or multiple with + between th	STRING
HIGH	…s/dataset_processing/g2p/convert_cmu_arpabet_to_ipa.py	25	cd NeMo/scripts && python dataset_processing/g2p/convert_cmu_arpabet_to_ipa.py	STRING
HIGH	scripts/installers/setup_os2s_decoders.py	86	+ " >/dev/null 2>/dev/null && rm "	CODE

AI Slop Vocabulary35 hits · 66 pts

Severity	File	Line	Snippet	Context
LOW	tools/speech_data_explorer/data_explorer.py	1406	# If using tarred audio, just return the filename as-is.	COMMENT
LOW	nemo/lightning/callback_group.py	162	# If we're already inside a wrapped __init__, just call the original	COMMENT
LOW	nemo/core/connectors/save_restore_connector.py	430	# artifact is optional and we simply return None	COMMENT
LOW	nemo/core/classes/common.py	1353	# If types are not defined, skip type checks and just call the wrapped method	COMMENT
LOW	nemo/core/optim/optimizers.py	143	# If we are provided just a Config object, simply return the dictionary of that object	COMMENT
LOW	nemo/core/optim/optimizers.py	155	# simply return the dictionary that was provided	COMMENT
LOW	nemo/agents/voice_agent/pipecat/services/nemo/diar.py	311	# if diarization is disabled, just pass the frame through	COMMENT
MEDIUM	nemo/collections/speechlm2/models/duplex_ear_tts.py	377	# EOS dropout to make the model more robust	COMMENT
MEDIUM	nemo/collections/speechlm2/models/duplex_ear_tts.py	425	# BOS dropout to make the model more robust	COMMENT
MEDIUM	nemo/collections/speechlm2/models/duplex_ear_tts.py	446	# BOS dropout to make the model more robust	COMMENT
LOW	nemo/collections/speechlm2/modules/speech_generation.py	114	# ToDo: move it to cache to need to just create a 1 frame tensor in inference	COMMENT
LOW	nemo/collections/common/data/dataset.py	464	# if min_monolingual fires, it means we will just return a single, original monolingual utterance	COMMENT
LOW	nemo/collections/common/data/lhotse/dataloader.py	839	# Bucket duration bins are provided: just use them.	COMMENT
MEDIUM	nemo/collections/common/data/lhotse/cutset.py	1078	# Normalize for robust matching	COMMENT
LOW	nemo/collections/common/data/lhotse/nemo_adapters.py	355	# just return self.	COMMENT
MEDIUM	nemo/collections/audio/data/audio_to_audio_lhotse.py	68	# TODO: use fault_tolerant=True for robust loading of target	COMMENT
MEDIUM	nemo/collections/audio/data/audio_to_audio_lhotse.py	72	# TODO: use fault_tolerant=True for robust loading of target	COMMENT
LOW	…ctions/asr/models/hybrid_rnnt_ctc_bpe_models_prompt.py	350	# RNNT Path - just use encoded outputs directly	COMMENT
MEDIUM	nemo/collections/asr/parts/mixins/diarization.py	485	# Be robust to callers accidentally passing "an array of arrays" (dtype=object),	COMMENT
MEDIUM⚡	…mo/collections/asr/parts/utils/asr_confidence_utils.py	404	"""Implemented by subclass in order to aggregate token confidence to a word-level confidence.	STRING
LOW	nemo/collections/asr/parts/submodules/jasper.py	143	# simply return symmetric padding for this scenario	COMMENT
MEDIUM	nemo/collections/asr/parts/submodules/ctc_decoding.py	804	# If the exact timestep information is available, utilize the 1st non-ctc blank token timestep	COMMENT
LOW	…/collections/tts/data/text_to_speech_dataset_lhotse.py	378	# If context audio is not available, just use a dummy context_audio_codes	COMMENT
LOW	nemo/collections/tts/data/text_to_speech_dataset.py	598	# If context audio is not available, just use a dummy context_audio_codes	COMMENT
LOW	tests/collections/common/test_lhotse_dataloading.py	1546	# in this test we'll just use 0.1 for simplicity	COMMENT
LOW	tests/collections/common/test_lhotse_dataloading.py	1623	# in this test we'll just use 0.1 for simplicity	COMMENT
LOW	tests/collections/common/test_lhotse_dataloading.py	1650	# in this test we'll just use 0.1 for simplicity	COMMENT
LOW	tests/collections/common/test_lhotse_dataloading.py	1751	# in this test we'll just use 0.1 for simplicity	COMMENT
LOW	tests/collections/common/test_lhotse_dataloading.py	1778	# in this test we'll just use 0.1 for simplicity	COMMENT
LOW	tests/collections/common/test_lhotse_dataloading.py	1868	# in this test we'll just use 0.1 for simplicity	COMMENT
LOW	tests/collections/common/test_lhotse_dataloading.py	1895	# in this test we'll just use 0.1 for simplicity	COMMENT
LOW	examples/speechlm2/salm_eval.py	149	# If no user prompt is provided, just use the audio placeholder.	COMMENT
MEDIUM	…t/server/parsers/nemotron_toolcall_parser_streaming.py	509	# re-set stuff pertaining to progress in the current tool	COMMENT
MEDIUM	scripts/tokenizers/conf/tabular_data_tokenizer.yaml	9	transform: yeo-johnson # can be ['yeo-johnson', 'quantile', 'robust'], check https://scikit-learn.org/stable/modul	CODE
MEDIUM	…/speech_recognition/convert_to_tarred_audio_dataset.py	31	# supplied to the config in order to utilize webdataset for efficient large dataset handling.	STRING

Modern AI Meta-Vocabulary11 hits · 31 pts

Severity	File	Line	Snippet	Context
MEDIUM	nemo/collections/asr/models/online_diarizer.py	107	# Set speaker embedding model in eval mode	COMMENT
MEDIUM	…ns/asr/parts/context_biasing/boosting_graph_batched.py	65	1.0 # The score for eos token after detected end of context phrase to prevent hallucination for AED models	CODE
MEDIUM	…collections/asr/parts/submodules/multitask_decoding.py	637	hallucinations_detector: bool = True # detect hallucinations in the predicted tokens	CODE
MEDIUM	nemo/collections/asr/parts/submodules/jasper.py	460	# Set default context window	COMMENT
MEDIUM	…parts/submodules/aed_decoding/aed_batched_streaming.py	169	# check for hallucinations	COMMENT
MEDIUM	…parts/submodules/aed_decoding/aed_batched_streaming.py	303	# check for hallucinations	COMMENT
MEDIUM	…parts/submodules/aed_decoding/aed_batched_streaming.py	346	# we need to have at least 8 tokens to run hallucinations detector	COMMENT
MEDIUM	nemo/collections/asr/inference/nmt/llm_translator.py	275	# Remove hallucinations if ASR transcript is empty string	COMMENT
MEDIUM	nemo/collections/tts/models/easy_magpietts_inference.py	388	# This enables keeping the zero-shot conditioning module private at release time.	COMMENT
MEDIUM	…odules/magpietts_inference/evaluate_generated_audio.py	199	# the embedding model doesn't accept NumPy arrays, so we write to a temporary file	COMMENT
MEDIUM	tests/collections/asr/test_parallel_expert_encoder.py	166	# _forward_online orchestration (stubbed ASR encoder, provided spk_targets)	COMMENT

Fake / Example Data12 hits · 17 pts

Severity	File	Line	Snippet	Context
LOW	tests/setup/data/create_sample_jsonl.py	35	"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore "	CODE
LOW	tests/setup/data/create_sample_jsonl.py	35	"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore "	CODE
LOW⚡	tests/collections/asr/test_text_to_text_dataset.py	65	"lorem ipsum dolor sit amet consectetur adipiscing elit",	CODE
LOW⚡	tests/collections/asr/test_text_to_text_dataset.py	65	"lorem ipsum dolor sit amet consectetur adipiscing elit",	CODE
LOW⚡	tests/collections/asr/test_text_to_text_dataset.py	79	"lorem ipsum dolor sit amet consectetur adipiscing elit",	CODE
LOW⚡	tests/collections/asr/test_text_to_text_dataset.py	79	"lorem ipsum dolor sit amet consectetur adipiscing elit",	CODE
LOW⚡	tests/collections/asr/test_text_to_text_dataset.py	80	"Lorem ipsum dolor sit amet, consectetur adipiscing elit.",	CODE
LOW⚡	tests/collections/asr/test_text_to_text_dataset.py	80	"Lorem ipsum dolor sit amet, consectetur adipiscing elit.",	CODE
LOW⚡	tests/collections/asr/inference/test_bpe_decoder.py	49	"lorem ipsum dolor sit amet",	CODE
LOW⚡	tests/collections/asr/inference/test_bpe_decoder.py	49	"lorem ipsum dolor sit amet",	CODE
LOW⚡	tests/collections/asr/inference/test_bpe_decoder.py	77	"lorem ipsum dolor sit amet",	CODE
LOW⚡	tests/collections/asr/inference/test_bpe_decoder.py	77	"lorem ipsum dolor sit amet",	CODE

Synthetic Comment Markers1 hit · 8 pts

Severity	File	Line	Snippet	Context
HIGH	nemo/collections/common/data/lhotse/cutset.py	772	# as requested by pzelasko	COMMENT

TODO Padding4 hits · 6 pts

Severity	File	Line	Snippet	Context
LOW	nemo/collections/asr/parts/submodules/subsampling.py	460	# TODO: implement lengths inside conv_split_by_channel	COMMENT
LOW	nemo/collections/asr/data/text_to_text.py	327	# TODO: implement, if we really need normalization inplace	COMMENT
LOW	…s/speaker_tasks/utils/test_data_simul_utils_speaker.py	296	# TODO: add tests for all util functions	COMMENT
LOW	…sts/collections/asr/utils/test_data_simul_utils_asr.py	296	# TODO: add tests for all util functions	COMMENT

Example Usage Blocks3 hits · 2 pts

Severity	File	Line	Snippet	Context
LOW	nemo/core/classes/dataset.py	84	# Usage:	STRING
LOW	examples/asr/speech_classification/frame_vad_infer.py	21	## Usage:	STRING
LOW	…/speech_recognition/convert_to_tarred_audio_dataset.py	34	# Usage:	STRING

Analysis Overview

What These Metrics Mean

Score History

Severity Breakdown

Directory Score Breakdown

Pattern Findings