Repository Analysis

NVIDIA-NeMo/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

23.2 Moderate AI signal View on GitHub
23.2
Adjusted Score
23.2
Raw Score
100%
Time Factor
2026-05-30
Last Push
17,273
Stars
Python
Language
469,839
Lines of Code
1901
Files
7397
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 13HIGH 609MEDIUM 406LOW 6369

Pattern Findings

7397 matches across 18 categories. Click a row to expand file-level details.

Hyper-Verbose Identifiers2764 hits · 2790 pts
SeverityFileLineSnippet
LOWtools/nemo_forced_aligner/align_eou.py248def get_manifests_for_this_rank(manifest_list, num_nodes, num_gpus, node_idx, gpu_idx):
LOW…orced_aligner/tests/test_add_t_start_end_to_utt_obj.py260def test_add_t_start_end_to_utt_obj(alignment, expected_output_utterance, output_timestep_duration):
LOWtools/nemo_forced_aligner/utils/make_ass_files.py335def make_token_level_ass_file(utt_obj, output_dir_root, ass_file_config, audio_dur):
LOWtools/ctc_segmentation/scripts/utils.py135def _prepare_tokenized_text_for_bpe_model(text: List[str], tokenizer, vocabulary: List[str], blank_idx: int = 0):
LOWtools/ctc_segmentation/scripts/utils.py213def determine_utterance_segments(config, utt_begin_indices, char_probs, timings, text, char_list):
LOWtools/ctc_segmentation/scripts/utils.py303def write_labels_for_audacity(
LOWtools/speech_data_explorer/data_explorer.py249def expand_sharded_path_without_braceexpand(path_pattern):
LOWtools/speech_data_explorer/data_explorer.py572def build_tar_index_from_local(tar_path):
LOW…taset_preparation/customization_dataset_preparation.py85def recommend_hyperparameters_human_readable(recommended_hyperparameters):
LOW…taset_preparation/customization_dataset_preparation.py92def recommend_hyperparameters(df, model=None):
LOW…taset_preparation/customization_dataset_preparation.py148def estimating_customization_job_time(df, recommended_hyperparameters):
LOW…taset_preparation/customization_dataset_preparation.py165def warn_completion_is_not_empty(df):
LOW…taset_preparation/customization_dataset_preparation.py181def warn_imbalanced_completion(df):
LOW…taset_preparation/customization_dataset_preparation.py304def convert_into_prompt_completion_only(df, prompt_template="{prompt}", completion_template="{completion}"):
LOW…taset_preparation/customization_dataset_preparation.py311def warn_and_drop_long_samples(df, max_total_char_length):
LOW…taset_preparation/customization_dataset_preparation.py363def split_into_train_validation(df, val_proportion=0.1):
LOW…ration/tests/test_customization_dataset_preparation.py39def test_recommend_hyperparameters():
LOW…ration/tests/test_customization_dataset_preparation.py83def test_warn_completion_is_not_empty():
LOW…ration/tests/test_customization_dataset_preparation.py106def test_warn_imbalanced_completion():
LOW…ration/tests/test_customization_dataset_preparation.py206def test_warn_duplicated_rows():
LOW…ration/tests/test_customization_dataset_preparation.py223def test_drop_duplicated_rows():
LOW…ration/tests/test_customization_dataset_preparation.py262def test_drop_unrequired_fields():
LOW…ration/tests/test_customization_dataset_preparation.py271def test_convert_into_template():
LOW…ration/tests/test_customization_dataset_preparation.py295def test_convert_into_prompt_completion_only():
LOW…ration/tests/test_customization_dataset_preparation.py313def get_indexes_of_long_examples(df, max_total_char_length):
LOW…ration/tests/test_customization_dataset_preparation.py318def test_warn_and_drop_long_samples():
LOW…ration/tests/test_customization_dataset_preparation.py346def test_show_first_example_in_df():
LOW…ration/tests/test_customization_dataset_preparation.py356def test_get_prepared_filename():
LOW…ration/tests/test_customization_dataset_preparation.py374def test_split_into_train_validation():
LOWnemo/lightning/base_callback.py78 def on_save_checkpoint_success(self, *args, **kwargs) -> None:
LOWnemo/lightning/__init__.py24def _is_slurm_interactive_mode():
LOWnemo/lightning/callback_group.py141def hook_class_init_with_callbacks(cls, start_callback: str, end_callback: str) -> None:
LOWnemo/lightning/one_logger_callback.py37def get_one_logger_init_config() -> Dict[str, Any]:
LOWnemo/lightning/one_logger_callback.py67def _get_base_callback_config(
LOWnemo/lightning/one_logger_callback.py205def _should_enable_for_current_rank() -> bool:
LOWnemo/core/connectors/save_restore_connector.py601 def _inject_model_parallel_rank_for_ckpt(self, dirname, basename):
LOWnemo/core/connectors/save_restore_connector.py607 def _make_nemo_file_from_folder(filename, source_dir):
LOWnemo/core/connectors/save_restore_connector.py617 def _make_nemo_file_from_folder_with_multistorageclient(filename, source_dir):
LOWnemo/core/connectors/save_restore_connector.py97 def load_config_and_state_dict(
LOWnemo/core/connectors/save_restore_connector.py223 def load_instance_with_state_dict(self, instance, state_dict, strict):
LOWnemo/core/connectors/save_restore_connector.py511 def check_artifact_and_query_basename_match(query_path: str) -> bool:
LOWnemo/core/connectors/save_restore_connector.py725 def _unpack_nemo_file_with_multistorageclient(
LOWnemo/core/connectors/save_restore_connector.py767 def _load_state_dict_from_disk(model_weights, map_location='cpu'):
LOWnemo/core/config/optimizers.py237def register_optimizer_params(name: str, optimizer_params: OptimizerParams):
LOWnemo/core/config/schedulers.py234def register_scheduler_params(name: str, scheduler_params: SchedulerParams):
LOWnemo/core/classes/exportable.py295 def disabled_deployment_input_names(self) -> List[str]:
LOWnemo/core/classes/exportable.py300 def disabled_deployment_output_names(self) -> List[str]:
LOWnemo/core/classes/exportable.py339 def dynamic_shapes_for_export(self, use_dynamo=False):
LOWnemo/core/classes/common.py146def _validate_config_targets_recursive(config_node: Any):
LOWnemo/core/classes/common.py180def is_semantic_typecheck_enabled():
LOWnemo/core/classes/common.py377 def _attach_and_validate_output_types(self, out_objects, ignore_collections=False, output_types=None):
LOWnemo/core/classes/common.py667 def _inspect_signature_for_trainer(cls, check_cls):
LOWnemo/core/classes/common.py808 def get_available_model_names(cls) -> List[str]:
LOWnemo/core/classes/common.py888 def _get_ngc_pretrained_model_info(cls, model_name: str, refresh_cache: bool = False) -> Tuple[type, str]:
LOWnemo/core/classes/common.py947 def _get_hf_hub_pretrained_model_info(cls, model_name: str, refresh_cache: bool = False) -> Tuple[type, str]:
LOWnemo/core/classes/common.py1230 def set_semantic_check_enabled(enabled: bool = True):
LOWnemo/core/classes/modelPT.py299 def has_native_or_submodules_artifacts(self) -> bool:
LOWnemo/core/classes/modelPT.py578 def setup_multiple_validation_data(self, val_data_config: Union[DictConfig, Dict]):
LOWnemo/core/classes/modelPT.py795 def setup_optimizer_param_groups(self):
LOWnemo/core/classes/modelPT.py879 def recursively_propagate_guid(module: "NeuralModule"):
2704 more matches not shown…
Cross-File Repetition528 hits · 2640 pts
SeverityFileLineSnippet
HIGHtools/nemo_forced_aligner/align_eou.py0converts text with accented or special latin characters (e.g., ó, ñ, ū, ō) into their closest ascii equivalents.
HIGHscripts/asr_eou/clean_manifest.py0converts text with accented or special latin characters (e.g., ó, ñ, ū, ō) into their closest ascii equivalents.
HIGHscripts/asr_eou/clean_manifest.py0converts text with accented or special latin characters (e.g., ó, ñ, ū, ō) into their closest ascii equivalents.
HIGHtools/nmt_grpc_service/api/nmt_pb2_grpc.py0riva nlp services implement task-specific apis for popular nlp tasks including intent recognition (as well as slot filli
HIGHtools/nmt_grpc_service/api/nmt_pb2_grpc.py0riva nlp services implement task-specific apis for popular nlp tasks including intent recognition (as well as slot filli
HIGHtools/nmt_grpc_service/api/nmt_pb2_grpc.py0riva nlp services implement task-specific apis for popular nlp tasks including intent recognition (as well as slot filli
HIGHnemo/core/config/schedulers.py0base configuration for all schedulers. it is not derived from config as it is not a nemo object (and in particular it do
HIGHnemo/core/config/schedulers.py0base configuration for all schedulers. it is not derived from config as it is not a nemo object (and in particular it do
HIGHnemo/core/config/schedulers.py0base configuration for all schedulers. it is not derived from config as it is not a nemo object (and in particular it do
HIGHnemo/core/config/schedulers.py0base configuration for all schedulers. it is not derived from config as it is not a nemo object (and in particular it do
HIGHnemo/core/config/schedulers.py0base configuration for all schedulers. it is not derived from config as it is not a nemo object (and in particular it do
HIGHnemo/core/classes/exportable.py0implement this method to return a set of input names disabled for export
HIGHnemo/collections/asr/modules/conv_asr.py0implement this method to return a set of input names disabled for export
HIGHnemo/collections/asr/modules/rnnt.py0implement this method to return a set of input names disabled for export
HIGHnemo/collections/tts/models/fastpitch.py0implement this method to return a set of input names disabled for export
HIGHnemo/core/classes/mixins/adapter_mixins.py0checks if any adapter module has been instantiated. returns: bool, determining if any adapter module has been instantiat
HIGHnemo/collections/asr/parts/mixins/asr_adapter_mixins.py0checks if any adapter module has been instantiated. returns: bool, determining if any adapter module has been instantiat
HIGH…llections/tts/parts/mixins/fastpitch_adapter_mixins.py0checks if any adapter module has been instantiated. returns: bool, determining if any adapter module has been instantiat
HIGHnemo/core/classes/mixins/adapter_mixins.py0utility method to resolve a given global/module adapter name to its components. always returns a tuple representing (mod
HIGHnemo/collections/asr/parts/mixins/asr_adapter_mixins.py0utility method to resolve a given global/module adapter name to its components. always returns a tuple representing (mod
HIGH…llections/tts/parts/mixins/fastpitch_adapter_mixins.py0utility method to resolve a given global/module adapter name to its components. always returns a tuple representing (mod
HIGHnemo/core/classes/mixins/adapter_mixin_strategies.py0compute the output of a single adapter to some input. args: input: original output tensor of the module, or the output o
HIGHnemo/core/classes/mixins/adapter_mixin_strategies.py0compute the output of a single adapter to some input. args: input: original output tensor of the module, or the output o
HIGH…odules/adapters/multi_head_attention_adapter_module.py0compute the output of a single adapter to some input. args: input: original output tensor of the module, or the output o
HIGHnemo/agents/voice_agent/pipecat/services/nemo/diar.py0process audio data and generate transcription frames. args: audio: raw audio bytes to transcribe yields: frame: transcri
HIGHnemo/agents/voice_agent/pipecat/services/nemo/diar.py0process audio data and generate transcription frames. args: audio: raw audio bytes to transcribe yields: frame: transcri
HIGHnemo/agents/voice_agent/pipecat/services/nemo/stt.py0process audio data and generate transcription frames. args: audio: raw audio bytes to transcribe yields: frame: transcri
HIGHnemo/agents/voice_agent/pipecat/services/nemo/utils.py0update the buffer with the new frame args: frame (frame): frame to update the buffer with
HIGH…ence/streaming/buffering/incremental_audio_bufferer.py0update the buffer with the new frame args: frame (frame): frame to update the buffer with
HIGH…ns/asr/inference/streaming/buffering/audio_bufferer.py0update the buffer with the new frame args: frame (frame): frame to update the buffer with
HIGHnemo/utils/app_state.py0property returns the number of gpus in each model parallel group. returns: number of gpus in each model parallel group.
HIGHnemo/utils/app_state.py0property returns the number of gpus in each model parallel group. returns: number of gpus in each model parallel group.
HIGHnemo/utils/app_state.py0property returns the number of gpus in each model parallel group. returns: number of gpus in each model parallel group.
HIGHnemo/utils/app_state.py0property returns the number of gpus in each model parallel group. returns: number of gpus in each model parallel group.
HIGHnemo/utils/app_state.py0property returns the number of gpus in each model parallel group. returns: number of gpus in each model parallel group.
HIGHnemo/utils/app_state.py0property returns the number of gpus in each model parallel group. returns: number of gpus in each model parallel group.
HIGHnemo/utils/app_state.py0property sets the number of gpus in each model parallel group. args: size (int): number of gpus in each model parallel g
HIGHnemo/utils/app_state.py0property sets the number of gpus in each model parallel group. args: size (int): number of gpus in each model parallel g
HIGHnemo/utils/app_state.py0property sets the number of gpus in each model parallel group. args: size (int): number of gpus in each model parallel g
HIGHnemo/utils/app_state.py0property sets the number of gpus in each model parallel group. args: size (int): number of gpus in each model parallel g
HIGHnemo/utils/app_state.py0property sets the number of gpus in each model parallel group. args: size (int): number of gpus in each model parallel g
HIGHnemo/utils/callbacks/dist_ckpt_io.py0override hook to finalize pending checkpoint(s) if they exist.
HIGHnemo/utils/callbacks/dist_ckpt_io.py0override hook to finalize pending checkpoint(s) if they exist.
HIGHnemo/utils/callbacks/dist_ckpt_io.py0override hook to finalize pending checkpoint(s) if they exist.
HIGHnemo/collections/speechlm2/models/salm_asr_decoder.py0returns the audio duration corresponding to a single frame/token at the output of ``self.perception``.
HIGHnemo/collections/speechlm2/models/salm_automodel.py0returns the audio duration corresponding to a single frame/token at the output of ``self.perception``.
HIGHnemo/collections/speechlm2/models/salm.py0returns the audio duration corresponding to a single frame/token at the output of ``self.perception``.
HIGHnemo/collections/speechlm2/models/salm_asr_decoder.py0implements a fully offline forward pass through the entire model. the flow is the following: |speech and text embeddings
HIGHnemo/collections/speechlm2/models/salm_automodel.py0implements a fully offline forward pass through the entire model. the flow is the following: |speech and text embeddings
HIGHnemo/collections/speechlm2/models/salm.py0implements a fully offline forward pass through the entire model. the flow is the following: |speech and text embeddings
HIGHnemo/collections/speechlm2/models/salm_asr_decoder.py0generate llm answers given text or mixed text+audio prompts. example 1. high-level api using ``prompts`` to provide both
HIGHnemo/collections/speechlm2/models/salm_automodel.py0generate llm answers given text or mixed text+audio prompts. example 1. high-level api using ``prompts`` to provide both
HIGHnemo/collections/speechlm2/models/salm.py0generate llm answers given text or mixed text+audio prompts. example 1. high-level api using ``prompts`` to provide both
HIGHnemo/collections/speechlm2/models/salm_asr_decoder.py0return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.
HIGHnemo/collections/speechlm2/models/salm_automodel.py0return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.
HIGHnemo/collections/speechlm2/models/duplex_ear_tts.py0return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.
HIGHnemo/collections/speechlm2/models/salm.py0return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.
HIGHnemo/collections/speechlm2/models/duplex_s2s_model.py0return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.
HIGH…ns/speechlm2/models/duplex_s2s_speech_decoder_model.py0return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.
HIGHnemo/collections/asr/models/ssl_models.py0return a typing schema for optimal batch size calibration for various sequence lengths using oomptimizer.
468 more matches not shown…
Over-Commented Block1684 hits · 1664 pts
SeverityFileLineSnippet
LOWnemo_dependencies.py1#!/usr/bin/env python3
LOW.pre-commit-config.yaml1# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
LOWpyproject.toml1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOW.readthedocs.yml1# =============================================================================
LOWsetup.py1# ! /usr/bin/python
LOWtools/nemo_forced_aligner/align.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/nemo_forced_aligner/align_eou.py1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOW…s/nemo_forced_aligner/tests/test_restore_token_case.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/nemo_forced_aligner/tests/test_get_utt_obj.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOW…orced_aligner/tests/test_add_t_start_end_to_utt_obj.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/nemo_forced_aligner/utils/make_output_manifest.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/nemo_forced_aligner/utils/constants.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/nemo_forced_aligner/utils/data_prep.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/nemo_forced_aligner/utils/make_ass_files.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/nemo_forced_aligner/utils/make_ctm_files.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/speech_data_simulator/multispeaker_simulator.py1# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
LOWtools/ctc_segmentation/run_segmentation.sh1#!/bin/bash
LOWtools/ctc_segmentation/run_filter.sh1#!/bin/bash
LOWtools/ctc_segmentation/scripts/prepare_data.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOW…ols/ctc_segmentation/scripts/get_metrics_and_filter.py1# Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
LOWtools/ctc_segmentation/scripts/normalization_helpers.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWtools/ctc_segmentation/scripts/verify_segments.py1# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
LOWtools/ctc_segmentation/scripts/cut_audio.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWtools/ctc_segmentation/scripts/utils.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWtools/ctc_segmentation/scripts/run_ctc_segmentation.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWtools/speech_data_explorer/data_explorer.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWtools/customization_dataset_preparation/__init__.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOW…taset_preparation/customization_dataset_preparation.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOW…ration/tests/test_customization_dataset_preparation.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOW…ls/customization_dataset_preparation/tests/__init__.py1# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
LOWtools/rir_corpus_generator/rir_mix_generator.py1# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
LOWtools/rir_corpus_generator/rir_corpus_generator.py1# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
LOWtools/nmt_webapp/nmt_service.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWtools/asr_evaluator/asr_evaluator.py1# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
LOWtools/asr_evaluator/utils.py1# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.
LOWtools/nmt_grpc_service/server.py1# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
LOWtools/nmt_grpc_service/client.py1# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
LOWtools/nmt_grpc_service/asr_nmt_client.py1# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
LOWtools/nmt_grpc_service/api/nmt_pb2.py1# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
LOWtools/nmt_grpc_service/api/nmt_pb2_grpc.py1# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
LOWnemo/package_info.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/constants.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/__init__.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/lightning/base_callback.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/lightning/__init__.py1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOWnemo/lightning/callback_group.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/lightning/one_logger_callback.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/lightning/base.py1# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/__init__.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/connectors/save_restore_connector.py1# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/connectors/__init__.py1# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/neural_types/elements.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/neural_types/neural_type.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/neural_types/__init__.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/neural_types/comparison.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/neural_types/axes.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/config/base_config.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/config/__init__.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/config/optimizers.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
LOWnemo/core/config/pytorch_lightning.py1# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
1624 more matches not shown…
Decorative Section Separators229 hits · 834 pts
SeverityFileLineSnippet
MEDIUM.readthedocs.yml1# =============================================================================
MEDIUM.readthedocs.yml15# =============================================================================
MEDIUMnemo/collections/speechlm2/vllm/salm/backends.py48# ── Base backend ────────────────────────────────────────────────────
MEDIUMnemo/collections/speechlm2/vllm/salm/backends.py79# ── Transformer backend (Qwen3, etc.) ────────────────────────────────
MEDIUMnemo/collections/speechlm2/vllm/salm/backends.py216# ── Hybrid backend (NemotronH / Mamba+MoE) ──────────────────────────
MEDIUMnemo/collections/speechlm2/vllm/salm/backends.py304# ── Factory ─────────────────────────────────────────────────────────
MEDIUMnemo/collections/speechlm2/vllm/salm/audio.py80# ── Helpers ─────────────────────────────────────────────────────────
MEDIUMnemo/collections/speechlm2/vllm/salm/audio.py118# ── Multimodal contract types ───────────────────────────────────────
MEDIUMnemo/collections/speechlm2/modules/ear_tts_vae_codec.py38# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_vae_codec.py40# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_vae_codec.py104# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_vae_codec.py106# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_vae_codec.py436# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_vae_codec.py438# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_model.py34# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_model.py36# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_model.py95# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_model.py97# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_model.py237# ==============================================================================
MEDIUMnemo/collections/speechlm2/modules/ear_tts_model.py239# ==============================================================================
MEDIUM…ollections/common/video_tokenizers/cosmos_tokenizer.py1# -----------------------------------------------------------------------------
MEDIUM…ollections/common/video_tokenizers/cosmos_tokenizer.py8# -----------------------------------------------------------------------------
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py249 # =====================
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py251 # =====================
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py256 # ===================================================
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py258 # ===================================================
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py268 # ================================================
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py270 # ================================================
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py293 # ==================================
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py295 # ==================================
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py323 # =================
MEDIUM…mo/collections/common/modules/adapters/mcore_mixins.py325 # =================
MEDIUMnemo/collections/audio/modules/transforms.py383 # ------------------------------------------------------------------
MEDIUMnemo/collections/audio/modules/transforms.py385 # ------------------------------------------------------------------
MEDIUMnemo/collections/asr/metrics/md_eval.py34# ==============================================================================
MEDIUMnemo/collections/asr/metrics/md_eval.py36# ==============================================================================
MEDIUMnemo/collections/asr/metrics/md_eval.py60# ==============================================================================
MEDIUMnemo/collections/asr/metrics/md_eval.py91# ─── Type aliases ──────────────────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py100# ─── Constants ─────────────────────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py158# ─── RTTM / UEM parsing ───────────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py316# ─── UEM manipulation helpers ─────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py543# ─── Speaker segment timeline ─────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py633# ─── Bipartite speaker matching ───────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py677# ─── Per-segment speaker scoring ─────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py796# ─── Main diarization scoring ─────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py904# ─── Output formatting ────────────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py973# ─── Top-level evaluate ───────────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py1113# ─── DER result wrapper ────────────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/md_eval.py1119# ───────────────────────────────────────────────────────────────────────────
MEDIUMnemo/collections/asr/metrics/der.py54# ─── Lhotse-backed annotation helpers ──────────────────────────────────────
MEDIUMnemo/collections/tts/parts/utils/tts_dataset_utils.py459# =============================================================================
MEDIUMnemo/collections/tts/parts/utils/tts_dataset_utils.py461# =============================================================================
MEDIUMtests/collections/speaker_tasks/utils/test_der.py612# ─── Tests: Multi-file scoring ───────────────────────────────────────────
MEDIUMtests/collections/speaker_tasks/utils/test_der.py68# ─── Helpers ──────────────────────────────────────────────────────────────
MEDIUMtests/collections/speaker_tasks/utils/test_der.py182# ─── Tests: md_eval low-level engine ──────────────────────────────────────
MEDIUMtests/collections/speaker_tasks/utils/test_der.py417# ─── Tests: der.py public API (score_labels_from_rttm_labels) ────────────
MEDIUMtests/collections/speaker_tasks/utils/test_der.py672# ─── Tests: External-engine-verified values (cross-validated) ────────────
MEDIUMtests/collections/speaker_tasks/utils/test_der.py902# ─── Tests: regression for no-UEM scoring (parity with external lib) ─────
MEDIUMtests/collections/speaker_tasks/utils/test_der.py1057# ─── Tests: lhotse-based replacement for the external annotation lib ─────
MEDIUMtests/collections/speaker_tasks/utils/test_der.py1549# ─── Tests: audio_end clipping ────────────────────────────────────────────
169 more matches not shown…
Unused Imports754 hits · 730 pts
SeverityFileLineSnippet
LOWtools/speech_data_explorer/data_explorer.py20
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/__init__.py16
LOWnemo/lightning/__init__.py20
LOWnemo/lightning/__init__.py20
LOWnemo/core/__init__.py15
LOWnemo/core/__init__.py16
LOWnemo/core/connectors/save_restore_connector.py15
LOWnemo/core/neural_types/__init__.py16
LOWnemo/core/neural_types/__init__.py17
LOWnemo/core/neural_types/__init__.py18
LOWnemo/core/neural_types/__init__.py19
LOWnemo/core/config/__init__.py15
LOWnemo/core/config/__init__.py16
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py17
LOWnemo/core/config/__init__.py31
LOWnemo/core/config/__init__.py32
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/config/__init__.py33
LOWnemo/core/classes/__init__.py16
LOWnemo/core/classes/__init__.py17
LOWnemo/core/classes/__init__.py18
LOWnemo/core/classes/__init__.py20
LOWnemo/core/classes/__init__.py20
LOWnemo/core/classes/__init__.py20
LOWnemo/core/classes/__init__.py20
LOWnemo/core/classes/__init__.py20
LOWnemo/core/classes/__init__.py20
694 more matches not shown…
Deep Nesting722 hits · 679 pts
SeverityFileLineSnippet
LOWnemo_dependencies.py27
LOWnemo_dependencies.py44
LOWnemo_dependencies.py90
LOWnemo_dependencies.py115
LOWtools/nemo_forced_aligner/align_eou.py445
LOW…orced_aligner/tests/test_add_t_start_end_to_utt_obj.py260
LOWtools/nemo_forced_aligner/utils/data_prep.py66
LOWtools/nemo_forced_aligner/utils/make_ass_files.py111
LOWtools/nemo_forced_aligner/utils/make_ass_files.py179
LOWtools/nemo_forced_aligner/utils/make_ass_files.py335
LOWtools/nemo_forced_aligner/utils/make_ctm_files.py69
LOWtools/ctc_segmentation/scripts/prepare_data.py111
LOWtools/ctc_segmentation/scripts/prepare_data.py207
LOWtools/ctc_segmentation/scripts/prepare_data.py213
LOWtools/ctc_segmentation/scripts/cut_audio.py50
LOWtools/ctc_segmentation/scripts/utils.py167
LOWtools/ctc_segmentation/scripts/utils.py213
LOWtools/ctc_segmentation/scripts/utils.py267
LOWtools/ctc_segmentation/scripts/utils.py303
LOWtools/speech_data_explorer/data_explorer.py497
LOWtools/speech_data_explorer/data_explorer.py640
LOWtools/speech_data_explorer/data_explorer.py774
LOWtools/speech_data_explorer/data_explorer.py929
LOWtools/speech_data_explorer/data_explorer.py971
LOW…taset_preparation/customization_dataset_preparation.py63
LOW…taset_preparation/customization_dataset_preparation.py237
LOWtools/asr_evaluator/utils.py37
LOWtools/asr_evaluator/utils.py85
LOWtools/asr_evaluator/utils.py270
LOWnemo/lightning/one_logger_callback.py152
LOWnemo/lightning/base.py55
LOWnemo/core/connectors/save_restore_connector.py52
LOWnemo/core/connectors/save_restore_connector.py97
LOWnemo/core/connectors/save_restore_connector.py289
LOWnemo/core/connectors/save_restore_connector.py361
LOWnemo/core/connectors/save_restore_connector.py454
LOWnemo/core/neural_types/elements.py99
LOWnemo/core/neural_types/neural_type.py66
LOWnemo/core/neural_types/neural_type.py93
LOWnemo/core/neural_types/neural_type.py179
LOWnemo/core/neural_types/axes.py60
LOWnemo/core/config/hydra_runner.py53
LOWnemo/core/config/hydra_runner.py69
LOWnemo/core/config/hydra_runner.py71
LOWnemo/core/classes/exportable.py138
LOWnemo/core/classes/common.py90
LOWnemo/core/classes/common.py226
LOWnemo/core/classes/common.py289
LOWnemo/core/classes/common.py377
LOWnemo/core/classes/common.py589
LOWnemo/core/classes/common.py888
LOWnemo/core/classes/modelPT.py628
LOWnemo/core/classes/modelPT.py795
LOWnemo/core/classes/modelPT.py962
LOWnemo/core/classes/modelPT.py1058
LOWnemo/core/classes/modelPT.py1263
LOWnemo/core/classes/modelPT.py1960
LOWnemo/core/classes/modelPT.py2017
LOWnemo/core/classes/mixins/access_mixins.py77
LOWnemo/core/classes/mixins/adapter_mixins.py126
662 more matches not shown…
Self-Referential Comments111 hits · 322 pts
SeverityFileLineSnippet
MEDIUMtools/nmt_grpc_service/api/nmt_pb2_grpc.py64# This class is part of an EXPERIMENTAL API.
MEDIUMnemo/core/neural_types/axes.py86 """This class represents axis semantics and (optionally) it's dimensionality
MEDIUMnemo/core/config/modelPT.py87 # Create the config builder
MEDIUM…e_agent/pipecat/transports/network/websocket_server.py172 # Create a task to monitor the websocket connection
MEDIUMnemo/utils/import_utils.py15# This file is taken from https://github.com/NVIDIA-NeMo/Curator/blob/dask/nemo_curator/utils/import_utils.py,
MEDIUMnemo/utils/exp_manager.py645 # Create the logging directory if it does not exist
MEDIUMnemo/utils/exp_manager.py1338 # Create the callback and attach it to trainer
MEDIUMnemo/utils/metaclasses.py36 # Create a new object instance - one per class.
MEDIUMnemo/utils/decorators/deprecated.py87 # Create a banner
MEDIUMnemo/collections/speechlm2/models/duplex_ear_tts.py459 # Create a random dropout decision for each BOS instance
MEDIUMnemo/collections/speechlm2/models/duplex_ear_tts.py470 # Create a mask of the same shape as target_text_tokens
MEDIUM…/collections/speechlm2/parts/metrics/results_logger.py216 # Create a wav with eou prediction for debug purposes
MEDIUMnemo/collections/speechlm2/modules/ear_tts_vae_codec.py94 # Create a range tensor from 0 to max_length - 1
MEDIUMnemo/collections/speechlm2/modules/ear_tts_vae_codec.py146 # Create the window tensor on the same device as the waveform.
MEDIUMnemo/collections/speechlm2/modules/ear_tts_model.py158 # Create a range tensor from 0 to max_length - 1
MEDIUMnemo/collections/speechlm2/modules/ear_tts_model.py331 # Create a new, dense character vocabulary sorted by the original token ID
MEDIUMnemo/collections/speechlm2/modules/ear_tts_model.py834 # Create a padded tensor for the character IDs
MEDIUM…o/collections/speechlm2/data/duplex_ear_tts_dataset.py715 # Create a deepcopy and update duration
MEDIUM…o/collections/speechlm2/data/duplex_ear_tts_dataset.py773 # Create a zero tensor of shape [T] (assuming mono audio)
MEDIUMnemo/collections/common/parts/skills_utils.py15# This file is maintained in sync with `nemo_skills/pipeline/utils.py`
MEDIUMnemo/collections/common/parts/nemo_run_utils.py124 # Create the remote directory on the cluster
MEDIUMnemo/collections/common/parts/nemo_run_utils.py167 # Create the config file on the local filesystem
MEDIUMnemo/collections/common/parts/nemo_run_utils.py191 # Create the config file on the remote cluster
MEDIUMnemo/collections/common/prompts/canary.py74 # This method handles a level of indirection for Canary.
MEDIUMnemo/collections/common/prompts/canary2.py104 # This method handles a level of indirection for Canary.
MEDIUMnemo/collections/common/data/lhotse/cutset.py1504 # Create a stream for each dataset.
MEDIUMnemo/collections/common/data/lhotse/cutset.py1671 # Create a new Recording with the extended audio
MEDIUMnemo/collections/common/data/lhotse/cutset.py1706 # Create a Recording from the silence audio
MEDIUMnemo/collections/audio/parts/submodules/diffusion.py160 """This class implements the Ornstein-Uhlenbeck SDE with variance exploding noise schedule.
MEDIUMnemo/collections/audio/parts/submodules/diffusion.py512 # Create a copy of SDE
MEDIUM…o/collections/audio/parts/submodules/conformer_unet.py159 # Create the self-attention and padding masks
MEDIUM…lections/audio/parts/submodules/schroedinger_bridge.py415 # Create a copy of the noise schedule
MEDIUMnemo/collections/audio/data/data_simulation.py593 # Create a radom set of microphones
MEDIUMnemo/collections/audio/data/data_simulation.py1682 # Define a window around the direct path delay
MEDIUMnemo/collections/asr/losses/rnnt_pytorch.py179 """This function implements Equation 7 in the TDT paper https://arxiv.org/pdf/2304.06795.pdf,
MEDIUMnemo/collections/asr/models/aed_multitask_models.py1196 # This method is a legacy helper for Canary that checks whether prompt slot values were provided
MEDIUMnemo/collections/asr/parts/features.py34# This file contains code artifacts adapted from https://github.com/ryanleary/patter
MEDIUMnemo/collections/asr/parts/mixins/transcription.py301 # Create a results of the same type as each element in processed_outputs
MEDIUMnemo/collections/asr/parts/mixins/transcription.py308 # Create a results of the same type as each element in processed_outputs
MEDIUMnemo/collections/asr/parts/mixins/transcription.py316 # Create a results of the same type as each element in processed_outputs
MEDIUMnemo/collections/asr/parts/mixins/transcription.py382 # Create a DataLoader if not already present
MEDIUMnemo/collections/asr/parts/mixins/diarization.py255 # Create a results of the same type as each element in processed_outputs
MEDIUMnemo/collections/asr/parts/mixins/diarization.py262 # Create a results of the same type as each element in processed_outputs
MEDIUMnemo/collections/asr/parts/mixins/diarization.py323 # Create a DataLoader if not already present
MEDIUMnemo/collections/asr/parts/utils/numba_utils.py33 # Create an empty output array
MEDIUM…llections/asr/parts/utils/multispk_transcribe_utils.py1114 # Initialize the instance manager with the batch size of the chunk audio.
MEDIUM…llections/asr/parts/utils/multispk_transcribe_utils.py1227 # Initialize the instance manager with the batch size of the chunk audio.
MEDIUMnemo/collections/asr/parts/utils/transcribe_utils.py227 # Create a preprocessor to convert audio samples into raw features,
MEDIUMnemo/collections/asr/parts/utils/diarization_utils.py196 # Create a list containing string formatted transcript
MEDIUMnemo/collections/asr/parts/utils/diarization_utils.py561 # Create a split segment and add it to the corresponding interval
MEDIUMnemo/collections/asr/parts/utils/diarization_utils.py1177 # Create a transscript information json dictionary from the output variables
MEDIUMnemo/collections/asr/parts/submodules/spectr_augment.py199 # Create a mask_tensor with all the indices.
MEDIUMnemo/collections/asr/parts/submodules/spectr_augment.py206 # Create a final mask that aligns with the full tensor
MEDIUM…llections/asr/parts/submodules/multi_head_attention.py609 # Create a helper tensor to find the local indices of global attention
MEDIUMnemo/collections/asr/parts/preprocessing/features.py34# This file contains code artifacts adapted from https://github.com/ryanleary/patter
MEDIUMnemo/collections/asr/parts/preprocessing/segment.py34# This file contains code artifacts adapted from https://github.com/ryanleary/patter
MEDIUMnemo/collections/asr/parts/preprocessing/perturb.py34# This file contains code artifacts adapted from https://github.com/ryanleary/patter
MEDIUMnemo/collections/asr/parts/preprocessing/perturb.py1356 """This function is used to iterate through utterances with different offsets for each file."""
MEDIUMnemo/collections/asr/inference/utils/context_manager.py170 # Create a dummy context with None values
MEDIUMnemo/collections/asr/inference/utils/bpe_decoder.py139 # Create a text segment
51 more matches not shown…
Docstring Block Structure59 hits · 295 pts
SeverityFileLineSnippet
HIGHnemo/collections/speechlm2/models/duplex_ear_tts.py1107 Returns a dictionary of initial inputs for inference, using registered buffers. Args: B (i
HIGHnemo/collections/speechlm2/models/nemotron_voicechat.py441 Runs full offline duplex speech-to-speech inference. This method performs: 1. Streaming S
HIGHnemo/collections/speechlm2/parts/optim_setup.py170 Utility used to freeze select model parameters, and skip them for the purpose of initializing an optimizer's pa
HIGHnemo/collections/speechlm2/modules/ear_tts_vae_codec.py173 Converts a spectrogram back into a waveform using the overlap-add method. This function is an approximate inver
HIGHnemo/collections/speechlm2/modules/ear_tts_vae_codec.py362 Computes a Mel-scaled spectrogram from an audio waveform. This function transforms a standard spectrogram into
HIGHnemo/collections/speechlm2/data/salm_dataset.py40 A dataset for Speech-Augmented Language Models (SALM) that processes multimodal conversations containing both t
HIGHnemo/collections/speechlm2/data/s2s_dataset.py29 A dataset for duplex speech-to-speech models that handles bidirectional conversations. This dataset processes
HIGH…o/collections/speechlm2/data/duplex_ear_tts_dataset.py34 A dataset for duplex speech-to-speech models that handles bidirectional conversations. This dataset processes
HIGH…ctions/common/tokenizers/huggingface/auto_tokenizer.py236 Adds a dictionary of special tokens (eos, pad, cls...). If special tokens are NOT in the vocabulary, they are
HIGHnemo/collections/common/callbacks/ema.py168 EMAOptimizer is a wrapper for torch.optim.Optimizer that computes Exponential Moving Average of parameters regi
HIGHnemo/collections/common/parts/skills_utils.py239Construct the command for starting a reward model server. Args: server_type (str): Type of server to start
HIGHnemo/collections/common/parts/preprocessing/manifest.py48Iterate through json lines of provided manifests. NeMo ASR pipelines often assume certain manifest files structure.
HIGHnemo/collections/common/parts/preprocessing/parsers.py229Creates parser from labels, set of arguments and concise parser name. Args: labels: List of labels to alloc
HIGHnemo/collections/asr/metrics/md_eval.py255Parse a UEM (Un-partitioned Evaluation Map) file. Args: uem_file: Path to the UEM file. If ``None``, return
HIGHnemo/collections/asr/models/sortformer_diar_models.py749 One-step forward pass for diarization inference in streaming mode. Args: processed_signal
HIGHnemo/collections/asr/models/rnnt_models.py193 Helper method to extract the rnnt loss name, and potentially its kwargs to be passed. Args:
HIGHnemo/collections/asr/parts/utils/transcribe_utils.py328 Prepare audio data for transcription. Args: cfg (DictConfig): Configuration dictionary containing the f
HIGHnemo/collections/asr/parts/utils/asr_batching.py207 Instantiates a Semi Sorted (Batch) Sampler. Args: model: ASR Model. dataset: Dataset which all
HIGHnemo/collections/asr/parts/utils/speaker_utils.py777 Combine overlaps with floating point numbers. Since neighboring integers are considered as continuous range, we
HIGH…ons/asr/parts/submodules/rnnt_maes_batched_computer.py327 Combines acoustic model log probabilities with language model scores based on the specified blank LM score mode
HIGH…ons/asr/parts/submodules/rnnt_maes_batched_computer.py355 Performs top-k selection and pruning for language model (LM) and automatic speech recognition (ASR) outputs
HIGHnemo/collections/asr/parts/submodules/tdnn_attention.py26Statistics and time average pooling (TAP) layer This computes mean and, optionally, standard deviation statistics a
HIGH…o/collections/asr/inference/pipelines/base_pipeline.py521 Resolve language_code to a strict prompt index; raise if invalid. Args: language_code: (str
HIGH…o/collections/asr/inference/pipelines/base_pipeline.py554 Build prompt vectors for a batch of states using one-hot encoding. Args: states: (list) Lis
HIGHnemo/collections/asr/inference/nmt/llm_translator.py110 Setup device for the LLM model. Args: device: (str) device to run the model on
HIGHnemo/collections/asr/inference/nmt/llm_translator.py139 Returns prompt template for the LLM model. Args: model_name: (str) name of the model to get
HIGHnemo/collections/asr/inference/nmt/llm_translator.py156 Load NMT model in vLLM format. Args: llm_params: (dict) parameters for the LLM model
HIGHnemo/collections/asr/data/audio_to_text_dataset.py930 Normalize manifest or tarred audio file paths into a ``ListConfig`` of lists. Handles string inputs (comma-sep
HIGHnemo/collections/asr/data/audio_to_text_dataset.py963 Chain multiple bucketed datasets using the specified bucketing strategy. When multiple datasets are provided (
HIGHnemo/collections/asr/data/audio_to_text_dataset.py1015 Calculate per-bucket batch sizes for adaptive bucketing. Supports two modes: linear scaling (integer ``bucketi
HIGHnemo/collections/tts/models/magpietts.py942Normalize speaker_indices to a tensor of shape (batch_size,). Args: speaker_indices: Speaker select
HIGHnemo/collections/tts/models/magpietts.py994Get baked context embeddings for a batch, with per-element speaker selection. Args: batch_size: Num
HIGHnemo/collections/tts/models/magpietts.py1363 Convert attention probability matrices to numpy images for logging. Args: attention_prob_m
HIGHnemo/collections/tts/models/magpietts.py1405 Decode audio codes to waveforms and convert to numpy arrays for logging. Args: logits: Mod
HIGHnemo/collections/tts/models/magpietts.py1887Prepare all context tensors for the decoder. This method orchestrates text encoding, context extraction, and mo
HIGHnemo/collections/tts/models/magpietts.py3720 Generate speech from raw text transcript. This is a convenience method for single-utterance text-to-sp
HIGHnemo/collections/tts/parts/utils/tts_dataset_utils.py480 Split a paragraph into sentences based on sentence-ending punctuation. Sentence separators are chosen from the
HIGHnemo/collections/tts/parts/utils/tts_dataset_utils.py775 Unified text chunking for inference: returns single chunk if below threshold, multiple sentence chunks if above
HIGH…o/collections/tts/modules/magpietts_inference/utils.py314Load a MagpieTTS model from checkpoint or NeMo archive. Supports two loading modes: 1. Checkpoint mode: hparams
HIGH…o/collections/tts/modules/magpietts_inference/utils.py397Load an EasyMagpieTTSInferenceModel (decoder-only) from checkpoint or NeMo archive. Uses the inference-only base cl
HIGHscripts/asr_language_modeling/ngram_lm/ngram_merge.py166 Calculates perplexity of a given ngram model on a test file. Args: ngram_mod (str): The pa
HIGHscripts/asr_language_modeling/ngram_lm/ngram_merge.py200 Converts an ngram model in binary format to ARPA format. Args: - ngram_mod (str): The path to
HIGHscripts/asr_language_modeling/ngram_lm/ngram_merge.py343 Function: make_symbol_list Create a symbol table for the input tokenizer model file. Args: nemo_m
HIGH…ognition/partial_conversion_to_tarred_audio_dataset.py66 Selects and returns a subset of shards from the tarred manifest file. Args: manifest_filepath (str): T
HIGH…ognition/partial_conversion_to_tarred_audio_dataset.py136 Creates tarred shards based on the provided configuration. Args: cfg (PartialASRTarredDatasetConfig):
HIGH…/speech_recognition/convert_to_tarred_audio_dataset.py182 Creates a new tarred dataset from a given manifest file. Args: manifest_path (str): Path t
HIGH…/speech_recognition/convert_to_tarred_audio_dataset.py380 Creates a concatenated tarred dataset from the base manifest and additional manifest files. Args:
HIGHscripts/tts_comparison_report/reporting/models.py83Create sample metadata from one filewise metrics item. Args: item: One entry from the filewise metr
HIGHscripts/tts_comparison_report/reporting/models.py167Create benchmark data by discovering benchmark artifacts in storage. Args: benchmark_name: Name of
HIGHscripts/tts_comparison_report/reporting/models.py297Create bucket data by discovering benchmark artifacts in storage. Args: bucket_name: Display name o
HIGHscripts/tts_comparison_report/reporting/models.py370Return the aggregated value of a metric for one benchmark. Args: metric_name: Name of the metric to
HIGHscripts/tts_comparison_report/reporting/models.py449Return filewise samples for a metric from one or all benchmarks. Args: metric_name: Name of the met
HIGHscripts/tts_comparison_report/reporting/models.py469Return generated audio file paths for a benchmark. Args: benchmark_name: Name of the benchmark.
HIGHscripts/tts_comparison_report/reporting/models.py495Return sample metadata for a benchmark derived from filewise metrics. Args: benchmark_name: Name of
HIGHscripts/tts_comparison_report/reporting/orchestrator.py401Generate evaluation reports, upload report artifacts to S3, and return report URLs. This method performs the fu
HIGH…s_comparison_report/reporting/components/stat_tests.py92Run statistical tests for all distribution metrics. Args: bucket_baseline: Baseline bucket data. bu
HIGH…comparison_report/reporting/components/audio_report.py70Prepare audio pairs for the selected benchmarks. Args: bucket_baseline: Baseline bucket data. bucke
HIGH…omparison_report/reporting/components/metrics_table.py64Prepare formatted metric rows for one benchmark comparison table. Args: benchmark_name: Name of the benchma
HIGH…omparison_report/reporting/components/metrics_table.py105Prepare formatted metric rows for the summary comparison table. Args: bucket_baseline: Baseline bucket data
Redundant / Tautological Comments164 hits · 253 pts
SeverityFileLineSnippet
LOWtools/speech_data_simulator/conf/data_simulator.yaml64 add_seg_aug: False # Set True to enable augmentation on each speech segment
LOWtools/speech_data_simulator/conf/data_simulator.yaml72 add_sess_aug: False # Set True to enable audio augmentation on the whole session
LOWtools/speech_data_explorer/data_explorer.py104 # Check if file exists
LOWnemo/core/connectors/save_restore_connector.py85 # Check if we are packing the folder into a nemo file
LOWnemo/core/connectors/save_restore_connector.py643 # Check if the member is a symbolic link
LOWnemo/core/config/hydra_runner.py94 # Check if user set the schema.
LOWnemo/core/classes/common.py109 # Check if this is a missing dependency issue vs a malicious target
LOWnemo/core/classes/common.py117 # Check if the module path is in our approved prefixes
LOWnemo/core/classes/common.py326 # Check if keys exists in the defined input types
LOWnemo/core/classes/common.py862 # Check if nemo_model_file_in_cache is a directory
LOWnemo/core/classes/common.py976 # Check if api token exists, use if it does
LOWnemo/core/classes/common.py982 # Check if model exists in HF
LOWnemo/core/classes/modelPT.py705 # Check if caller provided optimizer name, default to Adam otherwise
LOWnemo/core/classes/modelPT.py723 # Check if caller has optimizer kwargs, default to empty dictionary
LOWnemo/core/classes/modelPT.py1365 # Check if model is being resumed or not - only works if `Trainer` is attached to model
LOWnemo/core/classes/modelPT.py1551 # Assign trainer to the model
LOWnemo/core/classes/mixins/adapter_mixins.py361 # Check if type is supported (if available) and is an enabled adapter
LOWnemo/core/classes/mixins/adapter_mixins.py476 # Check if adapter is enabled or not
LOWnemo/core/classes/mixins/hf_io_mixin.py108 # Check if api token exists, use if it does
LOWnemo/core/classes/mixins/adapter_mixin_strategies.py241 # Check if globally allowed to compute aux loss
LOWnemo/core/optim/distributed_adam.py658 # Check if fragment needs to be updated
LOWnemo/core/utils/process_launcher/launcher.py269 # Check if all processes are completed or not
LOW…ice_agent/pipecat/utils/text/simple_text_aggregator.py74 # Check if the only period is a bullet point (e.g., "1. Alpha" or incomplete "1.")
LOW…ice_agent/pipecat/utils/text/simple_text_aggregator.py81 # Check if any of the abbreviations "e.", "i." "g.", "etc." are present in the text
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py721 # Check if we need to start a new turn or append to existing turn
LOWnemo/agents/voice_agent/pipecat/services/nemo/llm.py457 # Check if there's already a vLLM process running on the same port and model
LOWnemo/agents/voice_agent/pipecat/services/nemo/llm.py461 # Check if this process is using the same port and model
LOWnemo/agents/voice_agent/pipecat/services/nemo/llm.py574 # Check if process is still running
LOWnemo/utils/te_utils.py23# Check if Transformer Engine has quantized tensor classes
LOWnemo/utils/exp_manager.py1404 # Check if cuda is avialable as preemption is supported only on GPUs
LOWnemo/utils/decorators/deprecated.py48 # Check if we already warned about that function.
LOWnemo/utils/callbacks/preemption.py57 # Check if torch distributed is initialised, required for broadcasting the preemption signal to all the ranks
LOWnemo/collections/speechlm2/models/duplex_ear_tts.py1408 # Check if we should use the custom grouping
LOWnemo/collections/speechlm2/parts/metrics/turn_taking.py64 # Check if within tolerance
LOW…o/collections/speechlm2/parts/metrics/mcq_evaluator.py238 # Check if response is empty
LOW…o/collections/speechlm2/parts/metrics/mcq_evaluator.py282 # Check if correct
LOWnemo/collections/speechlm2/parts/metrics/empty_text.py49 # Check if hypothesis is empty or only whitespace
LOWnemo/collections/speechlm2/data/force_align.py269 # Check if this is a Segment object (has words_and_tokens attribute)
LOWnemo/collections/speechlm2/data/force_align.py273 # Check if this is a Word object (has 'text' and timing attributes)
LOW…o/collections/speechlm2/data/duplex_ear_tts_dataset.py585 # Check if system prompt exists in custom field
LOWnemo/collections/common/parts/skills_utils.py632 # Check if result directory compression is streamable
LOWnemo/collections/common/parts/nemo_run_utils.py57 # Check if the cluster config is provided
LOWnemo/collections/common/parts/nemo_run_utils.py61 # Check if the mounts key is present in the cluster config
LOWnemo/collections/common/parts/nemo_run_utils.py70 # Check if the mount path already exists in the cluster config
LOWnemo/collections/common/parts/nemo_run_utils.py99 # Check if the directory is a string or a list
LOWnemo/collections/common/parts/nemo_run_utils.py103 # Check if the executor is local
LOWnemo/collections/common/parts/nemo_run_utils.py113 # Check if the executor is slurm
LOWnemo/collections/common/parts/nemo_run_utils.py115 # Check if the ssh tunnel config is provided in the cluster config
LOWnemo/collections/common/parts/nemo_run_utils.py151 # Check if the config_name is a string and ends with .yaml
LOWnemo/collections/common/parts/nemo_run_utils.py155 # Check if the config_directory is a string or a list
LOWnemo/collections/common/parts/nemo_run_utils.py163 # Check if the executor is local
LOWnemo/collections/common/parts/nemo_run_utils.py217 # Check if the cluster config is provided
LOWnemo/collections/common/parts/nemo_run_utils.py221 # Check if the directories is a string or a list
LOWnemo/collections/common/parts/nemo_run_utils.py225 # Check if the executor is local
LOWnemo/collections/common/parts/nemo_run_utils.py229 # Check if the directories exist at the source location for mounting
LOWnemo/collections/common/parts/nemo_run_utils.py178 # Check if the executor is slurm
LOWnemo/collections/common/parts/nemo_run_utils.py180 # Check if the ssh tunnel config is provided in the cluster config
LOWnemo/collections/common/parts/nemo_run_utils.py251 # Check if the executor is slurm
LOWnemo/collections/common/parts/nemo_run_utils.py253 # Check if the ssh tunnel config is provided in the cluster config
LOWnemo/collections/common/parts/nemo_run_utils.py265 # Check if the directories exist at the source location for mounting
104 more matches not shown…
Excessive Try-Catch Wrapping213 hits · 237 pts
SeverityFileLineSnippet
LOWnemo_dependencies.py67 except Exception as e:
MEDIUMnemo_dependencies.py68 print(f"Error analyzing {file_path}: {e}")
LOWtools/ctc_segmentation/scripts/prepare_data.py107 except Exception as e:
LOW…ols/ctc_segmentation/scripts/get_metrics_and_filter.py184 except Exception as e:
LOWtools/ctc_segmentation/scripts/utils.py131 except Exception as e:
LOWtools/ctc_segmentation/scripts/run_ctc_segmentation.py175 except Exception as e:
LOWtools/speech_data_explorer/data_explorer.py465 except Exception as e:
LOWtools/speech_data_explorer/data_explorer.py2726 except Exception as ex:
LOWtools/speech_data_explorer/data_explorer.py2788 except Exception as ex:
LOWtools/speech_data_explorer/data_explorer.py2813 except Exception as ex:
LOWtools/speech_data_explorer/data_explorer.py2840 except Exception as ex:
LOWtools/nmt_webapp/nmt_service.py92 except Exception as ex:
MEDIUMtools/nmt_webapp/nmt_service.py67def get_translation():
LOWnemo/lightning/callback_group.py93 except Exception:
LOWnemo/core/connectors/save_restore_connector.py781 except Exception as e:
LOWnemo/core/classes/common.py101 except Exception:
LOWnemo/core/classes/common.py105 except Exception as e2:
LOWnemo/core/classes/common.py439 except Exception:
LOWnemo/core/classes/common.py571 except Exception:
LOWnemo/core/classes/common.py624 except Exception as e:
LOWnemo/core/classes/common.py641 except Exception as e:
LOWnemo/core/classes/modelPT.py770 except Exception as e:
LOWnemo/core/utils/numba_utils.py143 except Exception:
LOWnemo/agents/voice_agent/utils/config_manager.py71 except Exception as e:
LOW…o/agents/voice_agent/utils/tool_calling/basic_tools.py54 except Exception as e:
LOW…e_agent/pipecat/transports/network/websocket_server.py191 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/diar.py176 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/diar.py181 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/diar.py210 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/diar.py241 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py223 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py279 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py383 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py393 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py496 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py549 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py608 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py738 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py777 except Exception as e:
LOW…ents/voice_agent/pipecat/services/nemo/audio_logger.py808 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/tts.py176 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/tts.py181 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/tts.py184 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/tts.py388 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/tts.py398 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/tts.py590 except Exception as e:
LOW…ts/voice_agent/pipecat/services/nemo/streaming_diar.py203 except Exception as e:
MEDIUM…ts/voice_agent/pipecat/services/nemo/streaming_diar.py204 print(f"Error in diarizer streaming step: {e}")
LOWnemo/agents/voice_agent/pipecat/services/nemo/llm.py174 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/llm.py275 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/llm.py370 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/llm.py593 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/llm.py687 except Exception as e:
LOWnemo/agents/voice_agent/pipecat/services/nemo/stt.py258 except Exception as e:
LOWnemo/utils/export_utils.py265 except Exception: # there may ne size mismatch and it may be OK
LOWnemo/utils/export_utils.py287 except Exception: # there may be size mismatch and it may be OK
LOWnemo/utils/export_utils.py400except Exception:
LOWnemo/utils/cast_utils.py118 except Exception:
LOWnemo/utils/env_var_parsing.py106 except Exception:
LOWnemo/utils/cloud.py145 except Exception as e:
153 more matches not shown…
Slop Phrases35 hits · 100 pts
SeverityFileLineSnippet
LOWnemo/collections/asr/losses/ctc.py45 # Don't forget to properly call base constructor
LOWexamples/asr/slurm_example.sh34CONTAINER=nvcr.io/nvidia/nemo:25.02.rc4 # Adjust to your needs. and make sure you have ngc key in ~/.config/enroot/.cred
MEDIUMexamples/tts/conf/fastpitch_ssl.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/fastpitch_align_ipa.yaml2# If you want to train a model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/fastpitch_align_44100_adapter.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/aligner.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/fastpitch_align_44100.yaml2# rate. If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/fastpitch_align_v1.05.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/fastpitch_align_ipa_adapter.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/hifigan_dataset/hifigan_44100.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/hifigan_dataset/hifigan_22050.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/hifigan/hifigan.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/hifigan/hifigan_44100.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/fastpitch/fastpitch_44100.yaml2# If you want to train a model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/fastpitch/fastpitch_22050.yaml2# If you want to train a model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/audio_codec/mel_codec_22050.yaml3# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUM…conf/audio_codec/audio_codec_low_frame_rate_22050.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/audio_codec/mel_codec_44100.yaml3# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/audio_codec/audio_codec_44100.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/audio_codec/audio_codec_22050.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/audio_codec/audio_codec_16000.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/audio_codec/audio_codec_24000.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/audio_codec/encodec_24000.yaml2# If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUM…es/tts/conf/zh/fastpitch_align_multispeaker_22050.yaml2# rate. If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/zh/fastpitch_align_22050.yaml2# rate. If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/de/fastpitch_align_44100_phoneme.yaml2# rate. If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/de/fastpitch_align_22050_mix.yaml2# rate. If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUM…amples/tts/conf/de/fastpitch_align_44100_grapheme.yaml2# rate. If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUM…amples/tts/conf/de/fastpitch_align_22050_grapheme.yaml2# rate. If you want to train model on other dataset, you can change config values according to your dataset.
MEDIUMexamples/tts/conf/es/fastpitch_align_44100_ipa.yaml2# 44.1KHz sampling rate. If you want to train model on other dataset, you can change config values according
MEDIUM…mples/tts/conf/es/fastpitch_align_44100_ipa_multi.yaml2# 44.1KHz sampling rate. If you want to train model on other dataset, you can change config values according
MEDIUMexamples/tts/conf/es/fastpitch_align_44100.yaml2# 44.1KHz sampling rate. If you want to train model on other dataset, you can change config values according
MEDIUMscripts/tokenizers/process_asr_text_tokenizer.py37# In either case, you can add commas to concatenate different manifests or different data files.
MEDIUM…pts/dataset_processing/process_speech_commands_data.py462 f'\n<<NOTE>> Duration computation was skipped for demonstration purposes on Colaboratory.\n'
MEDIUMscripts/installers/install_opengrm.sh19# Alternatively, in the Linux Debian you can use: sudo apt install libngram-tools
Hallucination Indicators13 hits · 95 pts
SeverityFileLineSnippet
CRITICALnemo/core/connectors/save_restore_connector.py123 model = nemo.collections.asr.models.EncDecCTCModel.restore_from('asr.nemo')
CRITICALnemo/core/connectors/save_restore_connector.py262 model = nemo.collections.asr.models.EncDecCTCModel.restore_from('asr.nemo')
CRITICALnemo/core/connectors/save_restore_connector.py302 state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from('asr.nemo', './ckpts')
CRITICALnemo/core/connectors/save_restore_connector.py311 state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from(
CRITICALnemo/core/classes/modelPT.py463 model = nemo.collections.asr.models.EncDecCTCModel.restore_from('asr.nemo')
CRITICALnemo/core/classes/modelPT.py1484 state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from('asr.nemo', './ckpts')
CRITICALnemo/core/classes/modelPT.py1493 state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from(
CRITICAL…parts/submodules/aed_decoding/aed_batched_streaming.py109 pred_tokens_ids, batch_size, _ = self.asr_model.decoding.decoding.greedy_search._prepare_for_search(
CRITICAL…parts/submodules/aed_decoding/aed_batched_streaming.py141 self.asr_model.decoding.decoding.greedy_search._one_step_forward(
CRITICAL…parts/submodules/aed_decoding/aed_batched_streaming.py203 pred_tokens_ids, batch_size, _ = self.asr_model.decoding.decoding.greedy_search._prepare_for_search(
CRITICAL…parts/submodules/aed_decoding/aed_batched_streaming.py231 self.asr_model.decoding.decoding.greedy_search._one_step_forward(
CRITICALnemo/collections/tts/models/fastpitch.py880 n_speakers = self.fastpitch.speaker_emb.weight.data.size()[0]
CRITICALtests/lightning/test_one_logger_callback.py101 mock_provider_instance.with_base_config.return_value.with_export_config.return_value.configure_provider.assert_c
Cross-Language Confusion21 hits · 82 pts
SeverityFileLineSnippet
HIGH…ration/tests/test_customization_dataset_preparation.py268 assert df_dropped_unnecessary_fields.equals(drop_unrequired_fields(df))
HIGH…ration/tests/test_customization_dataset_preparation.py305 assert df_prompt.equals(
HIGH…ration/tests/test_customization_dataset_preparation.py310 assert df_prompt.equals(convert_into_prompt_completion_only(df_prompt))
HIGH…ration/tests/test_customization_dataset_preparation.py326 assert expected_df.equals(warn_and_drop_long_samples(df, 10000)[0])
HIGHtools/asr_evaluator/utils.py45 raise ValueError("decoder_type could only be null, ctc, rnnt or aed")
HIGHtools/asr_evaluator/utils.py119 f"Hybrid models only support rnnt or ctc decoding! Current decoder_type: {cfg.inference.decoder_type
HIGHnemo/collections/audio/losses/audio.py86 min_scale || scale * target - estimate ||^2
HIGHnemo/collections/audio/losses/audio.py136 min_filter || conv(filter, target) - estimate ||^2
HIGHnemo/collections/audio/parts/utils/audio.py441 min_scale || scale * target - estimate ||^2
HIGHnemo/collections/audio/parts/utils/audio.py466 min_filter || conv(filter, target) - estimate ||^2
HIGHnemo/collections/asr/parts/utils/vad_utils.py127 "or a list of {'audio_filepath': i, 'offset': 0, 'duration': null}."
HIGHnemo/collections/asr/modules/rnnt_abstract.py147 Stateful prediction of scores and state for a (possibly null) tokenset.
HIGHnemo/collections/asr/modules/rnnt.py709 Stateful prediction of scores and state for a (possibly null) tokenset.
HIGHnemo/collections/asr/data/data_simulation.py1265 orV_rcv (list or null): Microphone orientations
HIGHnemo/collections/asr/data/audio_to_label.py1335 "duration": null, # not used, will load the whole audio
HIGHexamples/speechlm2/nemotron_voicechat_eval.py44 checkpoint_path (str | null)
HIGHexamples/speechlm2/nemotron_voicechat_eval.py50 * inference_speaker_reference (str | null): Path to the reference audio used to condition the speaker's voice. S
HIGHexamples/asr/asr_adapters/train_asr_adapter.py25 model.adapter.adapter_module_name=<null, or str module. Type: encoder, decoder, joint, or multiple with + between th
HIGHexamples/asr/asr_adapters/train_asr_adapter.py51 model.adapter.adapter_module_name=<null, or str module. Type: encoder, decoder, joint, or multiple with + between th
HIGH…s/dataset_processing/g2p/convert_cmu_arpabet_to_ipa.py25 cd NeMo/scripts && python dataset_processing/g2p/convert_cmu_arpabet_to_ipa.py
HIGHscripts/installers/setup_os2s_decoders.py86 + " >/dev/null 2>/dev/null && rm "
Verbosity Indicators46 hits · 82 pts
SeverityFileLineSnippet
LOW…/collections/speechlm2/parts/metrics/results_logger.py344 # Step 1: Each rank saves its own results with rank suffix
LOW…/collections/speechlm2/parts/metrics/results_logger.py352 # Step 2: Synchronize all ranks before merging
LOW…/collections/speechlm2/parts/metrics/results_logger.py356 # Step 3: Only rank 0 merges all results and computes final metrics
LOW…/collections/speechlm2/parts/metrics/results_logger.py433 # Step 4: Broadcast metrics from rank 0 to all other ranks
LOWnemo/collections/asr/models/online_diarizer.py546 # Step 1: Get subsegments for embedding extraction.
LOWnemo/collections/asr/models/online_diarizer.py585 # Step 4: Generate RTTM style diarization labels from segment ranges and cluster labels
LOW…llections/asr/parts/utils/multispk_transcribe_utils.py1233 # Step 2: diarize or get GT rttms
LOW…llections/asr/parts/utils/multispk_transcribe_utils.py1251 # Step 3: update diar states
LOW…llections/asr/parts/utils/multispk_transcribe_utils.py1270 # Step 4: find active speakers
LOW…llections/asr/parts/utils/multispk_transcribe_utils.py1285 # Step 5: generate instance for active speakers
LOW…llections/asr/parts/utils/multispk_transcribe_utils.py1301 # Step 6:
LOW…llections/asr/parts/utils/multispk_transcribe_utils.py1315 # Step 7: ASR forward pass for active speakers
LOW…llections/asr/parts/utils/multispk_transcribe_utils.py1337 # Step 8: update ASR states
LOW…llections/asr/parts/utils/multispk_transcribe_utils.py1352 # Step 9: update seglsts with timestamps
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py689 # Step 1: Initialization
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py699 # Step 2: Get most likely labels for current frame
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py703 # Step 3: Get fusion scores
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py713 # Step 4: Get most likely labels with fusion scores. Labels that are blank or repeated are ignored.
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py719 # Step 5: Update labels if they initially weren't blank or repeated
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py724 # Step 6: Update fusion states and scores for non-blank and non-repeated labels
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py745 # Step 1: Initialization for fusion models
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py771 # Step 2: Get most likely labels for current frame
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py776 # Step 3: Get fusion scores
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py785 # Step 4: Get most likely labels with fusion scores. Labels that are blank or repeated are ignored.
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py791 # Step 5: Update labels if they initially weren't blank or repeated
LOW…ollections/asr/parts/submodules/ctc_greedy_decoding.py799 # Step 6: Update fusion states and scores for non-blank and non-repeated labels
LOWnemo/collections/asr/inference/streaming/state/state.py218 # we need to check if the last token is the same as the first token of the completed output
LOWnemo/collections/asr/data/data_simulation.py1020 # Step 1: Prepare parameters for sentence generation
LOWnemo/collections/asr/data/data_simulation.py1027 # Step 2: Select a speaker
LOWnemo/collections/asr/data/data_simulation.py1040 # Step 3: Generate a sentence
LOWnemo/collections/asr/data/data_simulation.py1044 # Step 4: Generate a timestamp for either silence or overlap
LOWnemo/collections/asr/data/data_simulation.py1062 # Step 6: Build entries for output files
LOWnemo/collections/asr/data/data_simulation.py1132 # Step 7: Normalize and write to disk
LOWnemo/collections/asr/data/data_simulation.py1145 # Step 8: Clean up memory
LOWnemo/collections/asr/data/data_simulation.py1541 # Step 1: Prepare parameters for sentence generation
LOWnemo/collections/asr/data/data_simulation.py1548 # Step 2: Select a speaker
LOWnemo/collections/asr/data/data_simulation.py1563 # Step 3: Generate a sentence
LOWnemo/collections/asr/data/data_simulation.py1567 # Step 4: Generate a time-stamp for either silence or overlap
LOWnemo/collections/asr/data/data_simulation.py1588 # Step 6: Build entries for output files
LOWnemo/collections/asr/data/data_simulation.py1651 # Step 7: Normalize and write to disk
LOWnemo/collections/tts/models/magpietts.py1907 # Step 1: Encode text input (always needed)
LOWnemo/collections/tts/models/magpietts.py1910 # Step 2: Get and scale attention prior
LOWnemo/collections/tts/models/magpietts.py1915 # Step 3: Process context based on model type
LOWnemo/collections/tts/models/magpietts.py1937 # Step 4: Dispatch to model-type-specific handler
LOWnemo/collections/tts/models/magpietts.py1963 # Step 5: Apply CTC prior layer filtering
LOWnemo/collections/tts/models/magpietts.py1966 # Step 6: Return typed output
AI Slop Vocabulary38 hits · 72 pts
SeverityFileLineSnippet
LOWtools/speech_data_explorer/data_explorer.py1339 # If using tarred audio, just return the filename as-is.
LOWnemo/lightning/callback_group.py162 # If we're already inside a wrapped __init__, just call the original
LOWnemo/core/connectors/save_restore_connector.py429 # artifact is optional and we simply return None
LOWnemo/core/classes/common.py1186 # If types are not defined, skip type checks and just call the wrapped method
LOWnemo/core/optim/optimizers.py143 # If we are provided just a Config object, simply return the dictionary of that object
LOWnemo/core/optim/optimizers.py155 # simply return the dictionary that was provided
LOWnemo/agents/voice_agent/pipecat/services/nemo/diar.py311 # if diarization is disabled, just pass the frame through
MEDIUMnemo/collections/speechlm2/models/duplex_ear_tts.py376 # EOS dropout to make the model more robust
MEDIUMnemo/collections/speechlm2/models/duplex_ear_tts.py424 # BOS dropout to make the model more robust
MEDIUMnemo/collections/speechlm2/models/duplex_ear_tts.py445 # BOS dropout to make the model more robust
LOWnemo/collections/speechlm2/modules/speech_generation.py114 # ToDo: move it to cache to need to just create a 1 frame tensor in inference
LOW…lections/common/video_tokenizers/modules/quantizers.py284 inds[inds >= self.used.shape[0]] = 0 # simply set to zero
MEDIUMnemo/collections/common/parts/skills_utils.py1085# should also make heterogenous logic very clear and more robust
LOWnemo/collections/common/modules/megatron_init.py431 # For this group, we can just return the concatenated
LOWnemo/collections/common/data/dataset.py464 # if min_monolingual fires, it means we will just return a single, original monolingual utterance
LOWnemo/collections/common/data/lhotse/dataloader.py839 # Bucket duration bins are provided: just use them.
MEDIUMnemo/collections/common/data/lhotse/cutset.py1078 # Normalize for robust matching
LOWnemo/collections/common/data/lhotse/nemo_adapters.py355 # just return self.
MEDIUMnemo/collections/audio/data/audio_to_audio_lhotse.py68 # TODO: use fault_tolerant=True for robust loading of target
MEDIUMnemo/collections/audio/data/audio_to_audio_lhotse.py72 # TODO: use fault_tolerant=True for robust loading of target
LOW…ctions/asr/models/hybrid_rnnt_ctc_bpe_models_prompt.py350 # RNNT Path - just use encoded outputs directly
MEDIUMnemo/collections/asr/parts/mixins/diarization.py485 # Be robust to callers accidentally passing "an array of arrays" (dtype=object),
MEDIUM…mo/collections/asr/parts/utils/asr_confidence_utils.py397 """Implemented by subclass in order to aggregate token confidence to a word-level confidence.
LOWnemo/collections/asr/parts/submodules/jasper.py143 # simply return symmetric padding for this scenario
MEDIUMnemo/collections/asr/parts/submodules/ctc_decoding.py804 # If the exact timestep information is available, utilize the 1st non-ctc blank token timestep
LOW…/collections/tts/data/text_to_speech_dataset_lhotse.py378 # If context audio is not available, just use a dummy context_audio_codes
LOWnemo/collections/tts/data/text_to_speech_dataset.py580 # If context audio is not available, just use a dummy context_audio_codes
LOWtests/collections/common/test_lhotse_dataloading.py1544 # in this test we'll just use 0.1 for simplicity
LOWtests/collections/common/test_lhotse_dataloading.py1621 # in this test we'll just use 0.1 for simplicity
LOWtests/collections/common/test_lhotse_dataloading.py1648 # in this test we'll just use 0.1 for simplicity
LOWtests/collections/common/test_lhotse_dataloading.py1749 # in this test we'll just use 0.1 for simplicity
LOWtests/collections/common/test_lhotse_dataloading.py1776 # in this test we'll just use 0.1 for simplicity
LOWtests/collections/common/test_lhotse_dataloading.py1866 # in this test we'll just use 0.1 for simplicity
LOWtests/collections/common/test_lhotse_dataloading.py1893 # in this test we'll just use 0.1 for simplicity
LOWexamples/speechlm2/salm_eval.py149 # If no user prompt is provided, just use the audio placeholder.
MEDIUM…t/server/parsers/nemotron_toolcall_parser_streaming.py509 # re-set stuff pertaining to progress in the current tool
MEDIUMscripts/tokenizers/conf/tabular_data_tokenizer.yaml9 transform: yeo-johnson # can be ['yeo-johnson', 'quantile', 'robust'], check https://scikit-learn.org/stable/modul
MEDIUM…/speech_recognition/convert_to_tarred_audio_dataset.py31# supplied to the config in order to utilize webdataset for efficient large dataset handling.
Fake / Example Data12 hits · 17 pts
SeverityFileLineSnippet
LOWtests/setup/data/create_sample_jsonl.py35 "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore "
LOWtests/setup/data/create_sample_jsonl.py35 "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore "
LOWtests/collections/asr/test_text_to_text_dataset.py64 "lorem ipsum dolor sit amet consectetur adipiscing elit",
LOWtests/collections/asr/test_text_to_text_dataset.py64 "lorem ipsum dolor sit amet consectetur adipiscing elit",
LOWtests/collections/asr/test_text_to_text_dataset.py78 "lorem ipsum dolor sit amet consectetur adipiscing elit",
LOWtests/collections/asr/test_text_to_text_dataset.py78 "lorem ipsum dolor sit amet consectetur adipiscing elit",
LOWtests/collections/asr/test_text_to_text_dataset.py79 "Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
LOWtests/collections/asr/test_text_to_text_dataset.py79 "Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
LOWtests/collections/asr/inference/test_bpe_decoder.py49 "lorem ipsum dolor sit amet",
LOWtests/collections/asr/inference/test_bpe_decoder.py49 "lorem ipsum dolor sit amet",
LOWtests/collections/asr/inference/test_bpe_decoder.py77 "lorem ipsum dolor sit amet",
LOWtests/collections/asr/inference/test_bpe_decoder.py77 "lorem ipsum dolor sit amet",
Synthetic Comment Markers1 hit · 8 pts
SeverityFileLineSnippet
HIGHnemo/collections/common/data/lhotse/cutset.py772 # as requested by pzelasko
Example Usage Blocks3 hits · 2 pts
SeverityFileLineSnippet
LOWnemo/core/classes/dataset.py84 # Usage:
LOWexamples/asr/speech_classification/frame_vad_infer.py21## Usage:
LOW…/speech_recognition/convert_to_tarred_audio_dataset.py34# Usage: