8 matches across 2 categories. Click a row to expand file-level details.
| Severity | File | Line | Snippet |
|---|---|---|---|
| HIGH | inference/kernel.py | 90 | Dequantizes the given weight tensor using the provided scale tensor. Args: x (torch.Tensor): The quant |
| HIGH | inference/model.py | 108 | Forward pass for parallel embedding layer. Args: x (torch.Tensor): Input tensor containing |
| HIGH | inference/model.py | 132 | Applies a linear transformation to the incoming data: y = xA^T + b. This function supports specialized implemen |
| HIGH | inference/fp8_cast_bf16.py | 13 | Converts FP8 weights to BF16 and saves the converted weights. This function reads FP8 weights from the specifi |
| HIGH | inference/fp8_cast_bf16.py | 45 | Retrieves a tensor from the cached safetensor files or loads it from disk if not cached. Args: |
| Severity | File | Line | Snippet |
|---|---|---|---|
| LOW | inference/generate.py | 81 | |
| LOW | inference/convert.py | 33 | |
| LOW | inference/fp8_cast_bf16.py | 12 |