Repository Analysis

datalab-to/surya

OCR, layout analysis, reading order, table recognition in 90+ languages

7.0 Low AI signal View on GitHub
7.0
Adjusted Score
7.0
Raw Score
100%
Time Factor
2026-05-27
Last Push
19,806
Stars
Python
Language
9,976
Lines of Code
84
Files
69
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 0HIGH 0MEDIUM 1LOW 68

Pattern Findings

69 matches across 5 categories. Click a row to expand file-level details.

Excessive Try-Catch Wrapping25 hits · 26 pts
SeverityFileLineSnippet
LOWsurya/layout/__init__.py88 except Exception as e:
LOWsurya/common/s3.py61 except Exception as e:
LOWsurya/common/s3.py80 except Exception:
LOWsurya/common/s3.py168 except Exception as e:
LOWsurya/scripts/screenshot_app.py76 except Exception:
LOWsurya/scripts/screenshot_app.py141 except Exception as e:
LOWsurya/scripts/screenshot_app.py161 except Exception as e:
LOWsurya/scripts/screenshot_app.py182 except Exception as e:
LOWsurya/scripts/screenshot_app.py216 except Exception as e:
LOWsurya/inference/__init__.py40 except Exception:
LOWsurya/inference/__init__.py58 except Exception:
LOWsurya/inference/backends/spawn.py54 except Exception:
LOWsurya/inference/backends/spawn.py79 except Exception:
LOWsurya/inference/backends/spawn.py104 except Exception:
LOWsurya/inference/backends/spawn.py117 except Exception:
LOWsurya/inference/backends/spawn.py137 except Exception as e:
LOWsurya/inference/backends/spawn.py157 except Exception as e:
LOWsurya/inference/backends/spawn.py168 except Exception as e:
LOWsurya/inference/backends/openai_client.py136 except Exception as e:
LOWsurya/recognition/__init__.py318 except Exception as e:
LOWsurya/table_rec/__init__.py105 except Exception as e:
LOWsurya/debug/draw.py56 except Exception as e:
MEDIUMsurya/debug/draw.py57 print(f"Error drawing rectangle at {box_position}: {e}")
LOWtests/conftest.py24 except Exception as exc: # SpawnError, binary missing, port issues, etc.
LOWtests/conftest.py29 except Exception:
Unused Imports21 hits · 21 pts
SeverityFileLineSnippet
LOWsurya/layout/__init__.py1
LOWsurya/ocr_error/model/encoder.py1
LOWsurya/common/blank.py14
LOWsurya/scripts/streamlit_app.py4
LOWsurya/scripts/screenshot_app.py10
LOWsurya/inference/__init__.py9
LOWsurya/inference/parsers.py3
LOWsurya/inference/prompts.py4
LOWsurya/inference/prompts.py5
LOWsurya/inference/prompts.py8
LOWsurya/inference/prompts.py9
LOWsurya/inference/backends/vllm.py3
LOWsurya/inference/backends/__init__.py1
LOWsurya/inference/backends/__init__.py2
LOWsurya/inference/backends/spawn.py7
LOWsurya/inference/backends/openai_client.py7
LOWsurya/inference/backends/base.py1
LOWsurya/inference/backends/llamacpp.py8
LOWsurya/recognition/__init__.py7
LOWsurya/table_rec/__init__.py10
LOWsurya/detection/model/encoderdecoder.py12
Deep Nesting12 hits · 12 pts
SeverityFileLineSnippet
LOWsurya/layout/__init__.py39
LOWsurya/ocr_error/tokenizer.py342
LOWsurya/ocr_error/tokenizer.py479
LOWsurya/ocr_error/model/encoder.py637
LOWsurya/ocr_error/model/encoder.py839
LOWsurya/common/polygon.py83
LOWsurya/common/s3.py31
LOWsurya/inference/backends/spawn.py172
LOWsurya/inference/backends/spawn.py293
LOWsurya/inference/backends/openai_client.py70
LOWsurya/table_rec/__init__.py70
LOWsurya/detection/__init__.py61
Hyper-Verbose Identifiers9 hits · 9 pts
SeverityFileLineSnippet
LOWsurya/ocr_error/__init__.py21 def batch_ocr_error_detection(
LOWsurya/ocr_error/tokenizer.py185 def build_inputs_with_special_tokens(
LOWsurya/ocr_error/tokenizer.py245 def create_token_type_ids_from_sequences(
LOWsurya/ocr_error/model/encoder.py40def create_sinusoidal_embeddings(n_pos: int, dim: int, out: torch.Tensor):
LOWsurya/ocr_error/model/encoder.py677 def resize_position_embeddings(self, new_num_position_embeddings: int):
LOWsurya/ocr_error/model/encoder.py825 def resize_position_embeddings(self, new_num_position_embeddings: int):
LOWsurya/common/pretrained.py9 def _check_and_adjust_attn_implementation(
LOWsurya/inference/backends/llamacpp.py33def _resolve_llama_server_binary() -> str:
LOWtests/test_layout.py1def test_layout_returns_blocks(layout_predictor, test_image):
Over-Commented Block2 hits · 2 pts
SeverityFileLineSnippet
LOWsurya/settings.py61 # (8192) generation + ~2k prompt/chat-template overhead ≈ 12k. Below this
LOWsurya/detection/processor.py1# coding=utf-8