Repository Analysis

run-llama/liteparse

A fast, helpful, and open-source document parser

6.1 Low AI signal View on GitHub
6.1
Adjusted Score
6.1
Raw Score
100%
Time Factor
2026-05-30
Last Push
7,910
Stars
Rust
Language
25,195
Lines of Code
148
Files
116
Pattern Hits
2026-05-31
Scan Date

Score History

No multi-scan history yet — run the scanner again to build trend data.

Severity Breakdown

CRITICAL 0HIGH 3MEDIUM 10LOW 103

Pattern Findings

116 matches across 12 categories. Click a row to expand file-level details.

Unused Imports42 hits · 42 pts
SeverityFileLineSnippet
LOWscripts/bump-version.py35
LOWpackages/python/liteparse/__init__.py1
LOWpackages/python/liteparse/__init__.py1
LOWpackages/python/liteparse/__init__.py2
LOWpackages/python/liteparse/__init__.py2
LOWpackages/python/liteparse/__init__.py2
LOWpackages/python/liteparse/__init__.py2
LOWpackages/python/liteparse/__init__.py2
LOWpackages/python/liteparse/__init__.py2
LOWpackages/python/liteparse/types.py3
LOWpackages/python/liteparse/types.py6
LOWdataset_eval_utils/src/liteparse_eval/__init__.py3
LOWdataset_eval_utils/src/liteparse_eval/__init__.py3
LOWdataset_eval_utils/src/liteparse_eval/__init__.py3
LOWdataset_eval_utils/src/liteparse_eval/__init__.py3
LOWdataset_eval_utils/src/liteparse_eval/__init__.py3
LOWdataset_eval_utils/src/liteparse_eval/__init__.py3
LOWdataset_eval_utils/src/liteparse_eval/__init__.py3
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py1
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py1
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py1
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py2
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py2
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py2
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py2
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py2
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py2
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py2
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py2
LOW…et_eval_utils/src/liteparse_eval/providers/__init__.py2
LOW…utils/src/liteparse_eval/providers/parsers/__init__.py1
LOW…utils/src/liteparse_eval/providers/parsers/__init__.py2
LOW…utils/src/liteparse_eval/providers/parsers/__init__.py3
LOW…utils/src/liteparse_eval/providers/parsers/__init__.py4
LOW…utils/src/liteparse_eval/providers/parsers/__init__.py5
LOW…utils/src/liteparse_eval/providers/parsers/__init__.py6
LOW…utils/src/liteparse_eval/providers/parsers/__init__.py7
LOW…utils/src/liteparse_eval/providers/parsers/__init__.py8
LOW…utils/src/liteparse_eval/providers/parsers/__init__.py9
LOW…val_utils/src/liteparse_eval/providers/llm/__init__.py1
LOW…val_utils/src/liteparse_eval/providers/llm/__init__.py1
LOW…val_utils/src/liteparse_eval/providers/llm/__init__.py2
Hyper-Verbose Identifiers25 hits · 30 pts
SeverityFileLineSnippet
LOWocr/paddleocr/test_server.py48def test_server_health_endpoint(server: PaddleOCRServer) -> None:
LOWocr/paddleocr/test_server.py77def test_server_normalizes_documented_language_aliases(
LOWocr/easyocr/test_server.py38def test_server_health_endpoint(server: EasyOCRServer) -> None:
LOWpackages/python/tests/test_parse_e2e.py20 def test_parse_returns_parse_result(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_parse_e2e.py24 def test_parse_result_has_pages(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_parse_e2e.py29 def test_parse_result_has_text(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_parse_e2e.py34 def test_parse_result_has_json(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_parse_e2e.py52 async def test_parse_async_bytes_input(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_parse_e2e.py135 def test_multi_page_text_joined(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_parse_e2e.py162 async def test_file_not_found_async(self, parser: LiteParse):
LOWpackages/python/tests/test_screenshot_e2e.py19 def test_screenshot_returns_batch_result(
LOWpackages/python/tests/test_screenshot_e2e.py25 def test_screenshot_has_screenshots(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_screenshot_e2e.py30 def test_screenshot_result_fields(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_screenshot_e2e.py39 def test_screenshot_output_dir(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_screenshot_e2e.py47 def test_screenshot_png_format(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_screenshot_e2e.py52 def test_screenshot_jpg_format(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_screenshot_e2e.py58 async def test_screensho_async_basic(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_batch_e2e.py18 def test_batch_parse_returns_batch_result(
LOWpackages/python/tests/test_batch_e2e.py32 def test_batch_parse_creates_output_files(
LOWpackages/python/tests/test_batch_e2e.py48 def test_batch_parse_json_format(self, parser: LiteParse, invoice_pdf: Path):
LOWpackages/python/tests/test_batch_e2e.py85 async def test_input_dir_not_found_async(self, parser: LiteParse):
LOWdataset_eval_utils/src/liteparse_eval/processing.py80def analyze_image_with_claude(
LOWdataset_eval_utils/src/liteparse_eval/report.py378 def _generate_navigation_html(self) -> str:
LOWdataset_eval_utils/src/liteparse_eval/report.py411 def _generate_all_documents_html(self) -> str:
LOWdataset_eval_utils/src/liteparse_eval/report.py461 def _generate_pdf_preview_html(self, pdf_path: Path) -> str:
Excessive Try-Catch Wrapping17 hits · 21 pts
SeverityFileLineSnippet
LOWocr/paddleocr/server.py91 except Exception as e:
LOWpackages/python/liteparse/parser.py153 except Exception as e:
LOWpackages/python/liteparse/parser.py199 except Exception as e:
LOWpackages/python/liteparse/cli.py13 except Exception as e:
MEDIUMpackages/python/liteparse/cli.py14 print(f"Error: {e}", file=sys.stderr)
MEDIUMpackages/python/liteparse/cli.py8def main() -> None:
LOWdataset_eval_utils/src/liteparse_eval/benchmark.py157 except Exception as e:
LOWdataset_eval_utils/src/liteparse_eval/benchmark.py170 except Exception:
LOWdataset_eval_utils/src/liteparse_eval/benchmark.py178 except Exception as e:
MEDIUMdataset_eval_utils/src/liteparse_eval/benchmark.py254 print(f"Error: Not a directory: {args.input_dir}")
LOWdataset_eval_utils/src/liteparse_eval/evaluation.py170 except Exception as e:
LOWdataset_eval_utils/src/liteparse_eval/evaluation.py254 except Exception as e:
LOWdataset_eval_utils/src/liteparse_eval/evaluation.py275 except Exception as e:
LOWdataset_eval_utils/src/liteparse_eval/evaluation.py345 except Exception as e:
LOWdataset_eval_utils/src/liteparse_eval/processing.py192 except Exception as e:
MEDIUMdataset_eval_utils/src/liteparse_eval/processing.py250 print(f"Error: Input directory does not exist: {args.input_dir}")
LOWdataset_eval_utils/src/liteparse_eval/report.py466 except Exception as e:
Docstring Block Structure3 hits · 15 pts
SeverityFileLineSnippet
HIGHpackages/python/liteparse/parser.py129 Parse a document file. Args: file_data: Path to the document file, or raw PDF bytes.
HIGHpackages/python/liteparse/parser.py162 Generate screenshots of document pages. Supports PDFs natively. Non-PDF formats (DOCX, XLSX, images, e
HIGHdataset_eval_utils/src/liteparse_eval/report.py535 Convert first page of PDF to base64-encoded image. Uses JPEG compression if PIL is available, otherwis
Over-Commented Block10 hits · 10 pts
SeverityFileLineSnippet
LOWcrates/liteparse-wasm/src/wasi_stubs.rs1//! Stub implementations of libc functions that pdfium's statically-linked
LOWcrates/liteparse/src/types.rs41 /// Whether the font has buggy encoding (private-use codepoints, TT subset, etc.)
LOWcrates/liteparse/src/conversion.rs741 // Dropping the TempDir removes the directory.
LOWcrates/liteparse/src/lib.rs21// ── Internal modules (available for binding crates, hidden from docs) ──
LOWcrates/liteparse/src/extract.rs81 println!("{}", serde_json::to_string(page)?);
LOWcrates/pdfium-sys/wrapper.h1#include "fpdfview.h"
LOWscripts/create-dataset.sh1#!/usr/bin/env bash
LOWscripts/upload-dataset.sh1#!/usr/bin/env bash
LOWpackages/python/scripts/copy-pdfium.sh1#!/usr/bin/env bash
LOWpackages/node/scripts/copy-pdfium.sh1#!/usr/bin/env bash
Decorative Section Separators3 hits · 9 pts
SeverityFileLineSnippet
MEDIUMcrates/liteparse/src/conversion.rs749 // ── find_pdf_in_dir ──────────────────────────────────────────────────────
MEDIUMcrates/liteparse/src/lib.rs7// ── Public API re-exports ──────────────────────────────────────────────
MEDIUMcrates/liteparse/src/lib.rs14// ── Modules with user-facing types (visible in docs) ───────────────────
Self-Referential Comments2 hits · 6 pts
SeverityFileLineSnippet
MEDIUMdataset_eval_utils/src/liteparse_eval/evaluation.py368 # Create a mapping of file paths to results
MEDIUMdataset_eval_utils/src/liteparse_eval/processing.py16# Define the output schema using Pydantic-like structure
Deep Nesting5 hits · 5 pts
SeverityFileLineSnippet
LOWocr/paddleocr/server.py49
LOWocr/paddleocr/server.py55
LOWscripts/bump-version.py166
LOWdataset_eval_utils/src/liteparse_eval/benchmark.py126
LOWdataset_eval_utils/src/liteparse_eval/report.py534
Example Usage Blocks3 hits · 4 pts
SeverityFileLineSnippet
LOWscripts/create-dataset.sh12# Usage:
LOWscripts/compare-dataset.sh4# Usage:
LOWscripts/upload-dataset.sh4# Usage:
Redundant / Tautological Comments3 hits · 4 pts
SeverityFileLineSnippet
LOWscripts/compare-dataset.sh60 # Check if file exists
LOWscripts/compare-dataset.sh146 # Check if error was expected
LOW…al_utils/src/liteparse_eval/providers/llm/anthropic.py70 # Check if the response is "<pass>" or "<fail>"
AI Slop Vocabulary1 hit · 3 pts
SeverityFileLineSnippet
MEDIUMcrates/liteparse/src/conversion.rs456/// `.pdf` entry is more robust than constructing a fixed `<stem>.pdf` path.
Verbosity Indicators2 hits · 3 pts
SeverityFileLineSnippet
LOWscripts/upload-dataset.sh38# Step 1: Regenerate dataset from documents in the dataset directory
LOWscripts/upload-dataset.sh42# Step 2: Upload to HuggingFace