Repository Analysis

HKUDS/RAG-Anything

"RAG-Anything: All-in-One RAG Framework"

27.8 Moderate AI signal View on GitHub
27.8
Adjusted Score
27.8
Raw Score
100%
Time Factor
2026-05-29
Last Push
20,773
Stars
Python
Language
28,843
Lines of Code
78
Files
626
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 0HIGH 21MEDIUM 49LOW 556

Pattern Findings

626 matches across 13 categories. Click a row to expand file-level details.

Hyper-Verbose Identifiers266 hits · 276 pts
SeverityFileLineSnippet
LOWREADME.md473async def process_multimodal_content():
LOWREADME.md600 async def process_multimodal_content(self, modal_content, content_type, file_path, entity_name):
LOWREADME.md818async def insert_content_list_example():
LOWREADME_zh.md452async def process_multimodal_content():
LOWREADME_zh.md579 async def process_multimodal_content(self, modal_content, content_type, file_path, entity_name):
LOWREADME_zh.md796async def insert_content_list_example():
LOWreproduce/llm_answer_evaluator.py46 def get_accuracy_evaluation_prompt(
LOWreproduce/llm_answer_evaluator.py491 async def evaluate_multiple_results(
LOWreproduce/llm_answer_evaluator.py848 async def evaluate_standard_rag_results(
LOWreproduce/llm_answer_evaluator.py1058 def generate_evaluation_report(self, total_time: float):
LOWreproduce/llm_answer_evaluator.py1489 def load_existing_evaluations(self):
LOWreproduce/llm_answer_evaluator.py1544 def update_evaluation_results_list(self, new_results: List[Dict[str, Any]]):
LOWreproduce/llm_answer_evaluator.py1566 def persist_single_evaluation(self, evaluation_result: Dict[str, Any]):
LOWreproduce/llm_answer_evaluator.py89 def get_comprehensive_evaluation_prompt(
LOWreproduce/llm_answer_evaluator.py213 def create_fallback_evaluation(self, response: str) -> Dict[str, Any]:
LOWtests/test_full_entities_merge.py10def _load_merge_logic_function_source():
LOWtests/test_full_entities_merge.py52def test_implementation_preserves_existing_metadata_fields():
LOWtests/test_full_entities_merge.py59def test_create_new_full_entities_entry():
LOWtests/test_full_entities_merge.py74def test_preserves_existing_fields_when_merging_multimodal_entities():
LOWtests/test_full_entities_merge.py95def test_deduplicates_new_entity_names_without_reordering_existing_ones():
LOWtests/test_callbacks.py36 def on_multimodal_item_complete(
LOWtests/test_callbacks.py64 def test_register_and_dispatch(self):
LOWtests/test_callbacks.py90 def test_register_rejects_non_callback(self):
LOWtests/test_callbacks.py95 def test_dispatch_handles_callback_errors(self):
LOWtests/test_callbacks.py105 def test_dispatch_unknown_event(self):
LOWtests/test_callbacks.py113 def test_event_log_disabled_by_default(self):
LOWtests/test_callbacks.py184 def test_process_document_emits_callbacks(self, monkeypatch, tmp_path):
LOWtests/test_callbacks.py239 def test_query_emits_callbacks(self, monkeypatch):
LOWtests/test_resilience.py21 def test_retries_on_transient_error(self):
LOWtests/test_resilience.py39 def test_raises_after_max_attempts(self):
LOWtests/test_resilience.py51 def test_does_not_retry_non_retryable(self):
LOWtests/test_resilience.py110 async def test_retries_on_transient_error(self):
LOWtests/test_resilience.py129 async def test_raises_after_max_attempts(self):
LOWtests/test_resilience.py143 def test_retry_rejects_invalid_max_attempts(self):
LOWtests/test_resilience.py156 def test_retry_rejects_invalid_delays(self):
LOWtests/test_resilience.py170 async def test_async_retry_rejects_invalid_max_attempts(self):
LOWtests/test_resilience.py184 async def test_async_retry_rejects_invalid_delays(self):
LOWtests/test_resilience.py199 def test_default_retry_does_not_retry_oserror_subclasses(self):
LOWtests/test_resilience.py217 def test_closed_state_allows_calls(self):
LOWtests/test_resilience.py227 def test_opens_after_threshold(self):
LOWtests/test_resilience.py243 def test_half_open_after_timeout(self):
LOWtests/test_resilience.py259 def test_failure_threshold_resets_outside_window(self):
LOWtests/test_resilience.py291 def test_application_error_does_not_trip_breaker(self):
LOWtests/test_resilience.py307 def test_half_open_allows_single_trial_call(self):
LOWtests/test_insert_content_list.py22def _install_minimal_lightrag_stubs():
LOWtests/test_insert_content_list.py116 async def _ensure_lightrag_initialized(self):
LOWtests/test_insert_content_list.py124 def _generate_content_based_doc_id(self, content_list):
LOWtests/test_insert_content_list.py127 async def _process_multimodal_content(self, multimodal_items, file_ref, doc_id):
LOWtests/test_insert_content_list.py131def test_insert_content_list_defers_status_until_after_text_insert():
LOWtests/test_insert_content_list.py157def test_process_document_complete_defers_status_until_after_text_insert():
LOWtests/test_insert_content_list.py179def test_process_document_complete_keeps_status_for_multimodal_only_content():
LOWtests/test_insert_content_list.py196def test_insert_content_list_keeps_status_for_multimodal_only_content():
LOWtests/test_doc_status_creation.py75async def test_insert_text_content_with_multimodal_falls_back_for_old_lightrag(
LOWtests/test_doc_status_creation.py121async def test_process_document_complete_bootstraps_doc_status(
LOWtests/test_doc_status_creation.py169 async def fake_ensure_lightrag_initialized():
LOWtests/test_doc_status_creation.py196async def test_image_only_document_falls_back_when_multimodal_flag_is_unsupported(
LOWtests/test_doc_status_creation.py246 async def fake_ensure_lightrag_initialized():
LOWtests/test_doc_status_creation.py263 async def fake_process_multimodal_content(multimodal_items, file_name, doc_id):
LOWtests/test_doc_status_creation.py292async def test_compatibility_multimodal_cache_prevents_repeat_processing(
LOWtests/test_doc_status_creation.py340 async def fake_ensure_lightrag_initialized():
206 more matches not shown…
Excessive Try-Catch Wrapping166 hits · 149 pts
SeverityFileLineSnippet
LOWreproduce/query.py270 except Exception as e:
LOWreproduce/index.py218 except Exception as e:
MEDIUMreproduce/llm_answer_evaluator.py1810 print(f"Error: result file does not exist: {file_path}")
MEDIUMreproduce/llm_answer_evaluator.py1797 print("Error: an OpenAI API key is required")
LOWreproduce/llm_answer_evaluator.py442 except Exception as e:
LOWreproduce/llm_answer_evaluator.py1518 except Exception as e:
LOWreproduce/llm_answer_evaluator.py1601 except Exception as e:
LOWreproduce/llm_answer_evaluator.py1621 except Exception:
LOWreproduce/llm_answer_evaluator.py1868 except Exception as e:
LOWreproduce/llm_answer_evaluator.py209 except Exception as e:
LOWtests/test_close_event_loop.py37 except Exception:
LOWtests/test_close_event_loop.py41 except Exception:
LOWtests/testparser_kwargs.py74 except Exception:
LOWdocs/enhanced_markdown.md485 except Exception as e:
LOWraganything/modalprocessors.py116 except Exception as e:
LOWraganything/modalprocessors.py444 except Exception as e:
LOWraganything/modalprocessors.py856 except Exception as e:
LOWraganything/modalprocessors.py954 except Exception as e:
LOWraganything/modalprocessors.py1017 except Exception as e:
LOWraganything/modalprocessors.py1150 except Exception as e:
LOWraganything/modalprocessors.py1212 except Exception as e:
LOWraganything/modalprocessors.py1338 except Exception as e:
LOWraganything/modalprocessors.py1395 except Exception as e:
LOWraganything/modalprocessors.py1512 except Exception as e:
LOWraganything/modalprocessors.py1558 except Exception as e:
LOWraganything/raganything.py169 except Exception:
LOWraganything/raganything.py173 except Exception:
LOWraganything/raganything.py340 except Exception as e:
LOWraganything/raganything.py412 except Exception as e:
LOWraganything/raganything.py417 except Exception as e:
LOWraganything/raganything.py468 except Exception as e:
LOWraganything/raganything.py567 except Exception as e:
LOWraganything/raganything.py605 except Exception as e:
LOWraganything/omml_extractor.py196 except Exception as e: # pragma: no cover - defensive
LOWraganything/query.py172 except Exception as exc:
LOWraganything/query.py288 except Exception as e:
LOWraganything/query.py331 except Exception as e:
LOWraganything/query.py343 except Exception as e:
LOWraganything/query.py464 except Exception as e:
LOWraganything/query.py501 except Exception as e:
LOWraganything/query.py653 except Exception:
LOWraganything/query.py663 except Exception:
LOWraganything/query.py697 except Exception as e:
LOWraganything/query.py821 except Exception as e:
LOWraganything/batch.py138 except Exception as e:
LOWraganything/batch.py398 except Exception as e:
LOWraganything/processor.py313 except Exception as e:
LOWraganything/processor.py383 except Exception as e:
LOWraganything/processor.py551 except Exception as e:
LOWraganything/processor.py677 except Exception as e:
LOWraganything/processor.py714 except Exception as e:
LOWraganything/processor.py799 except Exception as e:
LOWraganything/processor.py839 except Exception as e:
LOWraganything/processor.py904 except Exception:
LOWraganything/processor.py980 except Exception as e:
LOWraganything/processor.py1184 except Exception as e:
LOWraganything/processor.py1204 except Exception as e:
LOWraganything/processor.py1298 except Exception as e:
LOWraganything/processor.py1356 except Exception as e:
LOWraganything/processor.py1528 except Exception as e:
106 more matches not shown…
Decorative Section Separators30 hits · 104 pts
SeverityFileLineSnippet
MEDIUMtests/test_close_event_loop.py15# ── Replicate the close() logic under test ──────────────────────────────
MEDIUMtests/test_close_event_loop.py45# ── Tests ──────────────────────────────────────────────────────────────
MEDIUMtests/test_core_modules.py118# ── Content Separation Tests ─────────────────────────────────────
MEDIUMtests/test_core_modules.py177# ── Image Encoding Tests ─────────────────────────────────────────
MEDIUMtests/test_core_modules.py23# ── DocStatus Tests ──────────────────────────────────────────────
MEDIUMtests/test_core_modules.py44# ── RAGAnythingConfig Tests ──────────────────────────────────────
MEDIUMtests/test_core_modules.py201# ── Image Validation Tests ───────────────────────────────────────
MEDIUMtests/test_core_modules.py238# ── Processor Type Mapping Tests ─────────────────────────────────
MEDIUMtests/test_core_modules.py289# ── BatchProcessingResult Tests ──────────────────────────────────
MEDIUMtests/test_core_modules.py352# ── BatchParser Initialization Tests ─────────────────────────────
MEDIUMtests/test_minimax_integration.py167# ---------------------------------------------------------------------------
MEDIUMtests/test_minimax_integration.py169# ---------------------------------------------------------------------------
MEDIUMtests/test_minimax_integration.py367# ---------------------------------------------------------------------------
MEDIUMtests/test_minimax_integration.py369# ---------------------------------------------------------------------------
MEDIUMtests/test_minimax_integration.py21# ---------------------------------------------------------------------------
MEDIUMtests/test_minimax_integration.py23# ---------------------------------------------------------------------------
MEDIUMtests/test_minimax_integration.py113# ---------------------------------------------------------------------------
MEDIUMtests/test_minimax_integration.py115# ---------------------------------------------------------------------------
MEDIUMtests/test_minimax_integration.py282# ---------------------------------------------------------------------------
MEDIUMtests/test_minimax_integration.py284# ---------------------------------------------------------------------------
MEDIUMraganything/batch.py30 # ==========================================
MEDIUMraganything/batch.py32 # ==========================================
MEDIUMraganything/batch.py173 # ==========================================
MEDIUMraganything/batch.py175 # ==========================================
MEDIUMraganything/callbacks.py71 # ── Parsing stage ─────────────────────────────────────────────
MEDIUMraganything/callbacks.py90 # ── Text insertion stage ──────────────────────────────────────
MEDIUMraganything/callbacks.py101 # ── Multimodal processing stage ───────────────────────────────
MEDIUMraganything/callbacks.py126 # ── Query stage ───────────────────────────────────────────────
MEDIUMraganything/callbacks.py149 # ── Document complete ─────────────────────────────────────────
MEDIUMraganything/callbacks.py168 # ── Batch processing ──────────────────────────────────────────
Magic Placeholder Names16 hits · 78 pts
SeverityFileLineSnippet
HIGHREADME.md337 api_key = "your-api-key"
HIGHREADME.md475 api_key = "your-api-key"
HIGHREADME.md693 api_key = "your-api-key"
HIGHREADME.md820 api_key = "your-api-key"
HIGHREADME.md1019python examples/raganything_example.py path/to/document.pdf --api-key YOUR_API_KEY --parser mineru
HIGHREADME.md1022python examples/modalprocessors_example.py --api-key YOUR_API_KEY
HIGHREADME_zh.md314 api_key = "your-api-key"
HIGHREADME_zh.md454 api_key = "your-api-key"
HIGHREADME_zh.md671 api_key = "your-api-key"
HIGHREADME_zh.md798 api_key = "your-api-key"
HIGHREADME_zh.md997python examples/raganything_example.py path/to/document.pdf --api-key YOUR_API_KEY --parser mineru
HIGHREADME_zh.md1000python examples/modalprocessors_example.py --api-key YOUR_API_KEY
HIGHexamples/minimax_integration_example.py29 export MINIMAX_API_KEY=your-api-key
HIGHexamples/minimax_integration_example.py72 "Set it with: export MINIMAX_API_KEY=your-api-key"
HIGHexamples/minimax_integration_example.py154 print(" Set it with: export MINIMAX_API_KEY=your-api-key")
HIGHexamples/minimax_integration_example.py265 " MINIMAX_API_KEY=your-api-key\n"
Deep Nesting64 hits · 44 pts
SeverityFileLineSnippet
LOWsetup.py14
LOWreproduce/query.py90
LOWreproduce/llm_answer_evaluator.py159
LOWreproduce/llm_answer_evaluator.py213
LOWreproduce/llm_answer_evaluator.py286
LOWreproduce/llm_answer_evaluator.py491
LOWreproduce/llm_answer_evaluator.py759
LOWreproduce/llm_answer_evaluator.py848
LOWreproduce/llm_answer_evaluator.py1058
LOWtests/test_full_entities_merge.py10
LOWtests/test_close_event_loop.py18
LOWraganything/modalprocessors.py68
LOWraganything/modalprocessors.py139
LOWraganything/modalprocessors.py179
LOWraganything/modalprocessors.py212
LOWraganything/modalprocessors.py244
LOWraganything/modalprocessors.py285
LOWraganything/modalprocessors.py603
LOWraganything/raganything.py143
LOWraganything/raganything.py258
LOWraganything/omml_extractor.py111
LOWraganything/omml_extractor.py218
LOWraganything/query.py26
LOWraganything/query.py195
LOWraganything/query.py475
LOWraganything/query.py589
LOWraganything/query.py708
LOWraganything/query.py618
LOWraganything/processor.py200
LOWraganything/processor.py386
LOWraganything/processor.py607
LOWraganything/processor.py725
LOWraganything/processor.py1107
LOWraganything/processor.py1302
LOWraganything/processor.py1391
LOWraganything/processor.py2098
LOWraganything/parser.py2525
LOWraganything/parser.py92
LOWraganything/parser.py194
LOWraganything/parser.py344
LOWraganything/parser.py714
LOWraganything/parser.py960
LOWraganything/parser.py1151
LOWraganything/parser.py1373
LOWraganything/parser.py1566
LOWraganything/parser.py1846
LOWraganything/parser.py2099
LOWraganything/parser.py2205
LOWraganything/parser.py2237
LOWraganything/parser.py2111
LOWraganything/enhanced_markdown.py370
LOWraganything/enhanced_markdown.py403
LOWraganything/batch_parser.py376
LOWraganything/batch_parser.py110
LOWraganything/batch_parser.py203
LOWraganything/resilience.py146
LOWraganything/resilience.py188
LOWraganything/resilience.py190
LOWexamples/office_document_test.py49
LOWexamples/minimax_integration_example.py150
4 more matches not shown…
Self-Referential Comments9 hits · 33 pts
SeverityFileLineSnippet
MEDIUMreproduce/llm_answer_evaluator.py1813 # Create the configuration.
MEDIUMreproduce/llm_answer_evaluator.py1816 # Create the evaluator.
MEDIUMraganything/query.py41 # Create a normalized representation of the query parameters
MEDIUMraganything/processor.py231 # Create a content signature
MEDIUMraganything/parser.py134 # Create a temporary file with the correct extension
MEDIUMexamples/batch_processing_example.py323 # Create a directory structure with nested files
MEDIUMexamples/batch_processing_example.py326 # Create main directory files
MEDIUMscripts/create_tiktoken_cache.py4# Define the directory where you want to store the cache
MEDIUMscripts/create_tiktoken_cache.py9# Create the directory if it doesn't exist
Unused Imports31 hits · 30 pts
SeverityFileLineSnippet
LOWtests/test_asset_urls.py3
LOWtests/test_raganything_example.py1
LOWtests/test_omml_extractor.py8
LOWraganything/omml_extractor.py40
LOWraganything/batch.py16
LOWraganything/prompt_manager.py18
LOWraganything/__init__.py1
LOWraganything/__init__.py2
LOWraganything/__init__.py5
LOWraganything/__init__.py9
LOWraganything/__init__.py9
LOWraganything/__init__.py9
LOWraganything/__init__.py9
LOWraganything/__init__.py21
LOWraganything/__init__.py21
LOWraganything/__init__.py21
LOWraganything/__init__.py35
LOWraganything/__init__.py35
LOWraganything/__init__.py35
LOWraganything/__init__.py35
LOWraganything/__init__.py48
LOWraganything/__init__.py48
LOWraganything/__init__.py48
LOWraganything/__init__.py48
LOWraganything/__init__.py48
LOWraganything/parser.py24
LOWraganything/asset_urls.py14
LOWraganything/callbacks.py23
LOWraganything/prompts_zh.py11
LOWraganything/prompt.py8
LOWraganything/resilience.py11
Redundant / Tautological Comments19 hits · 30 pts
SeverityFileLineSnippet
LOWreproduce/query.py305 # Check if API key is provided
LOWreproduce/index.py253 # Check if API key is provided
LOWraganything/modalprocessors.py163 # Check if item is within context window and matches filter criteria
LOWraganything/query.py125 # Check if VLM enhanced query should be used
LOWraganything/query.py636 # Check if it's in the current working directory or subdirectories
LOWraganything/processor.py652 # Check if multimodal content is already processed
LOWraganything/parser.py986 # Check if this subdirectory contains the expected JSON output file
LOWraganything/parser.py1590 # Check if input is a URL
LOWraganything/enhanced_markdown.py36 # Check if pandoc module exists (not used directly, just for detection)
LOWraganything/enhanced_markdown.py104 # Check if pandoc is installed on system
LOWraganything/enhanced_markdown.py511 # Check if input file is provided
LOWraganything/utils.py174 # Check if file exists and is not a symlink (for security)
LOWexamples/office_document_test.py54 # Check if file exists and is a supported Office format
LOWexamples/raganything_example.py307 # Check if API key is provided
LOWexamples/batch_processing_example.py162 # Display results
LOWexamples/batch_processing_example.py220 # Display results
LOWexamples/image_format_test.py60 # Check if file exists and is a supported image format
LOWexamples/insert_content_list_example.py415 # Check if API key is provided
LOWexamples/text_format_test.py43 # Check if file exists and is a supported text format
Verbosity Indicators12 hits · 18 pts
SeverityFileLineSnippet
LOWraganything/batch.py356 # Step 1: Parse documents in batch
LOWraganything/batch.py367 # Step 2: Process with RAG
LOWraganything/processor.py1702 # Step 1: Parse document
LOWraganything/processor.py1711 # Step 2: Separate text and multimodal content
LOWraganything/processor.py1735 # Step 3: Insert pure text content with all parameters
LOWraganything/processor.py1772 # Step 4: Process multimodal content (using specialized processors)
LOWraganything/processor.py1968 # Step 1: Parse document
LOWraganything/processor.py2015 # Step 2: Separate text and multimodal content
LOWraganything/processor.py2027 # Step 3: Insert pure text content and multimodal content with all parameters
LOWraganything/processor.py2173 # Step 1: Separate text and multimodal content
LOWraganything/processor.py2197 # Step 2: Insert pure text content with all parameters
LOWraganything/processor.py2233 # Step 3: Process multimodal content (using specialized processors)
Cross-File Repetition3 hits · 15 pts
SeverityFileLineSnippet
HIGHreproduce/query.py0process document with raganything args: file_path: path to the document output_dir: output directory for rag results api
HIGHreproduce/index.py0process document with raganything args: file_path: path to the document output_dir: output directory for rag results api
HIGHexamples/raganything_example.py0process document with raganything args: file_path: path to the document output_dir: output directory for rag results api
AI Slop Vocabulary7 hits · 14 pts
SeverityFileLineSnippet
MEDIUMreproduce/llm_answer_evaluator.py270 # For comprehensive evaluation, try to extract additional metrics.
LOWtests/test_custom_parser.py167 # Do not touch the filesystem; just return a dummy result.
MEDIUMraganything/modalprocessors.py1410 """Parse equation analysis response with robust JSON handling"""
MEDIUMraganything/omml_extractor.py52# them to construct fully qualified tag names so that lookups are robust to
MEDIUMexamples/enhanced_markdown_example.py30 """Create comprehensive sample markdown content for testing"""
MEDIUMexamples/enhanced_markdown_example.py898 # Demonstrate robust conversion with fallbacks
MEDIUMexamples/enhanced_markdown_example.py920 # Test robust conversion
Docstring Block Structure2 hits · 10 pts
SeverityFileLineSnippet
HIGHraganything/query.py203 Multimodal query - combines text and multimodal content for querying Args: query: Base que
HIGHraganything/parser.py2494Get a parser instance by name. Checks built-in parsers first, then falls back to the custom parser registry pop
Over-Commented Block1 hit · 1 pts
SeverityFileLineSnippet
LOWrequirements.txt1huggingface_hub