Repository Analysis

HKUDS/LightRAG

[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

27.8 Moderate AI signal View on GitHub
27.8
Adjusted Score
27.8
Raw Score
100%
Time Factor
2026-05-30
Last Push
35,969
Stars
Python
Language
218,598
Lines of Code
584
Files
4290
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 0HIGH 193MEDIUM 627LOW 3470

Pattern Findings

4290 matches across 17 categories. Click a row to expand file-level details.

Decorative Section Separators517 hits · 1906 pts
SeverityFileLineSnippet
MEDIUMlightrag/pipeline.py204 # ============================================================
MEDIUMlightrag/pipeline.py206 # ============================================================
MEDIUMlightrag/pipeline.py2410 # ============================================================
MEDIUMlightrag/pipeline.py2412 # ============================================================
MEDIUMlightrag/pipeline.py3026 # ============================================================
MEDIUMlightrag/pipeline.py3028 # ============================================================
MEDIUMlightrag/pipeline.py1162 # ============================================================
MEDIUMlightrag/pipeline.py1164 # ============================================================
MEDIUMlightrag/pipeline.py1432 # ============================================================
MEDIUMlightrag/pipeline.py1434 # ============================================================
MEDIUMlightrag/pipeline.py1731 # ============================================================
MEDIUMlightrag/pipeline.py1733 # ============================================================
MEDIUMlightrag/pipeline.py2591 # ============================================================
MEDIUMlightrag/pipeline.py2593 # ============================================================
MEDIUMlightrag/pipeline.py3703 # ============================================================
MEDIUMlightrag/pipeline.py3705 # ============================================================
MEDIUMlightrag/multimodal_context.py87# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py91# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py131# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py134# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py187# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py189# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py266# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py268# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py462# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py464# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py686# ---------------------------------------------------------------------------
MEDIUMlightrag/multimodal_context.py688# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py565# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py567# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py592# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py594# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py440# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py442# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py490# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py492# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py630# ---------------------------------------------------------------------------
MEDIUMlightrag/utils_pipeline.py632# ---------------------------------------------------------------------------
MEDIUMlightrag/llm/binding_options.py421# =============================================================================
MEDIUMlightrag/llm/binding_options.py423# =============================================================================
MEDIUMlightrag/llm/binding_options.py433# =============================================================================
MEDIUMlightrag/llm/binding_options.py587# =============================================================================
MEDIUMlightrag/llm/binding_options.py589# =============================================================================
MEDIUMlightrag/llm/binding_options.py596# =============================================================================
MEDIUMlightrag/llm/binding_options.py633# =============================================================================
MEDIUMlightrag/llm/binding_options.py635# =============================================================================
MEDIUMlightrag/llm/binding_options.py641# =============================================================================
MEDIUMlightrag/llm/binding_options.py663# =============================================================================
MEDIUMlightrag/llm/binding_options.py665# =============================================================================
MEDIUMlightrag/llm/binding_options.py675# =============================================================================
MEDIUMlightrag/llm/binding_options.py32# =============================================================================
MEDIUMlightrag/llm/binding_options.py34# =============================================================================
MEDIUMlightrag/llm/binding_options.py68# =============================================================================
MEDIUMlightrag/llm/binding_options.py538# =============================================================================
MEDIUMlightrag/llm/binding_options.py540# =============================================================================
MEDIUMlightrag/parser/routing.py67# ---------------------------------------------------------------------------
MEDIUMlightrag/parser/routing.py69# ---------------------------------------------------------------------------
MEDIUMlightrag/parser/routing.py166# ---------------------------------------------------------------------------
MEDIUMlightrag/parser/routing.py178# ---------------------------------------------------------------------------
MEDIUMlightrag/parser/external/docling/ir_builder.py97 # ------------------------------------------------------------------
457 more matches not shown…
Hyper-Verbose Identifiers2123 hits · 1789 pts
SeverityFileLineSnippet
LOWreproduce/Step_3_openai_compatible.py57def run_queries_and_save_to_json(
LOWreproduce/Step_3.py26def run_queries_and_save_to_json(
LOWlightrag/rerank.py22def chunk_documents_for_rerank(
LOWlightrag/addon_params.py30def _emit_deprecated_addon_warnings(params: Mapping[str, Any]) -> None:
LOWlightrag/llm_roles.py125 def register_role_llm_builder(
LOWlightrag/llm_roles.py157 def _get_effective_role_llm_kwargs(self, role: str) -> dict[str, Any]:
LOWlightrag/llm_roles.py165 def _get_effective_role_llm_timeout(self, role: str) -> int:
LOWlightrag/llm_roles.py169 def _get_effective_role_llm_max_async(self, role: str) -> int:
LOWlightrag/llm_roles.py206 def _rebuild_single_role_llm_func(self, role: str) -> None:
LOWlightrag/llm_roles.py222 def _schedule_retired_llm_queue_cleanup(
LOWlightrag/llm_roles.py250 def _finalize_retired_llm_queue_cleanup(self, task: asyncio.Task) -> None:
LOWlightrag/llm_roles.py259 async def wait_for_retired_llm_queues(self) -> None:
LOWlightrag/llm_roles.py270 def _apply_llm_role_config_update(
LOWlightrag/llm_roles.py564 async def get_embedding_queue_status(self) -> dict[str, Any]:
LOWlightrag/lightrag.py654 def _set_runtime_addon_params(self, addon_params: Mapping[str, Any] | None) -> None:
LOWlightrag/lightrag.py658 def _apply_chunk_size_overlay(self) -> None:
LOWlightrag/lightrag.py780 def _refresh_addon_params_cache(self) -> None:
LOWlightrag/lightrag.py800 def _ensure_addon_params_cache(self) -> None:
LOWlightrag/lightrag.py833 def _build_role_llm_cache_identity(
LOWlightrag/lightrag.py1438 async def _process_extract_entities(
LOWlightrag/lightrag.py2208 async def _update_delete_retry_state(
LOWlightrag/lightrag.py2253 async def _get_existing_llm_cache_ids(self, cache_ids: list[str]) -> list[str]:
LOWlightrag/operate.py84def _get_relationship_vdb_timeout_seconds(global_config: dict[str, Any]) -> float:
LOWlightrag/operate.py101def _format_relation_edge_label(edge_key: tuple[str, str] | list[str]) -> str:
LOWlightrag/operate.py109def _truncate_entity_identifier(
LOWlightrag/operate.py187async def _handle_entity_relation_summary(
LOWlightrag/operate.py410def _handle_single_entity_extraction(
LOWlightrag/operate.py497def _handle_single_relationship_extraction(
LOWlightrag/operate.py584def _normalize_text_extraction_record_attributes(
LOWlightrag/operate.py606def _looks_like_json_extraction_result(result: str) -> bool:
LOWlightrag/operate.py622async def _process_json_extraction_result(
LOWlightrag/operate.py809async def rebuild_knowledge_from_chunks(
LOWlightrag/operate.py993 async def _locked_rebuild_relationship(src, tgt, chunk_ids):
LOWlightrag/operate.py1096async def _get_cached_extraction_results(
LOWlightrag/operate.py1186async def _process_extraction_result(
LOWlightrag/operate.py1317async def _rebuild_from_extraction_result(
LOWlightrag/operate.py1618async def _rebuild_single_relationship(
LOWlightrag/operate.py2900 async def _locked_process_entity_name(entity_name, entities):
LOWlightrag/operate.py3938def _strip_markdown_code_fence(text: str) -> str:
LOWlightrag/operate.py5016async def _find_most_related_edges_from_entities(
LOWlightrag/operate.py5072async def _find_related_text_unit_from_entities(
LOWlightrag/operate.py5290async def _find_most_related_entities_from_relationships(
LOWlightrag/operate.py5323async def _find_related_text_unit_from_relations(
LOWlightrag/utils.py97def _patch_ascii_colors_console_handler() -> None:
LOWlightrag/utils.py137async def safe_vdb_operation_with_exception(
LOWlightrag/utils.py703def serialize_llm_cache_identity(identity: Any) -> str:
LOWlightrag/utils.py708def _validate_cached_response_format(response_format: Any | None) -> None:
LOWlightrag/utils.py734def get_unique_filename_in_parsed(target_dir: Path, original_name: str) -> str:
LOWlightrag/utils.py856def priority_limit_async_func_call(
LOWlightrag/utils.py1383def wrap_embedding_func_with_attrs(**kwargs):
LOWlightrag/utils.py1481def _sanitize_string_for_json(text: str) -> str:
LOWlightrag/utils.py1721def pack_user_ass_to_openai_messages(*args: str):
LOWlightrag/utils.py1728def split_string_by_multi_markers(content: str, markers: list[str]) -> list[str]:
LOWlightrag/utils.py1741def truncate_list_by_token_size(
LOWlightrag/utils.py1780def split_text_units_for_hard_fallback(text: str) -> list[str]:
LOWlightrag/utils.py1796def split_text_by_token_limit(
LOWlightrag/utils.py1850def _normalized_child_offsets(
LOWlightrag/utils.py1938def enforce_chunk_token_limit_before_embedding(
LOWlightrag/utils.py2798def sanitize_and_normalize_extracted_text(
LOWlightrag/utils.py2951def sanitize_text_for_encoding(text: str, replacement_char: str = "") -> str:
2063 more matches not shown…
Excessive Try-Catch Wrapping673 hits · 539 pts
SeverityFileLineSnippet
LOWreproduce/Step_1_openai_compatible.py50 except Exception as e:
LOWreproduce/Step_0.py41 except Exception as e:
MEDIUMreproduce/Step_0.py42 print(f"An error occurred while processing file {filename}: {e}")
LOWreproduce/Step_0.py54 except Exception as e:
MEDIUMreproduce/Step_0.py55 print(f"An error occurred while saving to the file {output_filename}: {e}")
LOWreproduce/Step_1.py19 except Exception as e:
LOWreproduce/Step_3_openai_compatible.py53 except Exception as e:
MEDIUMreproduce/Step_3_openai_compatible.py49def process_query(query_text, rag_instance, query_param):
LOWreproduce/Step_3.py22 except Exception as e:
MEDIUMreproduce/Step_3.py18def process_query(query_text, rag_instance, query_param):
LOWlightrag/rerank.py58 except Exception as e:
LOWlightrag/rerank.py544 except Exception as e:
LOWlightrag/rerank.py559 except Exception as e:
LOWlightrag/rerank.py574 except Exception as e:
LOWlightrag/llm_roles.py256 except Exception as e:
LOWlightrag/llm_roles.py342 except Exception:
LOWlightrag/lightrag.py12except Exception: # pragma: no cover - optional dependency
LOWlightrag/lightrag.py1210 except Exception as e:
LOWlightrag/lightrag.py1451 except Exception as e:
LOWlightrag/lightrag.py1756 except Exception as e:
LOWlightrag/lightrag.py2167 except Exception as e:
LOWlightrag/lightrag.py2268 except Exception as verification_error:
LOWlightrag/lightrag.py2302 except Exception as e:
LOWlightrag/lightrag.py2482 except Exception as e:
LOWlightrag/lightrag.py2637 except Exception as e:
LOWlightrag/lightrag.py2654 except Exception as e:
LOWlightrag/lightrag.py2687 except Exception as e:
LOWlightrag/lightrag.py2750 except Exception as e:
LOWlightrag/lightrag.py2757 except Exception as e:
LOWlightrag/lightrag.py2778 except Exception as e:
LOWlightrag/lightrag.py2786 except Exception as e:
LOWlightrag/lightrag.py3000 except Exception as cache_err:
LOWlightrag/lightrag.py3013 except Exception as e:
LOWlightrag/lightrag.py3072 except Exception as cache_collect_error:
LOWlightrag/lightrag.py3091 except Exception as status_write_error:
LOWlightrag/lightrag.py3171 except Exception as e:
LOWlightrag/lightrag.py3349 except Exception as e:
LOWlightrag/lightrag.py3370 except Exception as e:
LOWlightrag/lightrag.py3408 except Exception as e:
LOWlightrag/lightrag.py3505 except Exception as e:
LOWlightrag/lightrag.py3513 except Exception as e:
LOWlightrag/lightrag.py3536 except Exception as e:
LOWlightrag/lightrag.py3574 except Exception as cache_delete_error:
LOWlightrag/lightrag.py3591 except Exception as e:
LOWlightrag/lightrag.py3606 except Exception as e:
LOWlightrag/lightrag.py3619 except Exception as e:
LOWlightrag/lightrag.py3638 except Exception as status_update_error:
LOWlightrag/lightrag.py3662 except Exception as persistence_error:
LOWlightrag/operate.py490 except Exception as e:
LOWlightrag/operate.py577 except Exception as e:
LOWlightrag/operate.py648 except Exception as e:
LOWlightrag/operate.py720 except Exception as e:
LOWlightrag/operate.py800 except Exception as e:
LOWlightrag/operate.py943 except Exception as e:
LOWlightrag/operate.py984 except Exception as e:
LOWlightrag/operate.py1022 except Exception as e:
LOWlightrag/operate.py1067 except Exception as e:
LOWlightrag/operate.py1441 except Exception as e:
LOWlightrag/operate.py1847 except Exception as e:
LOWlightrag/operate.py1876 except Exception as e:
613 more matches not shown…
Cross-File Repetition106 hits · 530 pts
SeverityFileLineSnippet
HIGHlightrag/base.py0get all edges in the graph. returns: a list of all edges, where each edge is a dictionary of its properties
HIGHlightrag/kg/networkx_impl.py0get all edges in the graph. returns: a list of all edges, where each edge is a dictionary of its properties
HIGHlightrag/kg/mongo_impl.py0get all edges in the graph. returns: a list of all edges, where each edge is a dictionary of its properties
HIGHlightrag/kg/memgraph_impl.py0get all edges in the graph. returns: a list of all edges, where each edge is a dictionary of its properties
HIGHlightrag/kg/neo4j_impl.py0get all edges in the graph. returns: a list of all edges, where each edge is a dictionary of its properties
HIGHlightrag/kg/qdrant_impl.py0buffered vector upsert waiting for embedding and/or bulk flush.
HIGHlightrag/kg/mongo_impl.py0buffered vector upsert waiting for embedding and/or bulk flush.
HIGHlightrag/kg/opensearch_impl.py0buffered vector upsert waiting for embedding and/or bulk flush.
HIGHlightrag/kg/milvus_impl.py0buffered vector upsert waiting for embedding and/or bulk flush.
HIGHlightrag/kg/qdrant_impl.py0buffer an entity vector delete by computing its hash id.
HIGHlightrag/kg/mongo_impl.py0buffer an entity vector delete by computing its hash id.
HIGHlightrag/kg/opensearch_impl.py0buffer an entity vector delete by computing its hash id.
HIGHlightrag/kg/milvus_impl.py0buffer an entity vector delete by computing its hash id.
HIGHlightrag/kg/qdrant_impl.py0get multiple vector data by their ids (read-your-writes), preserving order.
HIGHlightrag/kg/mongo_impl.py0get multiple vector data by their ids (read-your-writes), preserving order.
HIGHlightrag/kg/milvus_impl.py0get multiple vector data by their ids (read-your-writes), preserving order.
HIGHlightrag/kg/networkx_impl.py0get all nodes in the graph. returns: a list of all nodes, where each node is a dictionary of its properties
HIGHlightrag/kg/mongo_impl.py0get all nodes in the graph. returns: a list of all nodes, where each node is a dictionary of its properties
HIGHlightrag/kg/postgres_impl.py0get all nodes in the graph. returns: a list of all nodes, where each node is a dictionary of its properties
HIGHlightrag/kg/memgraph_impl.py0get all nodes in the graph. returns: a list of all nodes, where each node is a dictionary of its properties
HIGHlightrag/kg/neo4j_impl.py0get all nodes in the graph. returns: a list of all nodes, where each node is a dictionary of its properties
HIGHlightrag/kg/mongo_impl.py0check if the storage is empty for the current workspace and namespace returns: bool: true if storage is empty, false oth
HIGHlightrag/kg/mongo_impl.py0check if the storage is empty for the current workspace and namespace returns: bool: true if storage is empty, false oth
HIGHlightrag/kg/postgres_impl.py0check if the storage is empty for the current workspace and namespace returns: bool: true if storage is empty, false oth
HIGHlightrag/kg/postgres_impl.py0check if the storage is empty for the current workspace and namespace returns: bool: true if storage is empty, false oth
HIGHlightrag/kg/redis_impl.py0check if the storage is empty for the current workspace and namespace returns: bool: true if storage is empty, false oth
HIGHlightrag/kg/redis_impl.py0check if the storage is empty for the current workspace and namespace returns: bool: true if storage is empty, false oth
HIGHlightrag/kg/mongo_impl.py0drop the storage by removing all documents in the collection. returns: dict[str, str]: status of the operation with keys
HIGHlightrag/kg/mongo_impl.py0drop the storage by removing all documents in the collection. returns: dict[str, str]: status of the operation with keys
HIGHlightrag/kg/mongo_impl.py0drop the storage by removing all documents in the collection. returns: dict[str, str]: status of the operation with keys
HIGHlightrag/kg/mongo_impl.py0get counts of documents in each status for all documents returns: dictionary mapping status names to counts, including '
HIGHlightrag/kg/postgres_impl.py0get counts of documents in each status for all documents returns: dictionary mapping status names to counts, including '
HIGHlightrag/kg/json_doc_status_impl.py0get counts of documents in each status for all documents returns: dictionary mapping status names to counts, including '
HIGHlightrag/kg/redis_impl.py0get counts of documents in each status for all documents returns: dictionary mapping status names to counts, including '
HIGHlightrag/kg/mongo_impl.py0get document by file path args: file_path: the file path to search for returns: union[dict[str, any], none]: document da
HIGHlightrag/kg/postgres_impl.py0get document by file path args: file_path: the file path to search for returns: union[dict[str, any], none]: document da
HIGHlightrag/kg/redis_impl.py0get document by file path args: file_path: the file path to search for returns: union[dict[str, any], none]: document da
HIGHlightrag/kg/mongo_impl.py0get the total degree (sum of relationships) of two nodes. args: src_id: label of the source node tgt_id: label of the ta
HIGHlightrag/kg/memgraph_impl.py0get the total degree (sum of relationships) of two nodes. args: src_id: label of the source node tgt_id: label of the ta
HIGHlightrag/kg/neo4j_impl.py0get the total degree (sum of relationships) of two nodes. args: src_id: label of the source node tgt_id: label of the ta
HIGHlightrag/kg/postgres_impl.py0get documents with pagination support args: status_filter: filter by document status, none for all statuses page: page n
HIGHlightrag/kg/json_doc_status_impl.py0get documents with pagination support args: status_filter: filter by document status, none for all statuses page: page n
HIGHlightrag/kg/redis_impl.py0get documents with pagination support args: status_filter: filter by document status, none for all statuses page: page n
HIGHtests/setup/test_misc.py0set -euo pipefail source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" reset_state load_existing_env_if_pr
HIGHtests/setup/test_misc.py0set -euo pipefail source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" reset_state load_existing_env_if_pr
HIGHtests/setup/test_misc.py0set -euo pipefail source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" reset_state load_existing_env_if_pr
HIGHtests/setup/test_misc.py0set -euo pipefail source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" reset_state load_existing_env_if_pr
HIGHtests/setup/test_misc.py0set -euo pipefail source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" reset_state load_existing_env_if_pr
HIGHtests/setup/test_misc.py0set -euo pipefail source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" reset_state load_existing_env_if_pr
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_misc.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
HIGHtests/setup/test_validate.py0source "{repo_root}/scripts/setup/setup.sh" repo_root="{tmp_path}" security_check_env_file
46 more matches not shown…
Redundant / Tautological Comments166 hits · 243 pts
SeverityFileLineSnippet
LOWdocker-build-push.sh38# Check if buildx builder exists, create if not
LOWlightrag/lightrag.py2136 # Check if query_result is None
LOWlightrag/operate.py1055 # Check if any task raised an exception and ensure all exceptions are retrieved
LOWlightrag/operate.py2453 # Check if this is a placeholder record
LOWlightrag/operate.py3251 # Check if JSON structured output mode is enabled
LOWlightrag/operate.py3610 # Check if any task raised an exception and ensure all exceptions are retrieved
LOWlightrag/operate.py3878 # Check if pre-defined keywords are already provided
LOWlightrag/operate.py5416 # Check if any relations still have chunks after deduplication
LOWlightrag/utils.py373 # Check if record has the required attributes for an access log
LOWlightrag/utils.py550 # Check if func is already an EmbeddingFunc instance and unwrap it
LOWlightrag/utils.py573 # Check if user provided embedding_dim parameter
LOWlightrag/utils.py597 # Check if underlying function supports max_token_size and inject if not provided
LOWlightrag/utils.py610 # Check if total elements can be evenly divided by embedding_dim
LOWlightrag/utils.py953 # Check if task was cancelled before worker started
LOWlightrag/utils.py2089 # Check if we already have identical content cached
LOWlightrag/utils.py3078 # Check if there are still unused chunks
LOWlightrag/utils.py3326 # Check if results are in the new index-based format
LOWlightrag/pipeline.py1253 # Check if corresponding content exists in full_docs
LOWlightrag/pipeline.py1256 # Check if this is a failed document that should be preserved
LOWlightrag/pipeline.py1329 # Check if document has corresponding content in full_docs (consistency check)
LOWlightrag/pipeline.py1332 # Check if document is in interrupted status
LOWlightrag/storage_migrations.py34 # Check if migration is needed:
LOWlightrag/storage_migrations.py54 # Check if full_entities and full_relations are empty
LOWlightrag/utils_graph.py97 # Check if the entity exists
LOWlightrag/utils_graph.py198 # Check if the relation exists
LOWlightrag/utils_graph.py849 # Check if storage has existing data
LOWlightrag/utils_graph.py963 # Check if entity already exists
LOWlightrag/utils_graph.py1082 # Check if both entities exist
LOWlightrag/utils_graph.py1091 # Check if relation already exists
LOWlightrag/base.py237 # Check if model_name exists (model_name is optional in EmbeddingFunc)
LOWlightrag/tools/clean_llm_query_cache.py205 # Check if config.ini has configuration
LOWlightrag/tools/clean_llm_query_cache.py969 # Check if choice is valid
LOWlightrag/tools/clean_llm_query_cache.py1067 # Check if user cancelled
LOWlightrag/tools/clean_llm_query_cache.py1129 # Check if there are any records to delete
LOWlightrag/tools/download_cache.py39 # Check if TIKTOKEN_CACHE_DIR is already set in environment
LOWlightrag/tools/check_initialization.py101 # Print results
LOWlightrag/tools/prepare_qdrant_legacy_data.py548 # Print result
LOWlightrag/tools/migrate_llm_cache.py211 # Check if storage requires configuration
LOWlightrag/tools/migrate_llm_cache.py218 # Check if has environment variables
LOWlightrag/tools/migrate_llm_cache.py223 # Check if has config.ini configuration
LOWlightrag/tools/migrate_llm_cache.py190 # Check if config.ini has configuration
LOWlightrag/tools/migrate_llm_cache.py1060 # Check if choice is valid
LOWlightrag/tools/migrate_llm_cache.py1429 # Check if user cancelled (setup_storage returns None for all fields)
LOWlightrag/tools/migrate_llm_cache.py1433 # Check if there are at least 2 storage types available
LOWlightrag/tools/lightrag_visualizer/graph_visualizer.py920 # Check if node is behind camera
LOWlightrag/llm/jina.py28 # Check if the error response is HTML (common for 502, 503, etc.)
LOWlightrag/llm/gemini.py258 # Check if this part is thought content using the 'thought' attribute
LOWlightrag/llm/binding_options.py213 # Check if this is a dataclass and use dataclass fields
LOWlightrag/llm/openai.py516 # Check if this chunk has usage information (final chunk)
LOWlightrag/llm/openai.py523 # Check if choices exists and is not empty
LOWlightrag/llm/openai.py531 # Check if delta exists
LOWlightrag/llm/openai.py47 # Check if required Langfuse environment variables are set
LOWlightrag/llm/openai.py714 # Check if we should include reasoning content
LOWlightrag/parser/docx/numbering_resolver.py199 # Check if this style has numPr
LOWlightrag/parser/docx/parse_document.py601 # Check if this block can be absorbed (table_chunk_role constraints)
LOWlightrag/parser/docx/parse_document.py617 # Check if combined size doesn't exceed MAX
LOWlightrag/parser/docx/parse_document.py892 # Check if this block starts with a split table chunk (has _chunk_heading metadata)
LOWlightrag/parser/docx/parse_document.py1579 # Check if this is a heading using the new function
LOWlightrag/parser/docx/parse_document.py1761 # Check if table needs splitting (disabled in fixlevel mode)
LOWlightrag/parser/docx/utils.py361 # Check if it mentions billing which indicates permanent quota issue
106 more matches not shown…
Deep Nesting307 hits · 216 pts
SeverityFileLineSnippet
LOWreproduce/Step_0.py7
LOWreproduce/Step_3_openai_compatible.py57
LOWreproduce/Step_3.py26
LOWlightrag/rerank.py22
LOWlightrag/rerank.py116
LOWlightrag/rerank.py182
LOWlightrag/table_markup.py94
LOWlightrag/llm_roles.py270
LOWlightrag/lightrag.py848
LOWlightrag/lightrag.py1182
LOWlightrag/lightrag.py1496
LOWlightrag/lightrag.py2043
LOWlightrag/lightrag.py2383
LOWlightrag/lightrag.py2794
LOWlightrag/operate.py187
LOWlightrag/operate.py809
LOWlightrag/operate.py1186
LOWlightrag/operate.py1371
LOWlightrag/operate.py1618
LOWlightrag/operate.py1901
LOWlightrag/operate.py2230
LOWlightrag/operate.py2815
LOWlightrag/operate.py3221
LOWlightrag/operate.py3951
LOWlightrag/operate.py4169
LOWlightrag/operate.py5072
LOWlightrag/operate.py5323
LOWlightrag/operate.py3309
LOWlightrag/utils.py137
LOWlightrag/utils.py856
LOWlightrag/utils.py1796
LOWlightrag/utils.py1938
LOWlightrag/utils.py2180
LOWlightrag/utils.py3093
LOWlightrag/utils.py3273
LOWlightrag/utils.py3357
LOWlightrag/utils.py3778
LOWlightrag/utils.py887
LOWlightrag/utils.py924
LOWlightrag/utils.py1016
LOWlightrag/pipeline.py208
LOWlightrag/pipeline.py990
LOWlightrag/pipeline.py1436
LOWlightrag/pipeline.py1608
LOWlightrag/pipeline.py1735
LOWlightrag/pipeline.py3164
LOWlightrag/pipeline.py3707
LOWlightrag/pipeline.py4592
LOWlightrag/pipeline.py439
LOWlightrag/storage_migrations.py30
LOWlightrag/storage_migrations.py100
LOWlightrag/storage_migrations.py197
LOWlightrag/multimodal_context.py192
LOWlightrag/multimodal_context.py467
LOWlightrag/multimodal_context.py591
LOWlightrag/utils_graph.py262
LOWlightrag/utils_graph.py528
LOWlightrag/utils_graph.py737
LOWlightrag/utils_graph.py1198
LOWlightrag/utils_graph.py1626
247 more matches not shown…
Docstring Block Structure43 hits · 215 pts
SeverityFileLineSnippet
HIGHlightrag/utils.py1246 Execute function with enhanced priority-based concurrency control and timeout handling Args:
HIGHlightrag/pipeline.py221 Pipeline for Processing Documents 1. Validate ids if provided or generate MD5 hash IDs and remove dupl
HIGHlightrag/tools/clean_llm_query_cache.py250Initialize storage instance with fallback to config.ini and defaults Args: storage_name: Storage im
HIGHlightrag/tools/migrate_llm_cache.py263Initialize storage instance with fallback to config.ini and defaults Args: storage_name: Storage im
HIGHlightrag/llm/jina.py85Generate embeddings for a list of texts using Jina AI's API. Args: texts: List of texts to embed. m
HIGHlightrag/llm/voyageai.py61Generate embeddings for a list of texts using VoyageAI's API. Args: texts: List of texts to embed.
HIGHlightrag/llm/gemini.py302 Complete a prompt using Gemini's API with Chain of Thought (COT) support. This function supports automatic int
HIGHlightrag/llm/gemini.py624Generate embeddings for a list of texts using Gemini's API. This function uses Google's Gemini embedding model to g
HIGHlightrag/llm/openai.py259Complete a prompt using OpenAI's API with caching support and Chain of Thought (COT) integration. This function sup
HIGHlightrag/llm/openai.py912Generate embeddings for a list of texts using OpenAI's API with automatic text truncation. This function supports b
HIGHlightrag/parser/docx/utils.py94 Create Gemini client for AI Studio or Vertex AI. Supports two modes: - AI Studio (default): Uses GOOGLE_AP
HIGHlightrag/parser/docx/utils.py171 Create OpenAI client with optional custom base URL. Environment variables: - OPENAI_API_KEY: Required API
HIGHlightrag/kg/postgres_impl.py519 Execute a database operation with automatic retry for transient failures. Args: operation:
HIGHlightrag/kg/memgraph_impl.py134 Check if a node exists in the graph. Args: node_id: The ID of the node to check.
HIGHlightrag/kg/memgraph_impl.py174 Check if an edge exists between two nodes in the graph. Args: source_node_id: The ID of th
HIGHlightrag/kg/memgraph_impl.py222Get node by its label identifier, return only node properties Args: node_id: The node label to look
HIGHlightrag/kg/memgraph_impl.py277Get the degree (number of relationships) of a node with the given label. If multiple nodes have the same label,
HIGHlightrag/kg/memgraph_impl.py364Retrieves all edges (relationships) for a particular node identified by its label. Args: source_nod
HIGHlightrag/kg/memgraph_impl.py430Get edge properties between two nodes. Args: source_node_id: Label of the source node t
HIGHlightrag/kg/neo4j_impl.py440 Check if a node with the given label exists in the database Args: node_id: Label of the no
HIGHlightrag/kg/neo4j_impl.py474 Check if an edge exists between two nodes Args: source_node_id: Label of the source node
HIGHlightrag/kg/neo4j_impl.py516Get node by its label identifier, return only node properties Args: node_id: The node label to look
HIGHlightrag/kg/neo4j_impl.py607Get the degree (number of relationships) of a node with the given label. If multiple nodes have the same label,
HIGHlightrag/kg/neo4j_impl.py744Get edge properties between two nodes. Args: source_node_id: Label of the source node t
HIGHlightrag/kg/neo4j_impl.py881Retrieves all edges (relationships) for a particular node identified by its label. Args: source_nod
HIGHlightrag/api/auth.py121 Validate JWT token Args: token: JWT token Returns: dict: Dictionary c
HIGHlightrag/api/routers/query_routes.py330 Comprehensive RAG query endpoint with non-streaming response. Parameter "stream" is ignored. This endp
HIGHlightrag/api/routers/query_routes.py540 Advanced RAG query endpoint with flexible streaming response. This endpoint provides the most flexible
HIGHlightrag/api/routers/query_routes.py1043 Advanced data retrieval endpoint for structured RAG analysis. This endpoint provides raw retrieval res
HIGHlightrag/api/routers/document_routes.py113 Sanitize uploaded filename to prevent Path Traversal attacks. Args: filename: The original filename fr
HIGHlightrag/api/routers/document_routes.py1576Extract PDF content using pypdf (synchronous). Args: file_bytes: PDF file content as bytes password
HIGHlightrag/api/routers/document_routes.py3094 Upload a file to the input directory and index it. This API endpoint accepts a file through an HTTP PO
HIGHlightrag/api/routers/document_routes.py3336 Insert text into the RAG system. This endpoint allows you to insert text data into the RAG system for
HIGHlightrag/api/routers/document_routes.py3440 Insert multiple texts into the RAG system. This endpoint allows you to insert multiple text entries in
HIGHlightrag/api/routers/document_routes.py3990 Delete documents and all their associated data by their IDs using background processing. Deletes speci
HIGHlightrag/api/routers/document_routes.py4080 Clear all cache data from the LLM response cache storage. This endpoint clears all cached LLM response
HIGHlightrag/api/routers/document_routes.py4114 Delete an entity and all its relationships from the knowledge graph. Args: request (Delete
HIGHlightrag/api/routers/document_routes.py4150 Delete a relationship between two entities from the knowledge graph. Args: request (Delete
HIGHlightrag/api/routers/document_routes.py4189 Get the processing status of documents by tracking ID. This endpoint retrieves all documents associate
HIGHlightrag/api/routers/document_routes.py4265 Get documents with pagination support. This endpoint retrieves documents with pagination, filtering, a
HIGHlightrag/evaluation/eval_rag_quality.py295 Generate RAG response by calling LightRAG API. Args: question: The user query.
HIGHlightrag/chunker/paragraph_semantic.py1275Paragraph Semantic Chunking — the ``chunking="P"`` strategy. Reads structured blocks emitted by the docx native par
HIGHtests/api/test_lightrag_ollama_chat.py147Send an HTTP request with retry mechanism Args: url: Request URL data: Request data stream:
Magic Placeholder Names35 hits · 182 pts
SeverityFileLineSnippet
HIGHtests/setup/test_misc.py297 "LLM_BINDING_API_KEY=your_api_key",
HIGHtests/setup/test_misc.py302 "EMBEDDING_BINDING_API_KEY=your_api_key",
HIGHdocs/LightRAG-API-Server.md118LLM_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server.md124# EMBEDDING_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server.md135# LLM_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server.md143# EMBEDDING_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server.md265LLM_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server.md276EMBEDDING_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server.md759LLM_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server.md996LLM_BINDING_API_KEY=your-api-key
HIGHdocs/ProgramingWithCore.md852export OPENAI_API_KEY=your-api-key
HIGHdocs/ProgramingWithCore.md866OPENAI_API_KEY=your-api-key \
HIGHdocs/RoleSpecificLLMConfiguration-zh.md26LLM_BINDING_API_KEY=your_api_key
HIGHdocs/RoleSpecificLLMConfiguration-zh.md122LLM_BINDING_API_KEY=your_api_key
HIGHdocs/RoleSpecificLLMConfiguration-zh.md203LLM_BINDING_API_KEY=your_api_key
HIGHdocs/RoleSpecificLLMConfiguration-zh.md220LLM_BINDING_API_KEY=your_api_key
HIGHdocs/RoleSpecificLLMConfiguration-zh.md333LLM_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server-zh.md118LLM_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server-zh.md124# EMBEDDING_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server-zh.md135# LLM_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server-zh.md143# EMBEDDING_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server-zh.md265LLM_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server-zh.md276EMBEDDING_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server-zh.md759LLM_BINDING_API_KEY=your_api_key
HIGHdocs/LightRAG-API-Server-zh.md996LLM_BINDING_API_KEY=your-api-key
HIGHdocs/RoleSpecificLLMConfiguration.md26LLM_BINDING_API_KEY=your_api_key
HIGHdocs/RoleSpecificLLMConfiguration.md122LLM_BINDING_API_KEY=your_api_key
HIGHdocs/RoleSpecificLLMConfiguration.md203LLM_BINDING_API_KEY=your_api_key
HIGHdocs/RoleSpecificLLMConfiguration.md220LLM_BINDING_API_KEY=your_api_key
HIGHdocs/RoleSpecificLLMConfiguration.md333LLM_BINDING_API_KEY=your_api_key
HIGHexamples/lightrag_gemini_postgres_demo.py28 GEMINI_API_KEY=your-api-key
HIGHexamples/lightrag_gemini_demo.py37 "Please set it with: export GEMINI_API_KEY='your-api-key'"
HIGHexamples/lightrag_openai_opensearch_graph_demo.py23 OPENAI_API_KEY=your-api-key
HIGH…s/unofficial-sample/lightrag_llamaindex_direct_demo.py29OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "your-api-key-here")
HIGHexamples/unofficial-sample/lightrag_cloudflare_demo.py20cloudflare_api_key = "YOUR_API_KEY"
Self-Referential Comments50 hits · 148 pts
SeverityFileLineSnippet
MEDIUMreproduce/Step_1_openai_compatible.py77 # Initialize RAG instance
MEDIUMreproduce/Step_1.py42 # Initialize RAG instance
MEDIUMlightrag/lightrag.py964 # Create a NEW EmbeddingFunc instance with the wrapped func to avoid mutating the caller's object
MEDIUMlightrag/lightrag.py1951 # Create a copy of param to avoid modifying the original
MEDIUMlightrag/operate.py5619 # Create a preliminary system prompt with empty content_data to calculate overhead
MEDIUMlightrag/utils.py3980 # Create a list of (file_path, count, first_index) tuples
MEDIUMlightrag/pipeline.py790 # Create a new record with unique ID for this duplicate attempt
MEDIUMlightrag/tools/migrate_llm_cache.py701 # Create a snapshot of matching items while holding the lock
MEDIUMlightrag/llm/openai.py173 # Create a merged config dict with precedence: explicit params > client_configs
MEDIUMlightrag/llm/openai.py205 # Create a merged config dict with precedence: explicit params > client_configs > defaults
MEDIUMlightrag/llm/openai.py375 # Create the OpenAI client (supports both OpenAI and Azure)
MEDIUMlightrag/llm/openai.py996 # Create the OpenAI client (supports both OpenAI and Azure)
MEDIUMlightrag/llm/bedrock.py320 # Create a session that will be used throughout the streaming process
MEDIUMlightrag/llm/bedrock.py330 # Define the generator function that will manage the client lifecycle
MEDIUMlightrag/kg/mongo_impl.py2409 # Create the improved search index (async, no waiting)
MEDIUMlightrag/kg/mongo_impl.py363 # Create a copy of v for $set operation, excluding create_time to avoid conflicts
MEDIUMlightrag/kg/mongo_impl.py2738 # Define the aggregation pipeline with the converted query vector
MEDIUMlightrag/kg/postgres_impl.py1431 # Define the field changes needed
MEDIUMlightrag/kg/faiss_impl.py233 # Create an empty Faiss index for inner product (useful for normalized vectors = cosine similarity).
MEDIUMlightrag/kg/shared_storage.py17# Define a direct print function for critical logs that must be visible in all processes
MEDIUMlightrag/kg/shared_storage.py1286 # Create a shared list object for history_messages
MEDIUMlightrag/kg/shared_storage.py1358 # Create a simple mutable object to store boolean value for compatibility with mutiprocess
MEDIUMlightrag/kg/json_kv_impl.py253 # Create a copy to avoid modifying the original data
MEDIUMlightrag/kg/json_kv_impl.py268 # Create a copy to avoid modifying the original data
MEDIUMlightrag/api/run_with_gunicorn.py118 # Define a custom application class that loads our config
MEDIUMlightrag/api/run_with_gunicorn.py248 # Create the application
MEDIUMlightrag/api/lightrag_server.py1834 # Create the EmbeddingFunc instance (now returns complete EmbeddingFunc with max_token_size)
MEDIUMlightrag/api/routers/ollama_api.py149 # Create an instance of the model
MEDIUMlightrag/api/routers/query_routes.py431 # Create a mapping from reference_id to chunk content
MEDIUMlightrag/api/routers/query_routes.py685 # Create a mapping from reference_id to chunk content
MEDIUMtests/kg/test_graph_storage.py108 # Initialize the storage instance
MEDIUMtests/kg/test_graph_storage.py1475 # Initialize storage instance
MEDIUMtests/kg/postgres_impl/test_postgres_index_name.py84 # Create a table name that results in exactly 63 bytes
MEDIUMtests/api/auth/test_token_auto_renewal.py16# Create a simple token renewal cache for testing
MEDIUMtests/api/auth/test_token_auto_renewal.py323 # Create a mock JWT payload
MEDIUMtests/chunker/test_rerank_chunking.py39 # Create a very long document that exceeds character limit
MEDIUMexamples/graph_visual_with_html.py15# Create a Pyvis network
MEDIUMexamples/lightrag_openai_compatible_demo.py150 # Initialize RAG instance
MEDIUMexamples/lightrag_openai_mongodb_graph_demo.py72 # Initialize RAG instance
MEDIUMexamples/lightrag_openai_demo.py119 # Initialize RAG instance
MEDIUMexamples/lightrag_ollama_demo.py139 # Initialize RAG instance
MEDIUMexamples/graph_visual_with_neo4j.py159 # Create a Neo4j driver
MEDIUM…mples/unofficial-sample/lightrag_embedding_prefixes.py156 # Initialize RAG instance
MEDIUM…/unofficial-sample/lightrag_llamaindex_litellm_demo.py103 # Initialize RAG instance
MEDIUM…cial-sample/lightrag_openai_neo4j_milvus_redis_demo.py73 # Initialize RAG instance
MEDIUM…s/unofficial-sample/lightrag_llamaindex_direct_demo.py101 # Initialize RAG instance
MEDIUM…ficial-sample/lightrag_llamaindex_litellm_opik_demo.py114 # Initialize RAG instance
MEDIUMexamples/unofficial-sample/lightrag_cloudflare_demo.py251 # Initialize RAG instance
MEDIUMexamples/unofficial-sample/lightrag_lmdeploy_demo.py69 # Initialize RAG instance
MEDIUMexamples/unofficial-sample/lightrag_nvidia_demo.py123 # Initialize RAG instance
Unused Imports140 hits · 132 pts
SeverityFileLineSnippet
LOWlightrag/rerank.py1
LOWlightrag/file_atomic.py29
LOWlightrag/addon_params.py15
LOWlightrag/table_markup.py17
LOWlightrag/llm_roles.py11
LOWlightrag/lightrag.py1
LOWlightrag/lightrag.py140
LOWlightrag/__init__.py3
LOWlightrag/types.py1
LOWlightrag/operate.py1
LOWlightrag/utils.py1
LOWlightrag/utils.py301
LOWlightrag/utils.py301
LOWlightrag/utils.py301
LOWlightrag/pipeline.py12
LOWlightrag/storage_migrations.py13
LOWlightrag/exceptions.py1
LOWlightrag/multimodal_context.py64
LOWlightrag/utils_graph.py1
LOWlightrag/prompt.py1
LOWlightrag/utils_pipeline.py10
LOWlightrag/namespace.py1
LOWlightrag/chunk_schema.py24
LOWlightrag/base.py1
LOWlightrag/prompt_multimodal.py31
LOWlightrag/llm/gemini.py10
LOWlightrag/llm/_vision_utils.py20
LOWlightrag/llm/azure_openai.py12
LOWlightrag/llm/azure_openai.py12
LOWlightrag/llm/azure_openai.py12
LOWlightrag/parser/cli.py17
LOWlightrag/parser/_markdown.py10
LOWlightrag/parser/routing.py1
LOWlightrag/parser/debug.py16
LOWlightrag/parser/external/_manifest.py18
LOWlightrag/parser/external/_zip.py10
LOWlightrag/parser/external/_common.py11
LOWlightrag/parser/external/__init__.py16
LOWlightrag/parser/external/__init__.py16
LOWlightrag/parser/external/__init__.py16
LOWlightrag/parser/external/__init__.py16
LOWlightrag/parser/external/__init__.py16
LOWlightrag/parser/external/__init__.py16
LOWlightrag/parser/external/__init__.py24
LOWlightrag/parser/external/__init__.py24
LOWlightrag/parser/external/__init__.py24
LOWlightrag/parser/external/__init__.py24
LOWlightrag/parser/external/__init__.py24
LOWlightrag/parser/external/__init__.py24
LOWlightrag/parser/external/__init__.py24
LOWlightrag/parser/external/__init__.py33
LOWlightrag/parser/external/docling/ir_builder.py43
LOWlightrag/parser/external/docling/manifest.py11
LOWlightrag/parser/external/docling/client.py20
LOWlightrag/parser/external/docling/cache.py30
LOWlightrag/parser/external/docling/__init__.py8
LOWlightrag/parser/external/docling/__init__.py23
LOWlightrag/parser/external/docling/__init__.py26
LOWlightrag/parser/external/docling/__init__.py29
LOWlightrag/parser/external/mineru/ir_builder.py43
80 more matches not shown…
Verbosity Indicators31 hits · 50 pts
SeverityFileLineSnippet
LOWlightrag/lightrag.py945 # Step 1: Capture embedding_func and max_token_size before applying rate_limit decorator
LOWlightrag/lightrag.py963 # Step 2: Apply priority wrapper decorator to EmbeddingFunc's inner func
LOWlightrag/operate.py5094 # Step 1: Collect all text chunks for each entity
LOWlightrag/operate.py5121 # Step 2: Count chunk occurrences and deduplicate (keep chunks from earlier positioned entities)
LOWlightrag/operate.py5138 # Step 3: Sort chunks for each entity by occurrence count (higher count = higher priority)
LOWlightrag/operate.py5151 # Step 4: Apply the selected chunk selection algorithm
LOWlightrag/operate.py5205 # Step 5: Batch retrieve chunk data
LOWlightrag/operate.py5211 # Step 6: Build result chunks with valid data and update chunk tracking
LOWlightrag/operate.py5345 # Step 1: Collect all text chunks for each relationship
LOWlightrag/operate.py5380 # Step 2: Count chunk occurrences and deduplicate (keep chunks from earlier positioned relationships)
LOWlightrag/operate.py5429 # Step 3: Sort chunks for each relationship by occurrence count (higher count = higher priority)
LOWlightrag/operate.py5444 # Step 4: Apply the selected chunk selection algorithm
LOWlightrag/operate.py5500 # Step 5: Batch retrieve chunk data
LOWlightrag/operate.py5506 # Step 6: Build result chunks with valid data and update chunk tracking
LOWlightrag/utils.py3572 # Step 1: Remove chunks that are no longer needed
LOWlightrag/utils.py3577 # Step 2: Add new chunks (preserving order from new_chunk_ids)
LOWlightrag/kg/milvus_impl.py1089 # Step 3: Rename origin collection (keep for safety)
LOWlightrag/kg/milvus_impl.py1109 # Step 4: Rename temporary collection to original name
LOWlightrag/kg/milvus_impl.py1004 # Step 1: Create temporary collection with new schema
LOWlightrag/kg/milvus_impl.py1025 # Step 2: Copy data using query_iterator (solves query window limitation)
LOWlightrag/api/lightrag_server.py1503 # Step 1: Import provider function and extract default attributes
LOWlightrag/api/lightrag_server.py1556 # Step 2: Apply priority (user config > provider default)
LOWlightrag/api/lightrag_server.py1576 # Step 3: Create optimized embedding function (calls underlying function directly)
LOWlightrag/api/lightrag_server.py1771 # Step 4: Wrap in EmbeddingFunc and return
LOWlightrag/chunker/paragraph_semantic.py849 # Step 1: expand each oversized table paragraph into row-bounded
LOWlightrag/chunker/paragraph_semantic.py873 # Step 2: greedy-pack pieces into chunks ≤ target_max. A piece
LOWtests/kg/postgres_impl/test_postgres_migration.py662 # Step 1: Simulate workspace_a initialization (Case 3 - only legacy exists)
LOWtests/kg/postgres_impl/test_postgres_migration.py760 # Step 2: Simulate workspace_b initialization (Case 3 - both exist, but legacy has B's data)
LOWexamples/lightrag_ag2_multiagent_demo.py267 # Step 1: Set up LightRAG (async, runs on the background loop)
LOWexamples/lightrag_ag2_multiagent_demo.py270 # Step 2: Create AG2 agents with LightRAG tools
LOWexamples/lightrag_ag2_multiagent_demo.py273 # Step 3: Ask a complex question
Over-Commented Block58 hits · 42 pts
SeverityFileLineSnippet
LOWdocker-compose.podman.yml1# Podman-compatible compose file for LightRAG
LOWlightrag/lightrag.py721 if "chunk_overlap_token_size" not in sub:
LOWlightrag/lightrag.py741 chunker_cfg["paragraph_semantic"].setdefault(
LOWlightrag/lightrag.py1581 # Relationship storage is undirected, so keep only the last update
LOWlightrag/lightrag.py4081# `addon_params` is declared as an InitVar on the dataclass so it can still be
LOWlightrag/constants.py101)
LOWlightrag/constants.py281# because both engines are resource-intensive (GPU/CPU + memory) and tend to
LOWlightrag/utils.py3801
LOWlightrag/pipeline.py281 # processing loop because (a) full_docs is upserted before
LOWlightrag/pipeline.py661 # this, two enqueues for the same content (e.g. /upload during
LOWlightrag/pipeline.py1041 "Another process is already processing the document queue. Request queued."
LOWlightrag/pipeline.py1481 # Stamp parsing_start_time on the in-memory status_doc so
LOWlightrag/pipeline.py1661 # Mirror analyze-stage outcome as a 3-way decision so the
LOWlightrag/pipeline.py1841 # follows the standardized file-chunker contract
LOWlightrag/pipeline.py2061 # Explicit selector in process_options: reflect
LOWlightrag/pipeline.py2121 "Applied hard fallback split before embedding for "
LOWlightrag/utils_pipeline.py181# tables missing ``w14:paraId``) that admins should be able to surface
LOWlightrag/utils_pipeline.py201# operators can distinguish "analyze actually completed" from "analyze
LOWlightrag/utils_pipeline.py701# over all chunk_results. It has been moved into
LOWlightrag/llm/binding_options.py41# - Handles default values and type information for each parameter
LOWlightrag/llm/binding_options.py421# =============================================================================
LOWlightrag/llm/binding_options.py661
LOWlightrag/llm/openai.py461 # A "could not parse JSON body" 400 is transient (corrupted/truncated
LOWlightrag/parser/routing.py161 skip_kg=PROCESS_OPTION_SKIP_KG in chars,
LOWlightrag/parser/external/docling/cache.py41
LOWlightrag/kg/mongo_impl.py1081
LOWlightrag/kg/mongo_impl.py1101 # "target_node_id" : "ProductX",
LOWlightrag/kg/postgres_impl.py5081
LOWlightrag/kg/postgres_impl.py6041 target_node_id (str): Label of the target node (used as identifier)
LOWlightrag/kg/faiss_impl.py241 # Minimal pending area for deferred embedding: custom-id -> _PendingFaissDoc.
LOWlightrag/kg/shared_storage.py1301 # Exclusive subset of ``scanning``: only True during the
LOWlightrag/kg/milvus_impl.py1361 self._validate_embedding_func()
LOWlightrag/kg/milvus_impl.py1381 # "hnsw_m": 32,
LOWlightrag/api/routers/document_routes.py2581 except Exception as move_error:
LOWlightrag/api/routers/document_routes.py2681 # True for the rest of the task lifecycle (releases in
LOWlightrag/api/routers/document_routes.py3161 Raises:
LOWlightrag/chunker/paragraph_semantic.py821 candidates.append({"index": idx, "text": text, "position": cumulative})
LOWtests/chunker/test_chunking.py661 split_by_character=None,
LOWtests/chunker/test_chunking.py1021 chunk_overlap_token_size=2,
LOW…ts/chunker/test_paragraph_semantic_split_long_block.py61 # recursive-character splitting so ``target_max`` is honored without
LOWdocs/LightRAG-API-Server.md1001VLM_PROCESS_ENABLE=false
LOWdocs/LightRAG-API-Server.md1021# EMBEDDING_DOCUMENT_PREFIX="search_document: "
LOWdocs/RoleSpecificLLMConfiguration-zh.md241# have a valid OpenAI configuration.
LOWdocs/LightRAG-API-Server-zh.md1001VLM_PROCESS_ENABLE=false
LOWdocs/LightRAG-API-Server-zh.md1021# EMBEDDING_DOCUMENT_PREFIX="search_document: "
LOWdocs/RoleSpecificLLMConfiguration.md241# have a valid OpenAI configuration.
LOWexamples/milvus_kwargs_configuration_demo.py21 # os.environ["MILVUS_USER"] = "root"
LOWexamples/milvus_kwargs_configuration_demo.py41 "index_type": "HNSW",
LOWk8s-deploy/databases/neo4j/values.yaml1# Version
LOWk8s-deploy/databases/neo4j/values.yaml21# description: Memory, the unit is Gi.
LOWk8s-deploy/databases/postgresql/values.yaml1## description: service version.
LOWk8s-deploy/databases/postgresql/values.yaml21## default: 0.5
LOWk8s-deploy/databases/redis/values.yaml1## description: Cluster version.
LOWk8s-deploy/databases/redis/values.yaml21
LOWk8s-deploy/databases/mongodb/values.yaml1## description: Cluster version.
LOWk8s-deploy/databases/qdrant/values.yaml1## description: The version of Qdrant.
LOWk8s-deploy/databases/elasticsearch/values.yaml1## description: The version of ElasticSearch.
LOWk8s-deploy/lightrag/values.yaml1replicaCount: 1
Cross-Language Confusion9 hits · 40 pts
SeverityFileLineSnippet
HIGHlightrag/pipeline.py2722 # ``positions: [{"type": "paraid", "range": null}]``.
HIGHlightrag/kg/postgres_impl.py6082 " $1::text || E'\\x01' ||"
HIGHlightrag/api/lightrag_server.py123 if (!data || data.type !== 'lightrag:set-docs-theme') return;
HIGHlightrag/api/routers/graph_routes.py308 "merge_error": null,
HIGHlightrag/api/routers/graph_routes.py310 "target_entity": null,
HIGHlightrag/api/routers/graph_routes.py336 "merge_error": null,
HIGH…sts/kg/postgres_impl/test_postgres_cypher_injection.py135 "formula": "x < 5 && y > 3",
HIGH…sts/kg/postgres_impl/test_postgres_cypher_injection.py146 assert '`formula`: "x < 5 && y > 3"' in call["sql"]
HIGH…s/kg/postgres_impl/test_postgres_upsert_edge_cypher.py171 assert "$1::text || E'\\x01' ||" in lock_sql
AI Slop Vocabulary11 hits · 23 pts
SeverityFileLineSnippet
LOWlightrag/pipeline.py1038 # Another process is busy, just set request flag and return
MEDIUMlightrag/tools/clean_llm_query_cache.py870 """Print comprehensive cleanup report
MEDIUMlightrag/tools/migrate_llm_cache.py1517 # Print comprehensive migration report
MEDIUMlightrag/tools/migrate_llm_cache.py1349 """Print comprehensive migration report
MEDIUMlightrag/kg/mongo_impl.py2153 """Try Atlas Search using compound query for comprehensive matching."""
MEDIUMlightrag/kg/shared_storage.py984 """Release all locks with comprehensive error handling, protected from cancellation"""
MEDIUMlightrag/api/routers/document_routes.py4239 # Handle both DocStatus enum and string cases for robust deserialization
MEDIUMtests/api/routes/test_aquery_data_endpoint.py493 "mode": "mix", # Use mixed mode to get the most comprehensive results
MEDIUMtests/chunker/test_chunking_raw_lightrag_parity.py56# Shared fixtures (mirrors the harness used by test_pipeline_release_closure)
MEDIUMlightrag_webui/src/features/RetrievalTesting.tsx269 // Use the new robust COT parsing function
MEDIUMlightrag_webui/src/features/RetrievalTesting.tsx635 // Handle copying message content with robust clipboard support
Fake / Example Data17 hits · 14 pts
SeverityFileLineSnippet
LOWlightrag/api/routers/document_routes.py680 "metadata": {"author": "John Doe", "year": 2025},
LOWlightrag/api/routers/document_routes.py744 "metadata": {"author": "John Doe"},
LOWlightrag/api/routers/document_routes.py786 "metadata": {"author": "John Doe", "year": 2025},
LOWlightrag/api/routers/document_routes.py904 "metadata": {"author": "John Doe", "year": 2025},
LOWtests/kg/json_impl/test_write_json_optimization.py30 "name": "John Doe",
LOWtests/api/routes/test_description_api_validation.py228 ["entity", "Alice", "Acme Corp", "founded", "Alice founded Acme Corp."],
LOWtests/api/routes/test_description_api_validation.py240 assert relation["tgt_id"] == "Acme Corp"
LOWtests/api/routes/test_description_api_validation.py261 ["entity", "Alice", "Acme Corp", "founded", " "],
LOWtests/api/routes/test_description_api_validation.py276 ["edge", "Alice", "Acme Corp", "founded", "Alice founded Acme Corp."],
LOWtests/extraction/test_entity_extraction_stability.py420 assert set(nodes) == {"Alice", "Acme Corp"}
LOWtests/extraction/test_entity_extraction_stability.py421 assert ("Alice", "Acme Corp") in edges
LOWtests/extraction/test_entity_extraction_stability.py612 assert set(entities.keys()) == {"Alice", "Acme Corp"}
LOWtests/extraction/test_entity_extraction_stability.py616 assert relation_data["tgt_id"] == "Acme Corp"
LOWtests/extraction/test_entity_extraction_stability.py184 "name": "Acme Corp",
LOWtests/extraction/test_entity_extraction_stability.py192 "target": "Acme Corp",
LOWtests/extraction/test_entity_extraction_stability.py570 assert next(iter(relationships.keys())) == ("Alice", "Acme Corp")
LOWtests/extraction/test_entity_extraction_stability.py593 assert relation_data["tgt_id"] == "Acme Corp"
Slop Phrases2 hits · 5 pts
SeverityFileLineSnippet
MEDIUMexamples/unofficial-sample/lightrag_cloudflare_demo.py32WORKING_DIR = "../dickens" # you can change output as desired
MEDIUMexamples/unofficial-sample/lightrag_nvidia_demo.py112 # so you can adjust to be able to fit the NVIDIA model (future work)
Example Usage Blocks2 hits · 4 pts
SeverityFileLineSnippet
LOWdocker-compose.podman.yml3# Usage:
LOWlightrag/llm/binding_options.py671# Usage: