Repository Analysis

opendatalab/MinerU

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

8.6 Low AI signal View on GitHub
8.6
Adjusted Score
8.6
Raw Score
100%
Time Factor
2026-05-28
Last Push
65,653
Stars
Python
Language
137,017
Lines of Code
310
Files
1125
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 0HIGH 21MEDIUM 20LOW 1084

Pattern Findings

1125 matches across 12 categories. Click a row to expand file-level details.

Hyper-Verbose Identifiers603 hits · 572 pts
SeverityFileLineSnippet
LOWdemo/demo.py81def prepare_local_api_temp_dir() -> None:
LOWtests/unittest/test_e2e.py23def test_pipeline_with_two_config():
LOWmineru/utils/model_utils.py116def remove_nested_ocr_text_blocks(
LOWmineru/utils/model_utils.py148def get_res_list_from_layout_res(layout_res, overlap_threshold=0.8):
LOWmineru/utils/pdfium_guard.py30def get_pdfium_document_page_count(pdf_doc) -> int:
LOWmineru/utils/pdfium_guard.py42def rewrite_pdf_bytes_with_pdfium(
LOWmineru/utils/models_download_utils.py9def auto_download_and_get_model_root_path(relative_path: str, repo_mode='pipeline') -> str:
LOWmineru/utils/check_sys_env.py26def is_mac_os_version_supported(min_version: str = "13.5") -> bool:
LOWmineru/utils/llm_aided.py20def _get_title_line_avg_height(block):
LOWmineru/utils/llm_aided.py44def _collect_title_block_refs(page_info_list):
LOWmineru/utils/llm_aided.py71def _build_title_optimize_prompt(title_dict):
LOWmineru/utils/llm_aided.py114def _build_relative_title_optimize_prompt(title_dict):
LOWmineru/utils/llm_aided.py235def _get_title_block_identity(block):
LOWmineru/utils/llm_aided.py247def _sync_para_titles_to_preproc(page_info_list):
LOWmineru/utils/llm_aided.py270def _run_single_pass_title_leveling(title_block_refs, title_aided_config):
LOWmineru/utils/llm_aided.py276def _split_paragraph_title_groups(title_block_refs):
LOWmineru/utils/llm_aided.py295def _offset_paragraph_title_levels(levels_by_index):
LOWmineru/utils/llm_aided.py305def _request_paragraph_group_levels(title_block_refs, title_aided_config):
LOWmineru/utils/llm_aided.py315def _run_grouped_title_leveling(title_block_refs, title_aided_config):
LOWmineru/utils/config_reader.py122def get_ocr_det_mask_inline_formula_enable(enable):
LOWmineru/utils/config_reader.py128def get_processing_window_size(default: int = 64) -> int:
LOWmineru/utils/config_reader.py142def get_max_concurrent_requests(default: int = 3) -> int:
LOWmineru/utils/config_reader.py165def get_latex_delimiter_config():
LOWmineru/utils/magic_model_utils.py79 def calc_effective_index_diff(obj_index: int, sub_index: int) -> int:
LOWmineru/utils/visual_magic_model_utils.py101def fallback_inline_caption_fragments(blocks, visual_main_types):
LOWmineru/utils/visual_magic_model_utils.py132def fallback_leading_table_continuation_captions(blocks, visual_main_types):
LOWmineru/utils/visual_magic_model_utils.py175def _is_leading_continuation_text_block(block):
LOWmineru/utils/visual_magic_model_utils.py193def is_transparent_visual_relation_block(block):
LOWmineru/utils/visual_magic_model_utils.py204def _is_leading_continuation_cluster_near_table(leading_blocks, table_block):
LOWmineru/utils/visual_magic_model_utils.py226def fallback_stacked_table_caption_fragments(blocks, visual_main_types):
LOWmineru/utils/visual_magic_model_utils.py267def find_stacked_table_caption_cluster(table_block, blocks):
LOWmineru/utils/visual_magic_model_utils.py303def find_last_caption_position(caption_cluster):
LOWmineru/utils/visual_magic_model_utils.py311def is_horizontally_near_table(block, table_block):
LOWmineru/utils/visual_magic_model_utils.py323def is_single_line_caption_fragment(block):
LOWmineru/utils/visual_magic_model_utils.py333def find_previous_effective_block(ordered_blocks, pos):
LOWmineru/utils/visual_magic_model_utils.py342def find_next_effective_block(ordered_blocks, pos):
LOWmineru/utils/visual_magic_model_utils.py351def is_inline_caption_fragment(previous_caption, text_block, next_visual):
LOWmineru/utils/visual_magic_model_utils.py477def absorb_image_block_members(blocks):
LOWmineru/utils/visual_magic_model_utils.py622def effective_visual_index_diff(
LOWmineru/utils/visual_magic_model_utils.py678def is_block_outside_visual_gap(between_block, child_block, main_block):
LOWmineru/utils/visual_magic_model_utils.py697def vertical_gap_between_blocks(first_block, second_block):
LOWmineru/utils/visual_magic_model_utils.py708def is_bbox_intersecting_vertical_gap(bbox, vertical_gap):
LOWmineru/utils/visual_magic_model_utils.py714def is_bbox_overlapping_visual_relation_block(bbox, child_bbox, main_bbox):
LOWmineru/utils/guess_suffix_or_lang.py41def _normalize_text_for_language_guess(code: str) -> str:
LOWmineru/utils/guess_suffix_or_lang.py94def _ooxml_relationship_targets(root: ElementTree.Element) -> list[str]:
LOWmineru/utils/guess_suffix_or_lang.py113def _ooxml_content_type_overrides(root: ElementTree.Element) -> dict[str, str]:
LOWmineru/utils/guess_suffix_or_lang.py129def _guess_ooxml_suffix_from_zip(package: ZipFile) -> str | None:
LOWmineru/utils/guess_suffix_or_lang.py142def _guess_ooxml_suffix_by_bytes(file_bytes: bytes) -> str | None:
LOWmineru/utils/guess_suffix_or_lang.py158def _guess_ooxml_suffix_by_path(file_path: Path) -> str | None:
LOWmineru/utils/span_pre_proc.py192 def _candidate_indices_for_block(self, block_bbox):
LOWmineru/utils/office_rich_text.py76def has_non_visible_text_style(format_obj: Any) -> bool:
LOWmineru/utils/office_rich_text.py86def normalize_format_for_text(
LOWmineru/utils/office_rich_text.py172def is_valid_hyperlink_target(hyperlink: Any) -> bool:
LOWmineru/utils/office_rich_text.py180def format_text_with_hyperlink(
LOWmineru/utils/office_rich_text.py195def _format_hyperlink_segments(group: list[OfficeRichTextSegment]) -> str:
LOWmineru/utils/office_rich_text.py300def build_rich_text_from_segments(
LOWmineru/utils/office_rich_text.py350def build_text_mappings_from_elements(
LOWmineru/utils/table_continuation.py23def is_table_continuation_text(text: str) -> bool:
LOWmineru/utils/table_continuation.py38def _matches_continuation_end_marker(text: str, marker: str) -> bool:
LOWmineru/utils/table_merge.py354def _serialize_table_state_html(state: TableMergeState) -> None:
543 more matches not shown…
Deep Nesting190 hits · 165 pts
SeverityFileLineSnippet
LOWmineru/utils/model_utils.py148
LOWmineru/utils/model_utils.py183
LOWmineru/utils/model_utils.py217
LOWmineru/utils/pdfium_guard.py42
LOWmineru/utils/llm_aided.py160
LOWmineru/utils/config_reader.py75
LOWmineru/utils/magic_model_utils.py32
LOWmineru/utils/guess_suffix_or_lang.py41
LOWmineru/utils/guess_suffix_or_lang.py185
LOWmineru/utils/span_pre_proc.py34
LOWmineru/utils/span_pre_proc.py280
LOWmineru/utils/office_rich_text.py300
LOWmineru/utils/office_rich_text.py350
LOWmineru/utils/engine_utils.py10
LOWmineru/utils/table_merge.py78
LOWmineru/utils/table_merge.py402
LOWmineru/utils/table_merge.py483
LOWmineru/utils/table_merge.py769
LOWmineru/utils/table_merge.py868
LOWmineru/utils/table_merge.py938
LOWmineru/utils/pdf_classify.py249
LOWmineru/utils/draw_bbox.py102
LOWmineru/utils/draw_bbox.py146
LOWmineru/utils/draw_bbox.py317
LOWmineru/utils/draw_bbox.py325
LOWmineru/utils/boxbase.py40
LOWmineru/backend/pipeline/model_json_to_middle_json.py204
LOWmineru/backend/pipeline/model_json_to_middle_json.py247
LOWmineru/backend/pipeline/batch_analyze.py347
LOW…eru/backend/pipeline/pipeline_middle_json_mkcontent.py18
LOW…eru/backend/pipeline/pipeline_middle_json_mkcontent.py124
LOW…eru/backend/pipeline/pipeline_middle_json_mkcontent.py365
LOW…eru/backend/pipeline/pipeline_middle_json_mkcontent.py518
LOW…eru/backend/pipeline/pipeline_middle_json_mkcontent.py550
LOW…eru/backend/pipeline/pipeline_middle_json_mkcontent.py609
LOW…eru/backend/pipeline/pipeline_middle_json_mkcontent.py745
LOW…eru/backend/pipeline/pipeline_middle_json_mkcontent.py968
LOWmineru/backend/pipeline/para_split.py60
LOWmineru/backend/pipeline/para_split.py265
LOWmineru/backend/pipeline/para_split.py309
LOWmineru/backend/pipeline/para_split.py369
LOWmineru/backend/pipeline/para_split.py415
LOWmineru/backend/pipeline/pipeline_magic_model.py202
LOWmineru/backend/pipeline/pipeline_magic_model.py225
LOWmineru/backend/pipeline/model_init.py164
LOWmineru/backend/pipeline/pipeline_analyze.py157
LOWmineru/backend/pipeline/pipeline_analyze.py325
LOWmineru/backend/office/model_output_to_middle_json.py11
LOWmineru/backend/office/model_output_to_middle_json.py94
LOWmineru/backend/office/model_output_to_middle_json.py126
LOWmineru/backend/office/office_magic_model.py227
LOWmineru/backend/office/office_magic_model.py393
LOWmineru/backend/office/office_magic_model.py441
LOWmineru/backend/office/office_magic_model.py668
LOWmineru/backend/office/office_magic_model.py737
LOWmineru/backend/office/office_magic_model.py14
LOWmineru/backend/office/mkcontent/inline_renderer.py515
LOWmineru/backend/office/mkcontent/inline_renderer.py889
LOWmineru/backend/office/mkcontent/output_builders.py79
LOWmineru/backend/office/mkcontent/output_builders.py104
130 more matches not shown…
Excessive Try-Catch Wrapping140 hits · 148 pts
SeverityFileLineSnippet
LOWupdate_version.py15 except Exception as e:
MEDIUMtests/clean_coverage.py16 print(f"Error deleting file '{path}': {e}")
MEDIUMtests/clean_coverage.py22 print(f"Error deleting directory '{path}': {e}")
LOWtests/unittest/test_e2e.py148 except Exception as e:
MEDIUMtests/unittest/test_e2e.py144def validate_html(html_content):
LOWmineru/utils/pdf_reader.py28 except Exception as e:
LOWmineru/utils/pdf_text_tool.py52 except Exception:
LOWmineru/utils/llm_aided.py212 except Exception as e:
LOWmineru/utils/config_reader.py88 except Exception as e:
LOWmineru/utils/config_reader.py92 except Exception as e:
LOWmineru/utils/config_reader.py96 except Exception as e:
LOWmineru/utils/config_reader.py100 except Exception as e:
LOWmineru/utils/config_reader.py104 except Exception as e:
LOWmineru/utils/guess_suffix_or_lang.py81 except Exception:
LOWmineru/utils/guess_suffix_or_lang.py199 except Exception as e:
LOWmineru/utils/span_pre_proc.py41 except Exception as exc:
LOWmineru/utils/pdf_image_tools.py311 except Exception:
LOWmineru/utils/pdf_image_tools.py320 except Exception:
LOWmineru/utils/pdf_image_tools.py332 except Exception:
LOWmineru/utils/pdf_image_tools.py340 except Exception:
LOWmineru/utils/pdf_classify.py153 except Exception as e:
LOWmineru/backend/pipeline/batch_analyze.py481 except Exception as e:
LOWmineru/backend/pipeline/batch_analyze.py493 except Exception as e:
LOWmineru/backend/pipeline/model_init.py328 except Exception as e:
LOWmineru/backend/pipeline/pipeline_analyze.py102 except Exception:
LOWmineru/backend/pipeline/pipeline_analyze.py341 except Exception as e:
LOWmineru/backend/utils/office_chart.py78 except Exception:
LOWmineru/backend/utils/office_chart.py299 except Exception:
LOWmineru/backend/utils/ocr_det_utils.py15 except Exception as e:
MEDIUMmineru/backend/utils/ocr_det_utils.py12def get_ch_lite_ocr_det_model():
LOWmineru/backend/utils/html_image_utils.py46 except Exception as exc:
LOWmineru/backend/utils/html_image_utils.py62 except Exception:
LOWmineru/backend/utils/html_image_utils.py87 except Exception as exc:
LOWmineru/backend/vlm/vlm_magic_model.py60 except Exception as e:
LOWmineru/backend/vlm/vlm_analyze.py320 except Exception as exc:
LOWmineru/backend/vlm/vlm_analyze.py419 except Exception:
LOWmineru/backend/vlm/utils.py108 except Exception as e:
MEDIUMmineru/backend/vlm/utils.py95def set_default_batch_size() -> int:
LOWmineru/backend/hybrid/hybrid_analyze.py709 except Exception:
LOWmineru/backend/hybrid/hybrid_magic_model.py121 except Exception as e:
LOWmineru/cli/client.py350 except Exception as exc:
LOWmineru/cli/client.py386 except Exception as exc:
LOWmineru/cli/client.py409 except Exception as exc:
LOWmineru/cli/client.py750 except Exception as exc:
LOWmineru/cli/client.py847 except Exception as exc:
MEDIUMmineru/cli/client.py344def create_visualization_context() -> Optional[VisualizationContext]:
LOWmineru/cli/models_download.py182 except Exception as e:
LOWmineru/cli/visualization.py52 except Exception as exc:
LOWmineru/cli/visualization.py76 except Exception as exc:
MEDIUMmineru/cli/fast_api.py106def is_main_multiprocessing_process() -> bool:
MEDIUMmineru/cli/fast_api.py289def shutdown_runtime_resources() -> None:
MEDIUMmineru/cli/fast_api.py1092def _dispatcher_loop(self) -> None:
MEDIUMmineru/cli/fast_api.py1111def _cleanup_loop(self) -> None:
LOWmineru/cli/fast_api.py109 except Exception:
LOWmineru/cli/fast_api.py121 except Exception:
LOWmineru/cli/fast_api.py292 except Exception as exc:
LOWmineru/cli/fast_api.py297 except Exception as exc:
LOWmineru/cli/fast_api.py272 except Exception:
LOWmineru/cli/fast_api.py357 except Exception as e:
LOWmineru/cli/fast_api.py611 except Exception:
80 more matches not shown…
Unused Imports132 hits · 132 pts
SeverityFileLineSnippet
LOWtests/get_coverage.py6
LOWmineru/utils/engine_utils.py2
LOWmineru/utils/engine_utils.py41
LOWmineru/utils/engine_utils.py50
LOWmineru/utils/engine_utils.py63
LOWmineru/utils/engine_utils.py54
LOWmineru/utils/cli_parser.py4
LOWmineru/utils/title_level_postprocess.py2
LOWmineru/backend/pipeline/para_split.py3
LOWmineru/backend/office/office_middle_json_mkcontent.py2
LOWmineru/backend/office/office_middle_json_mkcontent.py2
LOWmineru/backend/office/office_middle_json_mkcontent.py2
LOWmineru/backend/office/office_middle_json_mkcontent.py2
LOWmineru/backend/office/office_middle_json_mkcontent.py2
LOWmineru/backend/office/office_middle_json_mkcontent.py2
LOWmineru/backend/office/office_middle_json_mkcontent.py10
LOWmineru/backend/office/office_middle_json_mkcontent.py10
LOWmineru/backend/office/office_middle_json_mkcontent.py10
LOWmineru/backend/office/office_middle_json_mkcontent.py10
LOWmineru/backend/office/office_middle_json_mkcontent.py10
LOWmineru/backend/office/office_middle_json_mkcontent.py10
LOWmineru/backend/office/office_middle_json_mkcontent.py10
LOWmineru/backend/office/office_middle_json_mkcontent.py10
LOWmineru/backend/office/office_magic_model.py7
LOWmineru/backend/vlm/vlm_middle_json_mkcontent.py4
LOWmineru/cli/vlm_server.py32
LOWmineru/cli/vlm_server.py48
LOWmineru/cli/vlm_server.py55
LOWmineru/cli/vlm_server.py38
LOWmineru/cli/client_side_output.py2
LOWmineru/model/mfr/unimernet/unimernet_hf/__init__.py2
LOWmineru/model/mfr/unimernet/unimernet_hf/__init__.py2
LOWmineru/model/mfr/unimernet/unimernet_hf/__init__.py2
LOWmineru/model/mfr/unimernet/unimernet_hf/__init__.py3
LOWmineru/model/mfr/unimernet/unimernet_hf/__init__.py3
LOWmineru/model/mfr/unimernet/unimernet_hf/__init__.py3
LOWmineru/model/mfr/unimernet/unimernet_hf/__init__.py4
LOW…model/mfr/unimernet/unimernet_hf/modeling_unimernet.py8
LOW…el/mfr/unimernet/unimernet_hf/unimer_mbart/__init__.py2
LOW…el/mfr/unimernet/unimernet_hf/unimer_mbart/__init__.py3
LOW…el/mfr/unimernet/unimernet_hf/unimer_mbart/__init__.py3
LOW…del/mfr/unimernet/unimernet_hf/unimer_swin/__init__.py2
LOW…del/mfr/unimernet/unimernet_hf/unimer_swin/__init__.py3
LOW…del/mfr/unimernet/unimernet_hf/unimer_swin/__init__.py4
LOWmineru/model/mfr/pp_formulanet_plus_m/processors.py11
LOW…neru/model/utils/pytorchocr/modeling/necks/__init__.py19
LOW…neru/model/utils/pytorchocr/modeling/necks/__init__.py19
LOW…neru/model/utils/pytorchocr/modeling/necks/__init__.py19
LOW…neru/model/utils/pytorchocr/modeling/necks/__init__.py20
LOW…neru/model/utils/pytorchocr/modeling/heads/__init__.py20
LOW…neru/model/utils/pytorchocr/modeling/heads/__init__.py20
LOW…neru/model/utils/pytorchocr/modeling/heads/__init__.py23
LOW…neru/model/utils/pytorchocr/modeling/heads/__init__.py24
LOW…neru/model/utils/pytorchocr/modeling/heads/__init__.py25
LOW…neru/model/utils/pytorchocr/modeling/heads/__init__.py28
LOW…/utils/pytorchocr/modeling/heads/rec_unimernet_head.py4
LOW…/utils/pytorchocr/modeling/heads/rec_unimernet_head.py5
LOW…ils/pytorchocr/modeling/heads/rec_ppformulanet_head.py17
LOW…ils/pytorchocr/modeling/heads/rec_ppformulanet_head.py18
LOW…ils/pytorchocr/modeling/heads/rec_ppformulanet_head.py19
72 more matches not shown…
Cross-File Repetition14 hits · 70 pts
SeverityFileLineSnippet
HIGH…ernet/unimernet_hf/unimer_swin/modeling_unimer_swin.py0drop paths (stochastic depth) per sample (when applied in main path of residual blocks).
HIGH…del/utils/pytorchocr/modeling/backbones/rec_svtrnet.py0drop paths (stochastic depth) per sample (when applied in main path of residual blocks).
HIGH…/utils/pytorchocr/modeling/backbones/rec_donut_swin.py0drop paths (stochastic depth) per sample (when applied in main path of residual blocks).
HIGHmineru/data/data_reader_writer/filebase.py0read at offset and limit. args: path (str): the path of file, if the path is relative path, it will be joined with paren
HIGHmineru/data/io/s3.py0read at offset and limit. args: path (str): the path of file, if the path is relative path, it will be joined with paren
HIGHmineru/data/io/base.py0read at offset and limit. args: path (str): the path of file, if the path is relative path, it will be joined with paren
HIGHmineru/data/data_reader_writer/filebase.py0write file with data. args: path (str): the path of file, if the path is relative path, it will be joined with parent_di
HIGHmineru/data/io/http.py0write file with data. args: path (str): the path of file, if the path is relative path, it will be joined with parent_di
HIGHmineru/data/io/s3.py0write file with data. args: path (str): the path of file, if the path is relative path, it will be joined with parent_di
HIGHmineru/data/io/base.py0write file with data. args: path (str): the path of file, if the path is relative path, it will be joined with parent_di
HIGHmineru/data/data_reader_writer/base.py0read the file. args: path (str): file path to read returns: bytes: the content of the file
HIGHmineru/data/io/http.py0read the file. args: path (str): file path to read returns: bytes: the content of the file
HIGHmineru/data/io/s3.py0read the file. args: path (str): file path to read returns: bytes: the content of the file
HIGHmineru/data/io/base.py0read the file. args: path (str): file path to read returns: bytes: the content of the file
Over-Commented Block28 hits · 28 pts
SeverityFileLineSnippet
LOWdemo/demo.py201
LOWdemo/demo.py221 # Available options:
LOWmineru/model/ocr/seal_det_warp.py1# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
LOWmineru/model/ocr/seal_crop.py1# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
LOW…net/unimernet_hf/unimer_mbart/modeling_unimer_mbart.py1# coding=utf-8
LOW…net/unimernet_hf/unimer_mbart/modeling_unimer_mbart.py421 if self.is_decoder:
LOW…nimernet_hf/unimer_mbart/configuration_unimer_mbart.py1# coding=utf-8
LOW…ernet/unimernet_hf/unimer_swin/modeling_unimer_swin.py1# coding=utf-8
LOW…/unimernet_hf/unimer_swin/configuration_unimer_swin.py1# coding=utf-8
LOWmineru/model/utils/tools/__init__.py1# Copyright (c) Opendatalab. All rights reserved.
LOW…neru/model/utils/pytorchocr/modeling/necks/__init__.py1# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
LOWmineru/model/utils/pytorchocr/modeling/necks/rnn.py21 # def forward(self, x):
LOW…el/utils/pytorchocr/modeling/architectures/__init__.py1# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
LOW…neru/model/utils/pytorchocr/modeling/heads/__init__.py1# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
LOW…ils/pytorchocr/modeling/heads/rec_ppformulanet_head.py1# Copyright (c) Opendatalab. All rights reserved.
LOW…/model/utils/pytorchocr/modeling/backbones/__init__.py1# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
LOW…del/utils/pytorchocr/modeling/backbones/rec_lcnetv3.py1# Copyright (c) Opendatalab. All rights reserved.
LOW…/model/utils/pytorchocr/postprocess/rec_postprocess.py1# Copyright (c) Opendatalab. All rights reserved.
LOWmineru/model/utils/pytorchocr/data/imaug/__init__.py1# Copyright (c) Opendatalab. All rights reserved.
LOWmineru/model/utils/pytorchocr/data/imaug/operators.py1"""
LOWmineru/model/table/rec/slanet_plus/matcher.py1# Copyright (c) Opendatalab. All rights reserved.
LOWmineru/model/table/rec/slanet_plus/table_structure.py1# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
LOW…u/model/table/rec/slanet_plus/table_structure_utils.py1# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
LOWmineru/model/table/rec/slanet_plus/matcher_utils.py1# Copyright (c) Opendatalab. All rights reserved.
LOWmineru/model/table/rec/slanet_plus/matcher_utils.py121 if not has_span_in_head:
LOWmineru/model/table/rec/unet_table/main.py161 # sorted_polygons: np.ndarray,
LOW.github/workflows/cla.yml41 #custom-allsigned-prcomment: 'pull request comment when all contributors has signed, defaults to **CLA Assista
LOW.github/workflows/cli.yml41# notify_to_feishu:
Magic Placeholder Names5 hits · 25 pts
SeverityFileLineSnippet
HIGHmineru.template.json18 "api_key": "your_api_key",
HIGHdocs/zh/usage/quick_usage.md138 "api_key": "your_api_key",
HIGHdocs/zh/usage/quick_usage.md148 "api_key": "your_api_key",
HIGHdocs/en/usage/quick_usage.md138 "api_key": "your_api_key",
HIGHdocs/en/usage/quick_usage.md148 "api_key": "your_api_key",
Self-Referential Comments4 hits · 10 pts
SeverityFileLineSnippet
MEDIUMmineru/utils/model_utils.py59 # Create a white background array
MEDIUMmineru/utils/model_utils.py69 # Create a white background array
MEDIUMmineru/utils/draw_bbox.py91 rect = cal_canvas_rect(page, bbox) # Define the rectangle
MEDIUMmineru/utils/draw_bbox.py110 rect = cal_canvas_rect(page, bbox) # Define the rectangle
AI Slop Vocabulary3 hits · 8 pts
SeverityFileLineSnippet
MEDIUM…ils/pytorchocr/modeling/heads/rec_ppformulanet_head.py971 # 1. Check whether the user has defined `decoder_input_ids` manually. To facilitate in terms of input naming,
MEDIUM…/model/utils/pytorchocr/postprocess/rec_postprocess.py112 ): # grouping word with '-', such as 'state-of-the-art'
MEDIUMmineru/model/docx/main.py19 # provide a more robust command-line interface and resolve the demo
Redundant / Tautological Comments4 hits · 6 pts
SeverityFileLineSnippet
LOWdemo/demo.py209 # Set this to an existing MinerU FastAPI base URL, for example:
LOWmineru/utils/model_utils.py95 # Check if intersection is valid
LOWmineru/utils/model_utils.py112 # Check if overlap exceeds threshold
LOWmineru/model/xlsx/xlsx_converter.py389 # Check if file exists in zip
Cross-Language Confusion1 hit · 5 pts
SeverityFileLineSnippet
HIGHmineru/model/mfr/utils.py312 r'\\(?:lefteqn|boldmath|ensuremath|centering|textsubscript|sides|textsl|textcent|emph|protect|null)')
Docstring Block Structure1 hit · 5 pts
SeverityFileLineSnippet
HIGH…l/utils/pytorchocr/modeling/backbones/rec_pphgnetv2.py546use 'handle_func' to modify the sub-layer(s) specified by 'layer_name_pattern'. Args: layer_name_pa