Repository Analysis

microsoft/data-formulator

🪄 Create rich visualizations with AI

37.6 Strong AI signal View on GitHub
37.6
Adjusted Score
37.6
Raw Score
100%
Time Factor
2026-05-30
Last Push
15,760
Stars
TypeScript
Language
180,376
Lines of Code
604
Files
3667
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 2HIGH 14MEDIUM 1284LOW 2367

Pattern Findings

3667 matches across 18 categories. Click a row to expand file-level details.

Decorative Section Separators1247 hits · 4384 pts
SeverityFileLineSnippet
MEDIUMpy-src/data_formulator/agent_config.py39# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/agent_config.py41# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/agent_config.py48 # ── Heavy: code-gen, multi-step, tool-using ─────────────────────────────
MEDIUMpy-src/data_formulator/agent_config.py56 # ── Light: single-turn extractors / classifiers / formatters ────────────
MEDIUMpy-src/data_formulator/error_handler.py51# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/error_handler.py53# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/error_handler.py103# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/error_handler.py105# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/error_handler.py133# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/error_handler.py135# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/error_handler.py201# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/error_handler.py203# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/error_handler.py244# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/error_handler.py246# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py52# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py54# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py1172# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py1174# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py1951# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py1953# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py308# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py310# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py667# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py669# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py698# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py700# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py2162# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/data_connector.py2164# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/url_allowlist.py46# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/url_allowlist.py48# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/url_allowlist.py67# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/url_allowlist.py69# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/code_signing.py35# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/code_signing.py37# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/code_signing.py96# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/code_signing.py98# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/log_sanitizer.py37# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/log_sanitizer.py39# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/log_sanitizer.py56# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/log_sanitizer.py58# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/log_sanitizer.py106# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/log_sanitizer.py108# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/log_sanitizer.py190# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/security/log_sanitizer.py192# ---------------------------------------------------------------------------
MEDIUMpy-src/data_formulator/auth/token_store.py31 # ── Core interface ────────────────────────────────────────
MEDIUMpy-src/data_formulator/auth/token_store.py107 # ── Store / clear ─────────────────────────────────────────
MEDIUMpy-src/data_formulator/auth/token_store.py182 # ── Internal: cache ───────────────────────────────────────
MEDIUMpy-src/data_formulator/auth/token_store.py192 # ── Internal: refresh ─────────────────────────────────────
MEDIUMpy-src/data_formulator/auth/token_store.py227 # ── Internal: SSO exchange ────────────────────────────────
MEDIUMpy-src/data_formulator/auth/token_store.py263 # ── Internal: vault ───────────────────────────────────────
MEDIUMpy-src/data_formulator/auth/token_store.py308 # ── Internal: SSO refresh ─────────────────────────────────
MEDIUMpy-src/data_formulator/auth/token_store.py340 # ── Internal: auth_config lookup ──────────────────────────
MEDIUMpy-src/data_formulator/auth/gateways/oidc_gateway.py203# ── Token management routes (work in all AUTH_MODE settings) ──────
MEDIUMpy-src/data_formulator/auth/vault/local_vault.py49 # ------------------------------------------------------------------
MEDIUMpy-src/data_formulator/workflows/create_vl_plots.py1762 # ── 2. Tooltips ───────────────────────────────────────────────────────
MEDIUMpy-src/data_formulator/workflows/create_vl_plots.py1765 # ── 3. Canvas sizing defaults ─────────────────────────────────────────
MEDIUMpy-src/data_formulator/workflows/create_vl_plots.py1770 # ── 4. Axis label limits (prevent long labels from overflowing) ──────
MEDIUMpy-src/data_formulator/workflows/create_vl_plots.py1776 # ── 5. Step-based sizing for wide discrete axes ───────────────────────
MEDIUMpy-src/data_formulator/workflows/create_vl_plots.py731 # ── Apply semantic enhancements when available ─────────────────────
MEDIUMpy-src/data_formulator/workflows/create_vl_plots.py878# ---------------------------------------------------------------------------
1187 more matches not shown…
Hyper-Verbose Identifiers1448 hits · 1450 pts
SeverityFileLineSnippet
LOW.cursor/skills/path-safety/SKILL.md118def _enforce_deployment_restrictions():
LOWpy-src/data_formulator/workspace_factory.py30def _build_azure_container_client(cfg: dict):
LOWpy-src/data_formulator/workspace_factory.py78def _get_user_workspaces_root(identity_id: str) -> Path:
LOWpy-src/data_formulator/error_handler.py76def classify_and_wrap_llm_error(exc: Exception) -> AppError:
LOWpy-src/data_formulator/data_connector.py56def classify_and_raise_connector_error(error: Exception, *, operation: str = "") -> None:
LOWpy-src/data_formulator/data_connector.py1448def connector_get_catalog_tree():
LOWpy-src/data_formulator/data_connector.py135def _lightweight_tree_for_response(tree: list[dict[str, Any]]) -> list[dict[str, Any]]:
LOWpy-src/data_formulator/data_connector.py193def _is_sensitive_or_auth_param(
LOWpy-src/data_formulator/data_connector.py272def _resolve_connector_with_key(data: dict[str, Any]) -> tuple[str, "DataConnector"]:
LOWpy-src/data_formulator/data_connector.py1506def connector_get_cached_catalog_tree():
LOWpy-src/data_formulator/data_connector.py1563def connector_sync_catalog_metadata():
LOWpy-src/data_formulator/security/sanitize.py25def _extract_traceback_summary(message: str) -> str:
LOWpy-src/data_formulator/security/sanitize.py54def _structured_error_response(code: str, message: str, status_code: int):
LOWpy-src/data_formulator/workflows/create_vl_plots.py22def field_metadata_to_semantic_types(
LOWpy-src/data_formulator/workflows/create_vl_plots.py539 def assign_aesthetic_channels():
LOWpy-src/data_formulator/workflows/create_vl_plots.py1225def _post_process_candlestick(
LOWpy-src/data_formulator/workflows/create_vl_plots.py1581def _post_process_streamgraph(spec: dict, encodings: dict, config: dict | None) -> None:
LOWpy-src/data_formulator/workflows/chart_semantics.py570def resolve_channel_semantics(
LOWpy-src/data_formulator/workflows/chart_semantics.py184def _looks_like_year_integers(values: List[Any]) -> bool:
LOW…-src/data_formulator/agents/agent_data_loading_chat.py385def _build_connector_summary_block(
LOW…-src/data_formulator/agents/agent_data_loading_chat.py811 def _tool_show_user_data_preview(self, args, scratch_jail):
LOW…-src/data_formulator/agents/agent_data_loading_chat.py1125 def _normalize_load_plan_candidate(self, candidate):
LOW…-src/data_formulator/agents/agent_data_loading_chat.py1205 def _format_valid_sources_hint(self) -> str:
LOW…-src/data_formulator/agents/agent_data_loading_chat.py1238 def _normalize_load_plan_filters(filters):
LOWpy-src/data_formulator/agents/client_utils.py72 def _strip_images_from_messages(self, messages):
LOWpy-src/data_formulator/agents/client_utils.py85 def _is_image_deserialize_error(self, error_text: str) -> bool:
LOWpy-src/data_formulator/agents/client_utils.py90 def _is_reasoning_effort_error(self, error_text: str) -> bool:
LOWpy-src/data_formulator/agents/client_utils.py165 def get_completion_with_tools(self, messages, tools, stream=False,
LOWpy-src/data_formulator/agents/agent_utils.py83def accumulate_reasoning_content(
LOWpy-src/data_formulator/agents/agent_utils.py109def _source_table_matches_catalog_entry(
LOWpy-src/data_formulator/agents/agent_utils.py124def build_catalog_metadata_lookups(
LOWpy-src/data_formulator/agents/agent_utils.py207def format_dataframe_sample_with_budget(
LOWpy-src/data_formulator/agents/agent_utils.py241def field_name_to_ts_variable_name(field_name):
LOWpy-src/data_formulator/agents/agent_utils.py292def extract_code_from_gpt_response(code_raw, language):
LOWpy-src/data_formulator/agents/agent_utils.py372def _fix_json_trailing_commas(s: str) -> str:
LOWpy-src/data_formulator/agents/agent_utils.py738def ensure_output_variable_in_code(code: str, output_variable: str) -> tuple[str, bool, str]:
LOW…src/data_formulator/agents/agent_experience_distill.py395 def _add_fallback_front_matter(
LOWpy-src/data_formulator/agents/agent_utils_sql.py17def create_duckdb_conn_with_parquet_views(workspace, input_tables: list[dict]):
LOWpy-src/data_formulator/agents/context.py29def _ensure_no_auth_catalogs_cached(user_home: Any) -> None:
LOWpy-src/data_formulator/agents/context.py84def _get_workspace_metadata_lookups(workspace: Any) -> tuple[dict[str, str], dict[str, dict[str, str]], dict[str, str]]:
LOWpy-src/data_formulator/agents/context.py110def build_focused_thread_context(focused_thread: list[dict[str, Any]]) -> str:
LOWpy-src/data_formulator/agents/context.py149def build_peripheral_thread_context(other_threads: list[dict[str, Any]]) -> str:
LOWpy-src/data_formulator/agents/context.py166def build_lightweight_table_context(
LOWpy-src/data_formulator/agents/context.py301def handle_inspect_source_data(
LOWpy-src/data_formulator/agents/context.py342def handle_read_catalog_metadata(
LOWpy-src/data_formulator/agents/semantic_types.py264def generate_semantic_types_prompt() -> str:
LOWpy-src/data_formulator/agents/data_agent.py77def _rescue_unpack_json_strings(data: dict) -> None:
LOWpy-src/data_formulator/agents/data_agent.py1475 def _build_focused_thread_context(
LOWpy-src/data_formulator/agents/data_agent.py1480 def _build_peripheral_thread_context(
LOWpy-src/data_formulator/agents/data_agent.py1485 def _build_lightweight_table_context(
LOWpy-src/data_formulator/agents/data_agent.py1888 def _search_relevant_knowledge(
LOWpy-src/data_formulator/agents/data_agent.py817 def _sanitize_clarification_options(cls, raw_options: Any) -> list[dict[str, Any]]:
LOWpy-src/data_formulator/agents/data_agent.py852 def _sanitize_clarification_questions(cls, raw_questions: Any) -> list[dict[str, Any]]:
LOWpy-src/data_formulator/agents/data_agent.py888 def _normalize_clarify_action(cls, action: dict[str, Any]) -> dict[str, Any]:
LOWpy-src/data_formulator/agents/data_agent.py895 def _normalize_explain_action(cls, action: dict[str, Any]) -> dict[str, Any]:
LOWpy-src/data_formulator/agents/data_agent.py919 def _normalize_delegate_action(cls, action: dict[str, Any]) -> dict[str, Any]:
LOWpy-src/data_formulator/agents/data_agent.py1916 def _load_active_session_experience(self) -> dict[str, Any] | None:
LOWpy-src/data_formulator/agents/web_utils.py278def get_html_meta_description(html_content: str) -> str | None:
LOWpy-src/data_formulator/agents/agent_language.py71def inject_language_instruction(
LOWpy-src/data_formulator/agents/agent_language.py104def build_language_instruction(language: str, *, mode: str = "full") -> str:
1388 more matches not shown…
Excessive Try-Catch Wrapping400 hits · 358 pts
SeverityFileLineSnippet
LOW.cursor/skills/error-handling/SKILL.md77 except Exception as e:
LOW.cursor/skills/error-handling/SKILL.md117 except Exception as e:
LOW.cursor/skills/error-handling/SKILL.md267 except Exception as e:
LOW.cursor/skills/error-handling/SKILL.md285except Exception as e:
MEDIUM.cursor/skills/error-handling/SKILL.md113def generate():
MEDIUM.cursor/skills/error-handling/SKILL.md263def my_table_op():
LOWpy-src/data_formulator/_startup_spinner.py32 except Exception:
LOWpy-src/data_formulator/app.py318 except Exception:
LOWpy-src/data_formulator/data_connector.py1141 except Exception:
LOWpy-src/data_formulator/data_connector.py1147 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1157 except Exception:
LOWpy-src/data_formulator/data_connector.py1164 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1438 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1443 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1946 except Exception as e:
LOWpy-src/data_formulator/data_connector.py282 except Exception as exc:
LOWpy-src/data_formulator/data_connector.py472 except Exception as exc:
LOWpy-src/data_formulator/data_connector.py487 except Exception as exc:
LOWpy-src/data_formulator/data_connector.py499 except Exception as exc:
LOWpy-src/data_formulator/data_connector.py510 except Exception as e:
LOWpy-src/data_formulator/data_connector.py571 except Exception as exc:
LOWpy-src/data_formulator/data_connector.py605 except Exception as exc:
LOWpy-src/data_formulator/data_connector.py618 except Exception as e:
LOWpy-src/data_formulator/data_connector.py650 except Exception as exc:
LOWpy-src/data_formulator/data_connector.py860 except Exception as exc:
LOWpy-src/data_formulator/data_connector.py876 except Exception as e:
LOWpy-src/data_formulator/data_connector.py887 except Exception as e:
LOWpy-src/data_formulator/data_connector.py975 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1022 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1056 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1060 except Exception:
LOWpy-src/data_formulator/data_connector.py1110 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1123 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1267 except Exception:
LOWpy-src/data_formulator/data_connector.py1284 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1288 except Exception:
LOWpy-src/data_formulator/data_connector.py1319 except Exception as exc:
LOWpy-src/data_formulator/data_connector.py1328 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1364 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1375 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1486 except Exception:
LOWpy-src/data_formulator/data_connector.py1501 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1554 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1617 except Exception:
LOWpy-src/data_formulator/data_connector.py1648 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1676 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1717 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1753 except Exception:
LOWpy-src/data_formulator/data_connector.py1763 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1817 except Exception:
LOWpy-src/data_formulator/data_connector.py1835 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1870 except Exception as e:
LOWpy-src/data_formulator/data_connector.py1932 except Exception as e:
LOWpy-src/data_formulator/data_connector.py2000 except Exception as e:
LOWpy-src/data_formulator/data_connector.py2068 except Exception as e:
LOWpy-src/data_formulator/data_connector.py2096 except Exception as e:
LOWpy-src/data_formulator/data_connector.py2114 except Exception as e:
LOWpy-src/data_formulator/auth/token_store.py89 except Exception:
LOWpy-src/data_formulator/auth/token_store.py223 except Exception as exc:
LOWpy-src/data_formulator/auth/token_store.py259 except Exception as exc:
340 more matches not shown…
Unused Imports333 hits · 272 pts
SeverityFileLineSnippet
LOWpy-src/data_formulator/_startup_spinner.py14
LOWpy-src/data_formulator/workspace_factory.py24
LOWpy-src/data_formulator/agent_config.py32
LOWpy-src/data_formulator/app.py24
LOWpy-src/data_formulator/app.py25
LOWpy-src/data_formulator/app.py26
LOWpy-src/data_formulator/app.py26
LOWpy-src/data_formulator/app.py31
LOWpy-src/data_formulator/error_handler.py19
LOWpy-src/data_formulator/errors.py15
LOWpy-src/data_formulator/security/url_allowlist.py38
LOWpy-src/data_formulator/security/log_sanitizer.py28
LOWpy-src/data_formulator/security/log_sanitizer.py30
LOWpy-src/data_formulator/security/path_safety.py14
LOWpy-src/data_formulator/security/sanitize.py6
LOWpy-src/data_formulator/security/sanitize.py11
LOWpy-src/data_formulator/auth/token_store.py12
LOWpy-src/data_formulator/auth/token_store.py17
LOWpy-src/data_formulator/auth/token_store.py17
LOWpy-src/data_formulator/auth/providers/oidc.py37
LOWpy-src/data_formulator/auth/providers/oidc.py51
LOWpy-src/data_formulator/auth/providers/github_oauth.py18
LOWpy-src/data_formulator/auth/providers/azure_easyauth.py17
LOWpy-src/data_formulator/auth/providers/__init__.py16
LOWpy-src/data_formulator/auth/providers/base.py10
LOWpy-src/data_formulator/auth/gateways/github_gateway.py12
LOWpy-src/data_formulator/auth/gateways/oidc_gateway.py20
LOWpy-src/data_formulator/auth/vault/__init__.py17
LOWpy-src/data_formulator/auth/vault/base.py5
LOWpy-src/data_formulator/auth/vault/local_vault.py12
LOWpy-src/data_formulator/workflows/create_vl_plots.py2
LOWpy-src/data_formulator/workflows/create_vl_plots.py9
LOWpy-src/data_formulator/workflows/chart_semantics.py23
LOW…-src/data_formulator/agents/agent_data_loading_chat.py16
LOWpy-src/data_formulator/agents/reasoning_log.py22
LOW…rc/data_formulator/agents/agent_interactive_explore.py11
LOWpy-src/data_formulator/agents/__init__.py4
LOWpy-src/data_formulator/agents/__init__.py5
LOWpy-src/data_formulator/agents/__init__.py7
LOWpy-src/data_formulator/agents/__init__.py8
LOWpy-src/data_formulator/agents/__init__.py9
LOWpy-src/data_formulator/agents/__init__.py10
LOWpy-src/data_formulator/agents/__init__.py11
LOWpy-src/data_formulator/agents/__init__.py12
LOW…src/data_formulator/agents/agent_experience_distill.py17
LOWpy-src/data_formulator/agents/agent_report_gen.py23
LOWpy-src/data_formulator/agents/agent_report_gen.py35
LOWpy-src/data_formulator/agents/agent_report_gen.py35
LOWpy-src/data_formulator/agents/agent_report_gen.py35
LOWpy-src/data_formulator/agents/agent_report_gen.py35
LOWpy-src/data_formulator/agents/agent_report_gen.py35
LOWpy-src/data_formulator/agents/agent_diagnostics.py11
LOWpy-src/data_formulator/agents/agent_simple.py14
LOWpy-src/data_formulator/agents/agent_data_rec.py4
LOWpy-src/data_formulator/agents/web_utils.py8
LOWpy-src/data_formulator/agents/web_utils.py9
LOWpy-src/data_formulator/agents/agent_code_explanation.py6
LOWpy-src/data_formulator/datalake/table_names.py29
LOWpy-src/data_formulator/datalake/catalog_cache.py24
LOWpy-src/data_formulator/datalake/workspace_metadata.py11
273 more matches not shown…
Deep Nesting118 hits · 88 pts
SeverityFileLineSnippet
LOWpy-src/data_formulator/data_connector.py234
LOWpy-src/data_formulator/data_connector.py763
LOWpy-src/data_formulator/data_connector.py942
LOWpy-src/data_formulator/data_connector.py1448
LOWpy-src/data_formulator/data_connector.py1563
LOWpy-src/data_formulator/data_connector.py1768
LOWpy-src/data_formulator/data_connector.py2092
LOWpy-src/data_formulator/auth/token_store.py346
LOWpy-src/data_formulator/auth/providers/__init__.py30
LOWpy-src/data_formulator/workflows/create_vl_plots.py309
LOWpy-src/data_formulator/workflows/create_vl_plots.py610
LOWpy-src/data_formulator/workflows/create_vl_plots.py1032
LOWpy-src/data_formulator/workflows/create_vl_plots.py1078
LOWpy-src/data_formulator/workflows/create_vl_plots.py1735
LOWpy-src/data_formulator/workflows/create_vl_plots.py1855
LOWpy-src/data_formulator/workflows/create_vl_plots.py375
LOWpy-src/data_formulator/workflows/create_vl_plots.py426
LOWpy-src/data_formulator/workflows/chart_semantics.py460
LOW…-src/data_formulator/agents/agent_data_loading_chat.py501
LOW…-src/data_formulator/agents/agent_data_loading_chat.py644
LOW…-src/data_formulator/agents/agent_data_loading_chat.py733
LOW…-src/data_formulator/agents/agent_data_loading_chat.py964
LOW…-src/data_formulator/agents/agent_data_loading_chat.py1238
LOW…-src/data_formulator/agents/agent_data_loading_chat.py1291
LOW…-src/data_formulator/agents/agent_data_loading_chat.py1362
LOWpy-src/data_formulator/agents/client_utils.py10
LOWpy-src/data_formulator/agents/client_utils.py58
LOWpy-src/data_formulator/agents/agent_utils.py124
LOWpy-src/data_formulator/agents/agent_utils.py250
LOWpy-src/data_formulator/agents/agent_utils.py315
LOWpy-src/data_formulator/agents/agent_utils.py388
LOWpy-src/data_formulator/agents/agent_utils.py429
LOWpy-src/data_formulator/agents/agent_utils.py559
LOW…rc/data_formulator/agents/agent_interactive_explore.py108
LOW…src/data_formulator/agents/agent_experience_distill.py335
LOWpy-src/data_formulator/agents/agent_report_gen.py234
LOWpy-src/data_formulator/agents/context.py29
LOWpy-src/data_formulator/agents/context.py84
LOWpy-src/data_formulator/agents/context.py110
LOWpy-src/data_formulator/agents/agent_data_rec.py127
LOWpy-src/data_formulator/agents/data_agent.py437
LOWpy-src/data_formulator/agents/data_agent.py919
LOWpy-src/data_formulator/agents/data_agent.py1041
LOWpy-src/data_formulator/agents/data_agent.py1342
LOWpy-src/data_formulator/agents/data_agent.py1552
LOWpy-src/data_formulator/agents/data_agent.py2063
LOWpy-src/data_formulator/agents/agent_chart_restyle.py75
LOWpy-src/data_formulator/agents/agent_data_transform.py124
LOWpy-src/data_formulator/agents/agent_chart_insight.py38
LOWpy-src/data_formulator/datalake/catalog_cache.py155
LOWpy-src/data_formulator/datalake/catalog_cache.py207
LOWpy-src/data_formulator/datalake/workspace_metadata.py555
LOWpy-src/data_formulator/datalake/cache_manager.py170
LOWpy-src/data_formulator/datalake/cache_manager.py267
LOWpy-src/data_formulator/datalake/cache_manager.py335
LOW…ta_formulator/datalake/azure_blob_workspace_manager.py116
LOWpy-src/data_formulator/datalake/workspace.py117
LOWpy-src/data_formulator/datalake/workspace.py481
LOWpy-src/data_formulator/datalake/workspace.py785
LOWpy-src/data_formulator/datalake/workspace.py801
58 more matches not shown…
Over-Commented Block67 hits · 56 pts
SeverityFileLineSnippet
LOWpy-src/data_formulator/errors.py61
LOWpy-src/data_formulator/security/__init__.py1# Copyright (c) Microsoft Corporation.
LOWpy-src/data_formulator/auth/__init__.py1# Copyright (c) Microsoft Corporation.
LOWpy-src/data_formulator/auth/gateways/__init__.py1# Copyright (c) Microsoft Corporation.
LOWpy-src/data_formulator/datalake/azure_blob_workspace.py121
LOWpy-src/data_formulator/knowledge/__init__.py1# Copyright (c) Microsoft Corporation.
LOW…rc/data_formulator/data_loader/external_data_loader.py601 * ``path`` *(optional)* — explicit hierarchy path as a list
LOWpy-src/data_formulator/routes/__init__.py1# Copyright (c) Microsoft Corporation.
LOWdocs/docs-cn/config-examples/superset/oauth_config.py421 },
LOWsrc/icons.tsx161// sx={{
LOWsrc/scss/EncodingShelf.scss161 }
LOWsrc/components/VirtualizedCatalogTree.tsx101// - Namespace (folder-like, no semantic icon): chevron itself acts as the
LOWsrc/components/ConnectorTablePreview.tsx441 <Typography sx={{ fontSize: 14, fontWeight: 600 }} noWrap>{displayName}</Typography>
LOWsrc/lib/agents-chart/core/compute-layout.ts341 // --- Gas pressure stretch for continuous non-banded axes ---
LOWsrc/lib/agents-chart/core/compute-layout.ts721 if (count <= 0) continue;
LOWsrc/lib/agents-chart/core/compute-layout.ts1121 // When a busy discrete axis makes each subplot wider than the
LOWsrc/lib/agents-chart/core/compute-layout.ts1221 const baseMinSubplot = options.minSubplotSize ?? 60;
LOWsrc/lib/agents-chart/core/semantic-types.ts221// getAncestorTypes, isSubtypeOf) have been removed. They were unused
LOW…/lib/agents-chart/test-data/line-area-stretch-tests.ts381 // -----------------------------------------------------------------------
LOWsrc/lib/agents-chart/test-data/area-tests.ts1// Copyright (c) Microsoft Corporation.
LOWsrc/lib/agents-chart/test-data/scatter-tests.ts1// Copyright (c) Microsoft Corporation.
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts121// 3. Domain constraint + Tick constraint (Rating [1,5])
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts221// Only triggers when data spans ≥ 4 orders of magnitude (10,000×)
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts321 },
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts421// Semantic type: { semanticType: "Rating", intrinsicDomain: [1, 5] }
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts641 genScoreColorDivergingTest(),
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts741}
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts881 x: makeEncodingItem('stock'),
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts921 sales: { type: Type.Number, semanticType: 'Amount', levels: [] },
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1061 },
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1221// 25. Duration with unit suffix — additive measure, "min" suffix
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1261 encodingMap: {
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1321 };
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1361 annual_cost: { semanticType: 'Amount', unit: 'USD' },
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1461 y: makeEncodingItem('population'),
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1561}
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1601 },
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1661//
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1761
LOWsrc/lib/agents-chart/test-data/semantic-tests.ts1861 };
LOWsrc/lib/agents-chart/test-data/bar-tests.ts1// Copyright (c) Microsoft Corporation.
LOWsrc/lib/agents-chart/test-data/line-tests.ts1// Copyright (c) Microsoft Corporation.
LOWsrc/lib/agents-chart/vegalite/instantiate-spec.ts121 // (on by default) already provides breathing room with clean
LOWsrc/lib/agents-chart/vegalite/instantiate-spec.ts661 for (const [ch, cs] of Object.entries(channelSemantics)) {
LOWsrc/lib/agents-chart/vegalite/instantiate-spec.ts721 // Full constraints (both min+max) set scale.domain directly.
LOWsrc/lib/agents-chart/vegalite/instantiate-spec.ts841 if (min !== undefined && max !== undefined) {
LOWsrc/lib/agents-chart/vegalite/instantiate-spec.ts881 if (cs.tickConstraint.exactTicks && !enc.axis.values) {
LOWsrc/lib/agents-chart/vegalite/templates/bar-table.ts81 // (many rows) leaves bars only a sliver wide.
LOWsrc/lib/agents-chart/vegalite/templates/bar-table.ts261 const scopeKey = hasFacet ? scopeKeyOf(r) : '';
LOWsrc/lib/agents-chart/vegalite/templates/bar-table.ts281 // we show it as-is and let VL apply its default number rendering.
LOWsrc/lib/agents-chart/vegalite/templates/bar-table.ts361
LOWsrc/lib/agents-chart/vegalite/templates/bar-table.ts401 // (e.g. Month, Day-of-week, Rank); otherwise rank by aggregated x.
LOWsrc/lib/agents-chart/vegalite/templates/bar-table.ts661 const othersTextTest = canTrim
LOWsrc/lib/agents-chart/vegalite/templates/kpi-card.ts141 } else {
LOWsrc/lib/agents-chart/vegalite/templates/kpi-card.ts181 const H = rows * tileH + (rows - 1) * spacing;
LOWsrc/views/SessionDistill.tsx61 isLeafDerivedTable,
LOWsrc/views/EncodingShelfCard.tsx401 // Intent-classifier round-trip in progress. Distinct from isRestyling so
LOWsrc/views/DataSourceSidebar.tsx921
LOWsrc/views/InteractionEntryCard.tsx321 && !isActiveAgentPause
LOWsrc/views/InteractionEntryCard.tsx341 // than "in-progress discussion".
7 more matches not shown…
Docstring Block Structure9 hits · 45 pts
SeverityFileLineSnippet
HIGHpy-src/data_formulator/workflows/create_vl_plots.py618 Assemble a Vega-Lite chart specification from a dataframe, chart type, and encodings. Parameters: - df
HIGHpy-src/data_formulator/agents/web_utils.py57 Validate a URL to prevent SSRF attacks. Performs the following checks: 1. Protocol validation (HTTP/HT
HIGHpy-src/data_formulator/agents/web_utils.py114 Download HTML content from a given URL with SSRF protection. This function implements comprehensive SSRF p
HIGHpy-src/data_formulator/datalake/parquet_utils.py44Unicode-safe filename sanitisation for data files. Prevents path traversal by extracting the basename while preserv
HIGHpy-src/data_formulator/datalake/workspace.py460 Get the filename for a table, suitable for use in generated code. Returns just the filename (e.g. "sal
HIGHpy-src/data_formulator/datalake/workspace.py482 Read a table from the workspace as a pandas DataFrame. Automatically selects the appropriate reader ba
HIGHpy-src/data_formulator/datalake/workspace_manager.py337 Open an existing workspace and return a Workspace instance. Args: workspace_id: Workspace
HIGHpy-src/data_formulator/datalake/file_manager.py208 Save an uploaded file to the workspace. The file is stored as-is without conversion. Metadata is added to
HIGH…rc/data_formulator/data_loader/external_data_loader.py397 Fetch data from the external source as a PyArrow Table. This is the primary method for data fe
AI Slop Vocabulary9 hits · 28 pts
SeverityFileLineSnippet
MEDIUMpy-src/data_formulator/workflows/create_vl_plots.py848 # Uses the robust convert_temporal_data which handles datetime objects,
MEDIUMpy-src/data_formulator/workflows/chart_semantics.py267# Date string detection (mirrors TS looksLikeDateString, much more robust)
MEDIUMpublic/df_global_energy.json1{"tables": [{"kind": "table", "id": "global-energy-20-small.csv", "displayId": "energy-co2", "names": ["Year", "Entity",
MEDIUMpublic/df_global_energy.json1{"tables": [{"kind": "table", "id": "global-energy-20-small.csv", "displayId": "energy-co2", "names": ["Year", "Entity",
MEDIUMpublic/df_global_energy.json1{"tables": [{"kind": "table", "id": "global-energy-20-small.csv", "displayId": "energy-co2", "names": ["Year", "Entity",
MEDIUMpublic/df_stock_prices_live.json1{"tables": [{"kind": "table", "id": "history", "displayId": "stock-hist", "names": ["symbol", "date", "open", "high", "l
MEDIUMpublic/df_stock_prices_live.json1{"tables": [{"kind": "table", "id": "history", "displayId": "stock-hist", "names": ["symbol", "date", "open", "high", "l
MEDIUMpublic/df_unemployment.json1{"tables": [{"kind": "table", "id": "unemployment-across-industries", "displayId": "unemp-by-ind", "names": ["series", "
MEDIUMpublic/df_unemployment.json1{"tables": [{"kind": "table", "id": "unemployment-across-industries", "displayId": "unemp-by-ind", "names": ["series", "
Self-Referential Comments6 hits · 21 pts
SeverityFileLineSnippet
MEDIUMpy-src/data_formulator/agents/web_utils.py163 # Create a custom adapter to hook into redirect handling
MEDIUMtests/backend/benchmarks/benchmark_sandbox.py195 # Create a temporary workspace directory
MEDIUMtests/backend/data/test_workspace_manager.py435 # Create a legacy source workspace (no workspace_meta.json)
MEDIUMtests/backend/data/test_all_loader_verification.py103 # Create a minimal stub that has params but doesn't connect
MEDIUMtests/database-dockers/docker-compose.test.yml3# This file is intentionally additive: existing per-service docker-compose.yml
MEDIUMtests/database-dockers/superset/init-superset.sh4# This file is kept as a reference if you need to customize init further.
Hallucination Indicators2 hits · 20 pts
SeverityFileLineSnippet
CRITICALpy-src/data_formulator/datalake/azure_blob_workspace.py169 from azure.core.exceptions import ResourceNotFoundError
CRITICAL…ta_formulator/datalake/azure_blob_workspace_manager.py77 from azure.core.exceptions import ResourceNotFoundError
Synthetic Comment Markers2 hits · 16 pts
SeverityFileLineSnippet
HIGHtests/backend/benchmarks/benchmark_sandbox.py26# Realistic Data Formulator code snippets (typical AI-generated transforms)
HIGHsrc/components/ComponentType.tsx371 insight?: ChartInsight, // AI-generated insight about the visualization
Redundant / Tautological Comments9 hits · 12 pts
SeverityFileLineSnippet
LOWpy-src/data_formulator/workflows/create_vl_plots.py101 # Check if values look like 4-digit years (1000-2999).
LOWpy-src/data_formulator/workflows/create_vl_plots.py107 # Check if it looks like a discrete categorical variable
LOWpy-src/data_formulator/agents/agent_utils.py754 # Check if output_variable appears as an assignment target (= but not ==, !=, <=, >=)
LOWpy-src/data_formulator/agents/data_agent.py1435 # Check if any step in the focused thread has a chart thumbnail
LOWpy-src/data_formulator/agents/web_utils.py29 # Check if IP is private, loopback, link-local, multicast, reserved, or unspecified
LOWpy-src/data_formulator/agents/web_utils.py100 # Check if this resolved IP is private/internal
LOWpy-src/data_formulator/datalake/file_manager.py256 # Write the file
LOWpy-src/data_formulator/data_loader/kusto_data_loader.py78 # Check if values look like datetime strings or timestamp numbers
LOWtests/database-dockers/cosmosdb/test_cosmosdb_loader.py52 # Check if the database exists (seed_data.py must have run)
Example Usage Blocks6 hits · 9 pts
SeverityFileLineSnippet
LOWtests/database-dockers/bigquery/start.sh4# Usage:
LOWtests/database-dockers/mongodb/start.sh4# Usage:
LOWtests/database-dockers/postgres/start.sh4# Usage:
LOWtests/database-dockers/cosmosdb/start.sh4# Usage:
LOWtests/database-dockers/mysql/start.sh4# Usage:
LOWtests/database-dockers/superset/start.sh4# Usage:
Slop Phrases3 hits · 9 pts
SeverityFileLineSnippet
MEDIUMpublic/df_global_energy.json1{"tables": [{"kind": "table", "id": "global-energy-20-small.csv", "displayId": "energy-co2", "names": ["Year", "Entity",
MEDIUMpublic/df_stock_prices_live.json1{"tables": [{"kind": "table", "id": "history", "displayId": "stock-hist", "names": ["symbol", "date", "open", "high", "l
MEDIUMpublic/df_unemployment.json1{"tables": [{"kind": "table", "id": "unemployment-across-industries", "displayId": "unemp-by-ind", "names": ["series", "
Cross-Language Confusion2 hits · 9 pts
SeverityFileLineSnippet
HIGHpy-src/data_formulator/routes/agents.py1253 raw = raw.replace(': NaN,', ': null,').replace(': NaN}', ': null}').replace(':NaN,', ':null,').replace('
HIGHtests/backend/agents/test_agent_diagnostics.py222 """The front-end currently does JSON.stringify(diagnostics, null, 2)
Verbosity Indicators4 hits · 6 pts
SeverityFileLineSnippet
LOWsrc/app/utils.tsx502 // Step 2: Group by and aggregate
LOWsrc/views/SessionDistill.tsx230 // Step 1: drop tool_call events.
LOWsrc/views/SessionDistill.tsx243 // Step 2: shrink each create_table.sample_rows to 1 row.
LOWsrc/views/SessionDistill.tsx260 // Step 3: drop oldest threads (first in render order).
Cross-Language Confusion (JS/TS)1 hit · 5 pts
SeverityFileLineSnippet
HIGHtests/database-dockers/mongodb/init_data.js65print("Test database initialized: products(12), customers(10), orders(10), app_settings(4)");
Fake / Example Data1 hit · 1 pts
SeverityFileLineSnippet
LOWtests/backend/auth/test_credential_vault.py140 "username": "user@example.com",