Repository Analysis

apache/arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

16.8 Moderate AI signal View on GitHub
16.8
Adjusted Score
16.8
Raw Score
100%
Time Factor
2026-05-29
Last Push
16,801
Stars
C++
Language
645,528
Lines of Code
3065
Files
8106
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 4HIGH 1054MEDIUM 151LOW 6897

Pattern Findings

8106 matches across 17 categories. Click a row to expand file-level details.

Over-Commented Block4408 hits · 4337 pts
SeverityFileLineSnippet
LOW.asf.yaml1# Licensed to the Apache Software Foundation (ASF) under one
LOW.pre-commit-config.yaml1# Licensed to the Apache Software Foundation (ASF) under one
LOW.hadolint.yaml1# Licensed to the Apache Software Foundation (ASF) under one
LOWcompose.yaml1# Licensed to the Apache Software Foundation (ASF) under one
LOWcompose.yaml21# The docker compose file is parametrized using environment variables, the
LOWcompose.yaml41# $ sudo sysctl -w kernel.core_pattern=/tmp/core.%e.%p
LOWcompose.yaml81
LOWcompose.yaml101
LOWcompose.yaml841 command: *c-glib-command
LOWcmake-format.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWcmake-format.py21
LOWcmake-format.py61
LOW.rubocop.yml1# Licensed to the Apache Software Foundation (ASF) under one
LOWCPPLINT.cfg1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/CMakeLists.txt1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyproject.toml1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/setup.cfg1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/_build_backend/__init__.py1# Licensed to the Apache Software Foundation (ASF) under one
LOW…examples/parquet_encryption/sample_vault_kms_client.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/examples/minimal_build/compose.yaml1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/examples/minimal_build/build_venv.sh1#!/usr/bin/env bash
LOWpython/examples/minimal_build/build_conda.sh1#!/usr/bin/env bash
LOWpython/examples/dataset/write_dataset_encrypted.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/examples/flight/server.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/examples/flight/client.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/examples/flight/middleware.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/benchmarks/array_ops.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/benchmarks/convert_builtins.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/benchmarks/parquet.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/benchmarks/io.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/benchmarks/__init__.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/benchmarks/microbenchmarks.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/benchmarks/convert_pandas.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/benchmarks/common.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/benchmarks/streaming.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/scripts/update_stub_docstrings.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/scripts/test_imports.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/scripts/test_leak.py1#!/usr/bin/env python
LOWpython/scripts/run_emscripten_tests.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/orc.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/conftest.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/benchmark.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/_compute_docstrings.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/ipc.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/util.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/flight.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/cffi.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/substrait.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/__init__.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/types.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/dataset.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/cuda.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/feather.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/pandas_compat.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/fs.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/acero.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/csv.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/jvm.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/json.py1# Licensed to the Apache Software Foundation (ASF) under one
LOWpython/pyarrow/compute.py1# Licensed to the Apache Software Foundation (ASF) under one
4348 more matches not shown…
Cross-Language Confusion1042 hits · 4272 pts
SeverityFileLineSnippet
HIGH…examples/parquet_encryption/sample_vault_kms_client.py140 assert table.equals(result_table)
HIGHpython/scripts/run_emscripten_tests.py56 window.python_done_callback = undefined;
HIGHpython/scripts/run_emscripten_tests.py62 window.python_done_callback = undefined;
HIGHpython/scripts/run_emscripten_tests.py67 evt.data.print.forEach((x)=>{window.python_logs.push(x)});
HIGHpython/pyarrow/_compute_docstrings.py40 null,
HIGHpython/pyarrow/__init__.py154from pyarrow.lib import (null, bool_,
HIGHpython/pyarrow/feather.py72 if not self.schema.equals(table.schema):
HIGHpython/pyarrow/compute.py502 null,
HIGHpython/pyarrow/tests/test_tensor.py114 assert result.equals(tensor)
HIGHpython/pyarrow/tests/test_tensor.py130 assert result.equals(tensor)
HIGHpython/pyarrow/tests/test_tensor.py150 assert result.equals(tensor)
HIGHpython/pyarrow/tests/test_tensor.py155 assert a.equals(b)
HIGHpython/pyarrow/tests/test_tensor.py160 assert not a.equals(b)
HIGHpython/pyarrow/tests/test_ipc.py410 assert reader.schema.equals(batches[0].schema)
HIGHpython/pyarrow/tests/test_ipc.py414 assert next_batch.equals(batches[i])
HIGHpython/pyarrow/tests/test_ipc.py521 assert reader.schema.equals(batches[0].schema)
HIGHpython/pyarrow/tests/test_ipc.py525 assert next_batch.equals(batches[i])
HIGHpython/pyarrow/tests/test_ipc.py770 assert msg.equals(restored)
HIGHpython/pyarrow/tests/test_ipc.py771 assert msg.equals(restored2)
HIGHpython/pyarrow/tests/test_ipc.py772 assert msg.equals(restored3)
HIGHpython/pyarrow/tests/test_ipc.py773 assert msg.equals(restored4)
HIGHpython/pyarrow/tests/test_ipc.py807 assert result.equals(message)
HIGHpython/pyarrow/tests/test_ipc.py813 assert schema.equals(batches[1].schema)
HIGHpython/pyarrow/tests/test_ipc.py821 assert read_batch.equals(batch)
HIGHpython/pyarrow/tests/test_ipc.py928 assert result.equals(expected)
HIGHpython/pyarrow/tests/test_ipc.py1022 assert result.schema.equals(table.schema)
HIGHpython/pyarrow/tests/test_ipc.py103 assert reader.schema.equals(batches[0].schema)
HIGHpython/pyarrow/tests/test_ipc.py189 assert result.equals(expected)
HIGHpython/pyarrow/tests/test_ipc.py205 assert result1.equals(result2)
HIGHpython/pyarrow/tests/test_ipc.py206 assert result1.equals(result3)
HIGHpython/pyarrow/tests/test_ipc.py238 assert t1.equals(t2)
HIGHpython/pyarrow/tests/test_ipc.py297 assert result1.equals(result2)
HIGHpython/pyarrow/tests/test_ipc.py298 assert result1.equals(result3)
HIGHpython/pyarrow/tests/test_ipc.py711 assert result.equals(expected)
HIGHpython/pyarrow/tests/test_ipc.py915 assert reader_schema.equals(writer_batches[0].schema)
HIGHpython/pyarrow/tests/test_ipc.py1121 assert recons_batch.equals(batch)
HIGHpython/pyarrow/tests/test_ipc.py1136 assert recons_schema.equals(schema)
HIGHpython/pyarrow/tests/test_ipc.py1165 assert table.schema.equals(schema)
HIGHpython/pyarrow/tests/test_convert_builtin.py90 assert arr1.equals(arr2)
HIGHpython/pyarrow/tests/test_convert_builtin.py96 assert arr1.equals(arr2)
HIGHpython/pyarrow/tests/test_convert_builtin.py102 assert arr1.equals(arr2)
HIGHpython/pyarrow/tests/test_convert_builtin.py108 assert arr1.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py111 assert arr1.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py117 assert arr1.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py120 assert arr1.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py196 assert arr1.equals(arr2)
HIGHpython/pyarrow/tests/test_convert_builtin.py204 assert arr1.equals(arr2)
HIGHpython/pyarrow/tests/test_convert_builtin.py452 assert arr.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py458 assert arr.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py469 assert arr.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py472 assert arr_inferred.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py546 assert arr.equals(arr2)
HIGHpython/pyarrow/tests/test_convert_builtin.py554 assert arr3.equals(arr4)
HIGHpython/pyarrow/tests/test_convert_builtin.py728 assert arr.type.equals(ty)
HIGHpython/pyarrow/tests/test_convert_builtin.py744 assert arr.type.equals(ty)
HIGHpython/pyarrow/tests/test_convert_builtin.py756 assert arr.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py770 assert arr.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py775 assert arr2.equals(expected2)
HIGHpython/pyarrow/tests/test_convert_builtin.py805 assert arr.equals(expected)
HIGHpython/pyarrow/tests/test_convert_builtin.py810 assert arr.equals(expected)
982 more matches not shown…
Hyper-Verbose Identifiers1728 hits · 1174 pts
SeverityFileLineSnippet
LOW…examples/parquet_encryption/sample_vault_kms_client.py86def parquet_write_read_with_vault(parquet_filename):
LOWpython/benchmarks/parquet.py47 def time_write_binary_table_uncompressed(self):
LOWpython/benchmarks/parquet.py51 def time_write_binary_table_no_dictionary(self):
LOWpython/benchmarks/parquet.py55 def time_convert_pandas_and_write_binary_table(self):
LOWpython/benchmarks/common.py180 def _generate_varying_sequences(self, random_factory, n, min_size,
LOWpython/benchmarks/common.py202 def generate_fixed_binary_list(self, n, size, none_prob=DEFAULT_NONE_PROB):
LOWpython/benchmarks/common.py209 def generate_varying_binary_list(self, n, min_size, max_size,
LOWpython/benchmarks/common.py218 def generate_ascii_string_list(self, n, min_size, max_size,
LOWpython/benchmarks/common.py227 def generate_unicode_string_list(self, n, min_size, max_size,
LOWpython/scripts/update_stub_docstrings.py221def _create_importable_pyarrow(pyarrow_pkg, source_dir, install_pyarrow_dir):
LOWpython/pyarrow/ipc.py130def _get_legacy_format_default(options):
LOWpython/pyarrow/ipc.py145def _ensure_default_ipc_read_options(options):
LOWpython/pyarrow/util.py210def _break_traceback_cycle_from_frame(frame):
LOWpython/pyarrow/util.py250def download_tzdata_on_windows():
LOWpython/pyarrow/__init__.py315def _get_pkg_config_executable():
LOWpython/pyarrow/__init__.py328def _read_pkg_config_variable(pkgname, cli_args):
LOWpython/pyarrow/dataset.py816def _ensure_write_partitioning(part, schema, flavor):
LOWpython/pyarrow/pandas_compat.py92def get_numpy_logical_type_map():
LOWpython/pyarrow/pandas_compat.py114def get_logical_type_from_numpy(pandas_collection):
LOWpython/pyarrow/pandas_compat.py306def _get_simple_index_descriptor(level, name):
LOWpython/pyarrow/pandas_compat.py450def _get_columns_to_convert_given_schema(df, schema, preserve_index):
LOWpython/pyarrow/pandas_compat.py534def _get_range_index_descriptor(level):
LOWpython/pyarrow/pandas_compat.py550def _resolve_columns_of_interest(df, schema, columns):
LOWpython/pyarrow/pandas_compat.py645 def _can_definitely_zero_copy(arr):
LOWpython/pyarrow/pandas_compat.py949def _check_data_column_metadata_consistency(all_columns):
LOWpython/pyarrow/pandas_compat.py960def _deserialize_column_index(block_table, all_columns, column_indexes):
LOWpython/pyarrow/pandas_compat.py1065def _backwards_compatible_index_name(raw_name, logical_name):
LOWpython/pyarrow/pandas_compat.py1094def get_pandas_logical_type_map():
LOWpython/pyarrow/pandas_compat.py1113def _pandas_type_to_numpy_type(pandas_type):
LOWpython/pyarrow/pandas_compat.py1136def _reconstruct_columns_from_metadata(columns, column_indexes):
LOWpython/pyarrow/fs.py130def _resolve_filesystem_and_path(path, filesystem=None, *, memory_map=False):
LOWpython/pyarrow/compute.py122def _scrape_options_class_doc(options_class):
LOWpython/pyarrow/compute.py129def _decorate_compute_function(wrapper, exposed_name, func, options_class):
LOWpython/pyarrow/tests/test_tensor.py87def test_tensor_numpy_roundtrip(dtype_str, arrow_type):
LOWpython/pyarrow/tests/test_tensor.py102def test_tensor_ipc_roundtrip(tmpdir):
LOWpython/pyarrow/tests/test_tensor.py118def test_tensor_ipc_read_from_compressed(tempdir):
LOWpython/pyarrow/tests/test_ipc.py405def test_stream_simple_roundtrip(stream_fixture):
LOWpython/pyarrow/tests/test_ipc.py424def test_compression_roundtrip():
LOWpython/pyarrow/tests/test_ipc.py511def test_stream_options_roundtrip(stream_fixture, options):
LOWpython/pyarrow/tests/test_ipc.py782def test_message_repr_shows_actual_values(example_messages):
LOWpython/pyarrow/tests/test_ipc.py816def test_message_read_record_batch(example_messages):
LOWpython/pyarrow/tests/test_ipc.py824def test_read_record_batch_on_stream_error_message():
LOWpython/pyarrow/tests/test_ipc.py935def test_ipc_file_stream_has_eos():
LOWpython/pyarrow/tests/test_ipc.py1027def test_get_record_batch_size():
LOWpython/pyarrow/tests/test_ipc.py1037def _check_serialize_pandas_round_trip(df, use_threads=False):
LOWpython/pyarrow/tests/test_ipc.py1044def test_pandas_serialize_round_trip():
LOWpython/pyarrow/tests/test_ipc.py1081def test_serialize_pandas_empty_dataframe():
LOWpython/pyarrow/tests/test_ipc.py1087def test_pandas_serialize_round_trip_not_string_columns():
LOWpython/pyarrow/tests/test_ipc.py1095def test_serialize_pandas_no_preserve_index():
LOWpython/pyarrow/tests/test_ipc.py167def test_file_simple_roundtrip(file_fixture):
LOWpython/pyarrow/tests/test_ipc.py192def test_open_file_from_buffer(file_fixture):
LOWpython/pyarrow/tests/test_ipc.py249def test_read_year_month_nano_interval(tmpdir):
LOWpython/pyarrow/tests/test_ipc.py268def test_stream_categorical_roundtrip(stream_fixture):
LOWpython/pyarrow/tests/test_ipc.py284def test_open_stream_from_buffer(stream_fixture):
LOWpython/pyarrow/tests/test_ipc.py327def test_open_stream_with_wrong_options(stream_fixture):
LOWpython/pyarrow/tests/test_ipc.py352def test_open_file_with_wrong_options(file_fixture):
LOWpython/pyarrow/tests/test_ipc.py361def test_stream_write_dispatch(stream_fixture):
LOWpython/pyarrow/tests/test_ipc.py382def test_stream_write_table_batches(stream_fixture):
LOWpython/pyarrow/tests/test_ipc.py568def test_read_options_included_fields(stream_fixture):
LOWpython/pyarrow/tests/test_ipc.py668def test_envvar_set_legacy_ipc_format():
1668 more matches not shown…
Unused Imports564 hits · 378 pts
SeverityFileLineSnippet
LOWpython/_build_backend/__init__.py40
LOWpython/scripts/test_imports.py18
LOWpython/pyarrow/conftest.py114
LOWpython/pyarrow/conftest.py120
LOWpython/pyarrow/conftest.py135
LOWpython/pyarrow/conftest.py141
LOWpython/pyarrow/conftest.py147
LOWpython/pyarrow/conftest.py159
LOWpython/pyarrow/conftest.py165
LOWpython/pyarrow/conftest.py171
LOWpython/pyarrow/conftest.py177
LOWpython/pyarrow/conftest.py183
LOWpython/pyarrow/conftest.py189
LOWpython/pyarrow/conftest.py195
LOWpython/pyarrow/conftest.py201
LOWpython/pyarrow/conftest.py207
LOWpython/pyarrow/conftest.py213
LOWpython/pyarrow/conftest.py129
LOWpython/pyarrow/conftest.py223
LOWpython/pyarrow/conftest.py256
LOWpython/pyarrow/benchmark.py21
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/ipc.py24
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
LOWpython/pyarrow/flight.py19
504 more matches not shown…
Decorative Section Separators78 hits · 195 pts
SeverityFileLineSnippet
MEDIUMcmake-format.py22# -----------------------------
MEDIUMcmake-format.py24# -----------------------------
MEDIUMcmake-format.py62# ------------------------------------------------
MEDIUMcmake-format.py64# ------------------------------------------------
MEDIUMpython/examples/minimal_build/build_venv.sh21#----------------------------------------------------------------------
MEDIUMpython/examples/minimal_build/build_venv.sh40#----------------------------------------------------------------------
MEDIUMpython/examples/minimal_build/build_venv.sh62#----------------------------------------------------------------------
MEDIUMpython/examples/minimal_build/build_conda.sh21#----------------------------------------------------------------------
MEDIUMpython/examples/minimal_build/build_conda.sh33#----------------------------------------------------------------------
MEDIUMpython/examples/minimal_build/build_conda.sh62#----------------------------------------------------------------------
MEDIUMpython/examples/minimal_build/build_conda.sh69#----------------------------------------------------------------------
MEDIUMpython/examples/minimal_build/build_conda.sh90#----------------------------------------------------------------------
MEDIUMpython/pyarrow/__init__.py287# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/__init__.py298# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/pandas_compat.py714# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/pandas_compat.py1285# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/acero.py18# ---------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_ipc.py931# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_ipc.py837# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_io.py73# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_io.py1524# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_io.py380# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_io.py1031# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_io.py1069# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_io.py1597# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_io.py1820# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_io.py1898# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py3400# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py3411# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py4031# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py4440# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py5211 # ------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py5224 # ------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py5262 # ------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py5267 # ------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py1743# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py3501# ---------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py3774# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py3823# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py3860# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py4010# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py4143# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py4362# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/test_pandas.py4724# ----------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/parquet/test_data_types.py56# -----------------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/parquet/test_data_types.py113# -----------------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/parquet/test_data_types.py208# -----------------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/parquet/test_data_types.py252# -----------------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/parquet/test_data_types.py401# -----------------------------------------------------------------------------
MEDIUMpython/pyarrow/tests/parquet/test_data_types.py434# -----------------------------------------------------------------------------
MEDIUMpython/pyarrow/parquet/core.py204# ----------------------------------------------------------------------
MEDIUMcpp/CMakeLists.txt225# ----------------------------------------------------------------------
MEDIUMcpp/src/arrow/CMakeLists.txt154# ----------------------------------------------------------------------
MEDIUMcpp/src/arrow/io/CMakeLists.txt18# ----------------------------------------------------------------------
MEDIUMcpp/src/arrow/compute/kernels/CMakeLists.txt20# ----------------------------------------------------------------------
MEDIUMcpp/src/arrow/compute/kernels/CMakeLists.txt39# ----------------------------------------------------------------------
MEDIUMcpp/src/arrow/compute/kernels/CMakeLists.txt100# ----------------------------------------------------------------------
MEDIUMcpp/src/arrow/compute/kernels/CMakeLists.txt146# ----------------------------------------------------------------------
MEDIUMcpp/src/arrow/compute/kernels/CMakeLists.txt159# ----------------------------------------------------------------------
MEDIUMcpp/src/parquet/CMakeLists.txt115# ----------------------------------------------------------------------
18 more matches not shown…
Self-Referential Comments55 hits · 161 pts
SeverityFileLineSnippet
MEDIUMpython/pyarrow/dataset.py829 # Create a partitioning factory with those field names.
MEDIUMpython/pyarrow/tests/arrow_39313.py18# This file is called from a test in test_pandas.py.
MEDIUMpython/pyarrow/tests/arrow_16597.py18# This file is called from a test in test_flight.py.
MEDIUMpython/pyarrow/tests/test_gandiva.py333 # Create a table with some sample data
MEDIUMpython/pyarrow/tests/strategies.py472# Define the same rules as above for pandas tests by excluding certain types
MEDIUMpython/pyarrow/tests/arrow_7980.py18# This file is called from a test in test_schema.py.
MEDIUMpython/pyarrow/tests/util.py412 # Create a limited user with a specific policy ...
MEDIUMpython/pyarrow/tests/util.py422 # Create a protected bucket for testing no-delete-bucket policy
MEDIUMpython/pyarrow/tests/test_jvm.py59 # Create a Java buffer
MEDIUMpython/pyarrow/tests/test_fs.py554 # Create a file in the protected bucket then try to delete it
MEDIUMpython/pyarrow/tests/pandas_threaded_import.py18# This file is called from a test in test_pandas.py.
MEDIUMpython/pyarrow/tests/test_dataset.py3877 # Create a dataset where f1 is incrementing from 0 to 100 spread across
MEDIUMpython/pyarrow/tests/test_cuda.py147 # Creating a device buffer from another device buffer, view:
MEDIUMpython/pyarrow/tests/test_cuda.py160 # Creating a device buffer from another device buffer, copy:
MEDIUMpython/pyarrow/tests/test_cuda.py181 # Creating a device buffer from a slice of host buffer
MEDIUMpython/pyarrow/tests/test_cuda.py196 # Creating a device buffer from a slice of an array
MEDIUMpython/pyarrow/tests/test_cuda.py207 # Creating a device buffer from a slice of bytes
MEDIUMpython/pyarrow/tests/test_cuda.py215 # Creating a device buffer from size
MEDIUMpython/pyarrow/tests/test_cuda.py366 # Create a buffer in host containing range(size)
MEDIUMpython/pyarrow/tests/test_cuda.py378 # Create a device buffer of the same size and copy from host
MEDIUMpython/pyarrow/tests/test_feather.py454 # Create a nan that is not numpy.nan
MEDIUMpython/pyarrow/tests/read_record_batch.py18# This file is called from a test in test_ipc.py.
MEDIUMpython/pyarrow/tests/parquet/test_basic.py228 # Create a non-empty table to infer the types correctly, then slice to 0
MEDIUMpython/pyarrow/interchange/dataframe.py212 # Create an iterator of RecordBatches
MEDIUMpython/pyarrow/interchange/column.py497 # Define the dtype of the returned buffer
MEDIUMpython/pyarrow/interchange/column.py523 # Define the dtype of the returned buffer
MEDIUMpython/pyarrow/vendored/version.py4# This file is dual licensed under the terms of the Apache License, Version
MEDIUMpython/pyarrow/vendored/docscrape.py4# This file is licensed under the BSD License. See the LICENSE.txt file
MEDIUMci/scripts/rust_build.sh26# This file is used to build the rust binaries needed for the archery
MEDIUMci/scripts/nanoarrow_build.sh26# This file is used to build the nanoarrow binaries needed for the archery
MEDIUMr/tools/nixlibs.R822 # Create a new env_var_list, with the values of turn_off set.
MEDIUMr/R/arrow-datum.R358 # Create a new array that encodes missing values as 1 and non-missing values
MEDIUMr/R/dplyr-datetime-helpers.R506# This function handles round/ceil/floor when unit is week. The fn argument
MEDIUMr/R/dplyr-eval.R251# Create a data mask for evaluating a dplyr expression
MEDIUMr/R/dplyr-mutate.R49 # Create a mask with aggregation functions in it
MEDIUMr/R/dplyr-mutate.R163# This function is a copy of dplyr:::check_transmute_args at
MEDIUMr/R/metadata.R91 # This function is used both when saving and loading metadata.
MEDIUMr/R/schema.R252#' # Create a schema using a list of pairs of field names and data types
MEDIUMr/R/dataset.R156#' # Create a temporary directory and write example dataset
MEDIUMr/R/extension.R301#' # Create the R6 type whose methods control how Array objects are
MEDIUMr/R/extension.R338#' # Create a helper type constructor that calls new_extension_type()
MEDIUMr/R/extension.R348#' # Create a helper array constructor that calls new_extension_array()
MEDIUMr/R/dplyr-funcs-type.R104 # Create a data frame/tibble/struct column
MEDIUMr/R/udf.R104 # Create a small wrapper function that is easier to call from C++.
MEDIUMr/R/dplyr-select.R50 # Create a mask for evaluating expressions in tidyselect helpers
MEDIUMr/R/dplyr-glimpse.R20 # This function is inspired by pillar:::glimpse.tbl(), with some adaptations
MEDIUMr/data-raw/codegen.R18# This file is used to generate code in the files
MEDIUMdocs/source/conf.py20# This file is execfile()d with the current directory set to its
MEDIUMcpp/build-support/asan_symbolize.py6# This file is distributed under the University of Illinois Open Source
MEDIUMcpp/src/arrow/acero/hash_join_graphs.py172 # Create a graph per lowest-cardinality arg
MEDIUMcpp/src/arrow/acero/hash_join_graphs.py173 # Create a line per second-lowest-cardinality arg
MEDIUMdev/release/post-01-tag.sh31# Create the release tag and trigger the Publish Release workflow.
MEDIUMdev/release/verify-release-candidate.sh359 # Creating a separate conda environment
MEDIUMdev/archery/archery/benchmark/compare.py19# Define a global regression threshold as 5%. This is purely subjective and
MEDIUMdev/archery/archery/crossbow/core.py754 # Create a Configuration object with necessary parameters
Deep Nesting94 hits · 78 pts
SeverityFileLineSnippet
LOWpython/benchmarks/microbenchmarks.py30
LOWpython/benchmarks/common.py284
LOWpython/scripts/update_stub_docstrings.py39
LOWpython/scripts/update_stub_docstrings.py177
LOWpython/pyarrow/conftest.py220
LOWpython/pyarrow/util.py161
LOWpython/pyarrow/__init__.py390
LOWpython/pyarrow/dataset.py121
LOWpython/pyarrow/dataset.py277
LOWpython/pyarrow/dataset.py297
LOWpython/pyarrow/dataset.py320
LOWpython/pyarrow/dataset.py580
LOWpython/pyarrow/dataset.py846
LOWpython/pyarrow/pandas_compat.py75
LOWpython/pyarrow/pandas_compat.py326
LOWpython/pyarrow/pandas_compat.py450
LOWpython/pyarrow/pandas_compat.py564
LOWpython/pyarrow/pandas_compat.py599
LOWpython/pyarrow/pandas_compat.py718
LOWpython/pyarrow/pandas_compat.py866
LOWpython/pyarrow/pandas_compat.py993
LOWpython/pyarrow/pandas_compat.py1136
LOWpython/pyarrow/pandas_compat.py1224
LOWpython/pyarrow/fs.py85
LOWpython/pyarrow/acero.py82
LOWpython/pyarrow/jvm.py72
LOWpython/pyarrow/jvm.py127
LOWpython/pyarrow/jvm.py154
LOWpython/pyarrow/jvm.py199
LOWpython/pyarrow/compute.py129
LOWpython/pyarrow/tests/strategies.py310
LOWpython/pyarrow/tests/test_orc.py49
LOWpython/pyarrow/tests/test_array.py3432
LOWpython/pyarrow/tests/test_fs.py779
LOWpython/pyarrow/tests/test_fs.py2188
LOWpython/pyarrow/tests/test_fs.py66
LOWpython/pyarrow/tests/test_dataset.py4341
LOWpython/pyarrow/tests/test_pandas.py4147
LOWpython/pyarrow/tests/test_pandas.py862
LOWpython/pyarrow/tests/test_pandas.py2854
LOWpython/pyarrow/tests/test_pandas.py2877
LOWpython/pyarrow/tests/test_exec_plan.py103
LOWpython/pyarrow/tests/test_flight.py2339
LOWpython/pyarrow/tests/test_flight.py441
LOWpython/pyarrow/tests/test_flight.py464
LOWpython/pyarrow/tests/test_flight.py499
LOWpython/pyarrow/tests/test_flight.py1302
LOWpython/pyarrow/tests/test_json.py314
LOWpython/pyarrow/tests/test_compute.py2303
LOWpython/pyarrow/tests/test_compute.py2415
LOWpython/pyarrow/tests/test_csv.py1563
LOWpython/pyarrow/tests/parquet/test_parquet_writer.py129
LOWpython/pyarrow/tests/parquet/test_data_types.py360
LOWpython/pyarrow/tests/parquet/test_data_types.py414
LOWpython/pyarrow/interchange/from_dataframe.py121
LOWpython/pyarrow/interchange/from_dataframe.py425
LOWpython/pyarrow/vendored/version.py240
LOWpython/pyarrow/vendored/version.py441
LOWpython/pyarrow/vendored/docscrape.py696
LOWpython/pyarrow/vendored/docscrape.py288
34 more matches not shown…
Cross-File Repetition12 hits · 60 pts
SeverityFileLineSnippet
HIGHpython/pyarrow/tests/test_schema.py0\ foo: int32 bar: string baz: list<item: int8> child 0, item: int8
HIGHpython/pyarrow/tests/test_schema.py0\ foo: int32 bar: string baz: list<item: int8> child 0, item: int8
HIGHpython/pyarrow/tests/test_schema.py0\ foo: int32 bar: string baz: list<item: int8> child 0, item: int8
HIGHpython/pyarrow/tests/test_flight.py0test sending/receiving multiple (binary-valued) headers.
HIGHpython/pyarrow/tests/test_flight.py0test sending/receiving multiple (binary-valued) headers.
HIGHpython/pyarrow/tests/test_flight.py0test sending/receiving multiple (binary-valued) headers.
HIGHpython/pyarrow/tests/test_flight.py0test sending/receiving multiple (binary-valued) headers.
HIGHpython/pyarrow/tests/test_flight.py0test sending/receiving multiple (binary-valued) headers.
HIGHpython/pyarrow/tests/test_flight.py0test sending/receiving multiple (binary-valued) headers.
HIGHpython/pyarrow/tests/parquet/test_pandas.py0\ carat cut color clarity depth table price x y z 0.23 ideal e si2 61.5 55.0 326 3.95 3.98 2.43 0.21 premium e si1 59.8
HIGHpython/pyarrow/tests/parquet/test_pandas.py0\ carat cut color clarity depth table price x y z 0.23 ideal e si2 61.5 55.0 326 3.95 3.98 2.43 0.21 premium e si1 59.8
HIGHpython/pyarrow/tests/parquet/test_pandas.py0\ carat cut color clarity depth table price x y z 0.23 ideal e si2 61.5 55.0 326 3.95 3.98 2.43 0.21 premium e si1 59.8
Example Usage Blocks49 hits · 56 pts
SeverityFileLineSnippet
LOWcompose.yaml221 # Usage:
LOWcompose.yaml253 # Usage:
LOWcompose.yaml272 # Usage:
LOWcompose.yaml307 # Usage:
LOWcompose.yaml344 # Usage:
LOWcompose.yaml375 # Usage:
LOWcompose.yaml412 # Usage:
LOWcompose.yaml570 # Usage:
LOWcompose.yaml624 # Usage:
LOWcompose.yaml673 # Usage:
LOWcompose.yaml707 # Usage:
LOWcompose.yaml733 # Usage:
LOWcompose.yaml762 # Usage:
LOWcompose.yaml793 # Usage:
LOWcompose.yaml821 # Usage:
LOWcompose.yaml850 # Usage:
LOWcompose.yaml880 # Usage:
LOWcompose.yaml906 # Usage:
LOWcompose.yaml936 # Usage:
LOWcompose.yaml968 # Usage:
LOWcompose.yaml1020 # Usage:
LOWcompose.yaml1042 # Usage:
LOWcompose.yaml1064 # Usage:
LOWcompose.yaml1088 # Usage:
LOWcompose.yaml1106 # Usage:
LOWcompose.yaml1138 # Usage:
LOWcompose.yaml1460 # Usage:
LOWcompose.yaml1490 # Usage:
LOWcompose.yaml1521 # Usage:
LOWcompose.yaml1553 # Usage:
LOWcompose.yaml1580 # Usage:
LOWcompose.yaml1606 # Usage:
LOWcompose.yaml1657 # Usage:
LOWcompose.yaml1717 # Usage:
LOWcompose.yaml1745 # Usage:
LOWcompose.yaml1769 # Usage:
LOWcompose.yaml1800 # Usage:
LOWcompose.yaml1833 # Usage:
LOWcompose.yaml1868 # Usage:
LOWcompose.yaml1944 # Usage:
LOWcompose.yaml1985 # Usage:
LOWcompose.yaml2036 # Usage:
LOWcompose.yaml2065 # Usage:
LOWcpp/src/arrow/visit_array_inline.h41/// Example usage:
LOWcpp/src/arrow/visit_scalar_inline.h44/// Example usage:
LOWcpp/src/arrow/visit_type_inline.h41/// Example usage:
LOWcpp/src/arrow/util/rows_to_batches.h82/// Example usage:
LOWcpp/src/arrow/util/bpacking_simd_codegen.py20# Usage:
LOWcpp/src/arrow/util/bpacking_scalar_codegen.py23# Usage:
Hallucination Indicators4 hits · 55 pts
SeverityFileLineSnippet
CRITICAL…/matlab/+arrow/+array/+internal/+list/TableValidator.m20 VariableValidators arrow.array.internal.list.Validator = arrow.array.internal.list.Validator.empty(1, 0)
CRITICALdev/archery/archery/integration/tester_java.py133 self.java_arrow.vector.util.Validator.compareSchemas(
CRITICALdev/archery/archery/integration/tester_java.py137 self.java_arrow.vector.util.Validator.compareVectorSchemaRoot(
CRITICALdev/archery/archery/integration/tester_java.py141 self.java_arrow.vector.util.Validator.compareDictionaryProviders(
AI Slop Vocabulary17 hits · 44 pts
SeverityFileLineSnippet
LOWr/tests/testthat/test-dplyr-funcs-string.R675 # so here we can just call the functions directly
LOWr/tests/testthat/helper-arrow.R39 # We know we're LANGUAGE=en because we just set it above.
MEDIUMr/tests/testthat/test-dataset-dplyr.R325 # This one is more nuanced
LOWr/R/filesystem.R400 # from_uri needs a non-empty path, so just use a placeholder of /_
MEDIUMr/R/dplyr-datetime-helpers.R378#' row-wise we can leverage this behaviour and introduce a condition. If `x` has
LOWr/R/query-engine.R234 # just use it to take the random slice
MEDIUMr/src/r_to_arrow.cpp1355// leverage concurrency. Maybe some refactoring needed.
MEDIUM…arty/flatbuffers/include/flatbuffers/vector_downward.h41// Essentially, this supports 2 std::vectors in a single buffer.
MEDIUMcpp/src/arrow/util/async_generator.h40// The methods in this file create, modify, and utilize AsyncGenerator which is an
MEDIUMcpp/src/arrow/io/interfaces.h360 /// For robust prefetching, use ReadAt() or ReadAsync().
MEDIUMcpp/src/arrow/compute/expression.h194/// These transform bound expressions. Some transforms utilize a guarantee, which is
MEDIUMcpp/src/arrow/compute/exec.h89 /// \brief If true, then utilize multiple threads where relevant for function
MEDIUMcpp/src/arrow/c/bridge.h479/// \param[in] handler the handler whose callbacks to utilize as data is available
MEDIUMcpp/src/arrow/c/abi.h387 // wants to utilize it after this function returns, it must copy or move the contents
MEDIUMdev/archery/archery/benchmark/runner.py238 # default repetitions is 5 for Java microbenchmark harness
LOWdev/archery/archery/crossbow/core.py1074 # shortcut to avoid pattern validation and just set all artifacts
MEDIUM…t/lib/arrow-format/org/apache/arrow/flatbuf/feature.rb29 # to facilitate exchanging and comparing bitmaps for supported
Excessive Try-Catch Wrapping35 hits · 37 pts
SeverityFileLineSnippet
MEDIUMpython/examples/flight/client.py74 print("Error calling action:", e)
LOWpython/pyarrow/feather.py189 except Exception:
LOWpython/pyarrow/pandas_compat.py284 except Exception as e:
LOWpython/pyarrow/tests/conftest.py187 except Exception as err:
LOWpython/pyarrow/tests/test_orc.py160 except Exception as e:
LOWpython/pyarrow/tests/test_orc.py196 except Exception as e:
LOWpython/pyarrow/tests/test_feather.py389 except Exception:
LOWpython/pyarrow/tests/test_pandas.py2893 except Exception:
LOWpython/pyarrow/tests/test_csv.py2117 except Exception as e:
LOWpython/pyarrow/tests/parquet/test_basic.py785 except Exception as e:
LOWpython/pyarrow/tests/parquet/conftest.py61 except Exception:
LOWpython/pyarrow/tests/parquet/test_parquet_writer.py150 except Exception as e:
LOWpython/pyarrow/tests/parquet/test_datetime.py442 except Exception:
MEDIUMpython/pyarrow/parquet/core.py1884def read_table(source, *, columns=None, use_threads=True,
LOWpython/pyarrow/parquet/core.py2056 except Exception:
LOWcpp/gdb_arrow.py2405 except Exception:
LOWcpp/gdb_arrow.py2416 except Exception:
LOWcpp/gdb_arrow.py2430 except Exception:
LOWcpp/build-support/asan_symbolize.py93 except Exception:
LOWcpp/build-support/asan_symbolize.py130 except Exception:
LOWcpp/src/arrow/util/bpacking_simd_codegen.py275 except Exception as e:
LOWdev/merge_arrow_pr.py135 except Exception as e:
LOWdev/merge_arrow_pr.py642 except Exception:
LOWdev/archery/archery/bot.py292 except Exception as e:
MEDIUMdev/archery/archery/bot.py251def handle(self, event, payload):
LOWdev/archery/archery/docker/core.py245 except Exception as e:
LOWdev/archery/archery/docker/core.py278 except Exception as e:
LOWdev/archery/archery/crossbow/core.py1352 except Exception as e:
LOWdev/archery/archery/integration/runner.py320 except Exception:
LOWdev/archery/archery/integration/runner.py450 except Exception:
LOWdev/archery/archery/integration/runner.py524 except Exception:
LOWdev/archery/archery/integration/runner.py574 except Exception:
LOWdev/archery/archery/utils/lint.py100 except Exception:
LOWdev/archery/archery/lang/python.py180 except Exception as orig_error:
LOWdev/archery/archery/lang/python.py183 except Exception:
Slop Phrases6 hits · 12 pts
SeverityFileLineSnippet
MEDIUMr/tests/testthat/test-dplyr-mutate.R275 # As well as adding new variables, you can use mutate() to
MEDIUMr/R/dplyr-funcs-agg.R29# you can use list_compute_functions("^hash_")
LOWr/R/install-arrow.R159#' make a source bundle with this function, make sure to set the first repo in
LOWr/R/install-arrow.R271 # make sure to suppress warnings and ignore the stderr so that this is silent where proc_translated doesn't exist
LOWcpp/build-support/run-test.sh105 # So we make sure to detect this and exit 1.
LOWformat/substrait/extension_types.yaml82# later to support these types, the consumer will need to make sure to continue supporting
Redundant / Tautological Comments9 hits · 11 pts
SeverityFileLineSnippet
LOWpython/pyarrow/tests/test_array.py1274 # Check if offset in offsets > 0
LOWpython/pyarrow/tests/parquet/test_basic.py610 # Check if both LZ4 compressors are working
LOWpython/pyarrow/tests/parquet/test_basic.py989 # Read file and verify that the data is correct
LOWpython/pyarrow/tests/parquet/test_basic.py1051 # Read file and verify that the data is correct
LOWr/tools/nixlibs.R1053# Check if we're authorized to download
LOWr/R/array.R92#' # Check if value is null; zero-indexed
LOW…arrow/flight/sql/odbc/install/unix/install_odbc_ini.sh54# Check if [ODBC Data Sources] section exists
LOW…src/arrow/flight/sql/odbc/install/unix/install_odbc.sh69# Check if [ODBC Drivers] section exists
LOWdev/archery/archery/crossbow/cli.py599 # Check if reference is a remote reference to point
Fake / Example Data2 hits · 3 pts
SeverityFileLineSnippet
LOWpython/pyarrow/tests/test_schema.py280Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla accumsan vel
LOWpython/pyarrow/tests/test_schema.py280Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla accumsan vel
Verbosity Indicators2 hits · 3 pts
SeverityFileLineSnippet
LOWr/R/arrow-tabular.R213 # We need to check if `i` is in names in case it is an active binding (e.g.
LOW…rrow/flight/sql/odbc/odbc_impl/flight_sql_ssl_config.h46 /// \brief Tells if we need to check if the certificate is in the system trust store.
Dead Code1 hit · 2 pts
SeverityFileLineSnippet
MEDIUMpython/pyarrow/tests/test_udf.py76