Repository Analysis

tesseract-ocr/tesseract

Tesseract Open Source OCR Engine (main repository)

4.4 Likely human-written View on GitHub
4.4
Adjusted Score
4.4
Raw Score
100%
Time Factor
2026-05-29
Last Push
74,380
Stars
C++
Language
211,076
Lines of Code
570
Files
913
Pattern Hits
2026-05-31
Scan Date

Score History

Severity Breakdown

CRITICAL 0HIGH 0MEDIUM 5LOW 908

Pattern Findings

913 matches across 5 categories. Click a row to expand file-level details.

Over-Commented Block901 hits · 901 pts
SeverityFileLineSnippet
LOWautogen.sh1#!/bin/sh
LOWautogen.sh21# The whole thing is quite complex...
LOWautogen.sh101fi
LOWnsis/winpath.cpp1// Copyright (C) 2024 Stefan Weil
LOWunittest/doubleptr.h1// Copyright 2012 Google Inc. All Rights Reserved.
LOWunittest/doubleptr.h21
LOWunittest/include_gunit.h1// (C) Copyright 2017, Google Inc.
LOWunittest/include_gunit.h81# define CHECK_GT(test, value) CHECK((test) > (value))
LOWunittest/lstm_test.h1// (C) Copyright 2017, Google Inc.
LOWunittest/lstm_test.h21#include "helpers.h"
LOWunittest/log.h1///////////////////////////////////////////////////////////////////////
LOWunittest/capiexample_c_test.c1///////////////////////////////////////////////////////////////////////
LOWunittest/cycletimer.h1// (C) Copyright 2017, Google Inc.
LOWunittest/normstrngs_test.h1// (C) Copyright 2017, Google Inc.
LOWunittest/util/utf8/unilib.h21// point to exactly the same memory.
LOWunittest/util/utf8/unicodetext.h21#include <iterator> // for bidirectional_iterator_tag, etc
LOWunittest/util/utf8/unicodetext.h41// is changed.
LOWunittest/util/utf8/unicodetext.h61// UnicodeText tests for interchange-validity, and will substitute a
LOWunittest/util/utf8/unicodetext.h81// validity check. They are used internally and by friend-functions
LOWunittest/util/utf8/unicodetext.h101//
LOWunittest/util/utf8/unicodetext.h321 // interchange-valid, a LOG(WARNING) is issued, and each
LOWunittest/util/utf8/unicodetext.h401// UnicodeTextRange is a pair of iterators, useful for specifying text
LOWunittest/util/utf8/unicodetext.h421 int byte_capacity) {
LOWunittest/util/utf8/unilib_utf8_utils.h21// They are also exported from unilib.h for legacy reasons.
LOWunittest/third_party/utf/utf.h41 * utf (7)
LOWunittest/third_party/utf/utf.h61
LOWunittest/third_party/utf/utf.h81
LOWunittest/third_party/utf/utf.h101
LOWunittest/third_party/utf/utf.h121// byte terminating a string is considered to be part of the string s.
LOWunittest/third_party/utf/utf.h161
LOWunittest/fuzzers/oss-fuzz-build.sh1#!/bin/bash -eu
LOWunittest/fuzzers/fuzzer-api.cpp1#include <allheaders.h>
LOWunittest/syntaxnet/base.h21#include <string>
LOWinclude/tesseract/ltrresultiterator.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/ltrresultiterator.h21#include "export.h" // for TESS_API
LOWinclude/tesseract/ltrresultiterator.h41// See tesseract/publictypes.h for the definition of PageIteratorLevel.
LOWinclude/tesseract/ltrresultiterator.h81 // object at the given level. Use delete [] to free after use.
LOWinclude/tesseract/ltrresultiterator.h181public:
LOWinclude/tesseract/ltrresultiterator.h201 // data. If only LSTM traineddata is used the value range is 0.0f - 1.0f. All
LOWinclude/tesseract/pageiterator.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/pageiterator.h161 * equal to other: 0
LOWinclude/tesseract/export.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/export.h21# if defined(TESS_EXPORTS)
LOWinclude/tesseract/osdetect.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/renderer.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/publictypes.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/unichar.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/unichar.h81 const char *utf8() const {
LOWinclude/tesseract/unichar.h101 // char buf[5];
LOWinclude/tesseract/resultiterator.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/baseapi.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/capi.h1// SPDX-License-Identifier: Apache-2.0
LOWinclude/tesseract/capi.h21#ifdef __cplusplus
LOWjava/com/google/scrollview/ScrollView.java141 // true or false -> Boolean.
LOW.github/workflows/codeql-analysis.yml1# For most projects, this workflow file will not need changing; you simply need
LOWdoc/generate_manpages.sh1#!/bin/bash
LOWsrc/svpaint.cpp1// Copyright 2007 Google Inc. All Rights Reserved.
LOWsrc/svpaint.cpp21// - A LMB dragging either draws a line, a rectangle or ellipse.
LOWsrc/svpaint.cpp221
LOWsrc/tesseract.cpp21# include "config_auto.h"
841 more matches not shown…
AI Slop Vocabulary4 hits · 12 pts
SeverityFileLineSnippet
MEDIUMsrc/wordrec/lm_pain_points.cpp3// Description: Functions that utilize the knowledge about the properties
MEDIUMsrc/wordrec/language_model.h3// Description: Functions that utilize the knowledge about the properties,
MEDIUMsrc/wordrec/language_model.cpp3// Description: Functions that utilize the knowledge about the properties,
MEDIUMsrc/wordrec/lm_pain_points.h3// Description: Functions that utilize the knowledge about the properties
Verbosity Indicators6 hits · 12 pts
SeverityFileLineSnippet
LOWsrc/ccmain/resultiterator.cpp146 // Step 1: Scan for and mark European Number sequences
LOWsrc/ccmain/resultiterator.cpp186 // Step 2: Convert all remaining types to either L or R.
LOWsrc/arch/intsimdmatrix.h43// Step 1: 8x4=32 results are computed,
LOWsrc/arch/intsimdmatrix.h44// Step 2: 8x4=32 again, total 64,
LOWsrc/arch/intsimdmatrix.h45// Step 3: 2x4=8 (since 8x4 is too many, so is 4x4), total 72,
LOWsrc/arch/intsimdmatrix.h46// Step 4: 1x3, total 75.
Decorative Section Separators1 hit · 3 pts
SeverityFileLineSnippet
MEDIUM.github/workflows/autotools-macos.yml111# ============================================================================================
Slop Phrases1 hit · 2 pts
SeverityFileLineSnippet
LOWinclude/tesseract/ltrresultiterator.h72 // versions, but if new data members are added, don't forget to add them!