tesseract v5.3.3.20231005
fixspace.cpp File Reference
#include "fixspace.h"
#include "blobs.h"
#include "boxword.h"
#include "errcode.h"
#include "normalis.h"
#include "pageres.h"
#include "params.h"
#include "ratngs.h"
#include "rect.h"
#include "stepblob.h"
#include "tesseractclass.h"
#include "tessvars.h"
#include "tprintf.h"
#include "unicharset.h"
#include "werd.h"
#include <tesseract/ocrclass.h>
#include <tesseract/unichar.h>
#include <cstdint>

Go to the source code of this file.

Namespaces

namespace  tesseract
 

Macros

#define PERFECT_WERDS   999
 

Functions

fix_fuzzy_spaces()

Walk over the page finding sequences of words joined by fuzzy spaces. Extract them as a sublist, process the sublist to find the optimal arrangement of spaces then replace the sublist in the ROW_RES.

Parameters
monitorprogress monitor
word_countcount of words in doc
[out]page_res
void tesseract::initialise_search (WERD_RES_LIST &src_list, WERD_RES_LIST &new_list)
 
transform_to_next_perm()

Examines the current word list to find the smallest word gap size. Then walks the word list closing any gaps of this size by either inserted new combination words, or extending existing ones.

The routine COULD be limited to stop it building words longer than N blobs.

If there are no more gaps then it DELETES the entire list and returns the empty list to cause termination.

void tesseract::transform_to_next_perm (WERD_RES_LIST &words)
 
fix_sp_fp_word()

Test the current word to see if it can be split by deleting noise blobs. If so, do the business. Return with the iterator pointing to the same place if the word is unchanged, or the last of the replacement words.

void tesseract::fixspace_dbg (WERD_RES *word)
 

Macro Definition Documentation

◆ PERFECT_WERDS

#define PERFECT_WERDS   999

Definition at line 48 of file fixspace.cpp.