|
tesseract
4.00.00dev
|
#include "oldlist.h"#include "efio.h"#include "emalloc.h"#include "featdefs.h"#include "tessopt.h"#include "ocrfeatures.h"#include "clusttool.h"#include "cluster.h"#include <string.h>#include <stdio.h>#include <math.h>#include "unichar.h"#include "commontraining.h"Go to the source code of this file.
Macros | |
| #define | PROGRAM_FEATURE_TYPE "cn" |
Functions | |
| DECLARE_STRING_PARAM_FLAG (D) | |
| int | main (int argc, char **argv) |
| void | WriteNormProtos (const char *Directory, LIST LabeledProtoList, const FEATURE_DESC_STRUCT *feature_desc) |
| void | WriteProtos (FILE *File, uinT16 N, LIST ProtoList, BOOL8 WriteSigProtos, BOOL8 WriteInsigProtos) |
| int | main (int argc, char *argv[]) |
Variables | |
| CLUSTERCONFIG | CNConfig |
| #define PROGRAM_FEATURE_TYPE "cn" |
Definition at line 40 of file cntraining.cpp.
| DECLARE_STRING_PARAM_FLAG | ( | D | ) |
| int main | ( | int | argc, |
| char ** | argv | ||
| ) |
This program reads in a text file consisting of feature samples from a training page in the following format:
FontName UTF8-char-str xmin ymin xmax ymax page-number
NumberOfFeatureTypes(N)
FeatureTypeName1 NumberOfFeatures(M)
Feature1
...
FeatureM
FeatureTypeName2 NumberOfFeatures(M)
Feature1
...
FeatureM
...
FeatureTypeNameN NumberOfFeatures(M)
Feature1
...
FeatureM
FontName CharName ...The result of this program is a binary inttemp file used by the OCR engine.
| argc | number of command line arguments |
| argv | array of command line arguments |
Definition at line 428 of file tesseractmain.cpp.
| int main | ( | int | argc, |
| char * | argv[] | ||
| ) |
This program reads in a text file consisting of feature samples from a training page in the following format:
FontName CharName NumberOfFeatureTypes(N)
FeatureTypeName1 NumberOfFeatures(M)
Feature1
...
FeatureM
FeatureTypeName2 NumberOfFeatures(M)
Feature1
...
FeatureM
...
FeatureTypeNameN NumberOfFeatures(M)
Feature1
...
FeatureM
FontName CharName ...
It then appends these samples into a separate file for each character. The name of the file is
DirectoryName/FontName/CharName.FeatureTypeName
The DirectoryName can be specified via a command line argument. If not specified, it defaults to the current directory. The format of the resulting files is:
NumberOfFeatures(M)
Feature1
...
FeatureM
NumberOfFeatures(M)
...
The output files each have a header which describes the type of feature which the file contains. This header is in the format required by the clusterer. A command line argument can also be used to specify that only the first N samples of each class should be used.
| argc | number of command line arguments |
| argv | array of command line arguments |
Definition at line 133 of file cntraining.cpp.
| void WriteNormProtos | ( | const char * | Directory, |
| LIST | LabeledProtoList, | ||
| const FEATURE_DESC_STRUCT * | feature_desc | ||
| ) |
This routine writes the specified samples into files which are organized according to the font name and character name of the samples.
| Directory | directory to place sample files into |
| LabeledProtoList | List of labeled protos |
| feature_desc | Description of the features |
Definition at line 224 of file cntraining.cpp.
| void WriteProtos | ( | FILE * | File, |
| uinT16 | N, | ||
| LIST | ProtoList, | ||
| BOOL8 | WriteSigProtos, | ||
| BOOL8 | WriteInsigProtos | ||
| ) |
Definition at line 262 of file cntraining.cpp.
| CLUSTERCONFIG CNConfig |
Definition at line 76 of file cntraining.cpp.