NW.NGramTextClassificationClient

Revision History

Date	Author	Description
2021-10-13	numbworks	Created.
2022-09-27	numbworks	Updated to v3.0.0.
2022-10-29	numbworks	Updated to v3.5.0.
2022-11-04	numbworks	Updated to v3.6.0.

Introduction

NW.NGramTextClassificationClient (ngramtc) is a command-line application to perform text classification tasks based on the NW.NGramTextClassification library.

Overview

The command-line interface for NW.NGramTextClassificationClient is summarized by the following table:

Command	Sub Command	Options	Exit Codes
about			Success
session			Success
session	classify	--labeledexamples:{filename} --textsnippets:{filename} --folderpath:{path} --tokenizerruleset:{filename} --minaccuracysingle:{number} --minaccuracymultiple:{number} --savesession --cleanlabeledexamples --disableindexserialization	Success Failure

The regular font indicates the mandatory options, while the italic font indicates an optional ones.

The exit codes are summarized below:

Label	Value
Success	0
Failure	1

Getting started

In this document I'll use Windows as reference OS, but the procedures are exactly the same on both Linux and Mac.

To get started:

download the application from the Releases page on Github and unzip it;
open a command prompt, such as Windows Terminal;
navigate to the application folder;
familiarize with each Command, Sub Command and Option provided by the application, such as:

PS C:\widjobs>.\ngramtc.exe
PS C:\widjobs>.\ngramtc.exe session
PS C:\widjobs>.\ngramtc.exe session classify --help
PS C:\widjobs>.\ngramtc.exe about

Commands: session classify

The simplest command you can run is session classify, which performs a text classification task on the data you provide. At very least, the command will look like:

PS C:\ngramtc>.\ngramtc.exe session classify --labeledexamples:LabeledExamples.json --textsnippets:TextSnippets.json

The command above requires that you have the two required files (LabeledExamples.json and TextSnippets.json) located in the same folder as the application, which by default it's the working folder for all the application's activities.

The command above will log something like this to the console:

...
[2022-10-29 22:11:47:640] Attempting to load a collection of 'LabeledExample' objects from: C:\ngramtc\LabeledExamples.json.
[2022-10-29 22:11:47:747] A collection of 'LabeledExample' objects has been successfully loaded.
[2022-10-29 22:11:47:748] Attempting to load a collection of 'TextSnippet' objects from: C:\ngramtc\TextSnippets.json.
[2022-10-29 22:11:47:749] A collection of 'TextSnippet' objects has been successfully loaded.
[2022-10-29 22:11:47:750] The provided snippets are: '2'.
[2022-10-29 22:11:47:767] The provided snippet has been tokenized into '65' INGram object.
[2022-10-29 22:11:47:769] The provided LabeledExample objects have been thru the tokenization process.
...
[2022-10-29 22:11:47:792] The 'SimilarityIndexAverage' object with the highest value is: '[ Label: 'sv', Value: '1' ]'.
[2022-10-29 22:11:47:792] The result of the classification task is: 'sv'.
[2022-10-29 22:11:47:792] The classification task has been successful.

If you wish to store the files elsewhere, you can specify a new working folder by using the folderpath option - i.e. --folderpath:C:\Documents

Markdown Toolset

Suggested toolset to view and edit this Markdown file:

Visual Studio Code
Markdown Preview Enhanced
Markdown PDF

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation-NW.NGramTextClassificationClient.md

Documentation-NW.NGramTextClassificationClient.md

NW.NGramTextClassificationClient

Revision History

Introduction

Overview

Getting started

Commands: session classify

Markdown Toolset

Files

Documentation-NW.NGramTextClassificationClient.md

Latest commit

History

Documentation-NW.NGramTextClassificationClient.md

File metadata and controls

NW.NGramTextClassificationClient

Revision History

Introduction

Overview

Getting started

Commands: session classify

Markdown Toolset