In this article I will explain with an example, how to read or extract text from image using Tesseract OCR library in Windows Forms (WinForms) Application using C# and VB.Net.
This process of reading or extracting text from images is also termed as Optical Character Recognition (OCR).
 
 

Installing and configuring Tesseract Library

Installing Tesseract Library

You will need to install the Tesseract package using the following command.
Install-Package Tesseract -Version5.2.0
 
For more details on how to install package from Nuget, please refer my article, Install Nuget package in Visual Studio 2017, 2019, 2022.
 

Downloading and configuring Tesseract Data Files

You will need to download the Tesseract Data files from the following link.
Once downloaded, unzip it.
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
Then copy it to the project root folder and rename it to tessdata as shown below.
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
 

Form Design

The following Windows Form consists of a Button, a Label and OpenFileDialog control.
Note: For more details on how to use OpenFileDialog, please refer my article, Using OpenFileDialog in C# and VB.Net.
 
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
 

Namespaces

You will need to import the following namespaces.
C#
using System.IO;
using Tesseract;
 
VB.Net
Imports System.IO
Imports Tesseract
 
 

Reading Text from Image File using C# and VB.Net

Inside the Button Click event handler, the Path of the selected File is read from the FileName property of the OpenFileDialog Box and passed to the ExtractTextFromImage method.
Inside the ExtractTextFromImage method, first the Tesseract Engine is initialized by setting the tessdata folder path and the Language.
Then, the file is read from the path using Tesseract Pix object and then the text is extracted from the image using Tesseract Page object.
Finally, the extracted text is assigned to the Label control.
C#
private void btnSelect_Click(object sender, EventArgs e)
{
    if  (openFileDialog1.ShowDialog() == DialogResult.OK)
    {
        string fileName = Path.GetFileName(openFileDialog1.FileName);
        string filePath = openFileDialog1.FileName;
        string extractText = this.ExtractTextFromImage(filePath);
        lblText.Text = extractText;
    }
}
 
private string ExtractTextFromImage(string filePath)
{
    string tessdataPath Application.StartupPath.Replace("\\b in\\Debug", "") + Path.DirectorySeparatorChar + "tessdata";
    using (TesseractEngine engine = new TesseractEngine(tessdataPath, "eng", EngineMode.Default))
    {
        using (Pix pix Pix.LoadFromFile(filePath))
        {
            using (Tesseract.Page page = engine.Process(pix))
            {
                return page.GetText();
            }
        }
    }
}
 
VB.Net
Private Sub btnSelect_Click(ByVal sender As ObjectByVal e As EventArgs) Handles btnSelect.Click
    If OpenFileDialog1.ShowDialog() = DialogResult.OK Then
        Dim fileName As String Path.GetFileName(OpenFileDialog1.FileName)
        Dim filePath As String = OpenFileDialog1.FileName
        Dim extractText As String = Me.ExtractTextFromImage(filePath)
        lblText.Text = extractText
    End If
End Sub
 
Private Function ExtractTextFromImage(ByVal filePath As String) As String
    Dim tessdataPath As String Application.StartupPath.Replace("\bin\Debug", "") + Path.DirectorySeparatorChar & "tessdata"
    Using engine As TesseractEngine = New TesseractEngine(tessdataPath, "eng", EngineMode.Default)
        Using pix As Pix pix.LoadFromFile(filePath)
            Using page As Tesseract.Page engine.Process(pix)
                Return page.GetText()
            End Using
        End Using
    End Using
End Function
 
 

Screenshots

Image with some text

Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 

The extracted Text

Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
 

Downloads