In this article I will explain with an example, how to read or extract text from image using Tesseract OCR library in ASP.Net with C# and VB.Net.
This process of reading or extracting text from images is also termed as Optical Character Recognition (OCR).
 
 
Installing and configuring Tesseract Library
Installing Tesseract Library
You will need to install the Tesseract package using the following command.
Install-Package Tesseract -Version 4.7.0
 
Note: For more details on how to install package from Nuget, please refer my article, Install Nuget package in Visual Studio 2017, 2019, 2022.
 
Downloading and configuring Tesseract Data Files
You will need to download the Tesseract Data files from the following link.
Once downloaded, unzip it.
Tesseract OCR: Read (Extract) Text from Image in ASP.Net using C# and VB.Net
 
Then copy it to the project root folder and rename it to tessdata as shown below.
Tesseract OCR: Read (Extract) Text from Image in ASP.Net using C# and VB.Net
 
 
HTML Markup
The following HTML Markup consists of an ASP.Net FileUpload control, a Button and a Label control.
Select File:
<asp:FileUpload ID="fuUpload" runat="server" />
<asp:Button Text="Upload" runat="server" OnClick="OnUpload" />
<hr />
<asp:Label ID="lblText" runat="server" />
 
 
Namespaces
You will need to import the following namespaces.
C#
using System.IO;
using Tesseract;
 
VB.Net
Imports System.IO
Imports Tesseract
 
 
Reading or extracting text from image
When the Upload Button is clicked, the selected file is saved inside the Uploads Folder (Directory) and then the file path is passed to the ExtractTextFromImage method.
Inside the ExtractTextFromImage method, first the Tesseract Engine is initialized by setting the tessdata folder path and the Language.
Then, the file is read from the saved path using Tesseract Pix object and then the text is extracted from the image using Tesseract Page object.
Finally, the extracted text is assigned to the Label control.
C#
protected void OnUpload(object sender, EventArgs e)
{
    string filePath = Server.MapPath("~/Uploads/" + Path.GetFileName(fuUpload.PostedFile.FileName));
    fuUpload.SaveAs(filePath);
    string extractText = this.ExtractTextFromImage(filePath);
    lblText.Text = extractText.Replace(Environment.NewLine, "<br />");
}
 
private string ExtractTextFromImage(string filePath)
{
    string path = Server.MapPath("~/") + Path.DirectorySeparatorChar + "tessdata";
    using (TesseractEngine engine = new TesseractEngine(path, "eng", EngineMode.Default))
    {
        using (Pix pix = Pix.LoadFromFile(filePath))
        {
            using (Tesseract.Page page = engine.Process(pix))
            {
                return page.GetText();
            }
        }
    }
}
 
VB.Net
Protected Sub OnUpload(ByVal sender As Object, ByVal e As EventArgs)
    Dim filePath As String = Server.MapPath("~/Uploads/" & Path.GetFileName(fuUpload.PostedFile.FileName))
    fuUpload.SaveAs(filePath)
    Dim extractText As String = Me.ExtractTextFromImage(filePath)
    lblText.Text = extractText.Replace(Environment.NewLine, "<br />")
End Sub
 
Private Function ExtractTextFromImage(ByVal filePath As String) As String
    Dim path As String = Server.MapPath("~/") & Path.DirectorySeparatorChar & "tessdata"
    Using engine As TesseractEngine = New TesseractEngine(path, "eng", EngineMode.Default)
        Using pix As Pix = Pix.LoadFromFile(filePath)
            Using page As Tesseract.Page = engine.Process(pix)
                Return page.GetText()
            End Using
        End Using
    End Using
End Function
 
 
Screenshots
Image with some text
Tesseract OCR: Read (Extract) Text from Image in ASP.Net using C# and VB.Net
 
The extracted Text
Tesseract OCR: Read (Extract) Text from Image in ASP.Net using C# and VB.Net
 
 
Downloads