In this article I will explain with an example, how to read or extract text from image using Microsoft Office Document Imaging (MODI) in ASP.Net MVC.
This process of reading or extracting text from images is also termed as Optical Character Recognition (OCR).
This article will explain how to upload an Image containing some text and the text will be read from the Image using OCR process and displayed in View.
 
 
Downloading and installing the Microsoft Office Document Imaging (MODI) library
For more details on downloading and installing the Microsoft Office Document Imaging (MODI) library, please refer my article Download and install the Microsoft Office Document Imaging (MODI) Library.
 
 
Adding Reference of Microsoft Office Document Imaging (MODI) to your project
In order to add reference of Microsoft Office Document Imaging (MODI) to your project, Right Click on the project in Solution Explorer and select Add, then Add Reference….
Then inside the Reference Manager dialog, expand the COM tab and look for the name Microsoft Office Document Imaging 12.0 Type Library from the list and check (select) the CheckBox and click OK.
Read (Extract) Text from Image (OCR) in ASP.Net MVC
 
After successfully referenced, you will see the MODI library reference as shown below.
Read (Extract) Text from Image (OCR) in ASP.Net MVC
 
 
Namespaces
You will need to import the following namespaces.
using MODI;
using System.IO;
 
 
Controller
The Controller consists of following two Action methods.
Action method for handling GET operation
Inside this Action method, simply the View is returned.
 
Action method for handling GET operation for reading or extracting text from image
Inside the Action method, first a check is performed whether Directory (Folder) exists, if not then the Directory (Folder) is created and the selected file is saved inside the Uploads Folder (Directory).
Then the file path is passed to the ExtractTextFromImage method.
Inside the ExtractTextFromImage method, the file is read from the saved path using MODI Document object and the text is extracted from the image using MODI Image object and returned back.
Finally, the extracted text is set in ViewBag object which will be later displayed in View.
Note: Before assigning to the ViewBag object, the new line character is replaced with “<br />” for displaying new lines on web page.
 
public class HomeController : Controller
{
    // GET: Home
    public ActionResult Index()
    {
        return View();
    }
 
    [HttpPost]
    public ActionResult Index(HttpPostedFileBase postedFile)
    {
        if (postedFile != null)
        {
            string path = Server.MapPath("~/Uploads/");
            if (!Directory.Exists(path))
            {
                Directory.CreateDirectory(path);
            }
 
            string filePath = path + Path.GetFileName(postedFile.FileName);
            postedFile.SaveAs(filePath);
            ViewBag.Message = this.ExtractTextFromImage(filePath).Replace(Environment.NewLine, "<br />");
        }
 
        return View();
    }
 
    private string ExtractTextFromImage(string filePath)
    {
        Document modiDocument = new Document();
        modiDocument.Create(filePath);
        modiDocument.OCR(MiLANGUAGES.miLANG_ENGLISH);
        MODI.Image modiImage = (modiDocument.Images[0] as MODI.Image);
        string extractedText = modiImage.Layout.Text;
        modiDocument.Close();
 
        return extractedText;
    }
}
 
 
View
The View consists of an HTML Form which has been created using the Html.BeginForm method with the following parameters.
ActionName – Name of the Action. In this case the name is Index.
ControllerName – Name of the Controller. In this case the name is Home.
FormMethod – It specifies the Form Method i.e. GET or POST. In this case it will be set to POST.
HtmlAttributes – This array allows to specify the additional Form Attributes. In this case it is set with enctype = “multipart/form-data” which is necessary for uploading File.
Inside the Form, there is an HTML FileUpload element, a Submit Button and a SPAN element for displaying extracted text.
When the Upload Button is clicked, the Form is submitted and the ViewBag object is displayed using Razor syntax.
Note: The Html.Raw Helper Method is used to display HTML in Raw format i.e. without encoding in ASP.Net MVC Razor. For more details please refer my article Using Html.Raw Helper Method in ASP.Net MVC.
 
@{
    Layout = null;
}
 
<!DOCTYPE html>
 
<html>
<head>
    <meta name="viewport" content="width=device-width" />
    <title>Index</title>
</head>
<body>
    <div>
        @using (Html.BeginForm("Index", "Home", FormMethod.Post, new { enctype = "multipart/form-data" }))
        {
            <span>Select File:</span>
            <input type="file" name="postedFile" />
            <input type="submit" value="Upload" />
            <hr />
            <span>@Html.Raw(ViewBag.Message)</span>
        }
    </div>
</body>
</html>
 
 
Screenshots
Image with some text
Read (Extract) Text from Image (OCR) in ASP.Net MVC
 
The extracted Text
Read (Extract) Text from Image (OCR) in ASP.Net MVC
 
 
Downloads