In this article I will explain with an example, how to convert Microsoft office
Word document i.e. (DOC and DOCX) files to HTML and display in browser in ASP.Net using C# and VB.Net.
The
Word document will be uploaded first, then it will be converted to HTML using
Microsoft Office Interop library and finally, the converted HTML will be displayed in browser using C# and VB.Net.
Installing Microsoft.Office.Interop.Word package using Nuget
HTML Markup
The HTML Markup consists of following controls:
FileUpload – For selecting file.
Button – For uploading selected file.
div – For displaying file in HTML format.
The Button has been assigned with an OnClick event handler.
<asp:FileUpload ID="fuUpload" runat="server" />
<asp:Button ID="btnUpload" runat="server" Text="submit" OnClick="Upload" />
<hr />
<div id="dvWord" runat="server"></div>
Namespaces
You will need to import the following namespaces.
C#
using System.IO;
using System.Text.RegularExpressions;
using Microsoft.Office.Interop.Word;
VB.Net
Imports System.IO
Imports System.Text.RegularExpressions
Imports Microsoft.Office.Interop.Word
Converting Office Word Document to HTML and display it in browser in ASP.Net
When the Upload Button is clicked, the uploaded Word file is saved into Temp folder inside the project.
Then, a check is performed whether the Folder (Directory) exists or not If it does not exists then, the Folder (Directory) is created and the uploaded word document is saved in the Temp folder.
Next, an object of the Application will be created and the file is opened using Open method where fileSavePath is passed as an argument and the word document is saved as HTML file.
Then, using File class all text is read and HTML file is saved.
Then, the FOR EACH loop is executed over a
Regular Expression pattern to find all matches within the
wordHTML string using
Regex.
After that the uploaded Word file will be deleted using Delete method of File class.
Finally, HTML string is assigned to InnerHtml property of HTML DIV.
C#
protected void Upload(object sender, EventArgs e)
{
object documentFormat = 8;
string randomName = DateTime.Now.Ticks.ToString();
object htmlFilePath = Server.MapPath("~/Temp/") + randomName + ".htm";
object fileSavePath = Server.MapPath("~/Temp/") + Path.GetFileName(fuUpload.PostedFile.FileName);
//If Directory not present, create it.
if (!Directory.Exists(Server.MapPath("~/Temp/")))
{
Directory.CreateDirectory(Server.MapPath("~/Temp/"));
}
//Upload the word document and save to Temp folder.
fuUpload.PostedFile.SaveAs(fileSavePath.ToString());
//Open the word document in background.
_Application applicationclass = new Application();
applicationclass.Documents.Open(ref fileSavePath);
applicationclass.Visible = false;
Document document = applicationclass.ActiveDocument;
//Save the word document as HTML file.
document.SaveAs(ref htmlFilePath, ref documentFormat);
//Close the word document.
document.Close();
//Read the saved Html File.
string wordHTML = File.ReadAllText(htmlFilePath.ToString());
//Loop and replace the Image Path.
foreach (Match match in Regex.Matches(wordHTML, "<v:imagedata.+?src= [\"'](.+?)[\"'].*?>", RegexOptions.IgnoreCase))
{
wordHTML = Regex.Replace(wordHTML, match.Groups[1].Value,"Temp/" + match.Groups[1].Value);
}
//Delete the Uploaded Word File.
File.Delete(fileSavePath.ToString());
dvWord.InnerHtml = wordHTML;
}
VB.Net
Protected Sub Upload(ByVal sender As Object, ByVal e As EventArgs)
Dim documentFormat As Object = 8
Dim randomName As String = DateTime.Now.Ticks.ToString()
Dim htmlFilePath As Object = Server.MapPath("~/Temp/") & randomName & ".htm"
Dim fileSavePath As Object = Server.MapPath("~/Temp/") + Path.GetFileName(fuUpload.PostedFile.FileName)
'If Directory not present, create it.
If Not Directory.Exists(Server.MapPath("~/Temp/")) Then
Directory.CreateDirectory(Server.MapPath("~/Temp/"))
End If
'Upload the word document and save to Temp folder.
fuUpload.PostedFile.SaveAs(fileSavePath.ToString())
'Open the word document in background.
Dim applicationclass As _Application = New Application()
applicationclass.Documents.Open(fileSavePath)
applicationclass.Visible = False
Dim document As Document = applicationclass.ActiveDocument
'Save the word document as HTML file.
document.SaveAs(htmlFilePath, documentFormat)
'Close the word document.
document.Close()
Dim wordHTML As String = File.ReadAllText(htmlFilePath.ToString())
For Each match As Match In Regex.Matches(wordHTML, "<v:imagedata.+?src= [""'](.+?)[""'].*?>", RegexOptions.IgnoreCase)
wordHTML = Regex.Replace(wordHTML, match.Groups(1).Value, "Temp/" & match.Groups(1).Value)
Next
File.Delete(fileSavePath.ToString())
dvWord.InnerHtml = wordHTML
End Sub
Screenshot
Downloads