C# – Create DOCX from HTML code


Hello everyone. Today I was playing around to find out how to convert HTML code to Word documents (.docx). I think I found a very simple solution to do this. All you need is Visual Studio and a dll you can download from here.

For my example I created a new Windows Forms Application.

DOCXToHTML_project

The next step is to add a reference to the dll “HTMLtoDOCX.dll”.

DOCXToHTML_reference_dll

Now here is the code I used for my example application:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;

namespace CreateDOCXFromHTML
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}

private void browse_Click(object sender, EventArgs e)
{
saveFileDialog1.Title = "Save file as...";
saveFileDialog1.Filter = "Word Document (*.docx)|*.docx|All files (*.*)|*.*";
saveFileDialog1.RestoreDirectory = true;

if (saveFileDialog1.ShowDialog() == DialogResult.OK)
{
path.Text = saveFileDialog1.FileName;
}
}

private void save_Click(object sender, EventArgs e)
{
Function f = new Function();
string data = f.GetHTMLfromUrl(url.Text);

NoInkSoftware.HTMLtoDOCX NewFile = new NoInkSoftware.HTMLtoDOCX();
NewFile.CreateFileFromHTML(data, path.Text);

MessageBox.Show("Save to DOCX finished.");
url.Text = "";
path.Text = "";
}
}
}
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
using System.IO;

namespace CreateDOCXFromHTML
{
class Function
{
public string GetHTMLfromUrl(string Url)
{
string data = string.Empty;

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Url);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
Stream receiveStream = response.GetResponseStream();
StreamReader readStream = null;
if (response.CharacterSet == null)
readStream = new StreamReader(receiveStream);
else
readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
data = readStream.ReadToEnd();
response.Close();
readStream.Close();
}

return data;
}
}
}

My example application needs the following input to work:

Here is an example document I created from my previous post:

DOCXToHTML_finished_docx

As you can see the result is not perfect but still very good. I hope my post was useful and interesting for you 🙂

And as always you can download my example from here.

Important: I did not create this solution on my own. I got the code from here. Please give this guy some credit for his great work.

Sources:

http://www.codeproject.com/Articles/91894/HTML-as-a-Source-for-a-DOCX-File

http://stackoverflow.com/questions/16642196/get-html-code-from-a-website-c-sharp

http://www.csharpdeveloping.net/Snippet/How_to_create_file_save_dialog

Advertisements
Tagged

2 thoughts on “C# – Create DOCX from HTML code

  1. Do you design your personal site? Love the way in which it looks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: