From Word to the DEC

Wanting to step up the functionality on my initial code I decided to do some research on other variations on copying content from a Word document to store into a database.

What I've created consists of two files: loaddocument.html and findworddoc.html.

Download the code

Open up loaddocument.html in your browser. Click on the Load button and you will see a dialog box prompting for two parameters: the path to the file and the type of formatting to be applied.

By pasting into the DEC instead of a textarea like my previous code example, we can retain all the formatting that is seen from Word with ease. All the parsing is handled by this switch statement

switch (ptype) {
  case 1 :
    txt = worddoc.Content;
    document.all.tbContentElement.DOM.selection.createRange().pasteHTML(txt);
    break;
  case 2 :
    for(i=1;i<worddoc.Paragraphs.Count;i++){
      txt = txt + "<p>" + worddoc.Paragraphs(i).Range + "</p>\n";
    }
    document.all.tbContentElement.DOM.selection.createRange().pasteHTML(txt);
    break;
  case 3 :
    worddoc.Range(0).Copy();
    document.all.tbContentElement.ExecCommand(5032);
    break;
}

In the first case we simply copy the contents of the word document and assign it to our variable. In case two, we parse through the word document one paragraph at a time and wrap it in <p> tags. This parsing is the most time consuming of all three routines. In the third case, we actually copy the contents of the word document into the windows clipboard and then paste it using the ExecCommand method of the DEC.

I used showModalDialog instead of window.open because this code is Internet Explorer specific.

Addition as of Jan 22, 2003: Check out the article on Pasting Content into the MSHTML for more ideas

Published November 06, 2001 · Updated September 17, 2005
Categorized as MSHTML and DEC
Short URL: https://snook.ca/s/109