Pasting Content into the MSHTML

For those of you who have read the article on importing content from Word into the MSHTML editor have noticed that this mechanism is handled through a modal dialog box. A more common request, however, has been: "How do you parse the content using the onpaste event?" Many have tried to use the clipboardData object and the getData method to try and retrieve the content but have been disappointed to find that only plain text can be pulled from the clipboard.

My solution to this problem is to create a "sandbox." The sandbox is a hidden editing area that could be used as a temporary storage area to manipulate code.

The Sandbox example:

<script language="javascript">
function i1onbeforepaste(){
    i2.setActive();
}
function i2onpaste(){
    // put parsing code here
    i1.innerText = i2.innerHTML;
}
</script>
<div id="i1"
    onbeforepaste="i1onbeforepaste();"

    style="height: 300px; width: 300px; border:1px solid black; overflow:scroll;"
    contenteditable="true">
</div>
<div id="i2"
    onpaste="setTimeout('i2onpaste()',100);"

    style="height: 1px; width: 1px; border:0px solid black; overflow:hidden;"
    contenteditable="true"></div>

As you can see, this is a really simple example. There are two DIV's which are both editable but the second one is invisible (sort of. see the overflow:hidden in the style attribute). You cannot set the focus to a hidden object. Therefore, we simply make it extremely, and if not impossibly, difficult to see.

When you press CTRL-V or use the edit menu to paste then the onbeforepaste event on i1 fires. This moves the focus to our hidden editable region. This will cause the clipboard contents to be pasted into our hidden DIV instead. This fires the onpaste event on i2. You'll notice that we pause a fraction of a second before we run our event handler. After some trouble-shooting, it seems that the onpaste event executes before the content actually appears in the DIV. So, we create a slight delay to allow the content to appear and then we can work with it.

In the example above, the i2onpaste() function just copies the exact contents of i2 into i1 simply for demonstration purposes. The following example gives a quick method of cleaning some of the HTML that occurs when copying and pasting from Microsoft Word.

Cleaning Word example

function i2onpaste(){
    for (var intLoop = 0; intLoop < i2.all.length; intLoop++) {
        el = i2.all[intLoop];
        el.removeAttribute("className","",0);
        el.removeAttribute("style","",0);
    }
    i1.innerHTML = i2.innerHTML;
}

In this example, I've extended the onpaste event handler to remove all class and style attributes from every tag that is in the sandbox before copying it to the editable DIV.

Some ideas of how to improve on this would include sandbox cleanup before and/or after it has been used as well as being able to handle content other than Microsoft Word.

Have Fun!

Published February 07, 2003 · Updated September 17, 2005
Categorized as MSHTML and DEC
Short URL: https://snook.ca/s/115

Conversation

2 Comments · RSS feed
asdf said on March 04, 2004

are you the author of this piece or are these guys? http://www.directions.com.au/articlehtmlxid_242

just wanted to give a heads up if they ripped off your work

dhananjay said on February 09, 2006

this is fine if i want to past entire file from word to MSHTML ,
BUT IF I WANT TO PAST SOMETHING IN BETYWEEN EXISTING TEXT IN MSHTML FROM WORD THIS CODE WILL ALWAYS APPENTD THE pasterd code to the end of the file please tell me what we should do to paste taxt in between existing text

Sorry, comments are closed for this post. If you have any further questions or comments, feel free to send them to me directly.