Pasting Content into the MSHTML

For those of you who have read my article on importing content from Word into the MSHTML editor have noticed that this mechanism is handled through a modal dialog box. A more common request, however, has been: "How do you parse the content using the onpaste event?" Many have tried to use the clipboardData object and the getData method to try and retrieve the content but have been disappointed to find that only plain text can be pulled from the clipboard.

My solution to this problem is to create a "sandbox." The sandbox is a hidden editing area that could be used as a temporary storage area to manipulate code.

The Sandbox example:

<script language="javascript">
function i1onbeforepaste(){
function i2onpaste(){
    // put parsing code here
    i1.innerText = i2.innerHTML;
<div id="i1"
    style="height: 300px; width: 300px; border:1px solid black; overflow:scroll;"
<div id="i2"
    style="height: 1px; width: 1px; border:0px solid black; overflow:hidden;"

As you can see, this is a really simple example. There are two DIV's which are both editable but the second one is invisible (sort of. see the overflow:hidden in the style attribute). You cannot set the focus to a hidden object. Therefore, we simply make it extremely, and if not impossibly, difficult to see.

When you press CTRL-V or use the edit menu to paste then the onbeforepaste event on i1 fires. This moves the focus to our hidden editable region. This will cause the clipboard contents to be pasted into our hidden DIV instead. This fires the onpaste event on i2. You'll notice that we pause a fraction of a second before we run our event handler. After some trouble-shooting, it seems that the onpaste event executes before the content actually appears in the DIV. So, we create a slight delay to allow the content to appear and then we can work with it.

In the example above, the i2onpaste() function just copies the exact contents of i2 into i1 simply for demonstration purposes. The following example gives a quick method of cleaning some of the HTML that occurs when copying and pasting from Microsoft Word.

Cleaning Word example

function i2onpaste(){
    for (var intLoop = 0; intLoop < i2.all.length; intLoop++) {
        el = i2.all[intLoop];
    i1.innerHTML = i2.innerHTML;

In this example, I've extended the onpaste event handler to remove all class and style attributes from every tag that is in the sandbox before copying it to the editable DIV.

Some ideas of how to improve on this would include sandbox cleanup before and/or after it has been used as well as being able to handle content other than Microsoft Word.

Have Fun!