Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

createContextualFragment doesn't respect the context #67

Closed
timdown opened this issue Mar 22, 2014 · 10 comments
Closed

createContextualFragment doesn't respect the context #67

timdown opened this issue Mar 22, 2014 · 10 comments

Comments

@timdown
Copy link
Owner

timdown commented Mar 22, 2014

From accou...@aleks.cc on August 22, 2011 20:21:10

What steps will reproduce the problem? 1) Create a range as follows: <xmp>f[o]o</xmp>
2) Let frag equal range.createContextualFragment('abc')
3) Call range.insertNode(frag) What is the expected output? What do you see instead? The fragment should be parsed as raw text and escaped per the fragment parsing spec. Instead, an actual bold node is inserted at the start of the range. What version of the product are you using? On what operating system? Rangy 1.2beta2 and Chromium 12.0.742.112 on Ubuntu. Please provide any additional information below. The problem appears to be that Rangy's createContextualFragment algorithm always inserts html into a generic DIV container, so the html fragment parsing algorithm doesn't 'see' the context. The fix is to use the range's start container node as the context per the DOM parsing spec ( http://html5.org/specs/dom-parsing.html ).

Original issue: http://code.google.com/p/rangy/issues/detail?id=67

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From accou...@aleks.cc on August 22, 2011 15:08:39

I wrote up a quick spec-compliant implementation below that works on my test cases. Feel free to use it.

rangy.rangePrototype.createContextualFragment = function(html) {
  // "Let node the context object's start's node."
  var node = this.startContainer;
  var doc = dom.getDocument(node);

  // "If the context object's start's node is null, raise an INVALID_STATE_ERR
  // exception and abort these steps."
  if (!node) {
    throw new rangy.DOMException("INVALID_STATE_ERR");
  }

  // "Let element be as follows, depending on node's interface:"
  // Document, Document Fragment: null
  var el = null;

  // "Element: node"
  if (node.nodeType === ELEMENT) {
    el = node;

  // "Text, Comment: node's parentElement"
  } else if (dom.isCharacterDataNode(node)) {
    el = dom.parentElement(node);
  }

  // "If either element is null or element's ownerDocument is an HTML document 
  // and element's local name is "html" and element's namespace is the HTML
  // namespace"
  if (el === null || (
    el.nodeName === 'HTML'
    && dom.isHtmlElement(dom.getDocument(el).documentElement)
    && dom.isHtmlNamespace(el.namespaceURI)
  )) {

  // "let element be a new Element with "body" as its local name and the HTML 
  // namespace as its namespace.""
    el = doc.createElement('body');
  }

  // "If the node's document is an HTML document: Invoke the HTML fragment
  // parsing algorithm."
  // "If the node's document is an XML document: Invoke the XML fragment parsing 
  // algorithm."
  // "In either case, the algorithm must be invoked with fragment as the input
  // and element as the context element."
  var context = el.cloneNode();
  context.innerHTML = html;

  // "If this raises an exception, then abort these steps. Otherwise, let new 
  // children be the nodes returned."

  // "Let fragment be a new DocumentFragment."
  var fragment = doc.createDocumentFragment(), child;

  // "Append all new children to fragment."
  while ((child = context.firstChild)) {
    fragment.appendChild(child);
  }

  // "Return fragment."
  return fragment;
}

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From timd...@gmail.com on August 22, 2011 17:44:27

You're right. I wasn't aware of the contextual aspect of createContextualFragment() (in fact, I'd always thought it was essentially useless, which is possibly reflected in the rather tossed-off implementation in Rangy). Your code looks good to me, so I'll look into this and get back to you. Thank you for the helpful bug report and code.

Now I wish I'd held off another day before releasing 1.2...

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From timd...@gmail.com on August 23, 2011 05:31:07

Having had a deeper look at this issue, it's opened a can of worms. The problem is that you're completely dependent on the browser's innerHTML implementation and whether it follows the HTML fragment parsing spec. For example, consider the following script:

var el = document.createElement("title");
el.innerHTML = "foo";
alert(el.firstChild.nodeType);

Recent WebKit and Mozilla correctly alert 3 (text node). Opera 11 alerts 1, thus not conforming. IE 6 and 7 know enough to throw an "Unknown runtime error". There are probably other situations in which IE throws an error for innerHTML.

Native browser implementations of createContextualFragment follow the same pattern: WebKit and Mozilla appear to do the right thing, Opera 11 still creates elements where they are not valid. I can't test on IE 9 right now, but I'm pretty sure it doesn't implement createContextualFragment at all.

So what to do? Options:

  1. Keep the old, incorrect implementation. This works by and large the same across browsers (modulo innerHTML implementation differences) but does not conform to the spec.
  2. Keep the new implementation based on your code. For your example it will work in recent Mozilla and WebKit but be incorrect in Opera and throw in IE.
  3. Create a more complicated implementation that uses a lookup table of elements that require text parsing. Possibly the method could defer to the underlying native range's implementation where one exists, based on a feature test. However, it looks like this will require implementing various algorithms from the HTML parsing spec, which will be time-consuming and add masses of code that will be used only in rare cases.
  4. Remove createContextualFragment from Rangy entirely.

I honestly don't know what's best. Thoughts?

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From accou...@aleks.cc on August 23, 2011 11:25:43

Yes, I believe IE9 lacks createContextualFragment support.

So I had a look at what jQuery does in older browsers and surprisingly the answer is not much. If you issue $(el).html('foo') in IE7 it will simply catch the unknown runtime error and happily fall back to appending the element to the title element.

Realistically, I think it's beyond the scope of this project to paper over browser differences w.r.t something as fundamental as HTML parsing. However, I do think it's worth taking advantage of spec-compliant behavior on newer browsers where possible. So, how about a jQuery-style compromise without the nasty performance hogging try-catch block?

On browsers that support createContextualFragment, just go ahead and use it. Opera 11's behavior is no-worse than the original fallback, so no worries. For the short bus browsers fall back to the original innerHTML div container, or perhaps a container with even looser content restrictions. Do all the feature detection at startup so it's nice and fast. Overall it's a net improvement for rangy.

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From timd...@gmail.com on August 23, 2011 15:50:02

That sounds sensible, and is consistent with my approach to other browser failings. It's not quite ideal but an improvement, as you say.

I'd be interested to see your implementation of isHtmlElement, if you could point me in the right direction. I'm interesting in isHtmlNamespace too, but I have a working implementation of that from another module.

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From timd...@gmail.com on August 24, 2011 02:20:36

I've implemented this in SVN.

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From timd...@gmail.com on August 24, 2011 02:20:52

Status: Started

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From accou...@aleks.cc on August 25, 2011 13:42:27

Looks good. I'll run the fix on my test cases soon. FYI, I cribbed isHtmlElement and isHtmlNamespace from aryeh's definitions. I think your definitions are more thorough.

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From timd...@gmail.com on August 25, 2011 15:45:34

Aryeh's stuff is designed to work on only the most recent version of browsers, a luxury Rangy doesn't have. My isHtmlNamespace came initially from Aryeh and has my added fudges for older browsers, but does the test where it can.

@timdown
Copy link
Owner Author

timdown commented Mar 22, 2014

From timd...@gmail.com on October 08, 2011 09:06:45

Fixed in version 1.2.1.

Status: Fixed

@timdown timdown closed this as completed Mar 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant