Uploaded image for project: 'XWiki Commons'
  1. XWiki Commons
  2. XCOMMONS-1965

HTML/4.01 Fails on html containing an invalid attribute

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 11.10.4
    • Fix Version/s: None
    • Component/s: XML
    • Labels:
      None
    • Difficulty:
      Unknown
    • Similar issues:

      Description

      The following code fails:

      {{groovy}}
      
         def html = """<html>
      <body>
      <a font-family:="" />
      </body>
      </html>"""
       
         // HTML parser
         parser = services.component.getComponentManager().getInstance(org.xwiki.rendering.parser.Parser.class, "html/4.01")
         // Remove everything that is before the <body> tag.
         htmlWithoutHeader = html.substring(html.indexOf("<body>"));
      
         // Convert to xwiki/2.1
         xdom = parser.parse(new java.io.StringReader(htmlWithoutHeader));
      {{/groovy}}
      

      With exception

      Failed to execute the [groovy] macro. Cause: [The name "" is not legal for JDOM/XML attributes: XML names cannot be null or empty.]. Click on this message for details.
      
      Caused by: javax.script.ScriptException: javax.script.ScriptException: org.jdom.IllegalNameException: The name "" is not legal for JDOM/XML attributes: XML names cannot be null or empty.
      	at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:158)
      	at org.xwiki.rendering.macro.script.AbstractJSR223ScriptMacro.eval(AbstractJSR223ScriptMacro.java:351)
      	at org.xwiki.rendering.macro.script.AbstractJSR223ScriptMacro.evaluateBlock(AbstractJSR223ScriptMacro.java:249)
      	at org.xwiki.rendering.macro.script.AbstractJSR223ScriptMacro.evaluateBlock(AbstractJSR223ScriptMacro.java:197)
      	... 185 more
      Caused by: javax.script.ScriptException: org.jdom.IllegalNameException: The name "" is not legal for JDOM/XML attributes: XML names cannot be null or empty.
      	at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:320)
      	at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:155)
      	... 188 more
      Caused by: org.jdom.IllegalNameException: The name "" is not legal for JDOM/XML attributes: XML names cannot be null or empty.
      	at org.jdom.Attribute.setName(Attribute.java:363)
      	at org.jdom.Attribute.<init>(Attribute.java:227)
      	at org.jdom.Attribute.<init>(Attribute.java:205)
      	at org.jdom.DefaultJDOMFactory.attribute(DefaultJDOMFactory.java:80)
      	at org.jdom.input.DOMBuilder.buildTree(DOMBuilder.java:345)
      	at org.jdom.input.DOMBuilder.buildTree(DOMBuilder.java:360)
      	at org.jdom.input.DOMBuilder.buildTree(DOMBuilder.java:360)
      	at org.jdom.input.DOMBuilder.buildTree(DOMBuilder.java:360)
      	at org.jdom.input.DOMBuilder.buildTree(DOMBuilder.java:173)
      	at org.jdom.input.DOMBuilder.build(DOMBuilder.java:138)
      	at org.xwiki.xml.html.HTMLUtils.toString(HTMLUtils.java:243)
      	at org.xwiki.xml.html.HTMLUtils.toString(HTMLUtils.java:229)
      	at org.xwiki.rendering.internal.parser.html.HTMLParser.parse(HTMLParser.java:64)
      	at org.xwiki.rendering.parser.Parser$parse.call(Unknown Source)
      	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
      	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
      	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
      	at Script214.run(Script214.groovy:14)
      	at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:317)
      
      

      I believe the issue is that HTMLCleaner is cleaning up font-family: as an empty attribute and the HTMLUtils.toString() function then fails.
      I would expect the cleaner to fully remove that attribute and not return content that will fail.

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            ludovic Ludovic Dubost
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated: