Not directly related to this issue by maybe worth to be discussed...
I did some experiments using the rendering engine in order to reuse the information extracted by the renderer to perform the syntax highlighting. There are two problems that makes this approach inapplicable :
1) Text offsets are not retrievable from the rendering API. So apparently there is no way for understanding where a given block starts and ends (at least I haven't been successful in finding one) And this information is needed by the Eclipse text framework.
2) The "transformation" source_xwiki2 -> XDOM -> source_xwiki2 is not an identity. There is some additional logic that "sanitizes" the source once it's rendered. For example the source "*a ##b* c##", once rendered becomes "*a ##b##* ##c##" or something like that. So even if we would be able to understand block's offsets, those offsets would be relative to a source that is not the original one.
Another idea I had was to reuse the low level grammar (http://bit.ly/6KLjkK) in order to build a parser that could parse the editor's content and give directly usable offsets. Even in this case I found some problems
1) The grammar has embedded logic for generating parsing events (onXYZ, beginXYZ, endXYZ) and most of that logic is useless in our context. It would be nice to have a cleaned up grammar with only the specification in order to plug our "offset finding" logic more easily. However this is not trivial
2) Javacc doesn't provide linear offsets, but each token has begin and end information represented with line and column. Nothing that's transcendent but quite annoying.
3) Maybe there are performance problems, because the parser has to reparse everything at each keystroke (unless it is incremental)
Anyway this solution implies the maintaining of a parallel version of the javacc grammars that should be kept in synch with the main one.
Since it seems that there in no alternative to the need of maintaining another grammar, the third idea is that it would be worth to use XText (http://www.eclipse.org/Xtext/). XText is a framework for writing DSLs, and it is able to automatically generate the editor for the specified language. So basically, once we specify the XWiki grammar we will have for free the editor with syntax highlighting, completion etc. However I don't know how much flexible is the generated code and how complex is to extend it (for example, we might be willing to plug the groovy highlighter when we are inside a groovy macro block)
Of course we should rewrite the XWiki grammar using XText's grammar definition language (http://www.eclipse.org/Xtext/documentation/latest/xtext.html#grammarLanguage) which might be difficult.