XWiki Platform
  1. XWiki Platform
  2. XWIKI-9270

Don't display raw content results by default, only if the user asks for it

    Details

    • Type: Improvement Improvement
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.1-milestone-2
    • Fix Version/s: 5.1
    • Component/s: Search - Solr
    • Labels:
    • Difficulty:
      Unknown
    • Similar issues:

      Description

      The idea is:

      • By default display search results for doccontent only
      • Have an advanced search checkbox to display doccontentraw results

        Issue Links

          Activity

          Hide
          Marius Dumitru Florea added a comment -

          Note that this affects only the document content and not the TextArea properties because the syntax of such a property is specific to the application that uses it and thus unknown to the search engine. Only the code that uses the property knows how to interpret its value/content.

          This improvement won't be very visible because most of the XWiki applications (e.g. Blog) use TextArea properties more than they use the document content.

          Show
          Marius Dumitru Florea added a comment - Note that this affects only the document content and not the TextArea properties because the syntax of such a property is specific to the application that uses it and thus unknown to the search engine. Only the code that uses the property knows how to interpret its value/content. This improvement won't be very visible because most of the XWiki applications (e.g. Blog) use TextArea properties more than they use the document content.
          Hide
          Thomas Mortagne added a comment -

          Actually we supposedly know that the textarea property contains wiki syntax (in the document syntax) when its type is not "puretext" and not "velocitycode" but the issue is that all the properties from all objects are concatenated in a single field in a DOCUMENT entry (would be put in 2 fields like for the document content).

          Show
          Thomas Mortagne added a comment - Actually we supposedly know that the textarea property contains wiki syntax (in the document syntax) when its type is not "puretext" and not "velocitycode" but the issue is that all the properties from all objects are concatenated in a single field in a DOCUMENT entry (would be put in 2 fields like for the document content).
          Hide
          Thomas Mortagne added a comment -

          Looks like this causes a new form of XWIKI-9271. If raw content is disabled we should not search in it which seems to be the case here.

          Show
          Thomas Mortagne added a comment - Looks like this causes a new form of XWIKI-9271 . If raw content is disabled we should not search in it which seems to be the case here.
          Hide
          Marius Dumitru Florea added a comment -

          I gave it some more thought and I think this is really just a display issue. It's just a matter of what extract/match to display (what to highlight). Right now, when you search for documents (default setting) I'm using this priority list to look for the first document 'field' that has an extract/match:

          ['doccontent', 'comment', 'objcontent', 'attcontent']
          

          So if a document is returned because the search keyword is found (only) in the comments then the extract will be displayed from the 'comment' field. Same if the keyword is found in objects or attachments. The raw content should not be handled differently. It's just another document 'field'. So as long as we don't have an option to search (only) in document content and not in its comments, objects or attachments then I don't think we should have a dedicated option to search (only) in the rendered document content and not in its raw content. You can always use the boost input if you want to search in some particular field. What we need to do is just display the raw content extract/match when a rendered version is not available. This means simply using:

          ['doccontent', 'doccontentraw', 'comment', 'objcontent', 'attcontent']
          

          This way, if the keyword is found in the rendered content then the extract for the rendered content will be displayed. If the keyword is found in the raw content but not in the rendered content then the raw content extract will be displayed. And so on for comments, objects and attachments. There's no need for a raw/rendered content switch.

          Show
          Marius Dumitru Florea added a comment - I gave it some more thought and I think this is really just a display issue. It's just a matter of what extract/match to display (what to highlight). Right now, when you search for documents (default setting) I'm using this priority list to look for the first document 'field' that has an extract/match: ['doccontent', 'comment', 'objcontent', 'attcontent'] So if a document is returned because the search keyword is found (only) in the comments then the extract will be displayed from the 'comment' field. Same if the keyword is found in objects or attachments. The raw content should not be handled differently. It's just another document 'field'. So as long as we don't have an option to search (only) in document content and not in its comments, objects or attachments then I don't think we should have a dedicated option to search (only) in the rendered document content and not in its raw content. You can always use the boost input if you want to search in some particular field. What we need to do is just display the raw content extract/match when a rendered version is not available. This means simply using: ['doccontent', 'doccontentraw', 'comment', 'objcontent', 'attcontent'] This way, if the keyword is found in the rendered content then the extract for the rendered content will be displayed. If the keyword is found in the raw content but not in the rendered content then the raw content extract will be displayed. And so on for comments, objects and attachments. There's no need for a raw/rendered content switch.
          Hide
          Marius Dumitru Florea added a comment -

          I removed the raw/rendered content switch and I'm displaying the raw content extract when there is a match in the raw content but not in the rendered content.

          Show
          Marius Dumitru Florea added a comment - I removed the raw/rendered content switch and I'm displaying the raw content extract when there is a match in the raw content but not in the rendered content.

            People

            • Assignee:
              Marius Dumitru Florea
              Reporter:
              Vincent Massol
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Date of First Response: