Details
-
Improvement
-
Resolution: Fixed
-
Major
-
None
Description
The collection reader tries to guess automatically the language a text is written in.
However this might lead to erroneous results when the text is too short. The typical "error" is that a document that is known being in a given language is not analyzed by any annotator because it is recognized as being written in another language. And when we see this behavior we tend to think that there's a bug in the framework.