Testing and Issues

You can test this app and submit issues during the testing period of the Data Clustering Contest contest.

Entries with serious issues will not be able to win the contest, but even minor issues might be important for overall results.


Fair Leopard Feb 28, 2020 at 15:11
Final score for this submission (out of 100):

Languages: 13.34
News EN: 27.15
News RU: 13.33
Categories EN: 13.29
Categories RU: 16.61
Threads EN: 13.25
Threads RU: 13.31
Unfortunately, this submission didn't get a high enough score to be evaluated for Top news (task 5).

These data reflect the relative accuracy, precision and speed of the algorithm as compared to the other submissions.
Fair Leopard Feb 6, 2020 at 16:03
In our preliminary tests, this submission received the following scores (out of 100):

Languages: 82
News EN: 66
News RU: 59
Categories EN: 37
Categories RU: 43
Threads EN: 38
Threads RU: 13

Unfortunately, this submission didn't get a high enough score for the final task (top news) to be evaluated.

This is not the final result, please stay tuned for updates. We apologize for the delay.
Eager Cobra Dec 13, 2019 at 13:29
This submission initially failed to launch due to missing "article:published_time" attributes in the test data set. This issue was fixed on our side and will not result in penalties during final scoring.
Fair Leopard Dec 16, 2019 at 13:40
This submission was unable to deliver results for stages 2-5 of the test (news/categories/threads/top) due to an issue on our side. The issue has been fixed and will not result in a penalty during final scoring.

The algorithm has been relaunched, kindly check the new results.
Only full article duplicates appear to be grouped into threads, resulting in many threads on the same subject
Cuddly Kangaroo Dec 16, 2019 at 22:14
Threshold of similarity is pretty high... so they are just very very similar :)
Nobody added any issues yet...