Data Clustering Contest – Developer Challenges

Info

Author

Testing and Issues

You can test this app and submit issues during the testing period of the Data Clustering Contest contest.

Entries with serious issues will not be able to win the contest, but even minor issues might be important for overall results.

Voting

Issues

Fair Leopard Feb 28, 2020 at 15:11

Final score for this submission (out of 100):

Languages: 87.61
News EN: 0
News RU: 13.06
Categories EN: 0
Categories RU: 0
Threads EN: 0
Threads RU: 0
Unfortunately, this submission didn't get a high enough score to be evaluated for Top news (task 5).

These data reflect the relative accuracy, precision and speed of the algorithm as compared to the other submissions.

Fair Leopard Feb 6, 2020 at 16:03

In our preliminary tests, this submission received the following scores (out of 100):

Languages: 100
News EN: 87
News RU: 93
Categories EN: 0
Categories RU: 0
Threads EN: 0
Threads RU: 0

Unfortunately, this submission didn't get a high enough score for the final task (top news) to be evaluated.

This is not the final result, please stay tuned for updates. We apologize for the delay.

Fair Leopard Dec 12, 2019 at 16:41

We had to fix the following issues before running the algorithm and will apply relevant penalties during the final scoring:
- missing modules, built new environment

The following issues have been discovered during preliminary testing:
- invalid categories output format (titles instead of filenames)
- ValueError: Found array with 0 sample(s) (shape=(0, 5000)) while a minimum of 1 is required.

Funky Kiwi Dec 13, 2019 at 16:39

hard way without changing the sources:
1) run top and threads on dataset with en AND ru in the same directory.

easy way with sources changed:
1) to fix the categories output need to change 376 line in main.py from "current_state.append(val[-3])" to "current_state.append(val[0])"
2)
as for the shape problem: 438 and 392 lines requires to be updated with
"if not target_texts: continue"
-------------------------------------------------------
general concern -- python relies on a C libraries to speed up a computation, we need to know what CPU used to install MKL for example.

Fair Leopard Dec 15, 2019 at 11:39

#issue9796
We had to re-run your algorithm with extra articles and will apply relevant penalties during the final scoring.

Funky Kiwi Dec 16, 2019 at 19:51

Pls run with a least 100 samples of Ru and En languages

Nobody added any issues yet...

Info

Testing and Issues

Voting

Issues

Log In