competitiondata mining

Knowledge Pit (2013) is the only platform for organizing competitions in the field of data science in Poland. Nowadays, it has nearly 2000 registered participants from all over the world, and one of the most popular challenges had over 150 entries.

The project was initiated by Andrzej Janusz and Dominik Ślęzak, who decided to create a tool for their students.

“We wanted students to be able to face real-world problems. They liked the new platform very much, their engagement during classes increased significantly.”

Advantages for business and participants

The platform brings together not only students, and data science specialists from all over the world participate in the competitions. What can they count on?

  • gaining valuable experience,
  • a publication opportunity,
  • rewards, which are offered by business partners,
  • last but not least – fame and glory.

However, it is a win-win relationship, because the companies with which we cooperate in the organization of competitions can count on:

  • an easy way of outsourcing work to the community,
  • a reliable feasibility study,
  • a significant reduction of research costs.

Our objectives:

  • stimulating data mining research,
  • attracting students/new researchers,
  • sharing “insights” and knowledge about data mining practices,
  • establishing connections between industry and academia,
  • promoting interesting events and conferences,
  • providing commercial services to companies that seek state-of-the-art in ML.

Each competition is different, but we follow a certain pattern when organizing them. Thanks to this, our partners know exactly what they can expect and what they can count on as a result of the competition.

A typical competition schema:

  • The available data set is divided into the training and test parts.
  • Target values (e.g. labels) for the test set are hidden from participants – they have to be predicted.
  • Participants submit solutions which are assessed on a sample from the test set.
  • Participants select their most reliable models and write short reports.
  • The final solutions are evaluated on the remaining test data.

To date, the project has held competitions in a number of industries and areas:

  • firefighting (2014, 2015),
  • financial/retail industry (2015, 2017),
  • coal mining (2015, 2016),
  • video games (2017, 2018, 2019),
  • cyber-security and hardware monitoring (2019, 2020),
  • customer service (2020).
Our partners: mBank, Information Builders, Security on Demand and others.

Want to know more? Check out Soon we are going to launch a new competition for data scientists through our platform For now, we have decided to give you access to all evaluation files from the Network Device Workload Prediction challenge. If you would like to start/ continue research related to this competition and evaluate your new results go to