FAQ

What is Machine Learning?

More broadly, machine learning is a field of science that aims to create machines (algorithms) that can learn from the information provided to them without the need for precise programming.

If we understand a certain process by the term «machine learning», then this is the work of a mathematical algorithm aimed at determining the optimal set of parameters for the final model that can predict the outcome under study as accurately as possible.

What are the benefits of machine learning?

Machine learning technologies can work with large amounts of data, including automatically generated data that has a complex branched structure.

Machine learning algorithms can find complex relationships between a large number of features, which allows you to develop more efficient and accurate models than when using only classical statistical approaches.

How much data is required for machine learning?

A large amount of data is not a required component of machine learning – algorithms can be run on relatively small data sets. Meanwhile, using large datasets tends to provide more efficient and versatile models. In the process of machine learning, usually all data is divided into three groups: «training sample», «validation sample» and «test sample».

What are training, validation, and test samples?

«Training sample» — it is a dataset that is used to develop a machine learning model.

«Validation sample» — it is a dataset that is used in the development of a machine learning model to find the optimal set of hyperparameters.

«Test samlpe» — it is a dataset that is not used directly in the process of training the model or for fitting hyperparameters, but it allows testing the model and is a control one.

What medical problems can be solved using machine learning?

Modern machine learning technologies can solve several types of medical problems. The most common type of problems are medical classification problems, the solution of which is necessary to create new diagnostic and forecasting methods. Classification problems, in turn, can be both binary, when it is necessary to make a choice between two states, and polynomial, when there are more than two possible states. For example, diagnostics of various clinical forms of anemia.

Less commonly, machine learning in medicine is used to solve regression problems that differ from classification problems in that the valid answer is a real number or a numerical vector, and not the probability of the patient having or developing any of the possible conditions. For example, predicting the duration of a patient's treatment (number of days).