Since the inception of Wikipedia, more and more research has been devoted to this free encyclopedia. Scientific work in the field of assessing the quality of information is especially important.

Despite the fact that Wikipedia is often criticized for its poor quality, it still is one of the most popular knowledge bases in the world. Currently Wikipedia contains more than 56 million articles about various topics. At the same time, each language version is edited separately, so the quality of information may vary depending on the language.

According to Ethnologue, people in the world speak more than 7 thousand languages, of which almost 3 thousand are endangered. In comparison, Wikipedia articles are available in 319 languages.

Every day the number of articles in Wikipedia is growing. They can be created and edited even by anonymous users. Authors do not need to formally demonstrate their skills, education and experience in certain areas. Wikipedia does not have a central editorial team or a group of reviewers who could comprehensively check all new and existing texts. For these and other reasons, people often criticize the concept of Wikipedia, in particular pointing out the poor quality of information.

Wikipedia can be edited in each language independently, which leads to problems such as:

the same object (city, person, event, etc.) can be described in different ways,
the user usually needs to understand these languages in order to check/compare information.

Additionally, the assessment of the quality of information itself is subjective and depends on the Wikipedia language:

each language edition defines its own rules and standards,
standards can change over time.

So, on Wikipedia you can sometimes find valuable information — depending on the language version and subject. Practically in every language version there is a system of awards for the best articles. However, the number of these articles is relatively small (less than one percent). In some language versions, there are also other quality grades. However, the overwhelming majority of articles have are unevaluated (in some languages more than 99%).

However, each language edition of Wikipedia may define its own quality rating system for articles. Often, each language version has a special mark for articles that are considered the best — “Featured Articles”. There is also a mark for quality, decent articles that do not meet the criteria for Featured Articles — they are called “Good Articles”.

Some language versions of Wikipedia also have other quality ratings that may reflect the “maturity” of an article. In the English Wikipedia, in addition to the highest marks “FA” and “GA”, there are also “A-class”, “B-class”, “C-class”, “Start” and “Stub”. In Russian Wikipedia, in addition to the two highest marks, there is also a “Solid article”, “I level”, “II level”, “III level” and “IV level”. The Polish Wikipedia has three additional classes: “Four”, “Start” and “Stub”.

Despite the same names, the equivalent classes between language versions may differ in how standards are evaluated. For example, in some language versions there is a limit on the length of the article for high marks. Therefore, each language version can have its own quality model, even if these languages have the same number of grades.

Automatic quality assessment of Wikipedia articles

So, in Wikipedia, many articles do not have quality grades, so each reader should manually analyze their content. The topic of automatic quality assessment of Wikipedia articles in the scientific world is known. Basically, the scientific works describes the most developed language version of Wikipedia — English, which already contains more than 6 million articles.

Since it foundation and with the growing popularity of Wikipedia, more and more scientific publications on this subject have published. One of the first studies showed that measuring the volume of content can help determine the degree of “maturity” of the Wikipedia article. Works in this direction shows that, in general, higher-quality articles are long, use many references, are edited by hundreds of authors and have thousands of editions.

The task of automatic quality assessment is can be solved by machine learning algorithms, especially by using classification models based on the comparison of Wikipedia articles with different quality grades that were evaluated by Wikipedia users. In such models it is possible to use over 200 quality measures related to completeness, credibility, objectivity, reliability, readability, relevance, style and timeliness. Some of them are language dependent and can be obtained by using NLP techniques. Additionally such models can use SEO related metrics: visibility index, PageRank, CheiRank, 2D rank, social signals and others.

More information you can find in scientific publications. Some of the results are implemented in different tools.