Snowball stemming library collection for Python

snowballstemmer-2.2.0-1-any

This package provides 29 stemmers for 28 languages generated from Snowball algorithms.

Stemming maps different forms of the same word to a common *stem* - for example, the English stemmer maps *connection*, *connections*, *connective*, *connected*, and *connecting* to *connect*. So a searching for *connected* would also find documents which only have the other forms.

This stem form is often a word itself, but this is not always the case as this is not a requirement for text search systems, which are the intended field of use. We also aim to conflate words with the same meaning, rather than all words with a common linguistic root (so *awe* and *awful* don't have the same stem), and over-stemming is more problematic than under-stemming so we tend not to stem in cases that are hard to resolve. If you want to always reduce words to a root form and/or get a root form which is itself a word then Snowball's stemming algorithms likely aren't the right answer.

Название: snowballstemmer
Репозиторий: HaikuPorts
Источник репозитория: haikuports_x86_64
Версия: 2.2.0-1
Скачиваемый объем: 1.2 KB
Исходный код доступен: Да
Категории: Нет
Просмотров версии: 5