docs: replace retired US_UK_PROFI with US_UK outside benchmarking history
This commit is contained in:
@@ -54,7 +54,7 @@ Radixor is especially attractive when you want something more adaptable than sim
|
||||
|
||||
Radixor includes a JMH benchmark suite for both its own algorithmic core and a side-by-side English comparison against the Snowball Porter stemmer family.
|
||||
|
||||
On the current English comparison workload, Radixor with bundled `US_UK_PROFI` reaches approximately **31 to 32 million tokens per second**. Snowball original Porter reaches approximately **8 million tokens per second**, and Snowball English (Porter2) approximately **5 to 5.5 million tokens per second**.
|
||||
On the current English comparison workload, Radixor with bundled `US_UK` reaches approximately **31 to 32 million tokens per second**. Snowball original Porter reaches approximately **8 million tokens per second**, and Snowball English (Porter2) approximately **5 to 5.5 million tokens per second**.
|
||||
|
||||
That places Radixor at approximately:
|
||||
|
||||
@@ -137,7 +137,7 @@ The repository keeps the front page concise and places detailed documentation un
|
||||
A practical first guide to loading, compiling, and using Radixor.
|
||||
|
||||
- [Built-in Languages](docs/built-in-languages.md)
|
||||
Overview of bundled language resources such as `US_UK` and `US_UK_PROFI`.
|
||||
Overview of bundled language resources such as `US_UK`.
|
||||
|
||||
- [Dictionary Format](docs/dictionary-format.md)
|
||||
How to write and normalize stemming dictionaries.
|
||||
|
||||
@@ -13,7 +13,7 @@ The benchmark suite currently covers two categories:
|
||||
|
||||
The comparison benchmark processes the same deterministic English token stream through:
|
||||
|
||||
- Radixor with bundled `US_UK_PROFI`,
|
||||
- Radixor with bundled `US_UK` (older benchmark snapshots used the now-retired `US_UK_PROFI` resource),
|
||||
- Snowball original Porter,
|
||||
- Snowball English, commonly referred to as Porter2.
|
||||
|
||||
@@ -37,7 +37,7 @@ For that reason, the published badge values should be treated primarily as a com
|
||||
|
||||
A recent JMH run on JDK 21.0.10 with JMH 1.37, one thread, three warmup iterations, and five measurement iterations produced the following approximate throughput ranges:
|
||||
|
||||
| Workload | Radixor `US_UK_PROFI` | Snowball Porter | Snowball English |
|
||||
| Workload | Radixor `US_UK` *(historical runs: `US_UK_PROFI`)* | Snowball Porter | Snowball English |
|
||||
| --- | ---: | ---: | ---: |
|
||||
| About 12,000 generated tokens | 30.99 M tokens/s | 8.21 M tokens/s | 5.46 M tokens/s |
|
||||
| About 60,000 generated tokens | 32.25 M tokens/s | 8.02 M tokens/s | 5.11 M tokens/s |
|
||||
@@ -83,7 +83,7 @@ The workload intentionally mixes:
|
||||
- simple inflections,
|
||||
- common derivational forms,
|
||||
- US and UK spelling families,
|
||||
- lexical forms appropriate for `US_UK_PROFI`.
|
||||
- lexical forms appropriate for the current bundled `US_UK` resource (with historical continuity from earlier `US_UK_PROFI` runs).
|
||||
|
||||
This design keeps runs reproducible across environments and avoids accidental drift caused by changing external corpora.
|
||||
|
||||
|
||||
@@ -21,7 +21,7 @@ public final class BundledLanguageExample {
|
||||
|
||||
public static void main(final String[] arguments) throws IOException {
|
||||
final FrequencyTrie<String> trie = StemmerPatchTrieLoader.load(
|
||||
StemmerPatchTrieLoader.Language.US_UK_PROFI,
|
||||
StemmerPatchTrieLoader.Language.US_UK,
|
||||
true,
|
||||
ReductionMode.MERGE_SUBTREES_WITH_EQUIVALENT_RANKED_GET_ALL_RESULTS);
|
||||
}
|
||||
|
||||
@@ -32,7 +32,7 @@ public final class BundledStemmerExample {
|
||||
|
||||
public static void main(final String[] arguments) throws IOException {
|
||||
final FrequencyTrie<String> trie = StemmerPatchTrieLoader.load(
|
||||
StemmerPatchTrieLoader.Language.US_UK_PROFI,
|
||||
StemmerPatchTrieLoader.Language.US_UK,
|
||||
true,
|
||||
ReductionMode.MERGE_SUBTREES_WITH_EQUIVALENT_RANKED_GET_ALL_RESULTS);
|
||||
|
||||
@@ -104,7 +104,7 @@ public final class SingleStemExample {
|
||||
|
||||
public static void main(final String[] arguments) throws IOException {
|
||||
final FrequencyTrie<String> trie = StemmerPatchTrieLoader.load(
|
||||
StemmerPatchTrieLoader.Language.US_UK_PROFI,
|
||||
StemmerPatchTrieLoader.Language.US_UK,
|
||||
true,
|
||||
ReductionMode.MERGE_SUBTREES_WITH_EQUIVALENT_RANKED_GET_ALL_RESULTS);
|
||||
|
||||
|
||||
Reference in New Issue
Block a user