Commit Graph

18 Commits

Author SHA1 Message Date
39969463a2 fix: filter phrase entries from stemmer dictionary generation 2026-04-26 15:03:41 +02:00
9eee321fef feat(trie): add diacritic processing modes with strip normalization 2026-04-24 00:43:43 +02:00
3e0f786042 fix: regression-golden updated to the latest data format 2026-04-23 23:51:47 +02:00
041b7f43fb Practical improvements
fix: cli-compilation doc is missing some params
chore: ExperimentCli is not relevant for JaCoCo
feat: human-readable format of trie metadata
fix: some new JUnit-s added
2026-04-23 23:43:25 +02:00
8785f2b7cb feat: Apply metadata-driven case normalization in get/getAll 2026-04-23 22:32:05 +02:00
4d939f5b6e feat: Prepare TrieMetadata and new stemmer data integration 2026-04-23 20:21:46 +02:00
a9d15fa3ae test: add regression coverage for trailing SKIP omission in patch encoding 2026-04-20 00:08:07 +02:00
db446932fc docs: refine footer branding and improve Javadoc overview
- remove Material for MkDocs generator branding from the site footer
- keep footer presentation aligned with the project's professional documentation style
- improve Javadoc overview content for the API landing page
- align Javadoc introductory text with the main project site messaging
- clarify project scope, documentation purpose, and license information
2026-04-18 15:04:37 +02:00
7e1aea72bf refactor: apply minor Radixor refinements and refresh dependency locks 2026-04-16 21:31:01 +02:00
594abe2c4b feat: add jqwik property-based coverage for trie and patch invariants
test: add property-based tests for FrequencyTrie determinism across repeated compilation
test: verify semantic alignment of get(), getAll(), and getEntries()
test: verify binary serialization and compressed persistence round-trip stability
test: verify builder reconstruction preserves observable trie behavior
test: add property-based tests for PatchCommandEncoder encode/apply round-trip and determinism
test: add generated stemmer-trie properties ensuring returned patches reconstruct only acceptable stems
test: introduce bounded reusable jqwik generators and scenario builders for maintainable property coverage
build: add jqwik to test dependencies and integrate it with the existing JUnit Platform setup
test: replace Jupiter display and tag annotations in jqwik suites with jqwik-native metadata to remove discovery warnings
2026-04-16 19:40:29 +02:00
953ce2226a feat(test): add deterministic fuzz-style coverage for trie compilation and stemming
* add fixed-seed fuzz scenario generator for bounded trie and dictionary inputs
* validate compilation stability across repeated builds and binary round-trips
* validate generated stemming dictionaries for non-crashing compilation and acceptable stem reconstruction
* add CI-safe semantic invariants for reduced trie reconstruction using get() and getAll()
* avoid unstable count-preservation assertions for builder reconstruction from reduced shared tries
2026-04-16 18:51:39 +02:00
5730babd06 feat: add Maven Central packaging and release publishing
refactor: move Maven POM and publication logic into gradle/maven-pom.gradle
feat: publish signed mavenJava artifacts with sources and Javadoc jars
feat: add Central staging, checksum generation, and centralBundle packaging
feat: add packageReleaseCandidate task for clean local release verification
docs: define Maven POM metadata for org.egothor:radixor
docs: switch project licensing metadata and repository license file to BSD-3-Clause
ci: build signed Central bundle in tagged release workflow
ci: upload Central bundle to Maven Central via Sonatype Portal API
ci: attach Central bundle to GitHub release assets
2026-04-16 02:00:59 +02:00
56d5da6b95 feat: add end-to-end Compile CLI integration tests and normalize us_uk dictionary encoding
test: add CompileIntegrationTest for remark-aware fixture and bundled dictionaries
test: verify compilation, gzip serialization, reload, overwrite handling, and lookup semantics
test: cover store-original behavior with dedicated remark-aware test resource
fix: normalize us_uk stemmer dictionary entry encoding for UTF-8 CLI parsing
fix: unblock compilation of bundled dictionaries through Compile integration workflow
2026-04-14 21:25:39 +02:00
ad8fe0ea1b feat: add deterministic compiled-trie artifact regression tooling
test: add deterministic regression coverage for compiled trie artifacts
test: add golden artifact resources and SHA-256 sidecar validation
test: add compiled trie artifact generator utility for regression preparation
build: add Gradle task for regression artifact generation
chore: add bash script to generate golden compiled trie regression files
fix: normalize SHA-256 sidecar output to use artifact basename only
fix: harden test resource loading for regression classpath access
fix: reconstruct stems from patch commands in golden artifact semantic probes
2026-04-14 19:12:51 +02:00
6b3559097a feat: add JMH comparison benchmarks for Radixor vs Snowball Porter stemmers
build: isolate Snowball benchmark integration into dedicated Gradle script
docs: highlight benchmarked throughput advantage in README
docs: add detailed benchmarking guide and execution notes
2026-04-14 18:25:41 +02:00
85e33f2f60 feat: JMH benchmarks added 2026-04-14 02:40:30 +02:00
038514bad0 Refine stemmer core, compiled trie workflow, tests, and public documentation
feat: implement Compile CLI for building binary stemmer tables from source dictionaries
feat: add loading support for persisted compiled tries, including GZip-compressed binaries
feat: add a builder path for recreating a writable trie from a compiled trie
feat: expose read-only value/count access for compiled trie entries
feat: support deterministic NOOP patch encoding for identical source and target words

fix: make value selection deterministic for equal frequencies using length and lexical tie-breakers
fix: preserve valid alternative reductions during trie optimization and reduction
fix: correct patch command edge cases discovered in round-trip and malformed-input tests
fix: address persistence and compiled-trie handling defects found during implementation review
fix: resolve test failures and behavioral regressions uncovered by PMD and JUnit runs

refactor: reorganize trie-related support types into dedicated packages and classes
refactor: simplify the core FrequencyTrie design toward a cleaner practical architecture
refactor: improve compiled/read-only trie boundaries without restoring mutability
refactor: clean up internal reduction, serialization, and helper structure

test: add professional JUnit coverage for stemmer core classes
test: split trie tests into dedicated test classes per production type
test: improve parameterized tests for readability, diagnostics, and edge-case traceability
test: cover positive, negative, malformed, persistence, and round-trip scenarios
test: verify compiled dictionaries against source inputs using getAll semantics

docs: write public README and supplementary Markdown documentation for project publishing
docs: document architecture, reduction model, built-in languages, and operational guidance
docs: clarify reverse-word storage, mutable construction, and compiled-trie runtime behavior
docs: remove placeholders, vague buzzwords, and unexplained terminology from the documentation
docs: improve examples and wording for professional reader-facing project guidance

chore: align project materials with the practical Radix scope and Egothor/Stempel lineage
chore: raise overall project quality through documentation review and test hardening
2026-04-13 02:10:46 +02:00
15248c92c9 Eclipse IDE configuration and setup 2026-04-12 13:15:27 +02:00