Examples

Lindera includes several example programs that demonstrate common use cases. The source code is available in the examples directory on GitHub.

Before running the examples, download a pre-built IPADIC dictionary from GitHub Releases and extract it to a local directory.

Available Examples

tokenize

Basic tokenization using an external IPADIC dictionary. Segments input text and prints each token with its part-of-speech details.

cargo run --example=tokenize

tokenize_with_user_dict

Tokenization with a user dictionary. Shows how to supplement the dictionary with custom entries for domain-specific terms.

cargo run --example=tokenize_with_user_dict

tokenize_with_filters

Tokenization with character filters and token filters. Demonstrates the text processing pipeline, including Unicode normalization, part-of-speech filtering, and other transformations.

cargo run --example=tokenize_with_filters

tokenize_with_config

Tokenization using a YAML configuration file. Shows how to configure the tokenizer declaratively instead of programmatically.

cargo run --example=tokenize_with_config