Installation
Install via Cargo
You can install the binary via cargo:
% cargo install lindera-cli
Download from GitHub Releases
Alternatively, you can download a pre-built binary from the release page:
Obtaining Dictionaries
Lindera does not bundle dictionaries with the binary. You need to download a pre-built dictionary separately from the GitHub Releases page:
# Example: download and extract the IPADIC dictionary
% curl -LO https://github.com/lindera/lindera/releases/download/<version>/lindera-ipadic-<version>.zip
% unzip lindera-ipadic-<version>.zip -d /path/to/ipadic
Then specify the dictionary path when using the CLI:
% lindera tokenize --dictionary /path/to/ipadic "関西国際空港限定トートバッグ"
Build from Source
Build without dictionaries (default)
Build a binary containing only the tokenizer and trainer without embedded dictionaries:
% cargo build --release
Build with all features
% cargo build --release --all-features
Build with Embedded Dictionaries (Advanced)
For advanced users who want to embed dictionaries directly into the binary, use the embed-* feature flags. This eliminates the need for external dictionary files at runtime but increases the binary size.
IPADIC (Japanese dictionary)
% cargo build --release --features=embed-ipadic
IPADIC NEologd (Japanese dictionary)
% cargo build --release --features=embed-ipadic-neologd
UniDic (Japanese dictionary)
% cargo build --release --features=embed-unidic
ko-dic (Korean dictionary)
% cargo build --release --features=embed-ko-dic
CC-CEDICT (Chinese dictionary)
% cargo build --release --features=embed-cc-cedict
Jieba (Chinese dictionary)
% cargo build --release --features=embed-jieba
[!TIP] After building with an
embed-*feature flag, use theembedded://scheme to load the embedded dictionary:% lindera tokenize --dictionary embedded://ipadic "関西国際空港限定トートバッグ"See Feature Flags for details.