Installation

Installing from PyPI

Pre-built wheels are available on PyPI:

pip install lindera-python

[!NOTE] The PyPI package does not include dictionaries. See Obtaining Dictionaries below.

Obtaining Dictionaries

Lindera does not bundle dictionaries with the package. You need to obtain a pre-built dictionary separately.

Download from GitHub Releases

Pre-built dictionaries are available on the GitHub Releases page. Download and extract the dictionary archive to a local directory:

# Example: download and extract the IPADIC dictionary
curl -LO https://github.com/lindera/lindera/releases/download/<version>/lindera-ipadic-<version>.zip
unzip lindera-ipadic-<version>.zip -d /path/to/ipadic

Building from Source

If you need to build from source (e.g., to enable specific feature flags), the following prerequisites are required:

  • Python 3.10 or later (up to 3.14)
  • Rust toolchain -- Install via rustup
  • maturin -- Python package for building Rust-based Python extensions

Install maturin with pip:

pip install maturin

Development Build

Build and install lindera-python in development mode:

cd lindera-python
maturin develop

Or use the project Makefile:

make python-develop

Build with Training Support

The train feature enables CRF-based dictionary training functionality. It is enabled by default:

maturin develop --features train

Feature Flags

FeatureDescriptionDefault
trainCRF training functionalityEnabled
embed-ipadicEmbed Japanese dictionary (IPADIC) into the binaryDisabled
embed-unidicEmbed Japanese dictionary (UniDic) into the binaryDisabled
embed-ipadic-neologdEmbed Japanese dictionary (IPADIC NEologd) into the binaryDisabled
embed-ko-dicEmbed Korean dictionary (ko-dic) into the binaryDisabled
embed-cc-cedictEmbed Chinese dictionary (CC-CEDICT) into the binaryDisabled
embed-jiebaEmbed Chinese dictionary (Jieba) into the binaryDisabled
embed-cjkEmbed all CJK dictionaries (IPADIC, ko-dic, Jieba) into the binaryDisabled

Multiple features can be combined:

maturin develop --features "train,embed-ipadic,embed-ko-dic"

[!TIP] If you want to embed a dictionary directly into the binary (advanced usage), enable the corresponding embed-* feature flag and load it using the embedded:// scheme:

dictionary = load_dictionary("embedded://ipadic")

See Feature Flags for details.

Verifying the Installation

After installation, verify that lindera is available in Python:

import lindera

print(lindera.version())