Installation

[!NOTE] lindera-ruby is not yet published to RubyGems. You need to build from source.

Prerequisites

  • Ruby 3.1 or later
  • Rust toolchain -- Install via rustup
  • Bundler -- Ruby dependency manager (gem install bundler)

Obtaining Dictionaries

Lindera does not bundle dictionaries with the package. You need to obtain a pre-built dictionary separately.

Download from GitHub Releases

Pre-built dictionaries are available on the GitHub Releases page. Download and extract the dictionary archive to a local directory:

# Example: download and extract the IPADIC dictionary
curl -LO https://github.com/lindera/lindera/releases/download/<version>/lindera-ipadic-<version>.zip
unzip lindera-ipadic-<version>.zip -d /path/to/ipadic

Development Build

Build and install lindera-ruby in development mode:

cd lindera-ruby
bundle install
bundle exec rake compile

Or use the project Makefile:

make ruby-develop

Build with Training Support

The train feature enables CRF-based dictionary training functionality:

LINDERA_FEATURES="train" bundle exec rake compile

Feature Flags

Features are specified through the LINDERA_FEATURES environment variable as a comma-separated list.

FeatureDescriptionDefault
trainCRF training functionalityDisabled
embed-ipadicEmbed Japanese dictionary (IPADIC) into the binaryDisabled
embed-unidicEmbed Japanese dictionary (UniDic) into the binaryDisabled
embed-ipadic-neologdEmbed Japanese dictionary (IPADIC NEologd) into the binaryDisabled
embed-ko-dicEmbed Korean dictionary (ko-dic) into the binaryDisabled
embed-cc-cedictEmbed Chinese dictionary (CC-CEDICT) into the binaryDisabled
embed-jiebaEmbed Chinese dictionary (Jieba) into the binaryDisabled
embed-cjkEmbed all CJK dictionaries (IPADIC, ko-dic, Jieba) into the binaryDisabled

Multiple features can be combined:

LINDERA_FEATURES="train,embed-ipadic,embed-ko-dic" bundle exec rake compile

[!TIP] If you want to embed a dictionary directly into the binary (advanced usage), enable the corresponding embed-* feature flag and load it using the embedded:// scheme:

dictionary = Lindera.load_dictionary("embedded://ipadic")

See Feature Flags for details.

Verifying the Installation

After installation, verify that lindera is available in Ruby:

require 'lindera'

puts Lindera.version