Architecture

Module Structure

lindera-crf/src/
├── lib.rs                # Public API re-exports
├── feature.rs            # FeatureSet, FeatureProvider
├── lattice.rs            # Edge, Node, Lattice
├── model.rs              # RawModel, MergedModel, Model trait
├── trainer.rs            # Trainer, Regularization enum
├── errors.rs             # Error types
├── forward_backward.rs   # Forward-backward algorithm
├── math.rs               # Mathematical utilities (logsumexp)
├── optimizers/
│   └── lbfgs.rs          # L-BFGS optimization
└── utils.rs              # Utility traits

Key Components

FeatureProvider / FeatureSet

Manage per-label feature sets. Each FeatureSet holds unigram features and left/right bigram features for a given label. FeatureProvider aggregates FeatureSet instances and maps feature IDs to weights.

Lattice / Edge / Node

Lattice structure with variable-length edges for sequence labeling. Edge represents a candidate span with a label, while Node aggregates edges at a given position. The Lattice is constructed from input data and used by the model to find the best path.

Trainer

Trains a CRF model using L-BFGS optimization with configurable regularization. The trainer accepts labeled lattice examples, computes gradients via the forward-backward algorithm, and iteratively updates model weights.

Regularization

Configurable regularization strategies:

  • L1: Sparse models via L1 penalty
  • L2: Smooth models via L2 penalty
  • ElasticNet: Combines L1 and L2 with a configurable l1_ratio

Model (trait)

Interface for searching the best path through a lattice. Two implementations are provided:

  • RawModel: Stores weights in a flat vector indexed by feature ID
  • MergedModel: Optimized for inference; merges feature weights into a compact representation serializable with rkyv

Forward-backward Algorithm

Computes alpha (forward) and beta (backward) values over the lattice. Used during training to calculate expected feature counts and gradients.

Feature Flags

FeatureDescriptionDefault
allocAlloc support for no_stdNo
stdStandard library support (implies alloc)No
trainTraining functionality (L-BFGS, multi-threading, logging)Yes