SauerkrautLM-GLiNER: Multilingual Zero-Shot Named Entity Recognition

SauerkrautLM-GLiNER is a multilingual GLiNER-style model for zero-shot named entity recognition (NER) based on the jhu-clsp/mmBERT-base backbone (a ModernBERT-style multilingual encoder).

Key Features

  • Multilingual Support: Trained jointly on English, German, French, Italian, and Spanish
  • Zero-Shot Entity Recognition: Identify any entity type without requiring retraining - just provide your custom label list!
  • 21k+ Entity Types: Trained on roughly 21k distinct entity types across multiple domains
  • Superior Performance: Achieves +23.02 F1 points over gliner_multi-v2.1 on multilingual benchmarks
  • General-Purpose: Works for broad-domain extraction, PII detection, and specialized taxonomies

Performance Highlights

  • CrossNER + Multilingual Domains: 55.34 F1 average (vs. 32.32 for gliner_multi-v2.1)
  • PII Detection: 44.94 F1 average across 5 languages
  • Real-Time Performance: Fast inference suitable for production applications

Useful Links

0 1

Allow for nested NER?

Examples
Text input Labels Threshold Nested NER