SauerkrautLM-GLiNER: Multilingual Zero-Shot Named Entity Recognition
SauerkrautLM-GLiNER is a multilingual GLiNER-style model for zero-shot named entity recognition (NER) based on the jhu-clsp/mmBERT-base backbone (a ModernBERT-style multilingual encoder).
Key Features
- Multilingual Support: Trained jointly on English, German, French, Italian, and Spanish
- Zero-Shot Entity Recognition: Identify any entity type without requiring retraining - just provide your custom label list!
- 21k+ Entity Types: Trained on roughly 21k distinct entity types across multiple domains
- Superior Performance: Achieves +23.02 F1 points over gliner_multi-v2.1 on multilingual benchmarks
- General-Purpose: Works for broad-domain extraction, PII detection, and specialized taxonomies
Performance Highlights
- CrossNER + Multilingual Domains: 55.34 F1 average (vs. 32.32 for gliner_multi-v2.1)
- PII Detection: 44.94 F1 average across 5 languages
- Real-Time Performance: Fast inference suitable for production applications
Useful Links
- Model Page: SauerkrautLM-GLiNER on Hugging Face
- Demo Space: Live Demo
- Benchmark Dataset: gliner-benchmark-multilingual
0 1
Allow for nested NER?
Examples
| Text input | Labels | Threshold | Nested NER |
|---|