Detecting Text Language with NLLanguageRecognizer in Swift


Greetings, traveler!

Language detection is one of those subtle features that can quietly improve user experience—especially in multilingual apps. Whether you’re building a translation tool, a reader app, or adaptive UI that reacts to language, Apple’s NaturalLanguage framework provides a simple and efficient way to determine the dominant language of any text.

In this article, we’ll look at how to implement a minimal and reusable LanguageDetector in Swift, understand how it works under the hood, and explore several practical improvements.

The Core Implementation

Here’s a clean, lightweight implementation that detects the dominant language of a given text:

import NaturalLanguage

struct LanguageDetector {
    static func languageCode(for text: String) -> String? {
        let recognizer = NLLanguageRecognizer()
        recognizer.processString(text)
        if let language = recognizer.dominantLanguage {
            return language.rawValue // "en", "ru", "de", etc.
        }
        return nil
    }
}

The code is straightforward:

  1. NLLanguageRecognizer is initialized to perform the analysis.
  2. processString(_:) processes the text and computes language probabilities.
  3. dominantLanguage returns the most likely language as an NLLanguage instance, which we then convert to its raw ISO 639-1 code.

If the recognizer cannot confidently determine a language (for example, the text is too short or contains mixed content), it will return nil.

Understanding How It Works

NLLanguageRecognizer is part of Apple’s NaturalLanguage framework. It uses statistical models trained on large text corpora to infer the most probable language for a given piece of text.

  • Short text limitation: The recognizer performs best on full sentences or paragraphs. Very short inputs (like a single word) can produce unreliable results.
  • Multilingual text: When a string contains several languages, dominantLanguage returns whichever is more frequent.
  • ISO codes: The returned values are standard two-letter ISO language codes, which can be directly passed into Locale for formatting or localization.

Enhancing the Implementation

The minimal version works well for basic use cases, but there are a few ways to make it more informative and adaptable.

Returning Confidence Scores

To understand how confident the model is in its prediction, you can query the probability associated with the detected language:

static func detectLanguage(for text: String) -> (code: String, confidence: Double)? {
    let recognizer = NLLanguageRecognizer()
    recognizer.processString(text)
    guard let language = recognizer.dominantLanguage else { return nil }
    let hypotheses = recognizer.languageHypotheses(withMaximum: 1)
    let confidence = hypotheses[language] ?? 0
    return (language.rawValue, confidence)
}

This helps when you need to filter out uncertain detections—for instance, ignoring results with a confidence lower than 0.6.

Using Language Hints

If your app targets a known set of languages, you can improve accuracy by guiding the recognizer with hints:

recognizer.languageHints = [.english: 0.5]

Hints are particularly useful for apps where users frequently mix languages (like bilingual messaging or content readers).

Conclusion

NLLanguageRecognizer is a small but powerful API that brings language intelligence directly to your app—no server calls, no latency, and no dependencies.

By wrapping it in a simple utility like LanguageDetector, you gain a clean and reusable component.