Levenshtein Distance Calculator

Stephanie Ben-Joseph headshot Stephanie Ben-Joseph

Introduction: What this Levenshtein Distance Calculator does

This calculator computes the Levenshtein distance between two strings. The distance is the minimum number of single-character edits needed to turn one string into the other, where the allowed edits are:

Enter two strings above and click Compute. The tool will return an integer distance. A result of 0 means the strings are identical; larger numbers mean more edits are required and therefore lower similarity.

Definition and formula

Formally, the Levenshtein distance between two strings A and B is defined as the minimum number of insertions, deletions, and substitutions needed to transform A into B. The standard algorithm uses dynamic programming and fills a matrix where each entry represents the distance between prefixes of the two strings.

Let i be the length of the prefix of A and j the length of the prefix of B. The value d(i, j) is the Levenshtein distance between the first i characters of A and the first j characters of B. The recurrence is:

d ( i , j ) = min ( d ( i - 1 , j ) + 1 d ( i , j - 1 ) + 1 d ( i - 1 , j - 1 ) + cost )

Here, cost = 0 if the i-th character of A equals the j-th character of B, and cost = 1 otherwise. The first row and first column of the matrix are initialized to represent distances involving an empty string:

  • d(0, j) = j (all insertions)
  • d(i, 0) = i (all deletions)

The final answer is d(|A|, |B|), where |·| denotes string length.

Worked example: "kitten" vs "sitting"

A classic example uses the words kitten and sitting. One optimal sequence of edits is:

  1. Substitute k with s: kitten → sitten
  2. Substitute e with i: sitten → sittin
  3. Insert g at the end: sittin → sitting

That sequence uses three single-character edits, so the Levenshtein distance between kitten and sitting is 3. The dynamic programming matrix guarantees that this is the minimum possible number of edits, even when there are many alternative paths.

How to interpret the result

The calculator returns a non-negative integer. You can interpret it as follows:

  • 0: the two strings are exactly the same.
  • 1–2: the strings are very similar, differing by only a few characters.
  • 3–5: noticeable differences, but they may still represent the same underlying item (for example, spelling variants or minor typos).
  • Higher values: the strings are progressively less similar and may represent different items.

Keep in mind that the raw distance depends on the string lengths. A distance of 3 is small for 20-character strings but large for 4-character strings. For some applications, you may want to convert the distance to a normalized similarity score, such as 1 − (distance / max_length), but this calculator reports the raw distance only.

Comparison with other string similarity metrics

Levenshtein distance is only one way to compare strings. Other metrics behave differently and are useful in different contexts. The table below contrasts a few common approaches.

Metric Key idea When it is a good fit
Levenshtein distance Counts insertions, deletions, and substitutions; all edits cost 1 in this calculator. General-purpose fuzzy matching where typos, missing characters, and substitutions all matter.
Hamming distance Counts substitutions only; strings must be the same length. Error-detecting codes, bit strings, or fixed-length IDs where insertions/deletions are impossible.
Damerau–Levenshtein Like Levenshtein but also treats adjacent transpositions (e.g., abba) as a single edit. Correcting common keyboard typos where characters are swapped.
Jaro / Jaro–Winkler Weighted comparison based on matching characters and their relative order. Short strings such as names, where prefix similarity is particularly important.

If your task needs one of these variants, this page can still provide a baseline by showing how far apart two strings are under the classic Levenshtein definition.

Assumptions and limitations of this calculator

To keep the tool predictable and fast for everyday use, the implementation makes several assumptions:

  • Equal edit costs: Insertions, deletions, and substitutions all cost 1. There is no weighting by character type, keyboard layout, or position in the string.
  • Exact character matching: Characters are compared exactly as entered. By default there is no case folding, accent/diacritic normalization, or Unicode canonicalization unless explicitly performed on your side before input.
  • No transposition operation: Swapping two adjacent characters (e.g., abba) counts as two edits (delete + insert or two substitutions), not one. This differs from Damerau–Levenshtein distance.
  • Performance on very long strings: The dynamic programming algorithm runs in time and memory roughly proportional to the product of the two string lengths. Extremely long inputs (for example, tens of thousands of characters each) may be slow or impractical in a browser environment.
  • Raw distance only: The calculator reports the distance, not a normalized similarity score or probability. If your workflow needs a 0–1 similarity value, you will need to transform the result accordingly.

These constraints are typical for a reference implementation of Levenshtein distance and are suitable for most text-oriented tasks such as spell checking, fuzzy search, and simple record matching.

Where Levenshtein distance is useful

Because it is simple and well-defined, Levenshtein distance appears in many practical scenarios:

  • Spell checking and autocorrect: Suggest words from a dictionary that are closest to a misspelling, such as mapping recieve to receive.
  • Fuzzy search: Match user queries that are slightly mistyped against product titles or article names.
  • Record linkage and deduplication: Identify near-duplicate customer names across databases (for example, Jon Smith vs John Smyth).
  • Data cleaning: Standardize free-text entries that contain minor variations, abbreviations, or missing characters.
  • Bioinformatics and sequences: Compare DNA, RNA, or protein sequences, treating each base or amino acid as a character (often with domain-specific extensions).

In each of these settings, a lower distance suggests higher similarity, and threshold rules like "accept if distance ≤ 2" are common.

How to use this calculator

  1. Enter String A using the unit or time period shown by the field.
  2. Enter String B using the unit or time period shown by the field.
  3. Run the calculation and compare the output with a second scenario before acting on it.
Enter two strings and click Compute.

Arcade Mini-Game: Levenshtein Distance Calculator Calibration Run

Use this quick arcade run to practice separating useful scenario inputs from common planning mistakes before you rely on the calculator output.

Score: 0 Timer: 30s Best: 0

Start the game, then use your pointer or arrow keys to catch useful inputs and avoid bad assumptions.