KMP Algorithm

Knuth-Morris-Pratt string matching

The problem:

given a (short) pattern and a (long) text, both strings, determine whether the pattern appears somewhere in the text.

Define function

void patternSearch(String text, String pattern)

KMP algorithm preprocess the pattern, when we are search for the pattern in the text, the pointer of text doesn't look back, we alternate the pointer of pattern to the position of longest prefix/suffix and continue the search.

suppose we have text: "abcdabcdcbd" and pattern "abcdabcf

when we mismatch 'f' and 'd' at:
        ptr1                          ptr1
        |                             |
abcdabc d cbd       ->        abcdabc d cbd
abcdabc f                         abc d abcf
        |                             |
        ptr2                          new ptr2


We move ptr2 to 'd' since the longest prefix/suffix in substring "abcdabc" is "abc"

we process in this way for all the mismatch in pattern.

The time complexity of search for one pattern is O(m+n)
(m is text's length, n is pattern's length)

Implementation

import java.util.Arrays;

public class kmp {
    public void patternSearch(String s, String pattern){
        System.out.println("Search String:"+s+"\t"+"Pattern:"+pattern);
        if(s==null || pattern==null || s.length()<pattern.length()) {
            System.out.println("No match found!");
            return;
        }
        int sLen=s.length(), pLen=pattern.length();
        int[] next=getNext(pattern);
        System.out.println(Arrays.toString(next));
        int i=0, j=0;
        while(i<sLen){
            if(j==-1 || s.charAt(i)==pattern.charAt(j)){
                i++;
                j++;
            }else{
                j=next[j];
            }
            if(j==pLen){
                System.out.println("Find at index "+(i-pLen));
                j=0;
            }
        }
    }

    public int[] getNext(String Pattern){
        int pLen=Pattern.length();
        int[] next=new int[pLen];
        //next[i] represents the longest proper prefix or suffix of pattern.substring(0,i)
        next[0]=-1;
        int i=0, j=-1;
        while(i<pLen-1){
            if(j==-1 || Pattern.charAt(i)==Pattern.charAt(j)) next[++i]=++j;
            else j=next[j];
        }
        return next;
    }

    public static void main(String[] args){
        kmp test=new kmp();
        test.patternSearch("ABABDABACDABABCABAB", "ABCABC");
    }
}

results matching ""

    No results matching ""