leetcode : Implement strStr() | StolasIn's Personal Page

runtime 0 ms, beats 100% of cpp submissions
O(n+m) solution with explanation

tags: string

🔗 link

📖 description

給定兩個字串 haystack 、 needle ，回傳 needle 在 haystack 中最早出現的位置，如果沒有出現，則回傳 -1

    haystack : "hello"
    needle   : "ll"

    出現的位置為 2
    
    0 1 2 3 4
    h e l l o

🧠 solution

可以使用暴力法 O(mn) 的解決問題，其中 m, n 分別為 haystack, needle 的長度。
但在字串搜尋領域，有個高效演算法 KMP algorithm ，主要想法是利用預先建立好的 prefix-surfix array 來避免重複的尋找。

⏳ time complexity

建表時間複雜度 O(n) ，尋找字串時間複雜度 O(m)
總時間複雜度 O(n+m)

📝 code

class Solution {
public:
    vector<int> pre(string needle){  // 建立 prefix array
        int n = needle.size();
        vector<int> lps(n, 0);
        int len = 0;        
        for (int i = 1; i < n;){
            if(needle[i] == needle[len]){     // 如果發現相同字元，紀錄 index
                lps[i++] = ++len;
            } 
            else if(len>0) {
                len = lps[len - 1];           // 如果之前有 match 到字元，必須從上一個位置開始
            } 
            else{
                lps[i++] = 0;                 // 否則當前值填入 0
            }
        }
        return lps;
    }
    int strStr(string haystack, string needle){
        int m = haystack.size(), n = needle.size();
        if (n==0) {
            return 0;
        }
        vector<int> lps = pre(needle);
        for(int i = 0, j = 0; i < m;){
            if (haystack[i] == needle[j]){
                i++, j++;
            }
            if (j == n) {       // 找到字串
                return i - j;
            }
            if (i < m && haystack[i] != needle[j]){
                if(j!=0){
                    j = lps[j - 1];     // 沒找到，從下一個 surfix 相同的點開始找起
                }
                else{
                    i++;
                }
            }
        }
        return -1;
    }
};