Thursday, May 15, 2014

LeetCode: Minimum Window Substring

Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n).

For example,
S = "ADOBECODEBANC"
T = "ABC"

Minimum window is "BANC".

Note:
If there is no such window in S that covers all characters in T, return the emtpy string "".

If there are multiple such windows, you are guaranteed that there will always be only one unique minimum window in S.


The basic idea is this: we start looking in the source string, if until now we cannot find a char set of the target, keep looking. If we already find a char set, then we can get current window size and then continue looking for the next window. When looking for the next window, each time when we find a char in the target string, we can check if the window size can be decreased. Until doing this to the end of the source, we have the minimum size.

Let's see an example:

S: "ADOBECODEBANC" ; T: "ABC"

current char: "A" ; begin = 0, end = 0;

current char: "D" ; begin = 0, end = 1;

current char: "O" ; begin = 0, end = 2;

current char: "B" ; begin = 0, end = 3;

current char: "E" ; begin = 0, end = 4;

current char: "C" ; begin = 0, end = 5; minimum window size = 6, "ADOBEC"

current char: "O" ; begin = 0, end = 6;

current char: "D" ; begin = 0, end = 7;

current char: "E" ; begin = 0, end = 8;

current char: "B" ; begin = 0, end = 9; minimum window size = 6, "ADOBEC"

current char: "A" ; begin = 5, end = 10; minimum window size = 6, "ADOBEC"

current char: "N" ; begin = 5, end = 11;

current char: "C" ; begin = 9, end = 12; minimum window size = 4, "BANC"

C++ Code:

/*
 * func: min_window
 * goal: find the minimum window in S which will contain all characters in T
 * @param S: source string
 * @param T: target string
 * return: minimum window
 */
string min_window(string S, string T){

    int len_s = S.length();
    int len_t = T.length();
    if(len_s < len_t){
        return "";
    }
    
    int char_dict_need[256];
    int char_dict_found[256];
    
    memset(char_dict_found, 0, 256 * sizeof(int));
    memset(char_dict_need, 0, 256 * sizeof(int));
    
    //Collect char needs to be found
    for(const char &ch : T){
        char_dict_need[ch] += 1;
    }
    
    //variable for begin and end in S
    int begin = 0;
    int end = 0;
    //variable for begin and end in minimum windown of S
    int min_begin = 0;
    int min_end = len_s;
    //keep chars that have been found
    int count = 0;
    
    while(end < len_s){
        //If current ch is needed
        if(char_dict_need[S[end]] != 0){
            //record it in found dict
            ++char_dict_found[S[end]];
            
            //Update count
            if(char_dict_found[S[end]] <= char_dict_need[S[end]]){
                ++count;
            }
            
            //If one windown is found
            if(count == len_t){
                //To see if begin index can be changed
                while(char_dict_need[S[begin]] == 0 || char_dict_need[S[begin]] < char_dict_found[S[begin]]){
                    if(char_dict_found[S[begin]] > char_dict_need[S[begin]]){
                        --char_dict_found[S[begin]];
                    }
                    ++begin;
                }
                
                //Update minimum window
                if((end-begin + 1) < (min_end - min_begin + 1)){
                    min_begin = begin;
                    min_end = end;
                }
            }
        }
        ++end;
        
    }
    if(count == len_t){
        return S.substr(min_begin, min_end-min_begin+1);
    }else{
        return "";
    }
}

Python Code:

# func: find the minimum window in S which will contain all chars in T
# @param S: source string
# @param T: target string
# @return: minimum window
def min_window(S, T):
    start = 0
    end = 0
    min_start = 0
    min_end = len(S)
    count = 0

    char_found = [0] * 256
    char_need = [0] * 256

    #Create char_need
    for ch in T:
        char_need[ord(ch)] += 1

    while end < len(S):
        if char_need[ord(S[end])] > 0:
            #Update found dict
            char_found[ord(S[end])] += 1

            #Update count
            if char_found[ord(S[end])] <= char_need[ord(S[end])]:
                count += 1

            if count == len(T):
                #If the begin word is not what we need or it is in the dict
                #but we have more than what we need for now
                while char_need[ord(S[start])] == 0 or char_found[ord(S[start])] > char_need[ord(S[start])]:
                    if char_found[ord(S[start])] > char_need[ord(S[start])]:
                        char_found[ord(S[start])] -= 1
                    start += 1

                #Update minimum window
                if (end-start+1) < (min_end-min_start+1):
                    min_start = start
                    min_end = end

        end += 1

    if count == len(T):
        return S[min_start:min_end+1]
    else:
        return ''

Reference Link:

1. Finding the Minimum Window in S which Contains All Elements from T: http://leetcode.com/2010/11/finding-minimum-window-in-s-which.html

No comments:

Post a Comment