3. Longest Substring Without Repeating Characters,最长无重复字符子串

Last Updated: 2024-01-08 02:14:06 Monday

-- TOC --

题目分析

Given a string s, find the length of the longest substring without repeating characters.

输入一个字符串,找出最长无重复子串的长度!

Constraints:

示例:

Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring. 

pwke只是subsequence,不是substring。注意英语subsequence和substring的区别!

直觉One Pass

将输入字符串一个个拿出来,每拿出来一个,就跟前面已经拿出来的进行比较,此时可发现是否有重复的字符。当存在重复字符的时候,更新代码中的状态信息,然后从重复位置的下一个位置开始,继续拿出字符比较。此过程中,保存最长的长度信息,用于return。注意循环退出后的size可能是最大的。

Python

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        ss = []
        maxlen = 0
        for c in s:
            if c not in ss:
                ss.append(c)
                continue
            maxlen = max(maxlen, len(ss))
            ss = ss[ss.index(c)+1:]
            ss.append(c)
        return max(maxlen, len(ss))

Python的append更快:

# slow
a = [1,2,3] + [4]
# fast
a = [1,2,3]
a.append(4)

要在尾部添加的元素只有1个,没有必要做两个list的合并!

C++

class Solution {
public:
    int lengthOfLongestSubstring(string s) {
        string ss;
        size_t maxlen {};
        size_t p;
        for(auto c: s){
            if((p=ss.find(c)) == string::npos){
                ss += c;
                continue;
            }
            maxlen = max(maxlen, ss.size());
            ss = ss.substr(p+1);
            ss += c;
        }
        return (int)max(maxlen, ss.size());
    }
};

C

int lengthOfLongestSubstring(char *s){
    char ss[256] = {};
    unsigned char idx = 0; // also size
    char *p;
    int maxlen = 0;
    while(*s){
        if((p=memchr(ss,*s,idx)) == NULL){
            ss[idx++] = *s++;
            continue;
        }
        maxlen = maxlen<idx ? idx : maxlen;
        idx = idx - (p-ss+1);
        memmove(ss, p+1, idx);  // cannot use memcpy
        ss[idx++] = *s++;
    }
    return maxlen<idx ? idx : maxlen;
}

C语言中,type决定如何解释内存,请看如下示例:

#include <stdio.h>
#include <string.h>

int main(void) {
    char p[] = {0xAA};
    printf("%d\n", *p);
    printf("%d\n", *(unsigned char*)p);

    char a = 129;
    printf("%d\n", a);
    printf("%d\n", (unsigned char)a);
    return 0;
}

输出:

-86
170
-127
129

Sliding Window

使用Sliding Window,一般就是记录Window的两端的index,通过两个index的变量,sliding...

前面所用算法,与Sliding Window无本质区别,不同点在与,没有记录两端的index,而是直接记录下全部substring,以此来计算window size。但如果使用sliding window,需记录下substring中字符以及它们最后出现的index,计算上会稍微简单一点点,特别是对C语言,也许代码能够更快,无需更新substring信息。

另一个关键点是查找。每次取一个char都需要查找,如果用hash的方式保存sliding window中的char,查找速度就上去了。

Python

用dict对象来实现membership checking,我的经验是,只比set稍微慢一丢丢。

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        ss = {}
        maxlen = left = 0
        right = -1  # in case s is empty
        for right,c in enumerate(s):
            if c in ss and ss[c]>=left:
                maxlen = max(maxlen, right-left)
                left = ss[c] + 1
            ss[c] = right
        return max(maxlen, right-left+1)

C++

class Solution {
public:
    int lengthOfLongestSubstring(string s) {
        unordered_map<char,int> ump;
        ump.reserve(128);
        int lo {};
        int hi {};
        int maxlen {};
        for(; hi<static_cast<int>(s.size()); ++hi){
            char c = s[hi];
            if(ump.contains(c) && ump[c]>=lo){
                maxlen = max(maxlen, hi-lo);
                lo = ump[c] + 1;
            }
            ump[c] = hi;
        }
        return max(maxlen, hi-lo);
    }
};

C++只有unordered_map才能与Python的dict媲美...

C

int lengthOfLongestSubstring(char *s){
    int idxmap[256];
    memset(idxmap, 0xFF, sizeof(int)*256);
    int left, right, maxlen;
    left = right = maxlen = 0;
    while(*s){
        if(idxmap[*s] >= left){
            maxlen = maxlen<(right-left)?(right-left):maxlen;
            left = idxmap[*s] + 1;
        }
        idxmap[*s++] = right++;
    }
    return maxlen<(right-left)?(right-left):maxlen;
}

本文链接:https://cs.pynote.net/ag/leetcode/202211091/

-- EOF --

-- MORE --