471.Encode String with Shortest Length

Tags: [DP], [string], [encode], [encoding], [trick]

Com: {g}

Link: https://leetcode.com/problems/encode-string-with-shortest-length/#/description

Given a non-empty string, encode the string such that its encoded length is the shortest.

The encoding rule is:k[encoded_string], where theencoded_stringinside the square brackets is being repeated exactlyktimes.

Note:

  1. k will be a positive integer and encoded string will not be empty or have extra space.
  2. You may assume that the input string contains only lowercase English letters. The string's length is at most 160.
  3. If an encoding process does not make the string shorter, then do not encode it. If there are several solutions, return any of them is fine.

Example 1:

Input: "aaa"
Output: "aaa"
Explanation: There is no way to encode it such that it is shorter than the input string, so we do not encode it.

Example 2:

Input: "aaaaa"
Output: "5[a]"
Explanation: "5[a]" is shorter than "aaaaa" by 1 character.

Example 3:

Input: "aaaaaaaaaa"
Output: "10[a]"
Explanation: "a9[a]" or "9[a]a" are also valid solutions, both of them have the same length = 5, which is the same as "10[a]".

Example 4:

Input: "aabcaabcd"
Output: "2[aabc]d"
Explanation: "aabc" occurs twice, so one answer can be "2[aabc]d".

Example 5:

Input: "abbbabbbcabbbabbbc"
Output: "2[2[abbb]c]"
Explanation: "abbbabbbc" occurs twice, but "abbbabbbc" can also be encoded to "2[abbb]c", so one answer can be "2[2[abbb]c]".

Solution: DP, Bottom To Top DP, Trick

class Solution(object):
    def encode(self, s):
        """
        :type s: str
        :rtype: str
        """
        if not s or len(s) <= 3:
            return s

        size = len(s)
        dp = [[None for _ in xrange(size)] for _ in xrange(size)]
        for i in xrange(size):
            dp[i][i] = s[i]

        for sub_len in xrange(1, size + 1):
            start = 0
            while start + sub_len <= size:
                end = start + sub_len - 1

                for i in xrange(start, end):
                    left = dp[start][i]
                    right = dp[i + 1][end]

                    if dp[start][end] is None or\
                       len(left) + len(right) < len(dp[start][end]):
                        dp[start][end] = left + right

                sub = s[start:end + 1]
                index = (sub + sub).find(sub, 1)
                if index < len(sub):
                    # there exists the repeating
                    encoded_sub = '{}[{}]'.format((len(sub) / index), dp[start][start + index - 1])

                    if dp[start][end] is None or\
                       len(encoded_sub) < len(dp[start][end]):
                        dp[start][end] = encoded_sub

                start += 1

        return dp[0][size - 1]

Revelation:

  • 在最内层循环做sub那部分是因为如果新的dp[start][end]是由left和right拼接而来的,那么我们要看这个拼接出来的sub还能不能再被encoded了.
  • dp[start][end] 表示 s[start:end + 1]的encode形式.
  • Trick: 把两个sub接起来,如果从 index=1 开始能找到sub,并且这个被找到的sub的起点的index < len(sub),那么就说明sub是几个重复的pattern组成的,例如sub = 'abcabc', sub+sub = 'abcabcabcabc', (sub+sub).find(sub, 1) = 3.
  • Be careful of that we are using the 'dp[start][start + index - 1])' to encode the sub.

Note:

  • Time complexity = O(n^3), n is the length of the given string.

Solution: DP, Top To Bottom DP, Trick

class Solution(object):
    def encode(self, s):
        """
        :type s: str
        :rtype: str
        """
        memo = {}
        return self.encode_helper(s, memo)

    def encode_helper(self, s, memo):
        # base case
        if not s or len(s) <= 3:
            return s

        if s in memo:
            return memo[s]

        result = s
        for left_len in xrange(1, len(s) + 1):
            left = s[:left_len]
            right = s[left_len:]

            tmp_result = self.solve(left, memo) + self.encode_helper(right, memo)
            if len(tmp_result) < len(result):
                result = tmp_result

        memo[s] = result
        return result

    def solve(self, s, memo):
        if not s or len(s) <= 3:
            return s

        size = len(s)
        result = s
        for sub_len in xrange(1, size / 2 + 1):
            if size % sub_len != 0 or s[:sub_len] * (size / sub_len) != s:
                continue

            tmp_result = '{}[{}]'.format((size / sub_len), self.encode_helper(s[:sub_len], memo))
            if len(tmp_result) < len(result):
                result = tmp_result

        return result

Note:

  • It's hard to see the time complexity.

results matching ""

    No results matching ""