471.Encode String with Shortest Length
Tags: [DP], [string], [encode], [encoding], [trick]
Com: {g}
Link: https://leetcode.com/problems/encode-string-with-shortest-length/#/description
Given a non-empty string, encode the string such that its encoded length is the shortest.
The encoding rule is:k[encoded_string]
, where theencoded_stringinside the square brackets is being repeated exactlyktimes.
Note:
- k will be a positive integer and encoded string will not be empty or have extra space.
- You may assume that the input string contains only lowercase English letters. The string's length is at most 160.
- If an encoding process does not make the string shorter, then do not encode it. If there are several solutions, return any of them is fine.
Example 1:
Input: "aaa"
Output: "aaa"
Explanation: There is no way to encode it such that it is shorter than the input string, so we do not encode it.
Example 2:
Input: "aaaaa"
Output: "5[a]"
Explanation: "5[a]" is shorter than "aaaaa" by 1 character.
Example 3:
Input: "aaaaaaaaaa"
Output: "10[a]"
Explanation: "a9[a]" or "9[a]a" are also valid solutions, both of them have the same length = 5, which is the same as "10[a]".
Example 4:
Input: "aabcaabcd"
Output: "2[aabc]d"
Explanation: "aabc" occurs twice, so one answer can be "2[aabc]d".
Example 5:
Input: "abbbabbbcabbbabbbc"
Output: "2[2[abbb]c]"
Explanation: "abbbabbbc" occurs twice, but "abbbabbbc" can also be encoded to "2[abbb]c", so one answer can be "2[2[abbb]c]".
Solution: DP, Bottom To Top DP, Trick
class Solution(object):
def encode(self, s):
"""
:type s: str
:rtype: str
"""
if not s or len(s) <= 3:
return s
size = len(s)
dp = [[None for _ in xrange(size)] for _ in xrange(size)]
for i in xrange(size):
dp[i][i] = s[i]
for sub_len in xrange(1, size + 1):
start = 0
while start + sub_len <= size:
end = start + sub_len - 1
for i in xrange(start, end):
left = dp[start][i]
right = dp[i + 1][end]
if dp[start][end] is None or\
len(left) + len(right) < len(dp[start][end]):
dp[start][end] = left + right
sub = s[start:end + 1]
index = (sub + sub).find(sub, 1)
if index < len(sub):
# there exists the repeating
encoded_sub = '{}[{}]'.format((len(sub) / index), dp[start][start + index - 1])
if dp[start][end] is None or\
len(encoded_sub) < len(dp[start][end]):
dp[start][end] = encoded_sub
start += 1
return dp[0][size - 1]
Revelation:
- 在最内层循环做sub那部分是因为如果新的dp[start][end]是由left和right拼接而来的,那么我们要看这个拼接出来的sub还能不能再被encoded了.
- dp[start][end] 表示 s[start:end + 1]的encode形式.
- Trick: 把两个sub接起来,如果从 index=1 开始能找到sub,并且这个被找到的sub的起点的index < len(sub),那么就说明sub是几个重复的pattern组成的,例如sub = 'abcabc', sub+sub = 'abcabcabcabc', (sub+sub).find(sub, 1) = 3.
- Be careful of that we are using the 'dp[start][start + index - 1])' to encode the sub.
Note:
- Time complexity = O(n^3), n is the length of the given string.
Solution: DP, Top To Bottom DP, Trick
class Solution(object):
def encode(self, s):
"""
:type s: str
:rtype: str
"""
memo = {}
return self.encode_helper(s, memo)
def encode_helper(self, s, memo):
# base case
if not s or len(s) <= 3:
return s
if s in memo:
return memo[s]
result = s
for left_len in xrange(1, len(s) + 1):
left = s[:left_len]
right = s[left_len:]
tmp_result = self.solve(left, memo) + self.encode_helper(right, memo)
if len(tmp_result) < len(result):
result = tmp_result
memo[s] = result
return result
def solve(self, s, memo):
if not s or len(s) <= 3:
return s
size = len(s)
result = s
for sub_len in xrange(1, size / 2 + 1):
if size % sub_len != 0 or s[:sub_len] * (size / sub_len) != s:
continue
tmp_result = '{}[{}]'.format((size / sub_len), self.encode_helper(s[:sub_len], memo))
if len(tmp_result) < len(result):
result = tmp_result
return result
Note:
- It's hard to see the time complexity.