535.Encode and Decode TinyURL

Tags: [data_structure], [encode], [decode], [stateful], [map], [random], [url], [tiny_url]

Link: https://leetcode.com/problems/encode-and-decode-tinyurl/\#/description

Note: This is a companion problem to the

System Design problem: Design TinyURL.

TinyURL is a URL shortening service where you enter a URL such ashttps://leetcode.com/problems/design-tinyurland it returns a short URL such ashttp://tinyurl.com/4e9iAk.

Design the encodeand decodemethods for the TinyURL service. There is no restriction on how your encode/decode algorithm should work. You just need to ensure that a URL can be encoded to a tiny URL and the tiny URL can be decoded to the original URL.


Solution: Random, Map, Stateful

class Codec:

    def __init__(self):
        self.short_url_len = 6
        self.short_url_chars = [str(x) for x in xrange(10)] +\
                               [chr(x) for x in xrange(ord('a'), ord('z'))] +\
                               [chr(x) for x in xrange(ord('A'), ord('Z'))]

        self.long_url_to_short_url = {}
        self.short_url_to_long_url = {}
        self.short_url_prefix = 'http://tinyurl.com/'
        self.max_try_times = 10000

    def encode(self, longUrl):
        """Encodes a URL to a shortened URL.

        :type longUrl: str
        :rtype: str
        """
        if not longUrl:
            return ''
        if longUrl in self.long_url_to_short_url:
            return self.short_url_prefix + self.long_url_to_short_url[longUrl]

        buff = [None for _ in xrange(self.short_url_len)]
        try_times = 0
        while try_times <= self.max_try_times:
            for i in xrange(self.short_url_len):
                rand_char_index = random.randint(0, len(self.short_url_chars) - 1)
                buff[i] = self.short_url_chars[rand_char_index]

            new_short_url = ''.join(buff)
            if new_short_url not in self.short_url_to_long_url:
                self.long_url_to_short_url[longUrl] = new_short_url
                self.short_url_to_long_url[new_short_url] = longUrl
                return self.short_url_prefix + new_short_url

            try_times += 1

        raise Exception('Fail to encode the longUrl')

    def decode(self, shortUrl):
        """Decodes a shortened URL to its original URL.

        :type shortUrl: str
        :rtype: str
        """
        if not shortUrl:
            return ''

        short_url_prefix_index = shortUrl.find(self.short_url_prefix)
        if short_url_prefix_index < 0:
            raise Exception('Fail to decode the shortUrl')

        short_url = shortUrl[short_url_prefix_index + len(self.short_url_prefix):]

        if short_url in self.short_url_to_long_url:
            return self.short_url_to_long_url[short_url]
        else:
            raise Exception('Fail to decode the shortUrl')


# Your Codec object will be instantiated and called as such:
# codec = Codec()
# codec.decode(codec.encode(url))

Revelation:

  • Using 6-digit str (chars including '0'-'9', 'a'-'z', 'A'-'Z'), it can represent (10 + 26 + 26)^6 = more than 50 billions different 6-chars string, which is big enough here.
  • We can first think how do we implement it in one single machine, so we can make it stateful, then we can extend this implementation to distributed system. We can create a distributed system, when we do encoding, once one node retrieve the request, if it cannot find the mapping on its own machine, it can broadcast the query to all other machines, and aggregate all responses, and once this current node generate a new short url, it can forward this new short url to some other machine based on some hash algorithm or some other dispatch algorithm. When we do decoding, if the current node cannot find the mapping, it will broadcast the query to all other machines, then aggregate the responses together.

results matching ""

    No results matching ""