297.Serialize and Deserialize Binary Tree

Tags: [binary_tree], [traversal], [preorder], [inorder], [serialize], [deserialize], [id], [queue]

Com: {fb}

Link: https://leetcode.com/problems/serialize-and-deserialize-binary-tree/#/description

Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.

Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary tree can be serialized to a string and this string can be deserialized to the original tree structure.

For example, you may serialize the following tree

    1
   / \
  2   3
     / \
    4   5

as

"[1,2,3,null,null,4,5]", just the same as how LeetCode OJ serializes a binary tree. You do not necessarily need to follow this format, so please be creative and come up with different approaches yourself.

Note:Do not use class member/global/static variables to store states. Your serialize and deserialize algorithms should be stateless.


More clean and better solution: Preorder traversal, Queue

# Definition for a binary tree node.
# class TreeNode(object):
#     def __init__(self, x):
#         self.val = x
#         self.left = None
#         self.right = None

from collections import deque

class Codec:

    def serialize(self, root):
        """Encodes a tree to a single string.

        :type root: TreeNode
        :rtype: str
        """
        if not root:
            return ''

        # preorder traversal
        result = []
        stack = [root]
        while stack:
            curr = stack.pop()
            if curr:
                result.append(curr.val)
                stack.append(curr.right)
                stack.append(curr.left)
            else:
                result.append(None)

        return ','.join([str(x) for x in result])

    def deserialize(self, data):
        """Decodes your encoded data to tree.

        :type data: str
        :rtype: TreeNode
        """
        if not data:
            return None

        queue = deque([x for x in data.split(',') if x])
        return self.build_binary_tree(queue)

    def build_binary_tree(self, queue):
        # base case.
        if not queue:
            return None

        curr = queue.popleft()
        if curr == 'None':
            return None

        root = TreeNode(int(curr))
        root.left = self.build_binary_tree(queue)
        root.right = self.build_binary_tree(queue)
        return root

# Your Codec object will be instantiated and called as such:
# codec = Codec()
# codec.deserialize(codec.serialize(root))

Revelation:

  • When we record preorder traversal, we record None. The preorder traversal with None can identify one tree.
  • The preorder traversal is: root, left, right.
  • So when we do deserialization, we know the first value is root, but it's hard to find where is the 'split' between left subtree, and right subtree, but if there exists some kind of container, which let us keep consuming the left nodes, then consuming the right nodes, we can use preorder traversal to reconstruct the tree. And the Queue is this kind of container. When we do recursion, we first do recursion on the left subtree, then right subtree, which can let the program consume the nodes belong to left subtree, then right subtree.

Note:

  • Time complexity of serialization = O(n), n is the number of nodes of the given tree.
  • Time complexity of deserialization = O(n), n is the number of nodes of the given tree.

Solution: Preorder Traversal, Inorder Traversal, Id

# Definition for a binary tree node.
# class TreeNode(object):
#     def __init__(self, x):
#         self.val = x
#         self.left = None
#         self.right = None

class Codec:

    def serialize(self, root):
        """Encodes a tree to a single string.

        :type root: TreeNode
        :rtype: str
        """
        preorder = self.preorder_traversal(root)
        inorder = self.inorder_traversal(root)

        print 'pre={}'.format(preorder)
        print 'in ={}'.format(inorder)

        return '{}:{}'.format(preorder, inorder)

    def preorder_traversal(self, root):
        if not root:
            return ''

        result = []
        stack = [root]
        while stack:
            curr = stack.pop()
            result.append('{}#{}'.format(id(curr), curr.val))

            # The reason here we push curr.right first, push curr.left second,
            # because later, we will pop curr.left first, curr.right second.
            if curr.right:
                stack.append(curr.right)
            if curr.left:
                stack.append(curr.left)

        return ','.join(result)

    def inorder_traversal(self, root):
        if not root:
            return ''

        result = []
        stack = []
        curr = root
        while curr or stack:
            if curr:
                stack.append(curr)
                curr = curr.left
            else:
                curr = stack.pop()
                result.append('{}#{}'.format(id(curr), curr.val))
                curr = curr.right

        return ','.join(result)

    def deserialize(self, data):
        """Decodes your encoded data to tree.

        :type data: str
        :rtype: TreeNode
        """
        if not data:
            return None
        if data.count(':') != 1:
            raise ValueError('the input data is invalid')

        preorder_str, inorder_str = data.split(':')
        preorder = [x for x in preorder_str.split(',') if x]
        inorder = [x for x in inorder_str.split(',') if x]
        return self.build_binary_tree(preorder, 0, len(preorder) - 1, 
                                      inorder, 0, len(inorder) - 1)

    def build_binary_tree(self, preorder, p_start, p_end, inorder, i_start, i_end):
        # base case
        if p_start > p_end or i_start > i_end:
            return None

        root_val = preorder[p_start]
        root_val_index_in_inorder = inorder.index(root_val, i_start, i_end + 1)
        # root_val_index_in_inorder = -1
        # for i in xrange(i_start, i_end + 1):
        #     if inorder[i] == root_val:
        #         root_val_index_in_inorder = i
        #         break

        # if root_val_index_in_inorder < 0:
        #     raise ValueError('The input is invalid')

        num_of_nodes_in_left = root_val_index_in_inorder - i_start
        num_of_nodes_in_right = i_end - root_val_index_in_inorder

        p_left_start = p_start + 1
        p_left_end = p_left_start + num_of_nodes_in_left - 1
        i_left_start = i_start
        i_left_end = root_val_index_in_inorder - 1

        p_right_start = p_left_end + 1
        p_right_end = p_end
        i_right_start = root_val_index_in_inorder + 1
        i_right_end = i_end

        root_id, val = root_val.split('#')
        root = TreeNode(int(val))
        root.left = self.build_binary_tree(preorder, p_left_start, p_left_end, inorder, i_left_start, i_left_end)
        root.right = self.build_binary_tree(preorder, p_right_start, p_right_end, inorder, i_right_start, i_right_end)

        return root

# Your Codec object will be instantiated and called as such:
# codec = Codec()
# codec.deserialize(codec.serialize(root))

Revelation:

  • Preorder traversal + inorder traversal can identify one tree. Only preorder traversal or postorder traversal cannot identify one tree, only inorder traversal can identify one tree, but we cannot only use inorder traversal to reconstruct the tree, because we cannot find the root.
  • But when we do preorder traversal, we also record the None, the preorder or postorder traversal can identify one tree.

  • The reason here we use Python's id() function is because we want to use an unique label to identify each node. We cannot just use node's val, because there may exist duplicate vals for nodes. And Python's id() can get the id number for each Python object, which is unique.

Note:

  • Time complexity of serialization = O(n), n is the number of nodes.
  • Time complexity of deserialization = O(n), n is the number of nodes.

results matching ""

    No results matching ""