{"id":27098,"date":"2018-01-03T21:25:09","date_gmt":"2018-01-03T15:55:09","guid":{"rendered":"https:\/\/www.wikitechy.com\/technology\/?p=27098"},"modified":"2018-01-03T21:25:09","modified_gmt":"2018-01-03T15:55:09","slug":"optimal-binary-search-tree","status":"publish","type":"post","link":"https:\/\/www.wikitechy.com\/technology\/optimal-binary-search-tree\/","title":{"rendered":"Optimal Binary Search Tree"},"content":{"rendered":"<p>Given a sorted array <em>keys[0.. n-1]<\/em> of search keys and an array <em>freq[0.. n-1]<\/em> of frequency counts, where <em>freq[i]<\/em> is the number of searches to <em>keys[i]<\/em>. Construct a binary search tree of all keys such that the total cost of all the searches is as small as possible.<span id=\"more-28316\"><\/span><\/p>\n<p>Let us first define the cost of a BST. The cost of a BST node is level of that node multiplied by its frequency. Level of root is 1.<\/p>\n<pre><strong>Example 1<\/strong>\r\nInput:  keys[] = {10, 12}, freq[] = {34, 50}\r\nThere can be following two possible BSTs \r\n        10                       12\r\n          \\                     \/ \r\n           12                 10\r\n          I                     II\r\nFrequency of searches of 10 and 12 are 34 and 50 respectively.\r\nThe cost of tree I is 34*1 + 50*2 = 134\r\nThe cost of tree II is 50*1 + 34*2 = 118 \r\n\r\n<strong>Example 2<\/strong>\r\nInput:  keys[] = {10, 12, 20}, freq[] = {34, 8, 50}\r\nThere can be following possible BSTs\r\n    10                12                 20         10              20\r\n      \\             \/    \\              \/             \\            \/\r\n      12          10     20           12               20         10  \r\n        \\                            \/                 \/           \\\r\n         20                        10                12             12  \r\n     I               II             III             IV             V\r\nAmong all possible BSTs, cost of the fifth BST is minimum.  \r\nCost of the fifth BST is 1*50 + 2*34 + 3*8 = 142\r\n<\/pre>\n<ul>\n<li><strong>Optimal Substructure:<\/strong><\/li>\n<\/ul>\n<p>The optimal cost for freq[i..j] can be recursively calculated using following formula.<\/p>\n<p><img decoding=\"async\" class=\"alignleft size-full wp-image-27115\" src=\"https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/06\/Optimal-Binary-Search-Tree.png\" alt=\"Optimal Binary Search Tree\" width=\"541\" height=\"43\" srcset=\"https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/06\/Optimal-Binary-Search-Tree.png 541w, https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/06\/Optimal-Binary-Search-Tree-300x24.png 300w\" sizes=\"(max-width: 541px) 100vw, 541px\" \/><br \/>\nWe need to calculate <em><strong>optCost(0, n-1)<\/strong><\/em> to find the result.<\/p>\n<p>The idea of above formula is simple, we one by one try all nodes as root (r varies from i to j in second term). When we make <em>rth<\/em> node as root, we recursively calculate optimal cost from i to r-1 and r+1 to j.<br \/>\nWe add sum of frequencies from i to j (see first term in the above formula), this is added because every search will go through root and one comparison will be done for every search.<\/p>\n[ad type=\u201dbanner\u201d]\n<ul>\n<li><strong>Overlapping Subproblems<\/strong><\/li>\n<\/ul>\n<p>Following is recursive implementation that simply follows the recursive structure mentioned above.<\/p>\n[pastacode lang=\u201dc\u201d manual=\u201d%2F%2F%20A%20naive%20recursive%20implementation%20of%20optimal%20binary%20search%20tree%20problem%0A%23include%20%3Cstdio.h%3E%0A%23include%20%3Climits.h%3E%0A%20%0A%2F%2F%20A%20utility%20function%20to%20get%20sum%20of%20array%20elements%20freq%5Bi%5D%20to%20freq%5Bj%5D%0Aint%20sum(int%20freq%5B%5D%2C%20int%20i%2C%20int%20j)%3B%0A%20%0A%2F%2F%20A%20recursive%20function%20to%20calculate%20cost%20of%20optimal%20binary%20search%20tree%0Aint%20optCost(int%20freq%5B%5D%2C%20int%20i%2C%20int%20j)%0A%7B%0A%20%20%20%2F%2F%20Base%20cases%0A%20%20%20if%20(j%20%3C%20i)%20%20%20%20%20%20%2F%2F%20If%20there%20are%20no%20elements%20in%20this%20subarray%0A%20%20%20%20%20return%200%3B%0A%20%20%20if%20(j%20%3D%3D%20i)%20%20%20%20%20%2F%2F%20If%20there%20is%20one%20element%20in%20this%20subarray%0A%20%20%20%20%20return%20freq%5Bi%5D%3B%0A%20%0A%20%20%20%2F%2F%20Get%20sum%20of%20freq%5Bi%5D%2C%20freq%5Bi%2B1%5D%2C%20\u2026%20freq%5Bj%5D%0A%20%20%20int%20fsum%20%3D%20sum(freq%2C%20i%2C%20j)%3B%0A%20%0A%20%20%20%2F%2F%20Initialize%20minimum%20value%0A%20%20%20int%20min%20%3D%20INT_MAX%3B%0A%20%0A%20%20%20%2F%2F%20One%20by%20one%20consider%20all%20elements%20as%20root%20and%20recursively%20find%20cost%0A%20%20%20%2F%2F%20of%20the%20BST%2C%20compare%20the%20cost%20with%20min%20and%20update%20min%20if%20needed%0A%20%20%20for%20(int%20r%20%3D%20i%3B%20r%20%3C%3D%20j%3B%20%2B%2Br)%0A%20%20%20%7B%0A%20%20%20%20%20%20%20int%20cost%20%3D%20optCost(freq%2C%20i%2C%20r-1)%20%2B%20optCost(freq%2C%20r%2B1%2C%20j)%3B%0A%20%20%20%20%20%20%20if%20(cost%20%3C%20min)%0A%20%20%20%20%20%20%20%20%20%20min%20%3D%20cost%3B%0A%20%20%20%7D%0A%20%0A%20%20%20%2F%2F%20Return%20minimum%20value%0A%20%20%20return%20min%20%2B%20fsum%3B%0A%7D%0A%20%0A%2F%2F%20The%20main%20function%20that%20calculates%20minimum%20cost%20of%20a%20Binary%20Search%20Tree.%0A%2F%2F%20It%20mainly%20uses%20optCost()%20to%20find%20the%20optimal%20cost.%0Aint%20optimalSearchTree(int%20keys%5B%5D%2C%20int%20freq%5B%5D%2C%20int%20n)%0A%7B%0A%20%20%20%20%20%2F%2F%20Here%20array%20keys%5B%5D%20is%20assumed%20to%20be%20sorted%20in%20increasing%20order.%0A%20%20%20%20%20%2F%2F%20If%20keys%5B%5D%20is%20not%20sorted%2C%20then%20add%20code%20to%20sort%20keys%2C%20and%20rearrange%0A%20%20%20%20%20%2F%2F%20freq%5B%5D%20accordingly.%0A%20%20%20%20%20return%20optCost(freq%2C%200%2C%20n-1)%3B%0A%7D%0A%20%0A%2F%2F%20A%20utility%20function%20to%20get%20sum%20of%20array%20elements%20freq%5Bi%5D%20to%20freq%5Bj%5D%0Aint%20sum(int%20freq%5B%5D%2C%20int%20i%2C%20int%20j)%0A%7B%0A%20%20%20%20int%20s%20%3D%200%3B%0A%20%20%20%20for%20(int%20k%20%3D%20i%3B%20k%20%3C%3Dj%3B%20k%2B%2B)%0A%20%20%20%20%20%20%20s%20%2B%3D%20freq%5Bk%5D%3B%0A%20%20%20%20return%20s%3B%0A%7D%0A%20%0A%2F%2F%20Driver%20program%20to%20test%20above%20functions%0Aint%20main()%0A%7B%0A%20%20%20%20int%20keys%5B%5D%20%3D%20%7B10%2C%2012%2C%2020%7D%3B%0A%20%20%20%20int%20freq%5B%5D%20%3D%20%7B34%2C%208%2C%2050%7D%3B%0A%20%20%20%20int%20n%20%3D%20sizeof(keys)%2Fsizeof(keys%5B0%5D)%3B%0A%20%20%20%20printf(%22Cost%20of%20Optimal%20BST%20is%20%25d%20%22%2C%20optimalSearchTree(keys%2C%20freq%2C%20n))%3B%0A%20%20%20%20return%200%3B%0A%7D\u201d message=\u201dC\u201d highlight=\u201d\u201d provider=\u201dmanual\u201d\/]\n<p><strong>Output :<\/strong><\/p>\n<pre>Cost of Optimal BST is 142<\/pre>\n<p>Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems again and again. We can see many subproblems being repeated in the following recursion tree for freq[1..4].<\/p>\n<p><img fetchpriority=\"high\" decoding=\"async\" class=\"size-full wp-image-27124 aligncenter\" src=\"https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/06\/Optimal-binary-tree.png\" alt=\"Optimal binary tree\" width=\"656\" height=\"265\" srcset=\"https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/06\/Optimal-binary-tree.png 656w, https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/06\/Optimal-binary-tree-300x121.png 300w\" sizes=\"(max-width: 656px) 100vw, 656px\" \/><\/p>\n<p>Since same suproblems are called again, this problem has Overlapping Subprolems property. So optimal BST problem has both properties of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by constructing a temporary array cost[][] in bottom up manner.<\/p>\n[ad type=\u201dbanner\u201d]\n<p><strong>Dynamic Programming Solution<\/strong><br \/>\nFollowing is C\/C++ implementation for optimal BST problem using Dynamic Programming. We use an auxiliary array cost[n][n] to store the solutions of subproblems. cost[0][n-1] will hold the final result. The challenge in implementation is, all diagonal values must be filled first, then the values which lie on the line just above the diagonal. In other words, we must first fill all cost[i][i] values, then all cost[i][i+1] values, then all cost[i][i+2] values. So how to fill the 2D array in such manner> The idea used in the implementation is same as Matrix Chain Multiplication problem, we use a variable \u2018L\u2019 for chain length and increment \u2018L\u2019, one by one. We calculate column number \u2018j\u2019 using the values of \u2018i\u2019 and \u2018L\u2019.<\/p>\n[pastacode lang=\u201dc\u201d manual=\u201d%2F%2F%20Dynamic%20Programming%20code%20for%20Optimal%20Binary%20Search%20Tree%20Problem%0A%23include%20%3Cstdio.h%3E%0A%23include%20%3Climits.h%3E%0A%20%0A%2F%2F%20A%20utility%20function%20to%20get%20sum%20of%20array%20elements%20freq%5Bi%5D%20to%20freq%5Bj%5D%0Aint%20sum(int%20freq%5B%5D%2C%20int%20i%2C%20int%20j)%3B%0A%20%0A%2F*%20A%20Dynamic%20Programming%20based%20function%20that%20calculates%20minimum%20cost%20of%0A%20%20%20a%20Binary%20Search%20Tree.%20*%2F%0Aint%20optimalSearchTree(int%20keys%5B%5D%2C%20int%20freq%5B%5D%2C%20int%20n)%0A%7B%0A%20%20%20%20%2F*%20Create%20an%20auxiliary%202D%20matrix%20to%20store%20results%20of%20subproblems%20*%2F%0A%20%20%20%20int%20cost%5Bn%5D%5Bn%5D%3B%0A%20%0A%20%20%20%20%2F*%20cost%5Bi%5D%5Bj%5D%20%3D%20Optimal%20cost%20of%20binary%20search%20tree%20that%20can%20be%0A%20%20%20%20%20%20%20formed%20from%20keys%5Bi%5D%20to%20keys%5Bj%5D.%0A%20%20%20%20%20%20%20cost%5B0%5D%5Bn-1%5D%20will%20store%20the%20resultant%20cost%20*%2F%0A%20%0A%20%20%20%20%2F%2F%20For%20a%20single%20key%2C%20cost%20is%20equal%20to%20frequency%20of%20the%20key%0A%20%20%20%20for%20(int%20i%20%3D%200%3B%20i%20%3C%20n%3B%20i%2B%2B)%0A%20%20%20%20%20%20%20%20cost%5Bi%5D%5Bi%5D%20%3D%20freq%5Bi%5D%3B%0A%20%0A%20%20%20%20%2F%2F%20Now%20we%20need%20to%20consider%20chains%20of%20length%202%2C%203%2C%20\u2026%20.%0A%20%20%20%20%2F%2F%20L%20is%20chain%20length.%0A%20%20%20%20for%20(int%20L%3D2%3B%20L%3C%3Dn%3B%20L%2B%2B)%0A%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%2F%2F%20i%20is%20row%20number%20in%20cost%5B%5D%5B%5D%0A%20%20%20%20%20%20%20%20for%20(int%20i%3D0%3B%20i%3C%3Dn-L%2B1%3B%20i%2B%2B)%0A%20%20%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%2F%2F%20Get%20column%20number%20j%20from%20row%20number%20i%20and%20chain%20length%20L%0A%20%20%20%20%20%20%20%20%20%20%20%20int%20j%20%3D%20i%2BL-1%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20cost%5Bi%5D%5Bj%5D%20%3D%20INT_MAX%3B%0A%20%0A%20%20%20%20%20%20%20%20%20%20%20%20%2F%2F%20Try%20making%20all%20keys%20in%20interval%20keys%5Bi..j%5D%20as%20root%0A%20%20%20%20%20%20%20%20%20%20%20%20for%20(int%20r%3Di%3B%20r%3C%3Dj%3B%20r%2B%2B)%0A%20%20%20%20%20%20%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%2F%2F%20c%20%3D%20cost%20when%20keys%5Br%5D%20becomes%20root%20of%20this%20subtree%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20int%20c%20%3D%20((r%20%3E%20i)%3F%20cost%5Bi%5D%5Br-1%5D%3A0)%20%2B%20%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20((r%20%3C%20j)%3F%20cost%5Br%2B1%5D%5Bj%5D%3A0)%20%2B%20%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20sum(freq%2C%20i%2C%20j)%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20if%20(c%20%3C%20cost%5Bi%5D%5Bj%5D)%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20cost%5Bi%5D%5Bj%5D%20%3D%20c%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%7D%0A%20%20%20%20return%20cost%5B0%5D%5Bn-1%5D%3B%0A%7D%0A%20%0A%2F%2F%20A%20utility%20function%20to%20get%20sum%20of%20array%20elements%20freq%5Bi%5D%20to%20freq%5Bj%5D%0Aint%20sum(int%20freq%5B%5D%2C%20int%20i%2C%20int%20j)%0A%7B%0A%20%20%20%20int%20s%20%3D%200%3B%0A%20%20%20%20for%20(int%20k%20%3D%20i%3B%20k%20%3C%3Dj%3B%20k%2B%2B)%0A%20%20%20%20%20%20%20s%20%2B%3D%20freq%5Bk%5D%3B%0A%20%20%20%20return%20s%3B%0A%7D%0A%20%0A%2F%2F%20Driver%20program%20to%20test%20above%20functions%0Aint%20main()%0A%7B%0A%20%20%20%20int%20keys%5B%5D%20%3D%20%7B10%2C%2012%2C%2020%7D%3B%0A%20%20%20%20int%20freq%5B%5D%20%3D%20%7B34%2C%208%2C%2050%7D%3B%0A%20%20%20%20int%20n%20%3D%20sizeof(keys)%2Fsizeof(keys%5B0%5D)%3B%0A%20%20%20%20printf(%22Cost%20of%20Optimal%20BST%20is%20%25d%20%22%2C%20optimalSearchTree(keys%2C%20freq%2C%20n))%3B%0A%20%20%20%20return%200%3B%0A%7D\u201d message=\u201dC\u201d highlight=\u201d\u201d provider=\u201dmanual\u201d\/]\n<p><strong>Output :<\/strong><\/p>\n<pre>Cost of Optimal BST is 142<\/pre>\n<p><strong>Notes<\/strong><\/p>\n<ul>\n<li>The time complexity of the above solution is O(n^4). The time complexity can be easily reduced to O(n^3) by pre-calculating sum of frequencies instead of calling sum() again and again.<\/li>\n<li>In the above solutions, we have computed optimal cost only. The solutions can be easily modified to store the structure of BSTs also. We can create another auxiliary array of size n to store the structure of tree. All we need to do is, store the chosen \u2018r\u2019 in the innermost loop.<\/li>\n<\/ul>\n[ad type=\u201dbanner\u201d]\n","protected":false},"excerpt":{"rendered":"<p>Optimal Binary Search Tree &#8211; Dynamic Programming Given a sorted array keys[0.. n-1] of search keys and an array freq[0.. n-1] of frequency counts,<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[82509,70145],"tags":[80836,80831,80832,80835,80841,80828,80824,80838,80839,72852],"class_list":["post-27098","post","type-post","status-publish","format-standard","hentry","category-binary-search-tree-2","category-dynamic-programming","tag-optimal-binary-search-tree-algorithm-using-dynamic-programming","tag-optimal-binary-search-tree-tutorial-point","tag-optimal-binary-search-tree-using-dynamic-programming","tag-optimal-binary-search-tree-using-dynamic-programming-algorithm","tag-optimal-binary-search-tree-using-dynamic-programming-ppt","tag-optimal-binary-search-tree-youtube","tag-optimal-binary-search-trees","tag-optimal-binary-search-trees-using-dynamic-programming","tag-optimal-binary-tree-dynamic-programming","tag-problems-on-dynamic-programming"],"_links":{"self":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/posts\/27098","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/comments?post=27098"}],"version-history":[{"count":0,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/posts\/27098\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/media?parent=27098"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/categories?post=27098"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/tags?post=27098"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}