{"id":3389,"date":"2017-04-01T18:30:37","date_gmt":"2017-04-01T13:00:37","guid":{"rendered":"https:\/\/www.wikitechy.com\/technology\/?p=3389"},"modified":"2017-04-01T18:30:37","modified_gmt":"2017-04-01T13:00:37","slug":"list-big-o-php-functions","status":"publish","type":"post","link":"https:\/\/www.wikitechy.com\/technology\/list-big-o-php-functions\/","title":{"rendered":"[ Solved &#8211; 1 Answers ] PHP &#8211; List of Big-O for PHP functions"},"content":{"rendered":"<ul>\n<li>After using PHP for a while, we noticed that not all PHP built in functions as fast as expected.<\/li>\n<li>Consider the below two possible implementations of a function that finds if a number is prime using a cached array of primes.<\/li>\n<\/ul>\n<div class=\"code-embed-wrapper\"> <div class=\"code-embed-infos\"> <span class=\"code-embed-name\">Php Code<\/span> <\/div> <pre class=\"language-php code-embed-pre line-numbers\"  data-start=\"1\" data-line-offset=\"0\"><code class=\"language-php code-embed-code\">\/\/very slow for large $prime_array<br\/>$prime_array = array( 2, 3, 5, 7, 11, 13, .... 104729, ... );<br\/>$result_array = array();<br\/>foreach( $prime_array =&gt; $number ) {<br\/>    $result_array[$number] = in_array( $number, $large_prime_array );<br\/>}<br\/><br\/>\/\/speed is much less dependent on size of $prime_array, and runs much faster.<br\/>$prime_array =&gt; array( 2 =&gt; NULL, 3 =&gt; NULL, 5 =&gt; NULL, 7 =&gt; NULL,<br\/>                       11 =&gt; NULL, 13 =&gt; NULL, .... 104729 =&gt; NULL, ... );<br\/>foreach( $prime_array =&gt; $number ) {<br\/>    $result_array[$number] = array_key_exists( $number, $large_prime_array );<br\/>}<\/code><\/pre> <\/div>\n<ul>\n<li>This is because in_array is implemented with a linear search O(n) which will linearly slow down as $prime_array grows. Where the array_key_exists function is implemented with a hash lookup O(1) which will not slow down unless the hash table gets extremely populated (in which case it&#8217;s only O(n)).<\/li>\n<li>So far we had to discover the big-O&#8217;s via trial and error, and occasionally looking at the source code. Now for the question&#8230;<\/li>\n<\/ul>\n<p><strong>Is there was a list of the theoretical (or practical) big O times for all* the PHP built in functions?<\/strong><\/p>\n<p>*or at least the interesting ones<\/p>\n<p>For example find it very hard to predict what the big O of functions listed because the possible implementation depends on unknown core data structures of PHP: array_merge, array_merge_recursive, array_reverse, array_intersect, array_combine, str_replace (with array inputs), etc.<\/p>\n<p><label class=\"label label-info\">SOLUTION :1<\/label><\/p>\n<ul>\n<li>We gone though and either via benchmark or code-skimming to characterize the array_* functions.<\/li>\n<\/ul>\n<p><strong>Note:<\/strong><\/p>\n<ul>\n<li>All the Big-O where calculated assuming a hash lookup is O(1) even though it&#8217;s really O(n).<\/li>\n<li>The coefficient of the n is so low, the ram overhead of storing a large enough array would hurt you before the characteristics of lookup Big-O would start taking effect.<\/li>\n<li>For example the difference between a call to array_key_exists at N=1 and N=1,000,000 is ~50% time increase.<\/li>\n<\/ul>\n<h4 id=\"interesting-points\"><span style=\"color: #ff6600;\">Interesting Points:<\/span><\/h4>\n<ol>\n<li>isset\/array_key_exists\u00a0is much faster than\u00a0in_array\u00a0and\u00a0array_search<\/li>\n<li>+(union) is a bit faster than\u00a0array_merge\u00a0(and looks nicer). But it does work differently so keep that in mind.<\/li>\n<li>shuffle\u00a0is on the same Big-O tier as\u00a0array_rand<\/li>\n<li>array_pop\/array_push\u00a0is faster than\u00a0array_shift\/array_unshift\u00a0due to re-index penalty<\/li>\n<\/ol>\n[ad type=&#8221;banner&#8221;]\n<h4 id=\"lookups\">Lookups:<\/h4>\n<ul>\n<li>array_key_exists O(n) but really close to O(1) &#8211; this is because of linear polling in collisions, but because the chance of collisions is very small, the coefficient is also very small. I find you treat hash lookups as O(1) to give a more realistic big-O. For example the different between N=1000 and N=100000 is only about 50% slow down.<\/li>\n<li>isset( $array[$index] ) O(n) but really close to O(1) &#8211; it uses the same lookup as array_key_exists. Since it&#8217;s language construct, will cache the lookup if the key is hardcoded, resulting in speed up in cases where the same key is used repeatedly.<\/li>\n<li>in_array O(n) &#8211; this is because it does a linear search though the array until it finds the value.<\/li>\n<li>array_search O(n) &#8211; it uses the same core function as in_array but returns value.<\/li>\n<\/ul>\n[ad type=&#8221;banner&#8221;]\n<h4 id=\"queue-functions\"><span style=\"color: #800000;\">Queue functions:<\/span><\/h4>\n<ul>\n<li>array_push O(\u2211 var_i, for all i)<\/li>\n<li>array_pop O(1)<\/li>\n<li>array_shift O(n) &#8211; it has to reindex all the keys<\/li>\n<li>array_unshift O(n + \u2211 var_i, for all i) &#8211; it has to reindex all the keys<\/li>\n<\/ul>\n<h4 id=\"array-intersection-union-subtraction\"><span style=\"color: #800080;\"><strong>Array Intersection, Union, Subtraction:<\/strong><\/span><\/h4>\n<ul>\n<li>array_intersect_key if intersection 100% do O(Max(param_i_size)*\u2211param_i_count, for all i), if intersection 0% intersect O(\u2211param_i_size, for all i)<\/li>\n<li>array_intersect if intersection 100% do O(n^2*\u2211param_i_count, for all i), if intersection 0% intersect O(n^2)<\/li>\n<li>array_intersect_assoc if intersection 100% do O(Max(param_i_size)*\u2211param_i_count, for all i), if intersection 0% intersect O(\u2211param_i_size, for all i)<\/li>\n<li>array_diff O(\u03c0 param_i_size, for all i) &#8211; That&#8217;s product of all the param_sizes<\/li>\n<li>array_diff_key O(\u2211 param_i_size, for i != 1) &#8211; this is because we don&#8217;t need to iterate over the first array.<\/li>\n<li>array_merge O( \u2211 array_i, i != 1 ) &#8211; doesn&#8217;t need to iterate over the first array<\/li>\n<li>+ (union) O(n), where n is size of the 2nd array (ie array_first + array_second) &#8211; less overhead than array_merge since it doesn&#8217;t have to renumber<\/li>\n<li>array_replace O( \u2211 array_i, for all i )<\/li>\n<\/ul>\n<h4 id=\"random\">Random:<\/h4>\n<ul>\n<li>shuffle O(n)<\/li>\n<li>array_rand O(n) &#8211; Requires a linear poll.<\/li>\n<\/ul>\n<p><strong>Obvious Big-O:<\/strong><\/p>\n<ul>\n<li>array_fill O(n)<\/li>\n<li>array_fill_keys O(n)<\/li>\n<li>range O(n)<\/li>\n<li>array_splice O(offset + length)<\/li>\n<li>array_slice O(offset + length) or O(n) if length = NULL<\/li>\n<li>array_keys O(n)<\/li>\n<li>array_values O(n)<\/li>\n<li>array_reverse O(n)<\/li>\n<\/ul>\n<p><strong>EDIT:<\/strong><\/p>\n<ul>\n<li>For those who doubt that PHP array lookups are O(N), we written a benchmark to test that (they are still effectively O(1) for most realistic values).<\/li>\n<\/ul>\n<p><img fetchpriority=\"high\" decoding=\"async\" class=\"wp-image-3399 size-full alignnone\" src=\"https:\/\/www.wikitechy.com\/technology\/wp-content\/uploads\/2017\/03\/Picture2-8.png\" alt=\"\" width=\"1059\" height=\"375\" \/><\/p>\n[ad type=&#8221;banner&#8221;]\n<div class=\"code-embed-wrapper\"> <div class=\"code-embed-infos\"> <span class=\"code-embed-name\">Php Code<\/span> <\/div> <pre class=\"language-php code-embed-pre line-numbers\"  data-start=\"1\" data-line-offset=\"0\"><code class=\"language-php code-embed-code\">$tests = 1000000;<br\/>$max = 5000001;<br\/>for( $i = 1; $i &lt;= $max; $i += 10000 ) {<br\/>    \/\/create lookup array<br\/>    $array = array_fill( 0, $i, NULL );<br\/><br\/>    \/\/build test indexes<br\/>    $test_indexes = array();<br\/>    for( $j = 0; $j &lt; $tests; $j++ ) {<br\/>        $test_indexes[] = rand( 0, $i-1 );<br\/>    }<br\/><br\/>    \/\/benchmark array lookups<br\/>    $start = microtime( TRUE );<br\/>    foreach( $test_indexes as $test_index ) {<br\/>        $value = $array[ $test_index ];<br\/>        unset( $value );<br\/>    }<br\/>    $stop = microtime( TRUE );<br\/>    unset( $array, $test_indexes, $test_index );<br\/>    printf( &quot;%d,%1.15f\\n&quot;, $i, $stop - $start ); \/\/time per 1mil lookups<br\/>    unset( $stop, $start );<br\/>}<\/code><\/pre> <\/div>\n","protected":false},"excerpt":{"rendered":"<p>List of Big-O for PHP functions After using PHP for a while now, we noticed that not all PHP built in functions as fast<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[25],"tags":[6270,6276,6275,6272,6277,6271,6278,6274,4090,6273,2363],"class_list":["post-3389","post","type-post","status-publish","format-standard","hentry","category-php","tag-big-o","tag-big-o-of-two-functions","tag-big-o-analysis-with-functions-within-functions","tag-big-o-for-eight-year-olds","tag-big-o-of-php-arrays","tag-how-do-you-calculateapproximate-it","tag-php-array-performance-checking-key-existence-vs-checking-values-existence","tag-preferred-method-to-store-php-arrays-json_encode-vs-serialize","tag-startswith-and-endswith-functions-in-php","tag-what-is-a-plain-english-explanation-of-big-o-notation","tag-why-shouldnt-i-use-mysql_-functions-in-php"],"_links":{"self":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/posts\/3389","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/comments?post=3389"}],"version-history":[{"count":0,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/posts\/3389\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/media?parent=3389"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/categories?post=3389"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wikitechy.com\/technology\/wp-json\/wp\/v2\/tags?post=3389"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}