{"id":1360,"date":"2016-05-15T16:46:38","date_gmt":"2016-05-15T23:46:38","guid":{"rendered":"http:\/\/www.coastalvectors.com\/blog\/?p=1360"},"modified":"2017-09-07T14:13:27","modified_gmt":"2017-09-07T21:13:27","slug":"project-euler-10","status":"publish","type":"post","link":"https:\/\/www.coastalvectors.com\/blog\/2016\/05\/project-euler-10\/","title":{"rendered":"Project Euler: 10"},"content":{"rendered":"<p><a href=\"http:\/\/www.coastalvectors.com\/blog\/wp-content\/uploads\/2015\/06\/check.jpg\" rel=\"attachment wp-att-1184\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1184\" src=\"http:\/\/www.coastalvectors.com\/blog\/wp-content\/uploads\/2015\/06\/check.jpg\" alt=\"check\" width=\"150\" height=\"150\" \/><\/a><\/p>\n<p><a href=\"https:\/\/projecteuler.net\/problem=10\">Problem 10<\/a> of <a href=\"https:\/\/projecteuler.net\">Project Euler <\/a>is to find the sum of all primes under 2,000,000.<\/p>\n<p>This isn&#8217;t a hard problem in most languages. I have recently been going through and re-solving problems in various languages for the fun of it. Just for fun, I decided to solve some problems using Bash scripting. Bash has all the constructs of a real programming language, however, the main draw back for problems like this is that it&#8217;s slow. VERY slow. Well, lets see what we can do to speed this problem up.<\/p>\n<h2>Brute Force<\/h2>\n<p>The brute force approach is pretty simple. Loop through every number, n, less than 2,000,000, and try dividing it by every number from 2 to n-1. If it is divisible, break out of the loop and try the next number.<\/p>\n<pre><code>sum=2\r\nfor (( i=3; i&lt;2000000; i++ )); do\r\n\u00a0\u00a0\u00a0 prime=1\r\n\u00a0\u00a0\u00a0 for (( d=2; d&lt;$i; d++ )); do\r\n\u00a0\u00a0 \u00a0if [ $(($i % $d)) -eq 0 ]; then\r\n\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0 prime=0\r\n\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0 break;\r\n\u00a0\u00a0 \u00a0fi\r\n\u00a0\u00a0\u00a0 done\r\n\u00a0\u00a0\u00a0 if [ $prime -eq 1 ]; then\r\n\u00a0\u00a0 \u00a0sum=$(($sum + $i))\r\n\u00a0\u00a0\u00a0 fi\r\ndone\r\necho $sum\r\n<\/code><\/pre>\n<p>This solution has no optimization, but will still yield a result in a relatively small time-frame for most languages. This would take incredibly long to process primes smaller than 2,000,000. For a benchmark, I have run this through primes less than 2,000, and it took 6.430 seconds. We need to optimize.<\/p>\n<h2>Optimize<\/h2>\n<p>There are a lot of simple changes that we can do to get our compute times down. For instance, if we only loop through odd numbers, we have half as many to process.<\/p>\n<p>Another classic technique to optimize a prime test is to check for factors less than or equal to the square root of the number. The reason this works is that if a factor of our number, n, is larger than the square root of n, we will find the other factor has to be smaller than the square root of n, and will thus be detected.<\/p>\n<pre><code>sum=2\r\nfor (( i=3; i&lt;2000; i+=2 )); do\r\n\u00a0\u00a0\u00a0 prime=1\r\n\u00a0\u00a0\u00a0 for (( d=2; d*d&lt;=$i; d++ )); do\r\n\u00a0\u00a0 \u00a0if [ $(($i % $d)) -eq 0 ]; then\r\n\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0 prime=0\r\n\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0 break;\r\n\u00a0\u00a0 \u00a0fi\r\n\u00a0\u00a0\u00a0 done\r\n\u00a0\u00a0\u00a0 if [ $prime -eq 1 ]; then\r\n\u00a0\u00a0 \u00a0sum=$(($sum + $i))\r\n\u00a0\u00a0\u00a0 fi\r\ndone\r\necho $sum<\/code><\/pre>\n<p>These two changes alone bring our compute time down to .319 seconds for primes under 2,000! I still wouldn&#8217;t recommend trying computing all primes under 2,000,000. I tried, and killed it after 3 days without a result!<\/p>\n<h2>Sieve of Eratosthenes<\/h2>\n<p>Prime sieves are a super fast way of computing prime numbers without having to do any division. This method is attributed to Eratosthenes of Cyrene, a Greek mathematician living about 200 BC. (You can read more about the history on <a href=\"https:\/\/en.wikipedia.org\/wiki\/Sieve_of_Eratosthenes\">Wikipedia<\/a>).<\/p>\n<p>Basically, you have an array of every number you want to check. Start by making each element of the array equal to 1, indicating it is a prime.\u00a0 Go through the array, starting at two. Whenever you hit a value that is 1, mark all the multiples of that value as 0&#8217;s. So, we look at element 2, which is prime because it&#8217;s value is 1. element whose index is a multiple of 2 should then get marked as 0, since that cannot be prime. You then advance to the next element, and repeat until you&#8217;ve visited every number in the array.<\/p>\n<pre><code>limit=2000\r\nprimes[$limit]=0\r\nprimeList[0]=2\r\nprimeIndex=0\r\nsum=0\r\nfor (( i=2; i&lt;$limit; i++ )); do\r\n\u00a0\u00a0\u00a0 if [[ ${primes[$i]} -eq \\0 ]]; then\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 sum=$(($sum+$i))\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primeList[$primeIndex]=$i\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primeIndex=$(($primeIndex+1))\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 for (( n=$i; n&lt;$limit; n+=$i )); do\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primes[$n]=1\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 done\r\n\u00a0\u00a0\u00a0 fi\r\ndone\r\n\r\necho $sum<\/code><\/pre>\n<p>Implementing this method definitely improves results for our benchmark of primes less than 2,000; .186 seconds! However, this approach does not scale up to the desired 2,000,000 because Bash arrays are super slow as they get larger. Even though this approach worked great in our benchmark, it will still take upwards of 3 days to complete for all the primes under 2,000,000.<\/p>\n<h2>Combination of Methods<\/h2>\n<p>By using a sieve method combined with our optimized method, I was able to finally get the correct answer in 34 minutes using a Bash script.<\/p>\n<p>I used a sieve to generate the primes between 1 and sqrt(2000000). This was fast since the array wasn&#8217;t nearly as big. I then used the array of primes that I generated to check every odd number between 2 and 2,000,000.<\/p>\n<p>Another shortcut to reduce computation is to make a second array containing the squares of each prime. This means that we don&#8217;t need to perform multiplication in each loop when we are checking for primality.<\/p>\n<pre><code>limit=2000000\r\nsum=0\r\n\r\n# Generate primes up to sqrt of 2000000 using sieve\r\nprimeLimit=$(echo \"scale=0;sqrt($limit*2)\" | bc)\r\nprimes[$primeLimit]=0\r\nprimeList[0]=2\r\nprimeListSquared[0]=4\r\nprimeIndex=0\r\nfor (( i=2; i&lt;$primeLimit; i++ )); do\r\n\u00a0\u00a0\u00a0 if [[ ${primes[$i]} -eq \\0 ]]; then\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 sum=$(($sum+$i))\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primeList[$primeIndex]=$i\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primeListSquared[$primeIndex]=$i*$i\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primeIndex=$(($primeIndex+1))\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 for (( n=$i; n&lt;$primeLimit; n+=$i )); do\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primes[$n]=1\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 done\r\n\u00a0\u00a0\u00a0 fi\r\ndone\r\n\r\n# Make sure primeLimit is an odd number\r\n[[ $(($primeLimit % 2)) -eq 0 ]] &amp;&amp; primeLimit=$(($primeLimit + 1))\r\n\r\n# Use known primes from above to check each remaining number for primality\r\nfor (( i=$primeLimit; i&lt;$limit; i+=2 )); do\r\n\u00a0\u00a0\u00a0 prime=1\r\n\u00a0\u00a0\u00a0 for (( p=0; ${primeListSquared[$p]}&lt;=$i; p++ )); do\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if [[ $(($i % ${primeList[$p]})) -eq 0 ]]; then\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 prime=0\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 break\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 fi\r\n\u00a0\u00a0\u00a0 done\r\n\u00a0\u00a0\u00a0 # Record!\r\n\u00a0\u00a0\u00a0 if [[ $prime -eq 1 ]]; then\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 sum=$(($sum+$i))\r\n\u00a0\u00a0\u00a0 fi\r\ndone\r\n\r\necho $sum<\/code><\/pre>\n<p>34 minutes. Finally, a &#8220;reasonable&#8221; result (by some definitions of reasonable).<\/p>\n<h2>Multi-Processing<\/h2>\n<p>The last step I took was by splitting up the task of finding primes into multiple processes. This allowed all 8 of my cores to work on the solution. I moved my prime search method into a function that took a start and stopping point as parameters, and then echoed the sum for that portion back to a while loop that added them all together.<\/p>\n<pre><code>limit=2000000\r\n\r\n# Generate primes up to sqrt of limit\r\nprimeLimit=$(echo \"scale=0;sqrt($limit*2)\" | bc)\r\nprimes[$primeLimit]=0\r\nprimeList[0]=2\r\nprimeListSquared[0]=4\r\nprimeIndex=0\r\nfor (( i=2; i&lt;$primeLimit; i++ )); do\r\n\u00a0\u00a0\u00a0 if [[ ${primes[$i]} -eq \\0 ]]; then\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primeList[$primeIndex]=$i\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primeListSquared[$primeIndex]=$i*$i\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primeIndex=$(($primeIndex+1))\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 for (( n=$i; n&lt;$primeLimit; n+=$i )); do\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 primes[$n]=1\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 done\r\n\u00a0\u00a0\u00a0 fi\r\ndone\r\n\r\nfunction checkRange() {\r\n\u00a0\u00a0\u00a0 start=$1\r\n\u00a0\u00a0\u00a0 end=$2\r\n\u00a0\u00a0\u00a0 rangeSum=0\r\n\r\n\u00a0\u00a0\u00a0 # Make sure range starts at an odd number\r\n\u00a0\u00a0\u00a0 [[ $start -lt 2 ]] &amp;&amp; start=3 &amp;&amp; rangeSum=2\r\n\u00a0\u00a0\u00a0 [[ $(($start % 2)) -eq 0 ]] &amp;&amp; start=$(($start + 1))\r\n\r\n\u00a0\u00a0\u00a0 # Use known primes to check all remaining numbers for primality\r\n\u00a0\u00a0\u00a0 for (( i=$start; i&lt;$end; i+=2 )); do\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 prime=1\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 for (( p=0; ${primeListSquared[$p]}&lt;=$i; p++ )); do\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if [[ $(($i % ${primeList[$p]})) -eq 0 ]]; then\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 prime=0\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 break\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 fi\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 done\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 # Record!\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if [[ $prime -eq 1 ]]; then\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 rangeSum=$(($rangeSum+$i))\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 fi\r\n\u00a0\u00a0\u00a0 done\r\n\u00a0\u00a0\u00a0 echo $rangeSum\r\n}\r\n\r\n# Split up task into chunks, and check with multiple processes\r\nsum=0\r\nrange=$(($limit \/ 50))\r\nfor (( start=0; start&lt;$limit; start+=$range )); do\r\n\u00a0\u00a0\u00a0 checkRange $start $(($start+$range)) &amp;\r\ndone | while read result; do\r\n\u00a0\u00a0\u00a0 sum=$(($sum+$result))\r\n\u00a0\u00a0\u00a0 echo $sum\r\ndone | tail -1<\/code><\/pre>\n<p>This method brought the compute time down to 3 minutes 22.075 seconds.<\/p>\n<h2>Conclusion<\/h2>\n<p>Project Euler gives the guideline that if your solution takes more than a minute, you&#8217;re probably doing it wrong. For this particular task though, I think 3 minutes is rather successful given the constraints of Bash. It was a fun project to try to optimize such a relatively simple algorithm within the constraints of this little scripting language.<\/p>\n<p>If you can find any way to speed up the results, please let me know!<\/p>\n<p>Visit <a href=\"https:\/\/github.com\/mryingster\/ProjectEuler\/commits\/master\/problem_010.sh\">the solution<\/a> on my <a href=\"https:\/\/github.com\/mryingster\">GitHub page<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Problem 10 of Project Euler is to find the sum of all primes under 2,000,000. This isn&#8217;t a hard problem in most languages. I have recently been going through and re-solving problems in various languages for the fun of it. Just for fun, I decided to solve some problems using Bash scripting. Bash has all &hellip; <a href=\"https:\/\/www.coastalvectors.com\/blog\/2016\/05\/project-euler-10\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Project Euler: 10<\/span><\/a><\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8],"tags":[],"class_list":["post-1360","post","type-post","status-publish","format-standard","hentry","category-programming"],"_links":{"self":[{"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/posts\/1360","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/comments?post=1360"}],"version-history":[{"count":11,"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/posts\/1360\/revisions"}],"predecessor-version":[{"id":1377,"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/posts\/1360\/revisions\/1377"}],"wp:attachment":[{"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/media?parent=1360"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/categories?post=1360"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.coastalvectors.com\/blog\/wp-json\/wp\/v2\/tags?post=1360"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}