{"id":4096,"date":"2012-12-03T10:38:27","date_gmt":"2012-12-03T09:38:27","guid":{"rendered":"https:\/\/blogs.mentor.com\/colinwalls\/?p=4096"},"modified":"2012-12-03T10:38:27","modified_gmt":"2012-12-03T09:38:27","slug":"less-puzzled","status":"publish","type":"post","link":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/2012\/12\/03\/less-puzzled\/","title":{"rendered":"Less puzzled"},"content":{"rendered":"<p>Some weeks ago, I made a <a href=\"https:\/\/blogs.mentor.com\/colinwalls\/blog\/2012\/10\/22\/a-puzzle\/\" target=\"_blank\" rel=\"noopener noreferrer\">posting<\/a> in which I presented some code and then considered how it might be optimized and asked for input from readers. I was confident that I would not have found every possibility. Indeed, there were some useful comments, which I really appreciated. [It is nice to know that there is someone out there!]<\/p>\n<p>Apart from the comments, I had an email from <a href=\"mailto:mlluber@gmx.de\" target=\"_blank\" rel=\"noopener noreferrer\">Michael Luber<\/a> in Germany, who looked at the problem in some detail. With his permission, I have reproduced his results here &#8230;<!--more--><\/p>\n<p>I also like puzzles very much and, therefore, I would like to comment on this post.<\/p>\n<p>I played around a bit and also found that much of what I could think of would probably addressed by most compilers.<\/p>\n<p>However, in the end, I came up with this:<\/p>\n<pre>\/*******************************************************************\/<\/pre>\n<pre>#define ITERATIONS 50<\/pre>\n<pre>int mandelbrot(float a)<\/pre>\n<pre>{<\/pre>\n<pre style=\"padding-left: 30px\">int i;<\/pre>\n<pre style=\"padding-left: 30px\">float b = a;<\/pre>\n<pre style=\"padding-left: 30px\">for (i=0; i&lt;ITERATIONS; i++)<\/pre>\n<pre style=\"padding-left: 30px\">{<\/pre>\n<pre style=\"padding-left: 60px\">if (*((unsigned int*) &amp;b) &amp; (unsigned int)0x40000000)<\/pre>\n<pre style=\"padding-left: 60px\">{<\/pre>\n<pre style=\"padding-left: 90px\">return i;<\/pre>\n<pre style=\"padding-left: 60px\">}<\/pre>\n<pre style=\"padding-left: 60px\">b = b * b + a;<\/pre>\n<pre style=\"padding-left: 30px\">}<\/pre>\n<pre style=\"padding-left: 30px\">return i;<\/pre>\n<pre>}<\/pre>\n<pre>\/*******************************************************************\/<\/pre>\n<p>&nbsp;<\/p>\n<p>Here is what I considered:<\/p>\n<ul>\n<li>Some compilers might initialize all uninitialized <strong>auto<\/strong> variables with zero, so I preferred to put a declaration of <strong>b<\/strong> with the assignment of <strong>a<\/strong> to it as a single statement. (As Peter already did).<\/li>\n<li>I would rather let the variable <strong>i<\/strong> run from zero to (<strong>ITERATIONS &#8211; 1<\/strong>). This may save initializing it (see above), and saves the subtraction of 1 in the <strong>return<\/strong> statement.<\/li>\n<li>This also allows <strong>i<\/strong> to be returned instead of <strong>ITERATIONS<\/strong>, if the <strong>for<\/strong>-loop runs to completion, which may save loading the constant <strong>ITERATIONS<\/strong> into a register (assuming <strong>i<\/strong> already is in a register).<\/li>\n<li>For checking if the absolute value of a <strong>float<\/strong> is greater than 2, I just check for the second bit is set, which should work considering how floats are represented in memory. Of course, this &#8220;hack&#8221; introduces a hardware dependency, which is generally not a good idea. However, if the requirement is for performance above all, I could consider that. On x86-gcc, this trick reduces the check to a total of four instructions, including those for loading the float from stack as well as the conditional jump:<\/li>\n<\/ul>\n<pre>mov -0x8(%ebp),%eax<\/pre>\n<pre>and $0x40000000,%eax<\/pre>\n<pre>test %eax,%eax<\/pre>\n<pre>je 0x4010bd<\/pre>\n<p>&nbsp;<\/p>\n<p>Maybe I would put this in a macro and add some compile-time checks to be safe, in case the compiler changes, or I go to 64-bit or something like that.<\/p>\n<p>I&#8217;m also aware that my code checks for &gt;=2 instead of &gt;2, but when it comes to comparison of floating point numbers, this seems acceptable.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some weeks ago, I made a posting in which I presented some code and then considered how it might be&#8230;<\/p>\n","protected":false},"author":71677,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spanish_translation":"","french_translation":"","german_translation":"","italian_translation":"","polish_translation":"","japanese_translation":"","chinese_translation":"","footnotes":""},"categories":[1],"tags":[313,300,327,340],"industry":[],"product":[],"coauthors":[],"class_list":["post-4096","post","type-post","status-publish","format-standard","hentry","category-news","tag-c","tag-embedded-software","tag-optimization","tag-programming-languages"],"_links":{"self":[{"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/posts\/4096","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/users\/71677"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/comments?post=4096"}],"version-history":[{"count":0,"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/posts\/4096\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/media?parent=4096"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/categories?post=4096"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/tags?post=4096"},{"taxonomy":"industry","embeddable":true,"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/industry?post=4096"},{"taxonomy":"product","embeddable":true,"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/product?post=4096"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/blogs.stage.sw.siemens.com\/embedded-software\/wp-json\/wp\/v2\/coauthors?post=4096"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}