tag:blogger.com,1999:blog-125341290284114978.post7523990381448228225..comments2023-04-23T00:05:10.829-07:00Comments on Guy Ellis' Tech Blog: Speed of Contains() on string list and hash set in C#Guy Ellishttp://www.blogger.com/profile/02574435376236977220noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-125341290284114978.post-42995106968200461122012-11-13T17:21:08.000-08:002012-11-13T17:21:08.000-08:00@matsolof --> On a real application you would c...@matsolof --> On a real application you would cache your instance of hashset.<br />The instantiation in each time you call the function is causing this difference in time. I didn't test it, but I'm pretty sure it would make a huge diff in time.<br />Kind regards!Johannnoreply@blogger.comtag:blogger.com,1999:blog-125341290284114978.post-20761458767773744412012-04-25T15:14:47.000-07:002012-04-25T15:14:47.000-07:00If all you want is to get a true or false value, I...If all you want is to get a true or false value, I suggest using switch. Sample code:<br />public bool isNumber_hashSet(char c)<br />{<br /> bool b = false;<br /> HashSet<char> hs = new HashSet<char> { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', };<br /> if (hs.Contains(c))<br /> {<br /> b = true;<br /> }<br /> return b;<br />}<br />public bool isNumber_switch(char c)<br />{<br /> bool b = false;<br /> switch (c)<br /> {<br /> case '0': b = true; break;<br /> case '1': b = true; break;<br /> case '2': b = true; break;<br /> case '3': b = true; break;<br /> case '4': b = true; break;<br /> case '5': b = true; break;<br /> case '6': b = true; break;<br /> case '7': b = true; break;<br /> case '8': b = true; break;<br /> case '9': b = true; break;<br /> }<br /> return b;<br />}<br />I ran each of the methods 1000000 times. Result:<br />isNumber_hashSet: 1700 ms<br />isNumber_switch: 35 ms<br />In other words, switch us about 50(!) times faster than HashSet.<br />Happy coding!matsolofnoreply@blogger.comtag:blogger.com,1999:blog-125341290284114978.post-46281170896714324452010-12-08T12:01:13.000-08:002010-12-08T12:01:13.000-08:00Thanks for the comments Bill - I fixed the code as...Thanks for the comments Bill - I fixed the code as per your recommendations and bug finds and have rerun it and added the new results.guy ellisnoreply@blogger.comtag:blogger.com,1999:blog-125341290284114978.post-51015949203753714162010-12-08T10:58:10.000-08:002010-12-08T10:58:10.000-08:00Your stopwatch.Reset() should be stopwatch.Restart...Your stopwatch.Reset() should be stopwatch.Restart() since the former stops the stopwatch and resets it to zero whereas the latter does that and also starts it again.<br />Also, a more representative test would be to populate the list and hashset with 100MM different strings and then find one of them. Do 10K of those requests and then average the times and you'll get some representative numbers.<br />As it stands now, the list is hobbled because the value is always at the end. The hash of the search string could very well place it close to the beginning or to wherever the algorithm begins its search. Realistically, the search string could be located anywhere within either collection and that could affect the performance of either.<br />Quibbles aside, there is no question in my mind that Contains on a hash-based data structure will perform better than on any list-based one. Thanks for putting some numbers behind it!Bill Brownnoreply@blogger.com