Guy Ellis' Tech Blog: September 2009

Wednesday, September 30, 2009

Windows 7

I'm ready to move on to Windows 7. Most of the reports and experiences that I've read have been very positive. In fact I don't recall anything negative that I've heard about Windows 7 that would impact me.

My main desktop machine is running Windows Server 2008 Enterprise. The reason that I selected this OS as a desktop operating system was because (1) I wanted to be developing on the same machine that my apps were being targeted for and (2) I wanted to use Hyper-V. This was a mistake. IIS7 comes on all modern Microsoft operating systems now and it turns out that as a desktop machine Hyper-V works particularly badly because it virtualizes the video card and really impacts performance of everything you're working on.

On top of this Windows Server 2008 makes it painfully difficult to install run of the mill applications that you would frequently use on your desktop such as the Zune Desktop and Windows Media Player. Granted, Server 2008 was not designed to run these apps, that's what Vista and Windows 7 are for. I now understand my folly.

I now have a dilemma. Do I keep my powerful Server 2008 machine and turn it into a "remote into" machine for running and testing applications or do I convert it to Windows 7? Using it as a remote into only machine means that I can re-enable Hyper-V however it also means that I'll have an expensive and powerful machine sitting on my home network that's hardly being used anymore.

My biggest problem with all of this is the amount of time it takes to setup a new machine but I think that while typing this I've come up with a solution. I'll buy a cheaper piece of hardware and use my Windows Home Server's image of the Server 2008 machine and restore that to the new hardware. I'll then install Windows 7 on my main power machine. This will also give me a chance to test the recoverability of a system from WHS without risk.

OCR with Google Docs

At last, Optical Character Recognition (OCR) online with Google Docs.

Go to http://googlecodesamples.com/docs/php/ocr.php
Use the link to sign in to your Google account.
Click browse to find your .jpg, .gif, or .png file to be converted.
Click 'Start Import'
Your image will now open as text in a Google Doc

My testing showed a fairly accurate conversion with very few mistakes. Because I was doing this in Firefox a squiggly red line appeared under the mistakes and a right click quickly fixed those. (Not sure if it's Firefox or Google Docs that puts the squiggly red line under those misspelled words? Might be both...)

One feature that's not available yet which I'd love to see is the ability to import PDF's that hold scanned images of documents. For some reason I seem to have a ton of those. At the moment, the only way that I can find to import those is to click on each page in the PDF. Copy it to the clipboard. Paste it into Paint.Net and then save it as an image. I then import that image into the doc. This is only practical if your PDF has a few pages. If it has many then this is just not workable.

Testing a site for nofollow links with jQuery

I'm working on a site at the moment and a number of the links on the pages do not need to be followed by the search engine (or other) spiders and do not need to be indexed. These links I set with the nofollow value on the rel attribute in the anchor tag follows:

Some of the pages have plenty of links and I'm never sure if I've marked all of them correctly but I would know if I could color the already marked ones. This is easily achieved with jQuery. I open up the page in FireFox and then open the Firebug Console panel and in there I type:

$('[rel=nofollow]').attr('style', 'color:Fuchsia')

This will cause all of the links on the page with the rel attribute set to nofollow to be colored pink. You can then easily perform a visual check to see if you've caught them all.

(It will actually cause any text inside any tag with a rel attribute set to nofollow to be colored pink but I believe that the only recognized tag that rel is used with is the anchor tag.

Saturday, September 26, 2009

Spam bots use strong passwords

I've been logging the details of the failed attempts to register on a forum. I described this in my accidental discovery of defeating spam bots with email names. I keep an eye on these logs to understand the nature of spam bots and to pick out markers that make an attempted registration more likely to be coming from a spam bot than from a human. Using the word spam in front of bot in this context may be redundant because I don't know of any other type of bot that would be used for registering on a forum.

I've noticed that all bots that failed to register used a strong password. Not super strong but strong. i.e. the passwords were all between 8 and 12 characters long, had at least one lower case letter, one upper case letter, one number and do not contain any known words. I would classify super strong as password that meets those criteria but also had a punctuation mark in it as well.

One thing that I have not done yet is to rate the successful passwords that have been entered.

First of all, I don't like logging the passwords when there is a registration failure. I don't like storing passwords in plain text even if it's only me that can see them and I only store them for a very short period of time. It's not good practice. It is my intention to set up a table and record attempted registrations and in that table include a bit field for success of registration along with a strong password score. I will then be able to draw stats about the type of passwords the average user is using versus bot/failed passwords and work out if a strong password is a good marker to help score a bot registration.

The only problem that I have at the moment is that the same bots are hitting the site several times an hour with the same username, email and password so I'd have to use those 3 fields as unique identifies which would mean hashing the password and storing that as well.

Wednesday, September 16, 2009

IE input button tag on localhost

Just encountered a very weird bug in Internet Explorer (IE) with the input/button tag that manifests itself when deployed but does not exist when accessing a site on localhost. Makes no sense to me whatsoever.

The tag I had looked like this:

<button name="submit" value="submit" onclick="JavaScript:Search()"> Submit</button>

When developing and testing on IE this button did a POST back to the server when testing it against localhost. However, when deployed, the button would refuse to POST even though the JavaScript ran. What is strange about this is that it's browser specific and shouldn't be affected by the server that you're POST'ing to. This I do not understand.

When I changed the button to:

<input type="submit" name="submit" value="submit" onclick="JavaScript:Search()">Submit</input>

It started working in both environments. This mystery has yet to be solved...

Friday, September 11, 2009

What is an EM?

Although I've been using EM's in CSS and HTML to define font sizes for years I've only just discovered the definition of an EM. It's the with of a capital M in a particular font.

An EM-Dash is a dash that is the width (length) of an EM, which is usually longer than a standard dash.