Guy Ellis' Tech Blog: 2014

Friday, November 28, 2014

How does a covering index on a database work?

Today I was having a conversation with a co-worker who lives in Italy. He said that Christmas in Italy starts on December 8 which is a holiday in Italy but he didn't know the English name for the holiday.

I went to Google and typed in "december 8 holiday italy" and got back this result:

I didn't need to click on the result of the first (or any) item that turned up in the search result because all the information that I needed was right there: The Feast of the Immaculate Conception.

This is a covering index. When the results from the index alone can satisfy the information that you seek then they are said to be "covered in the index" and no further lookup is required.

In my search analogy I only made one request to the Google index and never opened the page that it pointed to.

In a database I would pull back all the information that I needed from the table's index and not have to retrieve the record from the table to complete the information search.

Leaving the land of Google analogy you might have a table with an index on userId. Finding that ID in the index allows the database to quickly locate the record in the table from which it can pull the name, for example.

If you had an index on both userId and userName then only the index would need to be searched if all you are after is userName and the second lookup in the table would not be needed.

Thursday, October 23, 2014

Retro Forking with Git and GitHub

Problem: You cloned a repository that you don't have write access to and you've done some work on it. Now you want to commit that work and push it to GitHub but you never did a fork.

1. Fork the repository, for my example I'm using https://github.com/rjrodger/seneca-mvp

2. In the console in the local repository folder do "git remote -v" to find out what the remote address is:

$ git remote -v
origin    [email protected]:rjrodger/seneca-mvp.git (fetch)
origin    [email protected]:rjrodger/seneca-mvp.git (push)
3. Add this repository as the upstream repository with "git remote add":

git remote add upstream https://github.com/rjrodger/seneca-mvp

4. "git remote -v" should now show:

$ git remote -v
origin    [email protected]:rjrodger/seneca-mvp.git (fetch)
origin    [email protected]:rjrodger/seneca-mvp.git (push)
upstream    https://github.com/rjrodger/seneca-mvp (fetch)
upstream    https://github.com/rjrodger/seneca-mvp (push)

5. Change the origin to your fork:

git remote set-url origin [email protected]:guyellis/seneca-mvp.git

6. Confirm the change:

$ git remote -v
origin    [email protected]:guyellis/seneca-mvp.git (fetch)
origin    [email protected]:guyellis/seneca-mvp.git (push)
upstream    https://github.com/rjrodger/seneca-mvp (fetch)
upstream    https://github.com/rjrodger/seneca-mvp (push)

Monday, October 6, 2014

Real World WebSockets

Slides are at: http://slides.com/guyellis/realws/live

Co-presented by:

Guy Ellis: @wildfiction

and

Justin Dragos: @EllisandePoet

Saturday, September 20, 2014

Node.js Workshop with HTTP Status Check

I've been using HTTP Status Check for Node.js Workshops to help coach developers wanting to learn or get better at Node.js and JavaScript development.

HTTP Status Check is a Node.js utility that takes a list of URLs and checks that their statuses and other properties are what you would expect. As a web developer we usually have a number of domains, URLs and sites that we need to "keep-an-eye" on and this utility will quickly check the health of all our sites and report it back to us.

The benefits of using this project for a workshop are:

Most Node.js development is around web development. As a web developer you need to know and understand the HTTP protocol. This utility is all about that.
It's small and easy to understand.
It's something that you can (and should) use each day to keep an eye on your web properties. This means that you'll directly benefit from any changes you make to it.
Which brings us to changes. It's easy and quick to make changes in your fork (branch) of the project if you need it to do more than what it currently does.
In addition to learning Node.js/JavaScript you get to learn how to use Git, GitHub, Npm, how to contribute to Open Source and how to build your reputation and resume on GitHub.

To get the most out of a workshop you need your laptop to be in a state that allows you to dive in and start learning Node.js and JavaScript. This means that you need some basic applications installed and accounts setup and configured. Before attending, do the following. If any of the steps are unclear then post a comment below and I'll clarify.

("repo" is short for repository or code-repository. The code that you run is in a repo and you will pull it down to your machine. More on that to come...)

Setup a GitHub account and log into it: https://github.com/join
Fork the HTTP Status Check repo. (Forking is the process of making a copy of the code-base in your GitHub account that you have complete control over to do with as you please.)

You must be signed into GitHub. Go to the repo: https://github.com/guyellis/http-status-check
If you're signed in then you will see some buttons towards the top right of the page: Watch, Star, Fork. Click on the Fork Button.
The repo will now be forked (copied) to your account and you will be redirected to it, something like: https://github.com/<your-account-name>/http-status-check
Congratulations, you've successfully forked your first GitHub project and taken your first step to contributing to open source.

Install Git on your computer. You now need to get that repo (the code-base) from your fork on GitHub to a directory on your computer. This is done through a process called cloning which as the name suggests creates a clone of your fork on your local machine. Before we can do this though we need to install Git:

Use the official Git download page to find and install the right Git Client for your OS: http://git-scm.com/downloads
There is also a link on that page that will take you to popular Git GUI clients.
Once that's done come back here.
Open a command window and type git and hit enter. You should be presented with a list of git commands. This confirms that git has been successfully installed.

Clone the repo.

Create a directory where you want to keep your source code. You don't need to create a directory for the HTTP Status Check project, just one that will hold your projects. For example /source/ or /code/ or /myrepos/ or something like that.
Now open a command window in that directory or open a command window and change to that directory.
Clone the repo by executing this command after you have replaced your-github-account-name with your GitHub account name. (This is where you forked the repo to in the Fork step above.):
git clone https://github.com/<your-github-account-name>/http-status-check.git

Some notes about this:
You can find the correct link to use on your GitHub page by looking at the forked repo and on the right hand side you'll see something that says HTTPS clone URL. Right below that is a text box that you can copy the link from.
If you want to use SSH then you can click the SSH link below that box to switch the link to the correct SSH link.

If this is successful then you'll see something that looks like this. The numbers that you see will be different:

Cloning into 'http-status-check'...
remote: Counting objects: 395, done.
remote: Total 395 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (395/395), 56.90 KiB | 0 bytes/s, done.
Resolving deltas: 100% (224/224), done.
Checking connectivity... done.
A directory would have also been created for you called http-status-check.

Code Editor. You need a code editor to edit the code that you've just cloned. If you don't already have one installed then try one of these which are available on almost all Operating Systems:

WebStorm: 30 day trial and then $49 to buy and $29/year renewal after that (for non-commercial use).
Sublime Text: Free evaluation and then a $70 single payment for continued use.

Install Node.js and Npm. I deliberately put this step at the end. I want you to be able to immediately run something after you've installed Node.js and see it in action which is why you got the code setup first.

Go to the Node.js site and install it for your Operating System. This will also install Npm which you'll need later.
Once installed open a command window and change directory to the HTTP Status Check directory and run node.
You should see a > command prompt. This confirms it was installed. Hit Ctrl+C twice to exit.

Run Npm.

In the same command window in the http-status-check directory run npm install. (You can also run npm i which is a shorter version of this.)
You should now see all the dependencies being downloaded and added to the project. Here is an example of some of what you'll see in the console window. Some of the numbers might be different:

[email protected] node_modules/debug
└── [email protected]

[email protected] node_modules/chai
├── [email protected]
└── [email protected] ([email protected])

[email protected] node_modules/lodash

Run HTTP Status Check.

Now type node index.js and hit enter and the HTTP Status Check utility will run. You should see something like this:

_ Google (http://google.com) testing disabled.
_ HTTP Status Check on Guy's Blog (http://www.guyellisrocks.com/2014/06/http-status-check.html) working as expected.
_ Missing URL example (http://www.guyellisrocks.com/2014/06/will-this-get-written.html) working as expected.
_ LinkSilk WWW (http://www.linksilk.com) working as expected.
_ LinkSilk (http://linksilk.com) working as expected.
A total of 5 URIs were tested.
Failure count: 0
Success count: 4
Disable count: 1

Done! You've successfully forked, cloned, installed and run your first Node.js application. Now you're ready to start making changes to it to customize how it works. We'll cover code changes in the workshop. In the mean time you can change the list of web sites that it checks to check your sites and immediately provide you with some value:

Copy the samplesites.js file (in the root of the project) to a file called checksites.js. (checksites.js will take precedence over samplesites.js if it is present.)
Edit the checksites.js file and replace the sample web sites with your own websites and rules. The file is heavily commented (lines starting with // are comments) to guide you to what changes you can and should make.
Now run node index.js again and you'll see your sites being checked.

Workshop and Questions. If you weren't able to get to this point then post comments to this blog post. You should also bring your questions to the workshop. If you're not able to edit your checksites.js file and run it against your sites before the workshop then you'll miss out on making changes to the code and learning about Node.js and JavaScript.

Bonus Tasks

Run the tests.

In a console window in the http-status-check directory run npm test.
All the tests should pass.

Run code coverage.

In a console window in the http-status-check directory run npm run istanbul.
The tests will run (as above) and code coverage will be calculated. The output should end with something that looks a little bit like this:

Statements   : 100% ( 170/170 )
Branches     : 88.73% ( 63/71 )
Functions    : 100% ( 23/23 )
Lines        : 100% ( 169/169 )
Now open the coverage/lcov-report/index.html file that was created as a result of this and look at how the tests cover the solution.

Learn more about how Git and GitHub work. Install Ungit and use its graphical interface in the browser to visual understand the repo's structure and work with the repo.
New features in HTTP Status Check.

Is there a bug that needs to be fixed or feature that you think should be added to HTTP Status Check? Then add it as an issue: HTTP Status Check Issues. (Even if you intend to work on this you should add it to the issues first and then assign it to yourself.)
Want to work on existing issues in HTTP Status Check. Then find them in the same place: HTTP Status Check Issues

Wednesday, June 18, 2014

Pull Requests instead of Emailing Code

If you modify code from an open source repository, such as GitHub or BitBucket, here are the reasons why you should submit a pull request to get your code back into the main repo.

To avoid paying the stupid tax.

In short, your feature or fix will be available in future versions that you might want to use. If your code doesn't get into the master branch then it makes it difficult or impossible to keep up with future versions.

Make open source better.

If you've fixed a bug or added a feature then it's highly likely that others will need that.

Improve your resume.

More and more recruiting managers are looking at what you've done when they're hiring you. Having contributed to an open source project is a great way to show that you've done real work that people are using.

Get Credit

You've done the work. Now get your name on that repo and take credit for some of the work that you've done.

Use the Tools

Do you use Git? Have you ever submitted a pull request? These are questions you might be asked at an interview. Try it and practice it so you can demonstrate this skill. Even if I was hiring a junior right out of school I'd expect them to have at least done this.

Understand someone else's code-base

Making a contribution to someone else's code-base forces you to read their code and understand their style and way of architecting a project. This is an invaluable skill as 80% of the work you do is reading over code. Even if you wrote it you'll have forgotten it to the extent that it becomes foreign in a few months time.

This list came from a conversation that I was having with the owner of the ABot project on GitHub. ABot is a web crawler (spider) built for speed and flexibility.

He was telling me how other developers will add features to ABot and then email the code to him. If he decides to accept the code and integrate it into the project then it's all getting committed by him under his name. He was telling me how he doesn't want the credit for this work. By not submitting your code through a pull request on GitHub it makes it very difficult to give the credit to the person who did the work.

Tuesday, June 17, 2014

HTTP Status Check

I recently created a Node.js utility that will cycle through a list of URLs and check their HTTP statuses:

Npm: http-status-check

GitHub: http-status-check

This utility came out of a need to keep daily tabs on a number of URLs and the statuses that they were returning.

I plan on expanding it with other input and output adapters. Obvious input adapters would be databases. What else can you think of? Obvious output adapters would be email and any other type of messaging system.

Pull requests are welcome.

Update 4/July/2014

Added an excludedHeaders option. This is a list of headers that you want the check to fail on if they are present in the response from the server.

At first blush this may seem like a strange check to make. The common use case is the X-Powered-By header. This header allows the server to advertise the technology that is powering it. As a security concern, when possible, I'll remove this from sites that I publish. I feel that telling malicious attack bots what you're running on will help them exploit any vulnerabilities your stack might have.

Update 5/July/2014

Added an expectedText option. This is text that we expect to be present in the body of the response from the server. By default it is case insensitive but you can change that by supplying an object instead of a string.

Saturday, June 7, 2014

Using WebSockets when your Reverse Proxy doesn't allow it

As developers we don't always get to choose where our software runs. We often face economic or other restrictions based on infrastructure that already exists.

I recently moved a Node.js application from Linux server to a Windows Server 2008R2. Crazy right? It's working surprisingly well in the Windows environment. IIS 7.5 already owned port 80 so I had to setup a site on IIS, bind the domain to that site and use it as a reverse proxy to the Node app which was running on an arbitrary port.

In this case it happened to be IIS in any other case it might be Ngnix, Apache or any other server or reverse proxy that is between your Node.js application and the web. The problem that I faced is that this version of IIS does not support WebSockets so it looked like I couldn't use that and had to allow socket.io to fall back to long polling for this application.

There is, however, a solution, and it's rather simple.

Your site's facade, let's call it mdomain.com, is running on port 80 on IIS which is configured to run as a reverse proxy passing all traffic through to port (say) 4444 where your Node application is running.

When a client (a browser) connects to your site you provide it with the usual payload of HTML, CSS and JavaScript and in that you also provide it with the port number or sufficient information for the WebSocket part of the client to make a direct connection to your Node.js server and bypass the reverse proxy completely.

Using this little trick our site can remain on the default port going through the reverse proxy and all our WebSocket traffic can run over the application specific port.

Tuesday, June 3, 2014

JavaScript Dependency Injection.

Using JavaScript Dependency Injection for Spies, stubs and mocks in unit testing.

I was in the process of picking a Dependency Injector for use in a project in JavaScript. My default to narrow down the top three is to use Google as my initial search and then GitHub's star count as the top-three filter and then compare the features that I need.

After looking a SinonJS, sandboxed-module, rewire, and proxyquire I decided to be slightly more analytical and objective about how I compared them. I decided to include the Watch and Fork count in addition to the Star count. I created a spreadsheet that compared them:

	watch	star	fork	Avg
rewire	20	425	16	0.48
sinon	52	1471	233	2.08
sandboxed	10	190	24	0.29
proxyquire	2	195	12	0.15
	84	2281	285	3

I averaged the modules giving each of the attributes a one third weighting. Given that starring is 27 times more common than watching you might question this weighting. I justify it because starring is a less costly action (no update emails) and more likely to do than watching.

Seeing this as a pie chart makes this information easier to absorb. The clear winner amongst these four modules is SinonJS with almost three-quarters of the attributes.

Thursday, May 8, 2014

Thinking outside the Twitter Bootstrap (monitor) box

I'm doing a whole bunch of development using Twitter's CSS Bootstrap framework and while resizing the width of the browser I'm never sure when the grids should change based on the row and col-xs-X, col-sm-N, col-md-N, col-lg-N classes.

Then I had an idea. I'll pin the left side of the browser to the extreme left of the monitor and physically mark off on the top of the monitor where each width size starts.

To find the points on your browser where the switch takes place do this use the Bootstrap's Gid Template page and after pinning the browser to the left of the monitor size the right edge of the browser and you'll see the grids jump as they change and this will allow you to mark the changes. You'll need three stickers for these as you're dividing up four areas. I borrowed stickers from my kids sticker collection which read "Smart Work", "A Big Effort", and "Super." If you don't need encouragement like this then any sticker will work.

I then took four white adhesive labels and cut them up and labeled them XS, SM, MD and LG and stuck those in the appropriate places between and next to the dividers.

If I want to see the effect in each of the four bootstrap sizes I slide the right side of the browser under the XS, SM, MD and LG labels. If I want to see the "jump" as the size changes I size around the dividers.


Sized at SM


Sized at MD

Friday, April 11, 2014

IIS Reverse Proxy to multiple sites running under a single Node.js instance

Subtitled: How to setup Node.js to handle multiple sites and to run on Windows while IIS still occupies port 80

I did this on a Window 7 machine which was running IIS 7.5

Setting up your local box for testing

In your HOSTS file add some sites to test:
127.0.0.1     node1.com
127.0.0.1     node2.com
127.0.0.1     node3.com

IIS

Create a new web site in IIS:
Open IIS.
Right click on Sites.
"Add Web Site..."
Site Name: MyReverseProxy
Physical Path: C:\inetpub\wwwroot (can be anything as we're not going to use it)
Host name: node1.com
Now click on Bindings under Actions on RHS
Click Add.
Host name: node2.com
Click Add.
Host name: node3.com
Close.
You now have 3 host names running on port 80 on IIS and your HOSTS file will direct all local requests to this web site.

Click on MyReverseProxy on left panel. If the URL Rewrite widget is not under IIS then you might need to install it and might also need to add Application Request Routing (ARR). I can't remember if it came installed by default on IIS 7.5 or if I added it.

Open the URL Rewrite widget and click on Add Rule(s)...
Select a Reverse Proxy rule and click OK.
Server name: localhost:3000/ (or where the Node server is running.)
If you're using relative URLs in the Node.js code then you don't have to rewrite the domain names of the links in the HTTP responses. If you are using absolute names then you're in trouble if you're handling multiple domains as you won't know at this stage which one it's for. The only solution I have for this is to use relative URLs and DO NOT check the option to "Rewrite the domain names..."

Now start your Node.js application on port 3000 and try and access the sites node1.com, node2.com, and node3.com and they should all return the contents of your site running on port 3000.

We're not done yet. If, in Node.js/Express, you take a look at the req.headers.host value you will see that it reads "localhost:3000" and has not passed through the value you were expecting which was one of node1.com/node2.com/node3.com

To get this to work you need to go to this folder with an Administrator command prompt:
C:\Windows\System32\inetsrv
and run this command:
appcmd.exe set config -section:system.webServer/proxy /preserveHostHeader:"True" /commit:apphost

What this command will do is open this file:
C:\Windows\System32\inetsrv\config\applicationHost.config
and find the section called <system.webServer> and change this:
<proxy enabled="true" />
to this:
<proxy enabled="true" preserveHostHeader="true" />
Recycle the app pool on your reverse proxy site in IIS and try and access it again. The host "head" property will be correctly set.

Now the final "thing" you want to do is to be able to handle multiple sites from your Node.js/Express application.

Here is some basic middleware in app/server.js that will switch between sites:

app.use(function(req, res, next) {
    console.log('Domain is: ' + req.headers.host);
    switch(req.headers.host) {
        case 'node1.com':
            break;
        case 'node2.com':
            break;
        case 'node3.com':
            break;
        default:
            console.log('Unknown domain: ' + req.headers.host);
            break;
    }
    next();
});

You would then setup routes that appropriately handled routes for each host that you were expecting. Where routes are ambiguous you would put a switch statement like above in there to arbitrate among domain functionality.

You could also put a node-proxy in front of these sites to switch between different node apps but then you're defeating the purpose of IIS which can already do that and more efficiently so I can't see the need for a node-proxy.

Some of the answers from this Stackoverflow question might help readers of this topic.

Friday, April 4, 2014

ExpressJS and MongoDB End to End

This blog post accompanies the presentation called ExpressJS and MongoDB End to End at Desert Code Camp 2014.1.

If you want to follow along with this presentation then there are three items you need:

Once you've cloned the sample this locally you should be able to run it using:
node app.js
If MongoDB is running on the default port then it should work.

Give me feedback when the presentation is over: Guy on Speaker Rater

Resources from presentation:

Official ExpressJS site: ExpressJS
Node: nodejs.org
MongoDB: mongodb.org
Mongoose: mongoosejs.com

Saturday, February 22, 2014

Fluent 2014 NodeJS Express Presentation

If you want to follow along with this presentation then there are three items you need:

Official ExpressJS site: ExpressJS
Node: nodejs.org
MongoDB: mongodb.org

O'Reilly Fluent 2014 presentation: Introduction to ExpressJS

Wednesday, February 5, 2014

Roof Bug Fixing

Great blog post by Anna Shipman on Roof Bug Fixing.

She doesn't have comments enabled so I'll comment here.

I bought a newly built house in 2007 and after the buyer's inspection and the small fixes they did to it I was happy until the monsoons arrived and we had horizontal driving rain. This exposed leaks in the sealing around the windows.

I called the builders back in, explained what had happened and they fixed the leak and let me know it was done. I walked into the front yard and took the hosepipe and with a nozzle on the front of it sprayed the windows ten times harder than any monsoon could ever deliver water against them. This was my Black Friday load testing.

Of course it leaked and the repairers were standing inside my sitting room watching the water come through the seals around the window. The next day after it had dried they got back to work fixing it again. I was working upstairs and I could hear them spraying the window before they called me back to subsequently show me that it was fixed.

When someone comes around to fix something in my house my standard question before they start work is "how will you know that it's fixed?" and "how will you show me that it's fixed?" Sometimes this is moot and doesn't need to be answered.