Saturday, August 10, 2013

Boto and CloudFormation

I've been working on a  small side project using AWS. In this project, I create a slew a resources. In addition to the typical EC2 instances, I also use things like VPC, RDS and S3. In my project, I always keep the same 'infrastructure' of resources, but I may change out the EC2 instances.

So I came to the conclusion that I wanted a 2 phase setup. One that sets up the infrastructure, and another that will set up my EC2 instances. I quickly discovered AWS's CloudFormation stuff, and it looked like it fit the bill. It allows me to quickly bring up an infrastructure, and even parameterize it. I quickly was able to create my "Stack" for my infrastructure.

When I moved onto the creation of the EC2 instances, I tried to get the stack to query the infrastructure for the existing resources. This would allow me to automatically connect to the database and other resources. It turns out that I can't do this. So I needed to find a scripting solution.

I looked briefly at things like Chef and Puppet, but they seemed like overkill for what I needed, and it would be a whole new thing to learn that may or may not have value later on. I decided to try and work with boto, a python interface to various AWS services. It supported talking to CloudFormation, and I know Python fairly well, so it seemed an ideal fit.

Unfortunately, it seems that if you're doing anything outside of creating, deleting or updating stacks via boto, there's a lack of examples. For instance, I needed to query the stacks for things like the outputs and tags. I could not find a single example. So after a couple hours of hacking, here's a script I came up with to query the stacks.

#!/usr/bin/env python

 

importsys

import boto

import boto.cloudformation

import argparse

 

class MyBaseException(Exception):

    msg ="MyBaseException"

    def __init__(self, value):

        self.value = value

    def__str__(self):

        return "%s: %s" % (self.msg, self.value)

 

class MissingParamException(MyBaseException):

    msg ="Missing param"

 

class InvalidCommandException(MyBaseException):

    msg ="Invalid command"

 

class InvalidStackException(MyBaseException):

    msg ="Invalid stack"

 

 

def _create_cf_connection(args):

    # Connect to a cloudformation

    # Returns a cloudformation connection.

    # Throws exception if connect fails

    if not args.access_key:

        raise MissingParamException("access_key")

 

    if not args.secret_key:

        raise MissingParamException("secret_key")

 

    if not args.region:

        raise MissingParamException("region")

 

    conn = boto.cloudformation.connect_to_region(args.region, 

                                                 aws_access_key_id = args.access_key,

                                                 aws_secret_access_key = args.secret_key)

 

    return conn

 

def get_stacks(args):

    conn = _create_cf_connection(args)

    return conn.list_stacks()

 

def get_stack(args, stack):

    conn = _create_cf_connection(args)

 

    stacks = conn.describe_stacks(stack)

    if not stacks:

        raise InvalidStackException(stack)

 

    return stacks[0]

 

def print_stack(stack):

    print "---"

    print "Name:            %s" % stack.stack_name

    print"ID:              %s"% stack.stack_id

    print "Status:          %s" % stack.stack_status

    print "Creation Time:   %s" % stack.creation_time

    print"Outputs:         %s"% stack.outputs

    print "Parameters:      %s" % stack.parameters

    print"Tags:            %s"% stack.tags

    print "Capabilities:    %s" % stack.capabilities

 

 

def list_stacks(args):

    stacks = get_stacks(args)

    for stackSumm in stacks:

        print_stack(get_stack(args, stackSumm.stack_id))

 

 

def list_regions(args):

    regions = boto.cloudformation.regions()

 

    for r in regions:

        print r.name

 

 

command_list = { 'list-regions' :   list_regions,

                 'list-stacks'  :   list_stacks,

               }

 

 

def parseArgs():

    parser = argparse.ArgumentParser()

    parser.add_argument("--region" )

    parser.add_argument("--command" )

    parser.add_argument("--access-key" )

    parser.add_argument("--secret-key" )

    args = parser.parse_args()

 

    if not args.command:

        raise MissingParamException("command")

 

    if args.command not in command_list:

        raise InvalidCommandException(args.command)

 

    command_list[args.command](args)

 

if__name__=='__main__':

    try:

        parseArgs()

    except Exception, e:

        print e

Friday, July 5, 2013

Is it time to break email?

The email transport protocol, SMTP (Simple Mail Transport Protocol) , has been around over 30 years. It was standardized in 1982, in an RFC (Request For Comments), number 821. Everything involved is in the clear, with no encryption. It kept the protocol true to its name, simple.

Over the years, we've 'patched' SMTP with additional features, one of which is transport encryption. This is encryption when receiving or sending between SMTP clients and servers. The current RFC for this is 3207. In it, it lays how to implement TLS, transport layer security, with SMTP. TLS is optional, in all cases, and not required for SMTP. It does mention that we can make it required, but only for local delivery. Here's the section from the RFC:

A publicly-referenced SMTP server MUST NOT require use of the
STARTTLS extension in order to deliver mail locally.  This rule
prevents the STARTTLS extension from damaging the interoperability of
the Internet's SMTP infrastructure.  A publicly-referenced SMTP
server is an SMTP server which runs on port 25 of an Internet host
listed in the MX record (or A record if an MX record is not present)
for the domain name on the right hand side of an Internet mail
address.

What this is saying, is that when an SMTP server is configured to receive email from the internet, it can not require use of TLS. This means that when you send an email, unless you encrypt it yourself (with PGP or the like), the email is not encrypted when it goes out over the internet. Anyone with a packet sniffer in the right place can read your email.

Here's a simplified example of what happens when you send an email:

  1. User at alice@gmail.com composes an email to bob@yahoo.com
  2. Email is sent from Alice's email client to Gmail's SMTP server
  3. Gmail's SMTP server sends the email Yahoo's SMTP server
  4. Yahoo's SMTP server delivers the email into Bob's Inbox.
  5. Bob's email client downloads the email from his Inbox. This uses IMAP, POP or Exchange.

When you set up your email client, one of the options is TLS. This option turns on TLS between you and your SMTP server. This option will encrypt the email being set to your SMTP server, step 2 in the example.

When the email is sent from Gmail's SMTP server to Yahoo's, due to the requirement of the RFC, it might be sent in the clear. Yahoo's SMTP is not allowed to force an encrypted session. In fact, some quick testing tells me that Yahoo's SMTP does not support TLS at all. So the email in step 3, will not be encrypted.

I believe it is time to break RFC3207 and REQUIRE TLS on publicly-referenced SMTP servers. This will help prevent entities from sniffing the traffic. Not only government, but also identity thieves, hackers and anyone else with packet sniffer.

In this day and age of everyone starting to force HTTPS on all our web browsing, telnet has fallen out of vogue, and ftp is replaced with sftp, isn't it time we start securing email?

Saturday, June 22, 2013

Python templating..

So recently, I was working on a home project, where I needed to generate some configuration files. So I turned to a templating solution I've used before, Python and Cheetah

The configuration files I'm generating happen to use $ quite a bit, which is also one of the keyword tokens for Cheetah. I needed to override that token. Cheetah supports this, you can pass in a settings variable that lets you configure that token. It works. Except for one minor thing. The configuration file also happen to have lines that contained only a #. For some reason, Cheetah would remove this line in the output. Normally, that character is used for some compiler directive stuff, and I overrode that. But not matter what I did, it still removed that line.

So I started investigating other Python templating engines. It seems most of them assume that you only need templating for HTML. *sigh*. Eventually, I found Jinja. Instead of using $ for everything, it uses {{ }}. Which solves my dilemma. 

Friday, June 21, 2013

New bumper stickers!

I just created some new bumper stickers. 

IBreakForLoops2
IBreakForLoops1

You can order them here.

Thursday, June 20, 2013

In light of recent events..

In light of recent events, I've stopped my series of password articles. Mostly because ArsTechnica posted an article more or less going through the same things I wanted to cover. You can find the article here. It goes into not only how passwords are formed, but also how to crack them.

Also, we've had the NSA spying 'scandal', which many of us have known about, or highly suspected for years. The general consensus is that what the NSA is doing isn't some magical top secret program. Its probably using off the shelf components, that anyone with enough money can buy, and using it to collect data. I doubt they are storing everything that crosses the internet. The media seems to imply that, but I doubt it.

What they are doing through is storing who connects to who. Not the data itself, but  information about the data, called meta-data.  We don't really care about the data itself. For example, lets say that Amy calls Bob, over a cell phone. Right way, we know several things. Amy called Bob, establishing a relationship. They used a cell phone. As we dig further, we know where Amy was when she made the phone call, and if she was moving. We know how long Amy and Bob conversed. We also know where Bob was, and if he was moving. 

This is quite a bit of information that can be used to track movement, establish "social graphs", and other forms of data analysis. Even without knowing what the data is, they can infer what the may be. 

There are way you can protect yourself. Things like VPNs, proxies, and Tor. By keep in mind, those types of things will slow your internet connection, turning your fast broadband connection to a very slow dialup modem. That may be the price of "freedom".

Saturday, May 18, 2013

Software assembly line Pt2.

In part 1 of this series, we gathered our dependencies, and placed them into a known directory. In this part, we'll discuss how to build the software, and set up for 'installing'.

Creating the project

The first thing you need to do when creating the project is to set up your build script, makefile, CMakeLists.txt, or whatever your build environment calls for. In your build script, you'll need to refer to the dependencies area. Its a good idea to retrieve your dependencies before you attempt to create the makefile, as you'll have the file structure to reference.

One good thing about putting all your dependencies in one spot is you will always know where your dependencies are, in every project, so configuring multiple projects is consistent. Too many times, I've seen one project think that the dependencies are in some random folder, and another project think they are somewhere else.

Building the project

Next up, build the project. Make sure that it links, and VALIDATE THAT IT LINKS AGAINST YOUR DEPENDENCIES, and not your system's libraries. This is very important if you want to have consistent builds. Once you've validated that you're using the correct dependencies, you need to install the project.

Installing the project

When I say install the project, I do not mean install it for use. The install that I'm referring to is for packaging.  What I do is install into a directory in my project called "installed_files". Anything that goes in this directory is packaged up for use by the dependency manager. 

Submit to the dependency manager

Once you've installed, you'll want to submit those installed files to the dependency manager. This will give us the ability to use ant of our projects as a dependency.

In part 3, I'll do a walk through of the entire process, tying everything together. I will also have a few other hints to help build a stable development process. 

Saturday, May 11, 2013

The now of passwords Pt 1.

Most everyone is familiar with passwords. You give a username or email address, and you enter your password. This forms the basis of virtually everything we do online. With today's technology, using passwords to protect websites and other services is nearly useless. Why is that? Why do some websites have certain restrictions, like only using certain characters, or why can't I used more than 8 characters at some places.

How passwords are stored and checked

Passwords are one of the most basic security features. A user enters a password, its received by the software processing the password (referred to as the "server" from now on.),then analyzed to see if its the expected password. Simple huh? (Note: we're ignore how the password is transfer from the user to the server.) I'll be going over how password storage and checking has evolved. This is a general overview, and will (hopefully) not be too technical. If you have questions or comments, please ask in the comment section below.

Simple password storage

In the early days of password protection, it was that simple. Password were stored, as-is, in a file or database. Anyone with access to those files or databases could read your password, and log in as you. Hackers had a field day, because all they needed was file or database access. 

Encrypted passwords

Eventually, servers added encryption. When you set your password, it would be stored encrypted. Then when you entered your password, it was received and then encrypted by the server. Then the server would compare the encryption results against the already encrypted password in the database. 

Hackers started doing several things here. First, they would analyze the encryption method for weaknesses. Depending on the encryption used, the length of the result encryption would change with the size of the password. So just knowing the length of the password would narrow down the possibilities. With this knowledge, they would "brute-force" the passwords, trying every combination of characters. As time went on, they started creating "rainbow tables" which are tables that contain all combinations of passwords, with the resulting encrypted value. This means that even with an encrypted password, it was relatively easy to find the password.

Hashed passwords

With encrypted passwords, the length of the encrypted password would give clues that would reduce the amount of work needed to crack the password. Servers then started to implement "hashed" passwords. Hashing uses an algorithm to come up with a single large number or represent the password. The advantage of this over encryption is that for every password, the hashed value is exactly the same length as all the other hashed passwords. Assume that the hashed value was 32 digits long. No matter if you have a password that was a single character, or 100 characters, the length of the hashed password is 32 digits.

This makes it so that the hackers do not know the length of your password by looking at it. Now they have to generate every single combination before they can crack a password.

In part 2 of this article, I'll cover multi-hashed passwords and salting.

Monday, May 6, 2013

The software development assembly line Pt1

As a software developer of 15+ years, I'm surprised at the number of other experienced developers that do not understand how software is assembled. This series of posts will describe what i see as an assembly line. My main focus will be on C and C++ based development, but all this could apply to any development environment.

This will be a multipart article. The first part will be about dealing with dependencies. The goal will be to build a reusable, predictable, sensible build system.

The most common issue I've run across is dependency management. Most places I've worked had none. You were expected to have all the libraries you needed installed, used whatever compiler was installed on the box, and hope for the best. This invariably leads to the release building machine having different compilers and libraries, which in turn leads to release bugs that no one can duplicate.

So the first step in the assembly line is to gather your dependencies, and place them into a known location. I personally prefer putting the dependencies into the same directory as my current project. This is as opposed to having a global area for dependencies. I do this for a few reasons:

  • I generally work on multiple projects with slightly different dependencies. This is much easier to manage.
  • Allow mobility on my local machine. I can move the project around as needed. I can copy it to another machine with a simple directory copy.
  • If you have multiple versions of the project your working on, you know exactly which dependencies you are using, just by looking in your dependencies directory.
  • You can build your build scripts/projects/makefile to always point to the same area. No worries about trying to find where on the system the dependencies are.

I do realize that this will mean having the same libraries on my machine, but I'm willing to pay that price. 

The next decision that needs to be made is what you dependencies really are. Depending on your requirements, your dependencies could include an entire OS. You need to draw a line somewhere. For most desktop applications, you can safely assume a base set of libraries are available. I personally do not like having to install extra dependencies when I want to run some application. That may be ok, you'll have to make the decision based on your target audience. Also remember that your compiler is also a dependency.

You've now determined where and what your dependencies are. You now need the 'how'. I've used shell scripts, python scripts, Ivy, and a few other things. I even wrote my own at one point, depvault. Now matter what you choose, you need a way to describe what you dependencies are, and how to retrieve them. A few things in mind when selecting a dependency manager:

  • I prefer dependency managers that use a separate, human readable file to specify the dependencies. This makes it easier to track changes in your source control, as you'll be able to see the changes in what version of a dependency you are using. Using things like externals (subversion) aren't a good solution. Not enough visibility.
  • You should also be able to specify a version of that dependency, and possibly additional information. The additional information could be anything from what platform you need, to debug or release mode, to developer versions (header files and .lib/.a) or distributables (executables or shared libraries.).
Once you've determined what you need to depend on and how you're managing them, you now need to use these dependencies to build your software. I'll cover that in part 2 of this series. 

 

Saturday, May 4, 2013

Good smoke! Review of the Traeger Junior

A few months ago, I bought a new smoker. Its a Traeger Junior, also know as Traeger BBQ055. I happen to find it on Amazon for $100 less than anywhere else! The shipping wasn't free, but it was only $50 for me. So it saved me $50. Here's a link to the cheap one.

This is whats known as a pellet smoker. The pellets are compressed wood dust. There are *no* fillers or binders. Originally when i was looking at these, I figured they had to have some sort of glue or something in there to hold it together. That turns out not to be the case, The pellets are fed into a 'fire box' where during the startup phase, there is an ignitor. After a period of time, the ignitor shuts off. 

It arrived in a fairly large box. Comes with all the tools to assemble it, and takes about an hour or so. The smoker comes with a simple controller for to controlling how fast the pellets are fed. Since I had saved some money, I decided to upgrade to a more advanced controller. Traeger Digital Thermostat Kit BAC236. The simpler one has three settings, smoke, med, and high. The more advanced one has several temperatures you can select from. It also has a digital read out of the current temperature of the grill. It has a temperature sensor in the grill, that it uses as feedback to determine how quickly to feed the pellets.

The advanced controller makes smoking food a no brainer. Especially if you use a remote temp sensor like this one: Maverick ET732 Long Range Wireless Meat Thermometer. You can set a temperature, and only need to check on it to make sure you don't run out of pellets. I've done several 14+ hour smokes without issue. Just monitor your sensor to know when you are done.

I love this smoker. It does have its flaws though. You need to clean it throughly after every smoke. Not just the grate and the drip pan, you need to vacuum the area below the drip pan. Not a big deal, but does take some time. If you don't, there is a reasonable chance that the smoker won't ignite properly the next time you use it. Also, the pellets in the hopper don't always fall into the auger that feeds the fire. This means that even if you have pellets, they don't make it, and the flame/smoke goes out. The easy solution is to keep the hopper full.

I have a few modifications I would like to do in the future. Such as adding firebricks to help it maintain a more consistent temperature. RIght now, it has a pretty wide swings and nearly 50 degree's. The firebricks should improve that. I will also think about increasing the size of the pellet hopper. This will help with the auger going dry issue I mentioned above. 

All in all, I'm very happy with my purchase, and will be smoking everything and anything I can think of.

Sunday, April 28, 2013

Haskell TShirts

A coworker is teaching himself Haskell. He also likes wearing 'geek' TShirts, and was trying to come up with a Haskell one. After some pondering, I came up with a couple. Click the image to purchase!

Haskell1

The Future of Television

Over the past couple years, we've seen a shift in television. With the advent of devices that can stream directly from the internet and services like Netflix, Amazon Prime and Hulu, the landscape is changing. No longer are we tied to a television schedule, as our favorite shows will be available when we want to watch them.

The Netflix show, House of Cards, is yet another shift. The producers of the show weren't held to several restrictions that shows that are shown on 'normal' television. They didn't have to worry about the length of the show. It could be 5 minutes longer than normal, or 2 minutes shorter. They also didn't have to worry about breaking scenes up because of commercial breaks. They also aren't restricted in content. They aren't forced to 'dumb down' or censor anything. They can also develop without fear of cancelation. HBO and other premium networks can do most of this, except for length. They have a maximum length they can be.

We're starting to see the age of a la carte television entertainment. More and more people are getting their entertain on their schedule, and the shows they want. The exception to this is sports. The television sports networks have a stranglehold on the broadcasts. ESPN has rights to many sports. They do have a way you can stream events, but only if you are a cable subscriber on a participating cable provider. Not all providers have joined this streaming, which means that even though customers are paying for the same service, they are receiving less.

Until the strange hold of the sports networks is broken, cord cutting will never fully happen.