Your Docker image might be broken without you knowing it

Posted 5 months back at Phusion Corporate Blog

Docker is an awesome new technology for the creation of lightweight containers. It has many purposes and serves as a good building block for PaaS, application deployments, continuous integration systems and more. No wonder it’s becoming more and more popular every day.

However what a lot of people may not realize is that the operating system inside the container must be configured correctly, and that this is not easy to do. Unix has many strange corner cases that are hard to get right if you are not intimately familiar with the Unix system model. This can cause a lot of strange problems. In other words, your Docker image might be broken, without you knowing it.

To raise awareness about this problem, we’ve published a website which explains the problem in detail. It explains what exactly could be broken, why it’s broken that way, and what can be done to fix them.

And to make it easier for other Docker users to get things right, we’ve published a preconfigured image — baseimage-docker — which does get everything right. This potentially saves you a lot of time.

Learn the right way to build your Dockerfile.

One of the gripes we have with random Docker images on the Docker registry is that they’re often very poorly documented, and do not provide the original Dockerfile from which they’re created. With baseimage-docker, we’re breaking that tradition by providing excellent documentation and by making the entire build reproducible. The Dockerfile and all other sources are available at Github, for everyone to see and to modify.

Happy Docking!

Using Arel to Compose SQL Queries

Posted 5 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Rails gives us a great DSL for constructing most queries. With its knowledge of the relationships between our tables, it's able to construct join clauses with nothing more than the name of a table. It even aliases your tables automatically!

The method we most often reach for when querying the database is the where method. 80% of the time, your query will only be checking equality, which is what where handles. where is smart. It handles the obvious case:

where(foo: 'bar') # => WHERE foo = 'bar'

It also handles nils:

where(foo: nil) # => WHERE foo IS NULL

It handles arrays:

where(foo: ['bar', 'baz']) # => WHERE foo IN ('bar', 'baz')

It even handles arrays containing nil!

where(foo: ['bar', 'baz', nil]) # => (WHERE foo IN ('bar', 'baz') OR foo IS NULL)

With Rails 4, we can also query for inequality by using where.not. However, where has its limitations. It can only combine statements using AND. It doesn't provide a DSL for comparison operators other than = and <>.

When faced with a query that requires an OR statement, or when needing to do numeric comparisons such as <=, many Rails developers will reach for writing out a SQL string literal. However, there's a better way.

Arel

Arel is a library that was introduced in Rails 3 for use in constructing SQL queries. Every time you pass a hash to where, it goes through Arel eventually. Rails exposes this with a public API that we can hook into when we need to build a more complex query.

Let's look at an example:

class ProfileGroupMemberships < Struct.new(:user, :visitor)
  def groups
    @groups ||= user.groups.where("private = false OR id IN ?", visitor.group_ids)
  end
end

When we decide which groups to display on a users profile, we have the following restriction. The visitor may only see the group listed if the group is public, or if both users are members of the group. Even for a minor query like this, there's several reasons we would want to avoid using SQL string literals here.

  • Abstraction/Reuse
    • If we wanted to reuse any piece of this query, we would end up with a leaky abstraction at best involving string interpolation.
  • Readability
    • As complex SQL queries grow, they can quickly become difficult to reason about. Since they're so difficult to break apart, the reader often has to understand the entire query to understand any individual part.
  • Reliability
    • If we join to another table, our query will immediately break due to the ambiguity of the id column. Even if we qualify the columns with the table name, this will break as well if Rails decides to alias the table name.
  • Repetition
    • Often times we end up rewriting code that we already have as a scope on the class, just to be able to use it with an OR statement.

Refactoring to use Arel

The method Rails provides to access the underlying Arel interface is called arel_table. If you're working with another class's table, the code may become more readable if you assign a local variable or create a method to access the table.

def table
  Group.arel_table
end

The Arel::Table object acts like a hash which contains each column on the table. The columns given by Arel are a type of Node, which means it has several methods available on it to construct queries. You can find a list of most of the methods available on Nodes in the file predications.rb.

When breaking apart a query to use Arel, I find a good rule of thumb is to break out a method anywhere the word AND or OR is used, or when something is wrapped in parenthesis. Keeping this rule in mind, we end up with the following:

class ProfileGroupMemberships < Struct.new(:user, :visitor)
  def groups
    @groups ||= user.groups.where(public.or(shared_membership))
  end

  private

  def public
    table[:private].eq(false)
  end

  def shared_membership
    table[:id].in(visitor.group_ids)
  end

  def table
    Group.arel_table
  end
end

The resulting code is slightly more verbose due to Arel's interface, but we've given intention-revealing names to the underlying pieces, and are able to compose them in a satisfying fashion. The body of our public groups method now also describes the business logic we want, as opposed to how it is implemented.

With more complex queries, this can go a long way towards being able to easily reason about what a query is accomplishing, as well as debugging individual pieces. It also becomes possible to reuse pieces of scopes with OR clauses, or in the body of JOIN ON statements.

What's next?

  • Learn more about composition over inheritance in Ruby with Ruby Science

Senior Ruby on Rails developer. £50k Award winning company, prestigious country location, Oxfordshire near Reading/High Wycombe

Posted 6 months back at Ruby on Rails, London - The Blog by Dynamic50

Great opportunity for an experienced developer specialising in Ruby on Rails.

Prestigious countryside town location, Oxfordshire near Reading/High Wycombe

Working in close knit team directly with the CTO on award winning web application.

Opportunity to really contribute to an active product roadmap in an award winning small company. Really make a difference

- 3+ experience developing commercial web applications using Ruby on Rails, Python, Java, .NET, PHP or similar

- Passion for programming a must, daily stand-ups, etc.

- Knowledge of relational database design, SQL.

- Extensive experience with JavaScript, AJAX, CSS and HTML

- Experience with Agile methodologies

- Working knowledge of version control systems, SVN, Git.

- Experience with test driven development

- Knowledge of Eclipse would be an advantage

- Knowledge of Linux would be an advantage

Salary 50k

Send your resume to contactus@dynamic50.com or give us a call on 02032862879

Entry Level Ruby on Rails developer. £25k+ Award winning company, prestigious country location, Oxfordshire near Reading/High Wycombe

Posted 6 months back at Ruby on Rails, London - The Blog by Dynamic50

Great opportunity to develop your skills as a junior Ruby on Rails developer.

Prestigious countryside town location, Oxfordshire near Reading/High Wycombe

Working in close knit team directly with the CTO on award winning web application.

Opportunity to really contribute to an active product roadmap in an award winning small company. Really supportive development environment.
  • 2 experience developing commercial web applications using Ruby on Rails, Python, Java, .NET, PHP or similar
  • Passion for programming a must, great opportunity to learn in a supportive environment.
  • Knowledge of relational databases i.e. SQL preferred
  • Extensive experience with JavaScript, AJAX, CSS and HTML
  • Experience with Agile methodologies
Salary £25k+ depending on experience

Send your resume to contactus@dynamic50.com or give us a call on 02032862879

Episode #440 – February 14th, 2014

Posted 6 months back at Ruby5

PostgreSQL! Such wow! Much gitsh! Ask Ruby, maybe not? R u an activity feed? Hakiri amaze on the Doge 5!

Listen to this episode on Ruby5

Sponsored by New Relic

New Relic is _the_ all-in-one web performance analytics product. It lets you manage and monitor web application performance, from the browser down to the line of code. With Real User Monitoring, New Relic users can see browser response times by geographical location of the user, or by browser type.
This episode is sponsored by New Relic

PostgreSQL Awesomeness

Dig a little deeper into that database you’re most likely running. Hubert Lepicki has a nice overview of what makes PostgreSQL so nice for Rails development.
PostgreSQL Awesomeness

gitsh

Thoughtbot brings you an interactive shell for git. Why did they do this? Why not! It’s a simple tool, but effective. Save some typing and get some nice features for interacting with git.
gitsh

Ask Ruby or maybe not?

Pat Shaughnessy has a great post up on being more functional with your Ruby code. Be sure to follow the link to Dave Thomas’ clarifying post as well.
Ask Ruby or maybe not?

Enumerate your activity feed

The GiveGab team gives a high-level description of how they implemented their activity feed. Follow the pointers for more details on this sticky problem.
Enumerate your activity feed

Hakiri Facets

Hakiri launched a free service this week that scans your Gemfile.lock and reports known CVE vulnerabilities.
Hakiri Facets

A Guide to Core Data Concurrency

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

The iOS ethos of instant responsive UI elements means putting as much work as possible in background threads and as little work in the main thread. For most cases we are fine with using an NSOperationQueue or GCD, but getting concurrency to work in core data sometimes feels more like black magic than science. This post intends to demystify concurrency and offer two ways to go about it.

Setup 1: Private queue context and main queue context off of single persistent store coordinator

In this setup we will make two NSManagedObjectContext instances one with concurrency type NSMainQueueConcurrencyType and the other with type NSPrivateQueueConcurrencyType. We will attach ourself to the NSManagedObjectContextDidSaveNotification to propagate saves.

We add two methods to our TBCoreDataStore.h file:

+ (NSManagedObjectContext *)mainQueueContext;
+ (NSManagedObjectContext *)privateQueueContext;

In our implementation file we add two private properties and lazy load them:

@interface TBCoreDataStore ()

@property (strong, nonatomic) NSPersistentStoreCoordinator *persistentStoreCoordinator;
@property (strong, nonatomic) NSManagedObjectModel *managedObjectModel;

@property (strong, nonatomic) NSManagedObjectContext *mainQueueContext;
@property (strong, nonatomic) NSManagedObjectContext *privateQueueContext;

@end

#pragma mark - Singleton Access

+ (NSManagedObjectContext *)mainQueueContext
{
    return [[self defaultStore] mainQueueContext];
}

+ (NSManagedObjectContext *)privateQueueContext
{
    return [[self defaultStore] privateQueueContext];
}

#pragma mark - Getters

- (NSManagedObjectContext *)mainQueueContext
{
    if (!_mainQueueContext) {
        _mainQueueContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
        _mainQueueContext.persistentStoreCoordinator = self.persistentStoreCoordinator;
    }

    return _mainQueueContext;
}

- (NSManagedObjectContext *)privateQueueContext
{
    if (!_privateQueueContext) {
        _privateQueueContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
        _privateQueueContext.persistentStoreCoordinator = self.persistentStoreCoordinator;
    }

    return _privateQueueContext;
}

Next we override the initializer to add our observing:

- (id)init
{
    self = [super init];
    if (self) {
        [[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(contextDidSavePrivateQueueContext:)name:NSManagedObjectContextDidSaveNotification object:[self privateQueueContext]];
        [[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(contextDidSaveMainQueueContext:) name:NSManagedObjectContextDidSaveNotification object:[self mainQueueContext]];
    }
    return self;
}

- (void)dealloc
{
    [[NSNotificationCenter defaultCenter] removeObserver:self];
}

- (void)contextDidSavePrivateQueueContext:(NSNotification *)notification
{
    @synchronized(self) {
        [self.mainQueueContext performBlock:^{
            [self.mainQueueContext mergeChangesFromContextDidSaveNotification:notification];
        }];
    }
}

- (void)contextDidSaveMainQueueContext:(NSNotification *)notification
{
    @synchronized(self) {
        [self.privateQueueContext performBlock:^{
            [self.privateQueueContext mergeChangesFromContextDidSaveNotification:notification];
        }];
    }
}

Now we have a working private queue and main queue context which will both be updated whenever one is saved. Here is an example usage:

[[TBCoreDataStore privateQueueContext] performBlock:^{
    NSFetchRequest *fetchRequest = [NSFetchRequest fetchRequestWithEntityName:@"MyEntity"];
    NSArray *results = [[TBCoreDataStore privateQueueContext] executeFetchRequest:fetchRequest error:nil];
}];

One of the great advantages of this type of core data stack is it allows us to make great use of NSFetchedResultsController. An example of this is parsing JSON from a web service into a core data object as a background operation and then using the fetched results controller to indicate when said object has changed and updating the UI as a result.

Setup 2: The throwaway main queue context backed by a private queue context.

In this setup we will have only one NSManagedObjectContext which will stay with us for the life time of the app. This will be a private queue context which we will use to create child main queue contexts from. This allows us to spend as much time in the background and only when we need to do UI work do we create a new main queue context.

Starting from our base core data setup we add the following to TBCoreDataStore.h:

+ (NSManagedObjectContext *)newMainQueueContext;
+ (NSManagedObjectContext *)defaultPrivateQueueContext;

In our implementation file we add a single property and lazy load it:

@interface TBCoreDataStore ()

@property (strong, nonatomic) NSPersistentStoreCoordinator *persistentStoreCoordinator;
@property (strong, nonatomic) NSManagedObjectModel *managedObjectModel;

@property (strong, nonatomic) NSManagedObjectContext *defaultPrivateQueueContext;

@end

#pragma mark - Singleton Access

+ (NSManagedObjectContext *)newMainQueueContext
{
    NSManagedObjectContext *context = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
    context.parentContext = [self defaultPrivateQueueContext];
    
    return context;
}

+ (NSManagedObjectContext *)defaultPrivateQueueContext
{
    return [[self defaultStore] defaultPrivateQueueContext];
}

#pragma mark - Getters

- (NSManagedObjectContext *)defaultPrivateQueueContext
{
    if (!_defaultPrivateQueueContext) {
        _defaultPrivateQueueContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
        _defaultPrivateQueueContext.persistentStoreCoordinator = self.persistentStoreCoordinator;
    }

    return _defaultPrivateQueueContext;
}

Here we have no need to observe save notifications as any saves on the created main queue context will bubble up to its parent the defaultPrivateQueueContext. This approach is very robust and spends the least possible time on the main queue. The downside is that it we cannot use the NSFetchedResultsController out of the box though we could cobble our own version using the different notifications sent out by core data.

Lets say we have a really big database (20k+ objects) and we want to do a complex fetch, the best way to go about this is to first use the background queue to fetch the NSManagedObjectIDs and then hop on the main queue and call -objectWithID on the results. This is how we should always be passing around managed objects between threads.

[[TBCoreDataStore defaultPrivateQueueContext] performBlock:^{

    NSFetchRequest *fetchRequest = [NSFetchRequest fetchRequestWithEntityName:@"MyEntity"];
    fetchRequest.resultType = NSManagedObjectIDResultType;

    NSArray *managedObjectIDs = [[TBCoreDataStore defaultPrivateQueueContext] executeFetchRequest:fetchRequest error:nil];

    NSManagedObjectContext *mainQueueContext = [TBCoreDataStore newMainQueueContext];
    [mainQueueContext performBlock:^{

        for (NSManagedObjectID *managedObjectID in managedObjectIDs) {
            MyEntity *myEntity = [mainQueueContext objectWithID:managedObjectID];
            // Update UI with myEntity
        }
    }];
}];

In this scenario we need to update our UI with a bunch of MyEntity managed objects. For efficiencie's sake we perform the costly fetch in the background and set the result type to be NSManagedObjectIDResultType which will return NSManagedObjectIDs. We then create a new mainQueueContext and get each managed object from the cache by using [mainQueueContext objectWithID:managedObjectID]. These objects are then safe to use on the main thread.

If your fetch is not too intensive we can just perform it on your new main queue context. If we want to use this stack I recommend making this snippet:

NSManagedObjectContext *mainQueueContext = [TBCoreDataStore newMainQueueContext];
[mainQueueContext performBlock:^{
    <#code#>
}];

Caveats

When performing extremely intensive fetch operations (10+ seconds) in the background thread and simultaneously needing to perform operations on another thread we will run into blockage. To prevent this from happening we should perform this operation an an entirely new context linked to a entirely new PSC. This will ensure that the operation stays in the background.

Useful utility methods

An extremely useful little one liner is the ability to turn an NSManagedObjectID into a string. we can use this to store the ID in the user defaults.

@implementation NSManagedObjectID (TBExtras)

- (NSString *)stringRepresentation
{
    return [[self URIRepresentation] absoluteString];
}

The flip side of this is then to get an NSManagedObjectID out of such a string. Add this method to your CoreDataStore:

+ (NSManagedObjectID *)managedObjectIDFromString:(NSString *)managedObjectIDString
{
    return [[[self defaultStore] persistentStoreCoordinator] managedObjectIDForURIRepresentation:[NSURL URLWithString:managedObjectIDString]];
}

With these two methods we have an easy way to build a cache on disk by using a plist. This is useful for saving a list of managed objects which need to be updated or maybe deleted between app launches.

Creating a managed object is a pain, so here is a little method which will make your life better:

@implementation NSManagedObject (TBAdditions)

+ (instancetype)createManagedObjectInContext:(NSManagedObjectContext *)context
{
    NSEntityDescription *entity = [NSEntityDescription entityForName:NSStringFromClass([self class]) inManagedObjectContext:context];
    return  [[[self class] alloc] initWithEntity:entity insertIntoManagedObjectContext:context];
}

Finally, while Apple does provide a method to get an NSManagedObject from an NSManagedObjectID we often want to convert a whole array of ids into objects to do this we can use the following:

@implementation NSManagedObjectContext (TBAdditions)

- (NSArray *)objectsWithObjectIDs:(NSArray *)objectIDs
{
    if (!objectIDs || objectIDs.count == 0) {
        return nil;
    }
    __block NSMutableArray *objects = [[NSMutableArray alloc] initWithCapacity:objectIDs.count];

    [self performBlockAndWait:^{
        for (NSManagedObjectID *objectID in objectIDs) {
            if ([objectID isKindOfClass:[NSNull class]]) {
                continue;
            }

            [objects addObject:[self objectWithID:objectID]];
        }
    }];

    return objects.copy;
}

@end

What's next?

I've placed the two core data stacks on GitHub.

Mark and Gordon talked about this on Build Phase episode 18.

Custom Ember Computed Properties

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

EmberJS has a lot of features for helping you build a clean JavaScript interface. One of my favorites is the computed property; Ember can watch a property or set of properties and when any of those change, it recalculates a value that is currently displayed on a screen:

fullName: (->
  "#{@get('firstName)} #{@get('lastName')}"
).property('firstName', 'lastName')

Any time the firstName or lastName of the current object change, the fullName property will also be updated.

In my last project I needed to cacluate the sum of a few properties in an array. I started with a computed property:

sumOfCost: (->
  @reduce ((previousValue, element) ->
    currentValue = element.get('cost')
    if isNaN(currentValue)
      previousValue
    else
      previousValue + currentValue
  ), 0
).property('@each.cost')

This works fine but I need to use this same function for a number of different properties on this controller as well as others. As such, I extracted a helper function:

# math_helpers.js.coffee
class App.mathHelpers
  @sumArrayForProperty: (array, propertyName) ->
    array.reduce ((previousValue, element) ->
      currentValue = element.get(propertyName)
      if isNaN(currentValue)
        previousValue
      else
        previousValue + currentValue
    ), 0


# array_controller.js.coffee
sumOfCost: (->
  App.mathHelpers.sumArrayForProperty(@, 'cost')
).property('@each.cost')

This removes a lot of duplication but I still have the cost property name in the helper method as well as the property declation. I also have the 'decoration' of setting up a computed property in general.

What I need is something that works like Ember.computed.alias('name') but allows me to transform the object instead of just aliasing a property:

# computed_properties.js.coffee
App.computed = {}

App.computed.sumByProperty = (propertyName) ->
  Ember.computed "@each.#{propertyName}", ->
    App.mathHelpers.sumArrayForProperty(@, propertyName)

# array_controller.js.coffee
sumOfCost: App.computed.sumByProperty('cost')

This allows me to easily define a 'sum' for a property without a lot of duplication. In this application I have a lot of similar functions around computing information in arrays. Being able to easily have one function for calculation allowed me to easily unit test that function and feel confident that it would work on any other object. It also simplifies the model or controller a lot for anyone viewing the class for the first time.

Prevent Spoofing with Paperclip

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Egor Homakov recently brought to my attention a slight problem with how Paperclip handles some content type validations. Namely, if an attacker puts an entire HTML page into the EXIF tag of a completely valid JPEG and named the file "gotcha.html", they could potentially trick users into an XSS vulnerability.

Now, this is kind of a convoluted means of attacking. It involves:

  • A server that's running Paperclip configured to not validate content types or filenames
  • A front-end HTTP server that will serve the assets with a content type based on their file name
  • The attacker must get the user to load the crafted image directly (injecting it in an img tag is not enough)

Even with this list of requirements, it's possible, and so we need to take it seriously.

Content Type Spoof Detection

To combat this, we've released Paperclip 4.0 (and then quickly released 4.1), which has a few new restrictions in order to improve out-of-the-box security. The change that handles this problem directly is an automatic validation that checks uploaded files for content type spoofing. That is, if you upload a JPEG and name it .html, it's not going to get through. This happens automatically during the upload process, and uses the file command in order to determine the actual content type of the file. If you don't have file already (for example, because you're on Windows), you can install the file command separately.

Required Content Type or Filename Validations

Next, we're also turning on a new requirement: You must have a content type or filename validation, or you must explicitly opt-out of it.

class ActiveRecord::Base
  has_attached_file :avatar

  # Validate content type
  validates_attachment_content_type :avatar, :content_type => /\Aimage/

  # Validate filename
  validates_attachment_file_name :avatar, :matches => [/png\Z/, /jpe?g\Z/]

  # Explicitly do not validate
  do_not_validate_attachment_file_type :avatar
end

Note that older versions of Paperclip are susceptible to this attack if you don't have a content type validation. If you do have one, then you are protected against people crafting images to perform this type of attack.

The filename validation is new with 4.0.0. We know that some people don't store the content types on their models, but still need a way to be valid. Using the file name can help ensure you're only getting the kinds of files you expect, and all Paperclip attachments have that. This will allow those users to upgrade without having to implement a possibly costly migration of that data into their database.

Content Type Mapping

Immediately, some users reported problems with the spoof detection added in 4.0. In order to fix this, we released 4.1 that added an option called :content_type_mappings that will allow you to specify an extension that cannot otherwise be mapped. For example:

Paperclip.options[:content_type_mappings] = {
  :pem => "text/plain"
}

This will allow users to upload ".pem" files (public certificates for encryption), because file considers those files as "text/plain". This will tell Paperclip "I consider a .pem file that file calls 'text/plain' to be correct" and it will accept it.

Handling console.log errors

Posted 6 months back at Web On Rails

https://twitter.com/bansalakhil/status/433950675613921280

Announcing Taco Tuesdays: A Product Design Talk Series at thoughtbot SF

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

We're excited to introduce a new talk series focused on product design, hosted by thoughtbot in San Francisco.

On February 25th at 6:30pm, we will host the first Taco Tuesdays event at thoughtbot San Francisco (85 2nd St, Suite 700, 94105)

Our first two speakers are Adam Morse, product designer at Salesforce, and Wells Riley, product designer at KickSend and Hack Design. They will give talks on the topic: "What is a design problem you recently encountered, and how did you approach it?"

Food (in the form of tacos) and beverages will be provided.

RSVP at Eventbrite for free to join us. Hope to see you there!

Function Currying in CoffeeScript

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Have you had a function that takes two arguments, but you want to pass the second argument in later? Here's one possible example:

updateUsers = (db, users) ->
  _.map(users, (user) -> updateUser(db, user))

updateUser = (db, user) -> db.update("users", name: user.name)

This is messy, and extremely hard to parse mentally. However, we can make this a bit cleaner using function currying. Originally worked out by Haskell Curry, function currying is the act of taking a function that takes multiple arguments, and replacing it with a chain of functions that take a single argument each.

Curried functions take advantage of closures to emulate multiple arguments. In some functional languages, such as haskell or the lambda calculus, curried functions are the only way to pass multiple arguments to functions.

If we curry our add function, our code becomes much more readable.

updateUsers = (db, users) ->
  _.map(users, updateUser(db))

updateUser = (db) -> (user) -> db.update("users", name: user.name)

The resulting JavaScript for our updateUser function will look like this:

var updateUser = function(db) {
  return function(user) {
    return db.update("users", { name: user.name });
  };
};

It's a simple trick, but a prime example of how CoffeeScript's syntax can make certain tasks much cleaner!

Episode #439 - February 11th, 2014

Posted 6 months back at Ruby5

In this episode we cover Structuring Sinatrap Apps, REST clients with ActiveRestClient, supporting 12-Factor App with ENV_BANG using Foreman to manage services and a new DSL for creating objects with MooseX.

Listen to this episode on Ruby5

Sponsored by TopRubyJobs

If you're looking for a top Ruby job or for top Ruby talent, then you should check out Top Ruby Jobs. Top Ruby Jobs is a website dedicated to the best jobs available in the Ruby community.
This episode is sponsored by Top Ruby Jobs

Structuring Sinatra Apps with Trevi

Last week, Alex MacCaw posted an article on the Sourcing.io blog which focused on a very opinionated way to develop and structure Sinatra applications. He’s even released a companion gem called Trevi that bundles all of this knowledge up and helps you follow along.
Structuring Sinatra Apps with Trevi

ActiveRestClient

ActiveRestClient is a gem is for accessing REST services in an ActiveRecord style. It aims to be a more flexible alternative to ActiveResource. It allows things like setting different endpoints for different REST actions and has additional features like built-in caching.
ActiveRestClient

ENV!

ENV! is a variant for supporting 12-Factor Apps similar to dotenv, but which provides a bit more friendly onboarding experience to a new application. Where dotenv just loads whatever is in your .env file into ENV, ENV! will fail loudly if required variables are undefined or missing and gives you the opportunity to provide helpful messages in that case.
ENV!

Using Foreman to Manage services

Maurício Linhares published an article last week detailing how to use Foreman to isolate and manage application development on OS X machines. He points out that while installing Postgres, for example, is a good thing, you don’t necessarily need it running all the time. The same is true for other application dependencies, like Redis.
Using Foreman to Manage services

MooseX

MooseX is a DSL that helps to make Object Oriented programming in Ruby easier, more consistent, and less tedious. The gem is maintained by Tiago Peczenyj and it's based on Perl's Moose and Moo, two very popular modules in the Perl community. With MooseX you can think more about what you want to do and less about the mechanics of OOP.
MooseX

RubyHeroes

The nominations are open for Ruby Heroes 2014. Head on over to rubyheroes.com, armed with the GitHub usernames of people who have made this past last year a pleasure for you to be in the Ruby community.
RubyHeroes

Thank You for Listening to Ruby5

Ruby5 is released Tuesday and Friday mornings. To stay informed about and active with this podcast, we encourage you to do one of the following:

Thank You for Listening to Ruby5

brew leaves

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

No, it's not about tea. We're continuing our rundown of lesser-known Homebrew features with brew leaves. Let's check the man brew page:

leaves   Show installed formulae that are not dependencies of another installed formula.

Or, in more computer science-y terms, it shows you the leaves of the Homebrew dependency graph.

When to use it

brew leaves shows you programs that you can safely uninstall. If you want to clean house, just run brew leaves and happily uninstall:

$ brew leaves | wc -l
45
$ brew leaves
...
leiningen
...
pngcrush
...

We have 45 leaves. We haven't used leiningen in a while, and forgot pngcrush was even installed. Let's uninstall:

$ brew uninstall pngcrush leiningen
$ brew leaves | wc -l
43

We now have 2 fewer leaves. If pngcrush or leiningen were the only things that depended on a third package foo, then uninstalling those two packages would make foo a new leaf, since now nothing depends on foo.

Easily create a Brewfile

Brewfiles are an easy way to install frequently-used Homebrew packages on a new machine. We can easily create a Brewfile using brew leaves:

$ brew leaves | sed 's/^/install /' > Brewfile
$ wc -l Brewfile
42
$ head -3 Brewfile
install aspell
install bison
install colordiff

Now all 42 packages we depend on are neatly listed. One possible concern is that a package will be left out - for example, we use rbenv but it's not in the Brewfile. This is because we also have rbenv-gem-rehash installed, which depends on rbenv, making rbenv not a leaf. Since rbenv-gem-rehash depends on rbenv, installing it will also install rbenv. We're safe.

What's next?

You can learn how to start and stop background services in Homebrew. You can also take a deep dive into graph theory.

Announcing gitsh

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

gitsh is a new way to use Git: instead of running Git commands in a general purpose shell like zsh or bash, gitsh provides you with a dedicated shell just for your Git commands.

<iframe src="//fast.wistia.net/embed/iframe/wkl3njtmz0" allowtransparency="true" frameborder="0" scrolling="no" class="wistia_embed" name="wistia_embed" allowfullscreen="" mozallowfullscreen="" webkitallowfullscreen="" oallowfullscreen="" msallowfullscreen="" width="640" height="480"></iframe>

Many of the early Unix utilities, like dc, didn't take sub-commands like Git and other modern programs do, instead they launched a shell. For a program like Git, which has so many commands and options, interacting via a shell still makes a lot of sense, and so gitsh follows in this long Unix tradition.

Save yourself some typing

At its simplest, gitsh saves you from typing the word git over and over.

Git commands are very moreish, you almost never just want one. If you work with Git, this flurry of commands is probably very familiar to you:

$ git status
$ git add -p
$ git commit
$ git push

With gitsh this gets easier:

$ gitsh
gitsh@ status
gitsh@ add -p
gitsh@ commit
gitsh@ push
gitsh@ :exit
$

All of your Git aliases will work in gitsh too, so you can save yourself even more typing.

Deep integration

Now we're in a dedicated Git shell, there's a lot more it can do than just save us a few keystrokes. gitsh is only concerned with Git, so it has all kinds of little ways to make using Git easier.

What's my status?

Of all the Git commands, I find myself using git status most often. If I'm about to commit, or push, or pull, it's a great way of quickly checking where things are up to.

In gitsh, if you hit return without entering a command, we assume you wanted a status, saving you even more typing and making it really easy to check the status after any command.

If you prefer the more taciturn output of git status -s, or find yourself using a completely different command with annoying regularity, you can always change gitsh's default command by setting the gitsh.defaultCommand variable using git config:

gitsh@ config --global gitsh.defaultCommand "status -s"

Tab completion and Git prompts

In gitsh you automatically get tab completion for commands, branch names, and paths, and the name and status of the current branch in your prompt. For example, if everything is committed and your working directory is clean the prompt is blue and ends with @, but if you have untracked files the prompt is red and ends with !.

It is possible to set up parts of this in bash or zsh, but it can be fiddly to get working, easily broken, and can interact strangely with aliases and third-party Git commands.

Git environment variables

Like most general purpose Unix shells, gitsh also provides environment variables. You can set a variable using the :set command, and read them using a $ prefix:

gitsh@ :set message "A commit message"
gitsh@ commit -m $message

If the variable name contains a dot, it will temporarily override one of your git config settings, until the end of your gitsh session. This is useful when pair programming:

gitsh@ :set user.name "George Brocklehurst & Mike Burns"
gitsh@ :set user.email support+george+mburns@thoughtbot.com
gitsh@ commit -m "We are pair programming!"

Convinced?

If you're on Mac OS X, you can install gitsh via Homebrew:

brew tap thoughtbot/formulae
brew install gitsh

If you're on Linux, there are install instructions in the gitsh README.

Don't forget to check out the man page:

man gitsh

And if you do find a bug, please report it on the gitsh GitHub repo.

How to Evaluate Your Rails JSON API for Performance Improvements

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Let's say your company's product is a mobile app that gets its data from an internal JSON API. The API, built using Rails, is a few years old. Response objects are large, request latency is high, and your data indicates mobile users aren’t converting because of it.

It can be tempting to immediately dig into your code and look for N+1 queries to refactor. But if you have the time and bandwidth, try to view this as a great opportunity to take a step back and rethink the high-level requirements for your JSON API. Starting with a conversation about the desired functionality of each endpoint will help keep your team's efforts focused on delivering no more than is required by the client, as efficiently as possible.

Grab your team for a whiteboarding session and review your assumptions about the behavior of each API endpoint:

  • How is this endpoint currently being used by the client?
  • What information does the client require for display to the user?
  • What needs to be done on the server side before sending a response to the client?
  • How frequently does the response content change?
  • Why does the response content change?

With the big picture in mind, review your Rails code to identify opportunities for improving performance. In addition to those N+1 queries, keep an eye out for the following patterns:

The response object has properties the client doesn’t use

If you're using #as_json to serialize your ActiveRecord models, it's possible your application is returning more than the client needs. To address this, consider using ActiveModel Serializers instead of #as_json.

The delivery of the response has unnecessary dependencies

Let's say your API has an endpoint the clients uses for reporting analytics events. Your controller might look something like this:

class AnalyticsEventsController < ApiController
  def create
    job = AnalyticsEventJob.new(params[:analytics_event])

    if job.enqueue
      head 201
    else
      head 422
    end
  end
end

Something to consider here is whether the client really needs to know if enqueueing the job is successful. If not, a simple improvement which preserves the existing interface might look something like this:

class AnalyticsEventsController < ApiController
  before_filter :ensure_valid_params, only: [:create]

  def create
    job = AnalyticsEventJob.new(analytics_event_params)
    job.enqueue
    head 201
  end

  private

  def ensure_valid_params
    unless analytics_event_params.valid?
      head 422
    end
  end

  def analytics_event_params
    analytics_event_params ||= AnalyticsParametersObject.new(
      params[:analytics_event]
    )
  end
end

With these changes, the server will respond with a 422 only when the request parameters are invalid.

Static responses aren't being cached effectively

It's possible your Rails application is handling more requests than necessary. Data which is requested frequently by the client but changes infrequently – the current user, for example – presents an opportunity for HTTP caching. Think about using a CDN like Fastly to provide a caching layer.

What's next?

The next step after implementing optimizations for performance is to measure performance gains. You can use tools like JMeter or services like BlazeMeter and Blitz.io to perform load tests in your staging environment.

It's good to keep in mind that through the process of evaluating and improving your Rails application, your team may discover your API is out of date with the needs of the client. You may also see opportunities to move processes currently handled by your Rails application (e.g. persisting and reporting on analytics events) into separate services.

If an API redesign is in order and the idea of non-RESTful routing doesn't make you too uncomfortable, you can explore the possibility of adding an orchestration layer to your API.