Great Gray Landing

Posted 5 months back at Mike Clark

Great Gray Landing

A wild great gray owl returns to its perch after hunting the fields of a ranch near Jackson, WY. Spending a few days photographing this magnificent creature up close is an experience I'll never forget.

FactoryGirl for seed data?

Posted 5 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Occasionally, somebody recommends or asks about using FactoryGirl to generate seed data for an application. We recommend against it for a few reasons:

  1. The default attributes of a factory may change over time. This means that, when you're using FactoryGirl to generate data in seeds.rb, you'll need to explicitly assign each attribute in order to ensure that the data is correct, defeating the purpose.
  2. Attributes may be added, renamed, or removed. This means your seeds file will always need to be up to date with your factories as well as your database schema. You likely won't be running rake db:seed every time you change a migration. So, your seeds file may become out of sync and it won't be immediately obvious. You'll likely notice a breakage when a new developer comes onto the project.
  3. Data will still need to be checked for presence before insertion. The ActiveRecord gem gives you this with the "find or create" methods to locate a record or create it if it can't be found. In addition, those methods will basically force you to manually define each attribute you want assigned, making Factory Girl unnecessary.

What's next?

If you found this useful, you might also enjoy:

Honey, I shrunk the internet! - Content Compression via Rack::Deflater

Posted 5 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Speed is key. The snappier the site, the more visitors like you. Speed is so important that google uses it in site rankings and as a major component of its PageSpeed Tools.

Rack::Deflater middleware compresses responses at runtime using deflate or trusty ol' gzip. Inserted correctly into your Rack app, it can drastically reduce the size of your HTML / JSON controller responses (numbers below!). On a default heroku rails 3 deployment, it can be configured to compress assets delivered via your dynos.

There are other (possibly better) places to handle content compression, for example:

  • A frontend proxy / load balancer,
  • Your CDN,
  • By pre-compressing content and serving that from your web server.

We're going to talk about the simplest thing that'll work for most heroku hosted Rails apps.

Rails 3 and 4

Add it to config/application.rb thusly:

module YourApp
  class Application < Rails::Application
    config.middleware.use Rack::Deflater
  end
end

And your HTML, JSON and other Rails-generated responses will be compressed.

Rails 3 and Runtime Asset Compression

If you're running rails 3 (which will serve static assets for you), you can use Rack::Deflater for runtime asset compression as well. Configure it thusly:

module YourApp
  class Application < Rails::Application
    config.middleware.insert_before ActionDispatch::Static, Rack::Deflater
  end
end

Inserting Rack::Deflater before ActionDispatch::Static means you'll get runtime compression of assets served from heroku in addition to the HTML, JSON, XML and other content your app returns.

Rails 4 assumes you're serving assets from a CDN or via your webserver (and not your Rails processes) so ActionDispatch::Static middleware isn't enabled by default. If you try to insert Rack::Deflater before it, you'll get errors.

spec'ing

Controller specs skip Rack middleware. You need to assert that content is compressed / not compressed in a feature spec as they exercise the full Rails stack. We're using capybara's rspec awesomeness for our integration specs. Example:

# spec/integration/compression_spec.rb
require 'spec_helper'

feature 'Compression' do
  scenario "a visitor has a browser that supports compression" do
    ['deflate','gzip', 'deflate,gzip','gzip,deflate'].each do|compression_method|
      get root_path, {}, {'HTTP_ACCEPT_ENCODING' => compression_method }
      response.headers['Content-Encoding'].should be
    end
  end

  scenario "a visitor's browser does not support compression" do
    get root_path
    response.headers['Content-Encoding'].should_not be
  end
end

Compression Overhead - Dynamic Content

There is overhead to compressing content - let's see how significant it is. These tests were run against a fairly typical rails 3.2 app.

We'll run siege against a dynamic content page running under thin for 30 seconds, simulating 10 concurrent users. I'm picking typical results: I ran siege numerous times for each scenario.

siege -t30s -c 10 'http://127.0.0.1:3000/contact'
  Before Rack::Deflater After Rack::Deflater
Transactions 271 hits 265 hits
Data transferred 1.24 MB 0.45 MB
Transaction rate 9.19 trans/sec 8.93 trans/sec

So we were a little slower, but the content is around 1/3rd the size.

Compression Overhead - Static Content

Let's try against static asset content.

siege -t10s -c 10 'http://localhost:3000/assets/application.css'
  Before Rack::Deflater After Rack::Deflater
Transactions 22601 hits 4052 hits
Data transferred 658.26 MB 21.44 MB
Transaction rate 2374.05 trans/sec 443.33 trans/sec

So when we benchmark the raw speed of compressing static assets, yeah, there's overhead, it's around 5 times slower. The benefit, though, is that application.css is around 6 times smaller - 6.2KB instead of 30.1KB.

Plus we're still serving 443 requests per second. That is far beyond the demands that'd be put on any one dyno, and at traffic levels like this you're probably already using a CDN.

Real world impact

Once you've enabled Rack::Deflater middleware, you should see compression statistics in the chrome web inspector for (at least) your Rails-generated content. For example:

Compression Results

Assets are also being compressed in this Rails 3.2 app via Rack::Deflater, hence the compression ratios for application.css and application.js.

  Before Rack::Deflater After Rack::Deflater Compression Rate
Google PageSpeed analysis 79 of 100 93 of 100  
application.css 30.1KB 6.2KB 79%
application.js 117.0KB 40.8KB 65%
page html 4.3KB 2.6KB 40%
Total size 151.4KB 49.6KB 67%

So with minimal effort we were able to decrease our page sizes significantly (1/3 of their original size!) and bump up our pagespeed analysis 14 points.

Less Painful Heroku Deploys

Posted 5 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

When a Rails app boots, the ActiveRecord classes reflect on the database to determine which attributes and associations to add to the models. The config.cache_classes setting is true in production mode and false in development mode.

During development, we can write and run a migration and see the change take effect without restarting the web server but in production we need to restart the server for the ActiveRecord classes to learn about the new information from the database. Otherwise, the database will have the column but the ActiveRecord classes will have been cached based on the old information.

An app running on Heroku must therefore be restarted post migration.

You may find that you have issues sometimes that I call "stuck dynos". This is where not every process seems to be aware of new columns. Restarting your Heroku app will fix this problem.

You can introspect your running Heroku application to see if this problem has occurred.

Here's a real example:

$ production ps
=== web: `bundle exec rails server thin start -p $PORT -e $RACK_ENV`
web.1: up 2013/01/25 16:33:07 (~ 18h ago)
web.2: up 2013/01/25 16:47:15 (~ 18h ago)

=== worker: `bundle exec rake jobs:work`
worker.1: up 2013/01/25 17:30:58 (~ 17h ago)

The ups tell me the processes are running. They will say crashed if there's a problem.

If I were to run production tail (from Parity), I might just be watching the stream looking for anything unusual, like 500 errors. Sometimes error reporting services are delayed so there is no faster way to know about a post-deploy issue than tailing the logs while running through some critical workflows in the app.

If something looks unusual, I might then move over to the logging service we have set up (typically Splunk Storm or Papertrail) to run some searches to see how often the problem is coming up or if it looks new post-deploy.

New Relic or Airbrake will likely have more backtrace information by this time and we can make a decision about whether to roll back the deploy, or work on a hot fix as the next action, or record the bug and place it lower in the backlog.

What's next?

If you found this useful, you might also enjoy:

Don’t Talk to (Just) Me

Posted 5 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Having chatrooms for your company is great. They aid communication, especially across multiple locations; they provide an archive for past conversations for those who were not around; and they can serve to refresh the memory of those who were. As your team grows, you may need to create multiple rooms with more specific topics, but discourage one-on-one chats as they do not provide the same benefits.

Our Setup

Campfire Lobby

At thoughtbot, we have Everyone, Code, Design, and a couple of Meeting rooms that can be used for one-off conversations that take longer and might not be interesting to everyone in the company. The entire company have access to these rooms. We also have rooms specific to projects to which only developers and designers working on those projects have access.

We have arrived at this layout after a lot of experimentation, and it is certainly a living thing—we are always interested in trying something new out in case it works better.

One thing that we have determined is not a good idea is one-on-one chats. These types of conversation tend to contain information which would be valuable to the entire team, such as “hey, do you remember why you made this decision” or which could benefit from the insight of others: “do you know why this test is failing?” From our Playbook:

When things are only said on the phone, in person, in emails that don’t include the whole group, or in one-on-one chats, information gets lost, forgotten, or misinterpreted. The problems expand when someone joins or leaves the project.

Benefits

Having these discussions in a more public location encourages others to participate in the conversation and allows them to make the decision themselves to contribute, read and possibly learn from, or ignore if it doesn't apply to them. Here is an example of where both happen:

Ben: So this is odd. When I moved to ruby 2.0 I had to add "host: localhost" to my database.yml or it couldn't find the server.

Caleb: I used to need that. It turned out to be a problem with the pg gem, and I had to reinstall postgres and pg to fix it.

Joe: Works best while listening to Tool and Radiohead simultaneously

Joel: thoughts on this as an alternative to Postgres view homebrew - http://postgresapp.com

Sean: I use it, and love it.

Joe: That's what Heroku recommends: https://devcenter.heroku.com/articles/heroku-postgresql#set-up-postgres-on-mac

Best Practices

Things like mentioning a person’s name and setting up notifications for things that interest you can help keep up with several chat rooms. Mine are: my first name, my GitHub and Twitter usernames, ‘coffee’, ‘alaska’, /game ?night/, and ‘griddler’.

It is also important to remember that when company wide transient communication is happening in shared rooms, you need not worry about keeping up with everything. The most important information should be shared in more permanent ways, such as through Basecamp or email.

Keep conversations public. It encourages participation, builds culture, and reduces the cognitive overhead of keeping up with one-on-one conversations that you feel more obligated to budget attention to. Many chat programs provide a searchable archive, which is useful when looking up specific conversations or looking for topics which have been discussed. With good discipline in putting communications in the right room or in a more asynchronous place, inter-organization communication can improve greatly.

Episode #418 – November 9th, 2013

Posted 5 months back at Ruby5

Live from RubyConf Miami Beach 2013

Listen to this episode on Ruby5

New Relic
New Relic is _the_ all-in-one web performance analytics product. It lets you manage and monitor web application performance, from the browser down to the line of code. With Real User Monitoring, New Relic users can see browser response times by geographical location of the user, or by browser type.

Terence Lee
Terence talks about the upcoming release for Bundler

Jim Gay
Jim tells us about his RubyConf talk and his thoughts about presenters and applications architecture

Mike Perham
Mike talks about Sidekick and his open source projects.

Thank You for Listening to Ruby5
Ruby5 is released Tuesday and Friday mornings. To stay informed about and active with this podcast, we encourage you to do one of the following:

A Tour of Rails’ jQuery UJS

Posted 5 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

If you have a look at the default application.js file generated by Rails, you’ll see //= require jquery_ujs. You might know exactly what this require is for, but if you’re like me you know you need it but have only a vague idea of what it’s responsible for.

Maybe you’ve looked at that line and thought, “I should really figure out what that does someday.” Well, for me, today is that day. I thought you might want to join me.

Unobtrusive JavaScript

The UJS in jquery-ujs stands for unobtrusive JavaScript. This is a rather broad term that generally refers to using JavaScript to progressively enhance the user experience for capable browsers without negatively impacting clients that do not support or do not enable JavaScript.

jquery-ujs wires event handlers to eligible DOM elements to provide enhanced functionality. In most cases, the eligible DOM elements are identified by HTML 5 data- Attributes.

Let's have a look at the progressive enhancements jquery-ujs provides.

POST, PUT, DELETE Links

<%= link_to 'Delete', item, method: :delete %>

Clicking a link will always result in an HTTP GET request. If your link represents an action on a resource it may be more semantically correct for it to be performed with a different HTTP verb; in the case above, we want to use DELETE.

jquery-ujs attaches a handler to links with the data-method attribute. When the link is click, the handler constructs an HTML form along with a hidden input that sets the _method parameter to the requested HTTP verb and submits the form rather than following the link.

Confirmation Dialogs

<%= form_for item, data: { confirm: 'Are you sure?' } %>

jquery-ujs attaches a handler to links or forms with the data-confirm attribute that displays a JavaScript confirmation dialog. The user can choose to proceed with or cancel the action.

<%= form.submit data: { disable_with: 'Submitting...' } %>

Users double click links and buttons all the time. This causes duplicate requests but is easily avoided thanks to jquery-ujs. A click handler is added that updates the text of the button to that which was provided in the data-disable-with attribute and disables the button. If the action is performed via AJAX, the handler will re-enable the button and reset the text when the request completes.

AJAX Forms

<%= form_for item, remote: true %>

Adding remote: true to your form_for calls in Rails causes jquery-ujs to handle the form submission as an AJAX request. Your controller can handle the AJAX request and return JavaScript to be executed in the response. Thanks to jquery-ujs and Rails’ respond_with, setting remote: true is likely the quickest way to get your Rails application making AJAX requests. The unobtrusive nature of the implementation makes it simple to support both AJAX and standard requests at the same time.

AJAX File Uploads

Browsers do not natively support AJAX file uploads. If you have an AJAX form that contains a populated file input, jquery-ujs will fire the ajax:aborted:file event. If this event is not stopped by an event handler, the AJAX submission will be aborted and the form will submit as a normal form.

Remoteipart is one Rails gem that hooks into this event to enable AJAX file uploads.

Required Field Validation

HTML5 added the ability to mark an input as required. Browsers with full support for this feature will stop form submission and add browser-specific styling to the inputs that are required but not yet provided. jquery-ujs adds a polyfill that brings this behavior, minus the styling, to all JavaScript-enabled browsers. There’s no default styling provided by the polyfill, which means users of impacted browsers may be puzzled as to why the form will not submit. There’s an ongoing discussion about the appropriateness of this polyfill.

You can opt-out of this behavior by setting the “novalidate” attribute on your form. This will cause both the jquery-ujs polyfill and browsers with native support to skip HTML5 input validation. Given both the potential for confusion in browsers without native support and the fact that browsers with native support apply styles that may clash with your site design, handling validation is probably better left up to each developer.

Cross-Site Request Forgery Protection

Cross-Site Request Forgery (CSRF) is an attack wherein the attacker tricks the user into submitting a request to an application the user is likely already authenticated to. The user may think he’s simply signing up for an email newsletter, but the attacker controlling that sign up form is actually turning that into a request to post a status to some other website, using the users preexisting session.

Rails has built in protection which requires a token only available on your actual site to accompany every POST, PUT, or DELETE request. In collaboration with <%= csrf_meta_tags %> in your application layout HEAD, jquery-ujs augments this protection by adding the CSRF token to a header on outgoing AJAX requests.

jquery-ujs also updates the CSRF token on all non-AJAX forms on page load, which may be out-of-date if the form was rendered from a fragment cache.

Extensibility

jquery-ujs exposes its functions in the $.rails namespace and fires many events when submitting AJAX forms. jquery-ujs behavior can be customized by overriding these methods or handling the appropriate events. We’ve already seen Remoteipart as an example of custom event handling. There also exist several gems that override $.rails.allowAction to replace JavaScript confirmation dialogs with application-specific modal dialogs.

Docker-friendly Vagrant boxes

Posted 5 months back at Phusion Corporate Blog

Vagrant

We heavily utilize Vagrant in our development workflow. Vagrant is a tool for easily setting up virtual machines as development environments, making it easy to distribute development environments and making them reconstructible and resetable. It has proven to be an indispensable tool when working in development teams with more than 1 person, especially when not everybody uses the same operating system.

Lately we’ve been working with Docker, which is a cool new OS-level virtualization technology. Docker officially describes it as “iPhone apps for your server”, but being the hardcore system-level guys that we are, we dislike this description. Instead we’d like to describe Docker as “FreeBSD jails for Linux + an ecosystem to make it a joy to use”. Docker, while still young and not production-ready, is very promising and can make virtualization cheap and efficient.

Googling for Vagrant and Docker will yield plenty of information and tutorials.

Today, we are releasing Docker-friendly Vagrant boxes based on Ubuntu 12.04. Docker requires at least kernel 3.8, but all the Ubuntu 12.04 Vagrant boxes that we’ve encountered so far come with kernel 3.2 or 3.5, so that installing Docker on them requires a reboot. This makes provisioning a VM to be significantly more painful than it should be.

The Vagrant boxes that we’re releasing also come with a bigger virtual hard disk (40 GB) so that you don’t have to worry about running out of disk space inside your VM.

The Vagrant boxes can be found here:

https://oss-binaries.phusionpassenger.com/vagrant/boxes/

Please feel free to link to them from your Vagrantfile.

These Vagrant boxes are built automatically from Veewee definitions so that you can rebuild them. Our definitions can be found at Github: https://github.com/phusion/open-vagrant-boxes

Enjoy these Vagrant boxes!

You may also want to check our other products, such as Phusion Passenger, which is an application server for Ruby, Python, Node.js and Meteor which makes deployment extremely simple.

Discuss this on Hacker News.

Announcing Bitters, a Dash of Sass Stylesheets for Bourbon and Neat

Posted 5 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

The designers here like to have simple stylesheets when starting a new project. Styles that are better looking than browser defaults but not something that that will dictate our visual design moving forward. These styles should also remove a lot of the duplicated work we do when getting a project off the ground so we can start solving harder problems faster.

Our long time solution to this problem was the stylesheets in Flutie but Flutie’s stylesheets had grown outdated, its defaults and reset were getting in the way. We were continually overriding the styles, which was adding unnecessary bloat to our CSS. These styles weren’t doing their job, they were slowing us down at the start of a project not helping us speed up.

Instead of trying to refine the styles in Flutie we decided to just remove them and let Flutie stand alone as ActionView helpers. We didn’t just want to duplicate the problems that we had with Flutie’s styles and wanted to start from scratch. We also wanted to integrate this more fully with Bourbon and Neat.

Bitters aims to solve the same set of problems that the Flutie stylesheets and the problems that raised out of Flutie’s inattention. Unlike Flutie, the Bitters files should be installed into your Sass directory and imported at the top of your main stylesheet. Once installed the files should be edited to fit the style of the site or application. This way you won’t override styles and won’t add unnecessary cruft to your stylesheets, instead you’ll be building on top of the foundation that Bitters provides.

Bitters gives you plain variables for type, sizes, and color, a simple grid using Neat, smart defaults for typography and simply styled flashes for notifications or errors. Most importantly it should set up a consistent language and structure for your Sass.

We are still working out some of the details and I appreciate any feedback you might have.

The Product Design Sprint

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

A Product Design Sprint is a 5-phase exercise which uses design thinking to reduce the inherent risks in successfully bringing products to market. We've done six product design sprints so far and have made them a default part of our consulting engagements.

Participating in a Design Sprint orients the entire team and aims their efforts at hitting clearly defined goals. Sprints are useful starting points when kicking off a new feature, workflow, product, business or solving problems with an existing product.

Integrating design sprints and design thinking into our product development process keeps us aligned with our goals, and helps us invest our time and money wisely.

Design Thinking

Design Thinking combines empathy, creativity and rationality to solve human-centered problems. It is the foundation on which a Design Sprint is built.

Empathy

With Design Thinking, we use empathy to see the world through our customers eyes and understand their problems as they experience them. There may be many technological, financial, political, religious, human, and social and cultural forces involved. It is our job to develop a holistic understanding of these problems and forces and contextualize them in a greater world schema.

In addition to our own perspective, we aim to understand the perspectives of as many other people as possible to better diversify our understanding.

Empathy is the primary focus of Phase 1 (Understand) and a major part of Phase 5 (Test and Learn). We should aim to always maintain empathy when solving problems and building products for humans.

Creativity

Creativity is opportunity discovery. We use creativity to generate insights and solution concepts.

The most creative solutions are inspired by unique insights and intersecting perspectives. Empathy, as described above, empowers our ability to understand different perspectives and be more creative.

Collaboration inspires creativity. More perspectives, ideas, and insights lead to more opportunity.

Creativity is the focus of Phase 2 (Diverge), but is present in all phases (developing prototypes, testing/interviewing, researching/observing, creating experiments, etc.)

Rationality

We use rationality to fit solutions to the problem context through experimentation, testing and qualitative/quantitative measurements. This is the primary focus of Phase 3 (Converge) and Phase 5 (Test and Learn).

Design Thinking should pervade all of our processes outside the design sprint as well, from engineering to marketing to business development. In a complex business ecosystem design thinking can be used as a holistic approach to facilitating and maintaining a symbiotic relationship with your customers.

The Sprint Phases

A typical length for a project kick-off sprint is five days, with each day representing a different phase.

This timeframe is not rigid and should adapt to the specific needs of the problem. For example, some phases may need more than a full day and others may need less.

The aim is to develop a product or feature idea into a prototype that can be tested to help us fill our riskiest knowledge gaps, validate or invalidate our riskiest assumptions and guide future work.

image

Phase 1: Understand

Goal:

Develop a common understanding of the working context including the problem, the business, the customer, the value proposition, and how success will be determined.

By the end of this phase, we also aim to have identified some of our biggest risks and started to make plans for reducing them.

Why:

Common understanding will empower everyones decision making and contributions to the project.

Understanding our risks enables us to stay risk-averse and avoid investing time and money on things that rely on unknowns or assumptions.

Activities:

  • Define the Business Opportunity.
  • Define the Customer.
  • Define the Problem.
  • Define the Value Proposition (why will people pay you?).
  • Define context-specific terms (this will act as a dictionary).
  • Discuss short term and long term business goals (What’s the driving vision?).
  • Gather and analyze existing research.
  • Fill out the Business Model Canvas (this should be continually revisited).
  • Capture our analysis of competitive products.
  • Gather inspirational and informative examples of other people/products solving similar or analogous problems.
  • If there is an existing site/app, map out the screens.
  • As they come up in discussion, capture assumptions and unknowns on a wall or board with sticky notes. Later we can revisit this wall, group related items together and make plans to eliminate risky unknowns and address risky assumptions.

All of these definitions are expected to change as we move forward and learn more.

Deliverables:

  • Notes & documentation capturing the definitions and goals we discussed throughout the day. These notes should provide a solid reference and help with onboarding others later on.
  • A plan for initiating the next phase of the sprint.

Phase 2: Diverge

Goal:

Generate insights and potential solutions to our customers problems.

Explore as many ways of solving the problems as possible, regardless of how realistic, feasible, or viable they may or may not be.

Why:

The opportunity this phase generates enables us to evaluate and rationally eliminate options and identify potentially viable solutions to move forward with. This phase is also crucial to innovation and marketplace differentiation.

Activities:

  • Constantly ask, “How might we…”.
  • Generate, develop, and communicate new ideas.
  • Quick and iterative individual sketching.
  • Group sketching on whiteboards.
  • Mind Mapping individually and as a group.

Deliverables:

  • Critical path diagram: highlights the story most critical to the challenge at hand. Where does your customer start, where should they end up and what needs to happen along the way?

image

  • Prototype goals: What is it we want to learn more about? What assumptions do we need to address?

Phase 3: Converge

Goal:

Take all of the possibilities exposed during phases 1 and 2, eliminate the wild and currently unfeasible ideas and hone in on the ideas we feel the best about.

These ideas will guide the implementation of a prototype in phase 4 that will be tested with existing or potential customers.

Why:

Not every idea is actionable or feasible and only some will fit the situation and problem context. Exploring many alternative solutions helps provide confidence that we are heading in the right direction.

Activities:

  • Identify the ideas that aim to solve the same problem in different ways.
  • Eliminate solutions that can’t be pursued currently.
  • Vote for good ideas.
  • Storyboard the core customer flow. This could be a work flow or the story (from the customers perspective) of how they engage with, learn about and become motivated to purchase or utilise a product or service.

Deliverables:

  • The Prototype Storyboard: a comic book-style story of your customer moving through the previously-defined critical path. The storyboard is the blueprint for the prototype that will be created in phase 4.

image

  • Assumptions Table: A list of all assumptions inherent in our prototype, how we plan on testing them, and the expected outcomes which validate those assumptions.

image

Phase 4: Prototype

Goal:

Build a prototype that can be tested with existing or potential customers.

The prototype should be designed to learn about specific unknowns and assumptions. It’s medium should be determined by time constraints and learning goals. Paper, Keynote, and simple HTML/CSS are all good prototyping media.

The prototype storyboard and the first three phases of the sprint should make prototype-building fairly straight forward. There shouldn’t be much uncertainty around what needs to be done.

Why:

A prototype is a very low cost way of gaining valuable insights about what the product needs to be. Once we know what works and what doesn’t we can confidently invest time and money on more permanent implementation.

Activities:

  • Prototype implementation.

Deliverables:

  • A testable prototype.
  • A plan for testing. If we are testing workflows, we should also have a list of outcomes we can ask our testers to achieve with our prototype.

Phase 5: Test & Learn

Goal:

Test the prototype with existing or potential customers.

It is important to test with existing or potential customers because they are the ones you want your product to work and be valuable for. Their experiences with the problem and knowledge of the context have influence on their interaction with your product that non customers won’t have.

Why:

Your customers will show you the product they need. Testing our ideas helps us learn more about things we previously knew little about and gives us a much clearer understanding of which directions we should move next. It can also helps us course-correct and avoid building the wrong product.

Activities:

  • Observe and interview customers as they interact with your prototype.
  • Observe and interview customers as they interact with competitive products.

Deliverables:

  • Summary/report of our learnings from testing the prototype.
  • A plan for moving forward beyond the design sprint.

Closing

Our Product Design Sprint process has been heavily informed by IDEO’s Human Centered Design Toolkit and a series of blog posts by Google Ventures and we are grateful for the information they have shared.

We want the work we do to have a positive impact on the world. Our goal is not just to build “a” product, but to build the “right” product. A meaningful product that meets real people’s needs and can support a viable business.

We believe that Product Design Sprints and Design Thinking will help us bring more successful products and businesses to market.

If you'd like to hire us for a product design sprint, get in touch. Also, feel free to contact me if you're interested in talking and learning more about product design sprints.

Episode #417 – November 5th, 2013

Posted 6 months back at Ruby5

Rails 4.0.1, Keeping your yaml clean, Invoker, DevOps, Angular Rails, and Rumble Winners all on today's Ruby5!

Listen to this episode on Ruby5

This episode is sponsored by Top Ruby Jobs
If you're looking for a top Ruby job or for top Ruby talent, then you should check out Top Ruby Jobs. Top Ruby Jobs is a website dedicated to the best jobs available in the Ruby community.

Rails 4.0.1 Release

Cleaning your YAML
Last week our very own Carlos Souza wrote a blog post about keeping your yaml clean. This quick read shows you how to use anchors and aliases to dry up your configuration settings.

Invoker Gem
Invoker allows you to define an ini file with all of your related applications and the command to start them. It even supports pow style .dev urls, and procfiles.

DevOps for Developers
Zach Siri wrote to us about a video series he’s created, called DevOps for Developers covering web server basics, building static and dynamic virtual machines, serving multiple sites from them, and connecting virtual machines together.

Angular on Rails
Rocky Jaiswal wrote a blog post about building a project with Rails and Angular.js.

Rails Rumble Winners
The winners for this year’s Rails Rumble have been announced.

Catching Invalid JSON Parse Errors with Rack Middleware

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

There is a world where developers need never worry about poorly formatted JSON. This is not that world.

If a client submits invalid / poorly formatted JSON to a rails 3.2 or 4 app, a cryptic and unhelpful error is thrown and they're left wondering why the request tanked.

Example errors:

The error thrown by the parameter parsing middleware behaves differently depending on your version of Rails:

  • 3.2 throws a 500 error in HTML format (no matter what the client asked for in its Accepts: header), and
  • 4.0 throws a 400 "Bad Request" error in the format the client specifies.

Here's the default rails 3.2 error - not great.

> curl  -H "Accept: application/json" -H "Content-type: application/json" 'http://localhost:3000/posts' -d '{ i am broken'

<!DOCTYPE html>
<html>
  <!-- default 500 error page omitted for brevity -->
</html>

Here's the default Rails 4 error. Not bad, but it could be better.

> curl -H "Accept: application/json" -H "Content-type: application/json" 'http://localhost:3000/posts' -d '{ i am broken'

{"status":"400","error":"Bad Request"}%

Neither message tells the client directly about the actual problem - that invalid JSON was submitted.

Why?

The middleware that parses parameters (ActionDispatch::ParamsParser) runs long before your controller is on the scene, and throws exceptions when invalid JSON is encountered. You can't capture the parsing exception in your controller, as your controller is never involved in serving the failed request.

TDD All Day, Every Day.

Here's the test where we're looking for a more informative error message to be thrown. We're using curb in our JSON client integration tests to simulate a real-world client as closely as possible.

feature "A client submits JSON" do
  scenario "submitting invalid JSON", js: true do
    invalid_tokens = ', , '
    broken_json = %Q|{"notice":{"title":"A sweet title"#{invalid_tokens}}}|
   
    curb = post_broken_json_to_api('/notices', broken_json)
   
    expect(curb.response_code).to eq 500
    expect(curb.content_type).to match(/application\/json/)
    expect(curb.body_str).to match("There was a problem in the JSON you submitted:")
  end
   
  def post_broken_json_to_api(path, broken_json)
    Curl.post("http://#{host}:#{port}#{path}", broken_json) do |curl|
      set_default_headers(curl)
    end
  end
   
  def host
    Capybara.current_session.server.host
  end
   
  def port
    Capybara.current_session.server.port
  end
   
  def set_default_headers(curl)
    curl.headers['Accept'] = 'application/json'
    curl.headers['Content-Type'] = 'application/json'
  end
end

Middleware to the Rescue

Fortunately, it's easy to write custom middleware that rescues the errors thrown when JSON can't be parsed. To wit, the version for rails 3.2:

# in app/middleware/catch_json_parse_errors.rb
class CatchJsonParseErrors
  def initialize(app)
    @app = app
  end

  def call(env)
    begin
      @app.call(env)
    rescue MultiJson::LoadError => error
      if env['HTTP_ACCEPT'] =~ /application\/json/
        error_output = "There was a problem in the JSON you submitted: #{error}"
        return [
          400, { "Content-Type" => "application/json" },
          [ { status: 400, error: error_output }.to_json ]
        ]
      else
        raise error
      end
    end
  end
end

And the Rails 4.0 version:

# in app/middleware/catch_json_parse_errors.rb
class CatchJsonParseErrors
  def initialize(app)
    @app = app
  end

  def call(env)
    begin
      @app.call(env)
    rescue ActionDispatch::ParamsParser::ParseError => error
      if env['HTTP_ACCEPT'] =~ /application\/json/
        error_output = "There was a problem in the JSON you submitted: #{error}"
        return [
          400, { "Content-Type" => "application/json" },
          [ { status: 400, error: error_output }.to_json ]
        ]
      else
        raise error
      end
    end
  end
end

The only difference is what errors we're looking to rescue - MultiJson::LoadError under rails 3.2, and the more generic ActionDispatch::ParamsParser::ParseError under 4.0.

What this does is:

  • Rescue the relevant parser error,
  • Look to see if the client wanted JSON by inspecting their HTTP_ACCEPT header, and
  • If they want JSON, give them back a friendly JSON response with info about where parsing failed.
  • If they want something OTHER than JSON, re-raise the error and the default behavior takes over.

You need to insert the middleware before ActionDispatch::ParamsParser, thusly:

# in config/application.rb
module YourApp
  class Application < Rails::Application
    # ...
    config.middleware.insert_before ActionDispatch::ParamsParser, "CatchJsonParseErrors"
    # ...
  end
end

The results

Now when a JSON client submits invalid JSON, they get back something like:

> curl  -H "Accept: application/json" -H "Content-type: application/json" 'http://localhost:3000/posts' -d '{ i am broken'

{"status":400,"error":"There was a problem in the JSON you submitted: 795: unexpected token at '{ i am broken'"}

Excellent. Now we're telling our clients why we rejected their request and where their JSON went wrong, instead of leaving them to wonder.

Key/value logs in Go

Posted 6 months back at techno weenie - Home

I shipped GitHub's first user-facing Go app a month ago: the Releases API upload endpoint. It's a really simple, low traffic service to dip our toes in the Go waters. Before I could even think about shipping it though, I had to answer these questions:

  • How can I deploy a Go app?
  • Will it be fast enough?
  • Will I have any visibility into it?

The first two questions are simple enough. I worked with some Ops people on getting Go support in our Boxen and Puppet recipes. Considering how much time this app would spend in network requests, I knew that raw execution speed wasn't going to be a factor. To help answer question 3, I wrote grohl, a combination logging, error reporting, and metrics library.

import "github.com/technoweenie/grohl"

A few months ago, we started using the scrolls Ruby gem for logging on GitHub.com. It's a simple logger that writes out key/value logs:

app=myapp deploy=production fn=trap signal=TERM at=exit status=0

Logs are then indexed, giving us the ability to search logs for the first time. The next thing we did was added a unique X-GitHub-Request-Id header to every API request. This same request is sent down to internal systems, exception reporters, and auditors. We can use this to trace user problems across the entire system.

I knew my Go app had to be tied into the same systems to give me visibility: our exception tracker, statsd to record metrics into Graphite, and our log index. I wrote grohl to be the single source of truth for the app. Its default behavior is to just log everything, with the expectation that something would process them. Relevant lines are indexed, metrics are graphed, and exceptions are reported.

At GitHub, we're not quite there yet. So, grohl exposes both an error reporting interface, and a statter interface (designed to work with g2s). Maybe you want to push metrics directly to statsd, or you want to push errors to a custom HTTP endpoint. It's also nice that I can double check my app's metrics and error reporting without having to spin up external services. They just show up in the development log like anything else.

Episode #416 – November 1st, 2013

Posted 6 months back at Ruby5

Tear down the Rumble, fear Backbone and Angular, a dark statesman will arise and table_print pronto! It's a darker edition of the Ruby 5.

Listen to this episode on Ruby5

This episode is sponsored by New Relic
New Relic is _the_ all-in-one web performance analytics product. It lets you manage and monitor web application performance, from the browser down to the line of code. With Real User Monitoring, New Relic users can see browser response times by geographical location of the user, or by browser type.

Rails Rumble Gem Teardown
Adam from DwellableTrends once again collected the Gemfiles from this year's Rails Rumble participants and whipped them into a savory spread of statistics.

Contrasting Angular and Backbone
Victor Savkin compares Angular and Backbone in a new series of blog posts.

Statesman gem
Yet another state machine library! This one is slightly different and worth a look at the blog post.

table_print gem
The table_print gem from Chris Doyle allows you to view structured data in IRB in a nice tabular format.

Pronto automated code checks
Quick automated code review of your changes. Created to be used on pull requests, but suited for other scenarios as well. Perfect if you want to find out quickly if branch introduces changes that conform to your style-guide, are DRY, don’t introduce security holes and more.

How To Efficiently Display Large Amounts of Data on iOS Maps

Posted 6 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

This tutorial will demonstrate how to handle and display thousands of points of data on an iOS map in a way people understand and enjoy.

We are going to make an iOS app which ships with 87,000 hotels, each with a coordinate, a name and a phone number. This app will never ask the user to "redo search in area"; it will update the map as the user pans and zooms, allowing the user to freely explore the data.

This will require us to come up with an ultra quick data structure built for the task. We will need to build it in C for it to be performant. Once we have constructed our data structure we will come up with a clustering system, as to not overwhelm the user. Finally, we will give it the professional polish that is required for apps to compete in today's market.

By the end of this tutorial you will have all the components needed to make this app:

final app

The data structure

To come up with a good data structure for this problem, we first need to look at our data and determine we want to do with it. Our data is represented by a set of coordinates (latitude, longitude) that we need to plot onto a map. The map is able to zoom and pan, meaning that we won't need to have all the points up on screen. At the core this is a search problem. We need to find all coordinates which lie in a certain range, i.e within a rectangle.

A simple solution is to iterate through all our points and ask: (xMin <= x <= xMax and yMin <= y <= yMax). Unfortunately, this is a O(N) solution and won't work for our scenario.

A good approach to search problems, is to look for exploitable symmetries which reduce the search space as we search. So how can we reduce our search space after every iteration of a search? If instead of indexing by points, we index by region, we reduce our search space. This type of spatial indexing is achieved by a quad tree, which queries at O(H) (where H is the tree's height for a particular point).

Quad Tree

A quad tree is a data structure comprising nodes which store a bucket of points and a bounding box. Any point which is contained within the node's bounding box is added to its bucket. Once the bucket gets filled up, the node splits itself into four nodes, each with a bounding box corresponding to a quadrant of its parents bounding box. All points which would have gone into the parent's bucket now go into one of its children's buckets.

So how do we code this quad tree?

Lets define the base structures:

typedef struct TBQuadTreeNodeData {
    double x;
    double y;
    void* data;
} TBQuadTreeNodeData;
TBQuadTreeNodeData TBQuadTreeNodeDataMake(double x, double y, void* data);

typedef struct TBBoundingBox {
    double x0; double y0;
    double xf; double yf;
} TBBoundingBox;
TBBoundingBox TBBoundingBoxMake(double x0, double y0, double xf, double yf);

typedef struct quadTreeNode {
    struct quadTreeNode* northWest;
    struct quadTreeNode* northEast;
    struct quadTreeNode* southWest;
    struct quadTreeNode* southEast;
    TBBoundingBox boundingBox;
    int bucketCapacity;
    TBQuadTreeNodeData *points;
    int count;
} TBQuadTreeNode;
TBQuadTreeNode* TBQuadTreeNodeMake(TBBoundingBox boundary, int bucketCapacity);

TBQuadTreeNodeData struct will hold our coordinate (latitude and longitude).

void* data is a generic pointer and should be used to store whatever extra info we may need, i.e hotel name and phone number.

TBBoundingBox struct represents a rectangle defined like a range query i.e (xMin <= x <= xMax && yMin <= y <= yMax) where your top left corner is (xMin, yMin) and the bottom right is (xMax, yMax).

Finally, we see TBQuadTreeNode, which holds four pointers – one for each of its quadrants. It has a boundary and an array (bucket).

The spatial indexing happens as we build the quad tree. Here is an animation of a quad tree being built:

quad tree build

The following code is exactly what happens in the above animation:

void TBQuadTreeNodeSubdivide(TBQuadTreeNode* node)
{
    TBBoundingBox box = node->boundingBox;

    double xMid = (box.xf + box.x0) / 2.0;
    double yMid = (box.yf + box.y0) / 2.0;

    TBBoundingBox northWest = TBBoundingBoxMake(box.x0, box.y0, xMid, yMid);
    node->northWest = TBQuadTreeNodeMake(northWest, node->bucketCapacity);

    TBBoundingBox northEast = TBBoundingBoxMake(xMid, box.y0, box.xf, yMid);
    node->northEast = TBQuadTreeNodeMake(northEast, node->bucketCapacity);

    TBBoundingBox southWest = TBBoundingBoxMake(box.x0, yMid, xMid, box.yf);
    node->southWest = TBQuadTreeNodeMake(southWest, node->bucketCapacity);

    TBBoundingBox southEast = TBBoundingBoxMake(xMid, yMid, box.xf, box.yf);
    node->southEast = TBQuadTreeNodeMake(southEast, node->bucketCapacity);
}

bool TBQuadTreeNodeInsertData(TBQuadTreeNode* node, TBQuadTreeNodeData data)
{
    // Bail if our coordinate is not in the boundingBox
    if (!TBBoundingBoxContainsData(node->boundingBox, data)) {
        return false;
    }

    // Add the coordinate to the points array
    if (node->count < node->bucketCapacity) {
        node->points[node->count++] = data;
        return true;
    }

    // Check to see if the current node is a leaf, if it is, split
    if (node->northWest == NULL) {
        TBQuadTreeNodeSubdivide(node);
    }

    // Traverse the tree
    if (TBQuadTreeNodeInsertData(node->northWest, data)) return true;
    if (TBQuadTreeNodeInsertData(node->northEast, data)) return true;
    if (TBQuadTreeNodeInsertData(node->southWest, data)) return true;
    if (TBQuadTreeNodeInsertData(node->southEast, data)) return true;

    return false;
}

Now that we have a quad tree built we need to be able to perform bounding box queries on it and have it return our TBQuadTreeNodeData structs. Here is an animation of a bounding box query for all annotations in the light blue region. When an annotation is found it turns green.

quad tree query

And here is the query code:

typedef void(^TBDataReturnBlock)(TBQuadTreeNodeData data);

void TBQuadTreeGatherDataInRange(TBQuadTreeNode* node, TBBoundingBox range, TBDataReturnBlock block)
{
    // If range is not contained in the node's boundingBox then bail
    if (!TBBoundingBoxIntersectsBoundingBox(node->boundingBox, range)) {
        return;
    }

    for (int i = 0; i < node->count; i++) {
        // Gather points contained in range
        if (TBBoundingBoxContainsData(range, node->points[i])) {
            block(node->points[i]);
        }
    }
    
    // Bail if node is leaf
    if (node->northWest == NULL) {
        return;
    }

    // Otherwise traverse down the tree
    TBQuadTreeGatherDataInRange(node->northWest, range, block);
    TBQuadTreeGatherDataInRange(node->northEast, range, block);
    TBQuadTreeGatherDataInRange(node->southWest, range, block);
    TBQuadTreeGatherDataInRange(node->southEast, range, block);
}

This data structure can perform bounding box queries at lightning speed. It can easily handle hundreds of queries, in a database containing hundreds of thousands of points, at 60 fps.

Creating the quadtree using the hotel data

The hotel data comes from POIplaza and comes formatted in CSV. We are going to read from the disk, parse the data, and build a quad tree from it.

The code to build the quad tree will reside in TBCoordinateQuadTreeController

typedef struct TBHotelInfo {
    char* hotelName;
    char* hotelPhoneNumber;
} TBHotelInfo;

TBQuadTreeNodeData TBDataFromLine(NSString *line)
{
    // Example line:
    // -80.26262, 25.81015, Everglades Motel, USA-United States, +1 305-888-8797
    
    NSArray *components = [line componentsSeparatedByString:@","];
    double latitude = [components[1] doubleValue];
    double longitude = [components[0] doubleValue];

    TBHotelInfo* hotelInfo = malloc(sizeof(TBHotelInfo));

    NSString *hotelName = [components[2] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
    hotelInfo->hotelName = malloc(sizeof(char) * hotelName.length + 1);
    strncpy(hotelInfo->hotelName, [hotelName UTF8String], hotelName.length + 1);

    NSString *hotelPhoneNumber = [[components lastObject] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
    hotelInfo->hotelPhoneNumber = malloc(sizeof(char) * hotelPhoneNumber.length + 1);
    strncpy(hotelInfo->hotelPhoneNumber, [hotelPhoneNumber UTF8String], hotelPhoneNumber.length + 1);

    return TBQuadTreeNodeDataMake(latitude, longitude, hotelInfo);
}

- (void)buildTree
{
    NSString *data = [NSString stringWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"USA-HotelMotel" ofType:@"csv"] encoding:NSASCIIStringEncoding error:nil];
    NSArray *lines = [data componentsSeparatedByString:@"\n"];

    NSInteger count = lines.count - 1;

    TBQuadTreeNodeData *dataArray = malloc(sizeof(TBQuadTreeNodeData) * count);
    for (NSInteger i = 0; i < count; i++) {
        dataArray[i] = TBDataFromLine(lines[i]);
    }

    TBBoundingBox world = TBBoundingBoxMake(19, -166, 72, -53);
    _root = TBQuadTreeBuildWithData(dataArray, count, world, 4);
}

With this we now have a full quad tree built using data preloaded on the iPhone. This will allow us to tackle the next part of our app: clustering.

Clustering

Now that we have a quad tree loaded up with our hotel data, we can solve the clustering problem. First, let's explore the reasoning behind clustering. We cluster because we want to avoid overwhelming the user with data. There are many ways to do this; Google Maps does it by just showing a portion of their overall data depending on your zoom level. The more you zoom in, the finer the detail gets until you see all the available annotations. We are going to take the clustering approach because it allows us to show off the amount of data we have while reducing the number of points on the map.

The final annotations will be represented as a circle with the number of hotels in the middle. Our approach will be the same as a simplistic image downsizing operation. We will draw a grid over the map. Then for each cell in the grid we will derive a clustered annotation from the hotels contained within that cell. We determine the coordinate of the cluster by taking the average over the coordinates of each hotel.

Here is an animation of the process:

clustering

Now for some code. Add a method to the TBCoordinateQuadTree:

- (NSArray *)clusteredAnnotationsWithinMapRect:(MKMapRect)rect withZoomScale:(double)zoomScale
{
    double TBCellSize = TBCellSizeForZoomScale(zoomScale);
    double scaleFactor = zoomScale / TBCellSize;

    NSInteger minX = floor(MKMapRectGetMinX(rect) * scaleFactor);
    NSInteger maxX = floor(MKMapRectGetMaxX(rect) * scaleFactor);
    NSInteger minY = floor(MKMapRectGetMinY(rect) * scaleFactor);
    NSInteger maxY = floor(MKMapRectGetMaxY(rect) * scaleFactor);

    NSMutableArray *clusteredAnnotations = [[NSMutableArray alloc] init];

    for (NSInteger x = minX; x <= maxX; x++) {
        for (NSInteger y = minY; y <= maxY; y++) {

            MKMapRect mapRect = MKMapRectMake(x / scaleFactor, y / scaleFactor, 1.0 / scaleFactor, 1.0 / scaleFactor);
            
            __block double totalX = 0;
            __block double totalY = 0;
            __block int count = 0;

            TBQuadTreeGatherDataInRange(self.root, TBBoundingBoxForMapRect(mapRect), ^(TBQuadTreeNodeData data) {
                totalX += data.x;
                totalY += data.y;
                count++;
          });

            if (count >= 1) {
                CLLocationCoordinate2D coordinate = CLLocationCoordinate2DMake(totalX / count, totalY / count);
                TBClusterAnnotation *annotation = [[TBClusterAnnotation alloc] initWithCoordinate:coordinate count:count];
                [clusteredAnnotations addObject:annotation];
            }
        }
    }

    return [NSArray arrayWithArray:clusteredAnnotations];
}

The above method will make clustered annotations for us for a defined cell size. Now all we need to do is add these to an MKMapView. To do so we make a UIViewController subclass with an MKMapView as its view. We want the annotations to update whenever the visible region changes so we implement the mapView:regionDidChangeAnimated: protocol method.

- (void)mapView:(MKMapView *)mapView regionDidChangeAnimated:(BOOL)animated
{
    [[NSOperationQueue new] addOperationWithBlock:^{
        double zoomScale = self.mapView.bounds.size.width / self.mapView.visibleMapRect.size.width;
        NSArray *annotations = [self.coordinateQuadTree clusteredAnnotationsWithinMapRect:mapView.visibleMapRect withZoomScale:zoomScale];

        [self updateMapViewAnnotationsWithAnnotations:annotations];
    }];
}

Adding only the annotations we need

We want to spend as little time operating on the main thread as possible, which means we place everything inside a background operation queue. To reduce the amount of time spent on the main thread we need to make sure to only draw what is necessary. This will avoid unnecessary blocks when scrolling and ensure a smooth, uninterrupted experience.

To start, let's take a look at the following image: adding annotations

The leftmost screenshot is a snapshot of the map immediately before the region has been changed. The annotations in this snapshot are exactly the mapView's current annotations. We will call this the before set.

The rightmost screenshot is a snapshot of the map immediately after the region changed. This snapshot is exactly what we get back from clusteredAnnotationsWithinMapRect:withZoomScale:. We will call this the after set.

We want to keep all the annotations which both sets have in common (set intersection). Get rid of the ones which are not in the after set and add the ones which are left.

- (void)updateMapViewAnnotationsWithAnnotations:(NSArray *)annotations
{
    NSMutableSet *before = [NSMutableSet setWithArray:self.mapView.annotations];
    NSSet *after = [NSSet setWithArray:annotations];

    // Annotations circled in blue shared by both sets
    NSMutableSet *toKeep = [NSMutableSet setWithSet:before];
    [toKeep intersectSet:after];

    // Annotations circled in green
    NSMutableSet *toAdd = [NSMutableSet setWithSet:after];
    [toAdd minusSet:toKeep];

    // Annotations circled in red
    NSMutableSet *toRemove = [NSMutableSet setWithSet:before];
    [toRemove minusSet:after];

    // These two methods must be called on the main thread
    [[NSOperationQueue mainQueue] addOperationWithBlock:^{
        [self.mapView addAnnotations:[toAdd allObjects]];
        [self.mapView removeAnnotations:[toRemove allObjects]];
    }];
}

With this we are sure to be doing as little work as possible on the main thread, which will improve scrolling performance.

Next we are going to look at how to draw the annotations to make them reflect their different counts. Then we will add the final touches which will allow the app to stand toe-to-toe with the best out there.

Drawing the annotations

Because we are reducing the number of data points a user can see, we need to make each of those points reflect how much data we really have.

We are going to make a circular annotation with the cluster count in the middle. The circle's size will reflect the number of hotels in that cluster.

To do this we need to find an equation which will allow us to scale values on the order of 1 to 500+. We will use this equation:

scale equation, alpha = 0.3, beta = 0.4

Here lower values of x will grow quickly and tail off as x gets larger. Beta will control how fast our values go to 1, while alpha affects the lower bounds (in our case we want the smallest cluster (count of 1) to be 60% of our largest size).

static CGFloat const TBScaleFactorAlpha = 0.3;
static CGFloat const TBScaleFactorBeta = 0.4;

CGFloat TBScaledValueForValue(CGFloat value)
{
    return 1.0 / (1.0 + expf(-1 * TBScaleFactorAlpha * powf(value, TBScaleFactorBeta)));
}

- (void)setCount:(NSUInteger)count
{
    _count = count;

    // Our max size is (44,44)
    CGRect newBounds = CGRectMake(0, 0, roundf(44 * TBScaledValueForValue(count)), roundf(44 * TBScaledValueForValue(count)));
    self.frame = TBCenterRect(newBounds, self.center);

    CGRect newLabelBounds = CGRectMake(0, 0, newBounds.size.width / 1.3, newBounds.size.height / 1.3);
    self.countLabel.frame = TBCenterRect(newLabelBounds, TBRectCenter(newBounds));
    self.countLabel.text = [@(_count) stringValue];

    [self setNeedsDisplay];
}

This takes care of scaling our annotations; now lets make them look nice.

- (void)setupLabel
{
    _countLabel = [[UILabel alloc] initWithFrame:self.frame];
    _countLabel.backgroundColor = [UIColor clearColor];
    _countLabel.textColor = [UIColor whiteColor];
    _countLabel.textAlignment = NSTextAlignmentCenter;
    _countLabel.shadowColor = [UIColor colorWithWhite:0.0 alpha:0.75];
    _countLabel.shadowOffset = CGSizeMake(0, -1);
    _countLabel.adjustsFontSizeToFitWidth = YES;
    _countLabel.numberOfLines = 1;
    _countLabel.font = [UIFont boldSystemFontOfSize:12];
    _countLabel.baselineAdjustment = UIBaselineAdjustmentAlignCenters;
    [self addSubview:_countLabel];
}

- (void)drawRect:(CGRect)rect
{
    CGContextRef context = UIGraphicsGetCurrentContext();

    CGContextSetAllowsAntialiasing(context, true);

    UIColor *outerCircleStrokeColor = [UIColor colorWithWhite:0 alpha:0.25];
    UIColor *innerCircleStrokeColor = [UIColor whiteColor];
    UIColor *innerCircleFillColor = [UIColor colorWithRed:(255.0 / 255.0) green:(95 / 255.0) blue:(42 / 255.0) alpha:1.0];

    CGRect circleFrame = CGRectInset(rect, 4, 4);

    [outerCircleStrokeColor setStroke];
    CGContextSetLineWidth(context, 5.0);
    CGContextStrokeEllipseInRect(context, circleFrame);

    [innerCircleStrokeColor setStroke];
    CGContextSetLineWidth(context, 4);
    CGContextStrokeEllipseInRect(context, circleFrame);

    [innerCircleFillColor setFill];
    CGContextFillEllipseInRect(context, circleFrame);
}

Final touches

Now that we have annotations which do a good job at representing our data, let's add a few final touches to make the app fun to use.

First we need to make an animation for the addition of new annotations. Without an animation, new annotations just appear out of nowhere and the whole experience is a bit jarring.

- (void)addBounceAnnimationToView:(UIView *)view
{
    CAKeyframeAnimation *bounceAnimation = [CAKeyframeAnimation animationWithKeyPath:@"transform.scale"];

    bounceAnimation.values = @[@(0.05), @(1.1), @(0.9), @(1)];

    bounceAnimation.duration = 0.6;
    NSMutableArray *timingFunctions = [[NSMutableArray alloc] init];
    for (NSInteger i = 0; i < 4; i++) {
        [timingFunctions addObject:[CAMediaTimingFunction functionWithName:kCAMediaTimingFunctionEaseInEaseOut]];
    }
    [bounceAnimation setTimingFunctions:timingFunctions.copy];
    bounceAnimation.removedOnCompletion = NO;

    [view.layer addAnimation:bounceAnimation forKey:@"bounce"];
}

- (void)mapView:(MKMapView *)mapView didAddAnnotationViews:(NSArray *)views
{
    for (UIView *view in views) {
        [self addBounceAnnimationToView:view];
    }
}

Next, we want to change our clustering grid based on how zoomed in we are on the map. We want to cluster smaller cells the closer we are. To do this we define the current zoom level of our map, where scale = mapView.bounds.size.width / mapView.visibleMapRect.size.width:

NSInteger TBZoomScaleToZoomLevel(MKZoomScale scale)
{
    double totalTilesAtMaxZoom = MKMapSizeWorld.width / 256.0;
    NSInteger zoomLevelAtMaxZoom = log2(totalTilesAtMaxZoom);
    NSInteger zoomLevel = MAX(0, zoomLevelAtMaxZoom + floor(log2f(scale) + 0.5));

    return zoomLevel;
}

Now that we have integer values for each zoomLevel:

float TBCellSizeForZoomScale(MKZoomScale zoomScale)
{
    NSInteger zoomLevel = TBZoomScaleToZoomLevel(zoomScale);

    switch (zoomLevel) {
        case 13:
        case 14:
        case 15:
            return 64;
        case 16:
        case 17:
        case 18:
            return 32;
        case 19:
            return 16;

        default:
            return 88;
    }
}

Now as we start zooming in you should see more, smaller clusters, until finally we see every hotel as a single count cluster.

Checkout out the complete working version of the app on GitHub.