docs/modules/ROOT/pages/development.adoc
= Development
This section of the documentation will teach you how to develop new cops. We'll
start with generating a cop template and then we'll address the various aspects
of its implementation (interacting with the AST, autocorrect, configuration)
and testing.
== Create a new cop
NOTE: Clone the repository and run `bundle install` if not done yet.
The following rake task can only be run inside the rubocop project directory itself.
Use the bundled rake task `new_cop` to generate a cop template:
[source,sh]
----
$ bundle exec rake 'new_cop[Department/Name]'
[create] lib/rubocop/cop/department/name.rb
[create] spec/rubocop/cop/department/name_spec.rb
[modify] lib/rubocop.rb - `require_relative 'rubocop/cop/department/name'` was injected.
[modify] A configuration for the cop is added into config/default.yml.
Do 4 steps:
1. Modify the description of Department/Name in config/default.yml
2. Implement your new cop in the generated file!
3. Commit your new cop with a message such as
e.g. "Add new `Department/Name` cop"
4. Run `bundle exec rake changelog:new` to generate a changelog entry
for your new cop.
----
== Basics
RuboCop uses the https://github.com/whitequark/parser[parser] library to create the
Abstract Syntax Tree (AST) representation of the code.
You can install `parser` gem and use `ruby-parse` command line utility to check
what the AST looks like in the output.
[source,sh]
----
$ gem install parser
----
And then try to parse a simple integer representation with `ruby-parse`:
[source,sh]
----
$ ruby-parse -e '1'
(int 1)
----
Each expression surrounded by parentheses represents a node in the AST. The first
element is the node type and the tail contains the children with all
information needed to represent the code.
Here's another example - a local variable `name` being assigned the
string value "John":
[source,sh]
----
$ ruby-parse -e 'name = "John"'
(lvasgn :name
(str "John"))
----
=== Inspecting the AST representation
Let's imagine we want to simplify statements from `!array.empty?` to
`array.any?`:
First, check what the bad code returns in the Abstract Syntax Tree
representation.
[source,sh]
----
$ ruby-parse -e '!array.empty?'
(send
(send
(send nil :array) :empty?) :!)
----
Now, it's time to debug our expression using the REPL from RuboCop:
[source,sh]
----
$ bin/console
----
First we need to declare the code that we want to match, and use the
https://www.rubydoc.info/gems/rubocop-ast/RuboCop/AST/ProcessedSource[ProcessedSource]
that is a simple wrap to make the parser interpret the code and build the AST:
[source,ruby]
----
code = '!something.empty?'
source = RuboCop::ProcessedSource.new(code, RUBY_VERSION.to_f)
node = source.ast
# => s(:send, s(:send, s(:send, nil, :something), :empty?), :!)
----
The node has a few attributes that can be useful in the journey:
[source,ruby]
----
node.type # => :send
node.children # => [s(:send, s(:send, nil, :something), :empty?), :!]
node.source # => "!something.empty?"
----
== Implementation
=== Writing Node Pattern Rules
NOTE: You can write cops without using `NodePattern` (and many older cops don't use it), but it
generally simplifies a lot the code, as manual node matching and destructuring can be
quite verbose.
Now that you're familiar with AST, you can learn a bit about the
https://www.rubydoc.info/gems/rubocop-ast/RuboCop/AST/NodePattern[node pattern]
and use patterns to match with specific nodes that you want to match.
You can learn more about Node Pattern https://github.com/rubocop/rubocop-ast/blob/master/docs/modules/ROOT/pages/node_pattern.adoc[here].
Node pattern matches something very similar to the current output from AST
representation, then let's start with something very generic:
[source,ruby]
----
NodePattern.new('send').match(node) # => true
----
It matches because the root is a `send` type. Now lets match it deeply using
parentheses to define details for sub-nodes. If you don't care about what an internal
node is, you can use `+...+` to skip it and just consider " a node".
[source,ruby]
----
NodePattern.new('(send ...)').match(node) # => true
NodePattern.new('(send (send ...) :!)').match(node) # => true
NodePattern.new('(send (send (send ...) :empty?) :!)').match(node) # => true
----
Sometimes it's hard to comprehend complex expressions you're building with the
pattern, then, if you got lost with the node pattern parens surrounding deeply,
try to use the `$` to capture the internal expression and check exactly each
piece of the expression:
[source,ruby]
----
NodePattern.new('(send (send (send $...) :empty?) :!)').match(node) # => [nil, :something]
----
It's not needed to strictly receive a send in the internal node because maybe
it can also be a literal array like:
[source,ruby]
----
![].empty?
----
The code above has the following representation:
[source,ruby]
----
=> s(:send, s(:send, s(:array), :empty?), :!)
----
It's possible to skip the internal node with `+...+` to make sure that it's just
another internal node:
[source,ruby]
----
NodePattern.new('(send (send (...) :empty?) :!)').match(node) # => true
----
In other words, it says: "Match code calling ``!<expression>.empty?``".
Great! Now, lets implement our cop to simplify such statements:
[source,sh]
----
$ rake 'new_cop[Style/SimplifyNotEmptyWithAny]'
----
After the cop scaffold is generated, change the node matcher to match with
the expression achieved previously:
[source,ruby]
----
def_node_matcher :not_empty_call?, <<~PATTERN
(send (send $(...) :empty?) :!)
PATTERN
----
Note that we added a `$` sign to capture the "expression" in `!<expression>.empty?`,
it will become useful later.
Get yourself familiar with the AST node hooks that
https://www.rubydoc.info/gems/parser/Parser/AST/Processor[`parser`]
and https://www.rubydoc.info/gems/rubocop-ast/RuboCop/AST/Traversal[`rubocop-ast`]
provide.
As it starts with a `send` type, it's needed to implement the `on_send` method, as the
cop scaffold already suggested:
[source,ruby]
----
def on_send(node)
return unless not_empty_call?(node)
add_offense(node)
end
----
The `on_send` callback is the most used and can be optimized by restricting the acceptable
method names with a constant `RESTRICT_ON_SEND`.
And the final cop code will look like something like this:
[source,ruby]
----
module RuboCop
module Cop
module Style
# `array.any?` is a simplified way to say `!array.empty?`
#
# @example
# # bad
# !array.empty?
#
# # good
# array.any?
#
class SimplifyNotEmptyWithAny < Base
MSG = 'Use `.any?` and remove the negation part.'.freeze
RESTRICT_ON_SEND = [:!].freeze # optimization: don't call `on_send` unless
# the method name is in this list
def_node_matcher :not_empty_call?, <<~PATTERN
(send (send $(...) :empty?) :!)
PATTERN
def on_send(node)
return unless not_empty_call?(node)
add_offense(node)
end
end
end
end
end
----
Note that `on_send` will be called on a given `node` before the callbacks `on_<some type>` for its children are called. There's also a callback `after_send` that is called after the children are processed. There's a similar `after_<some type>` callback for all types, except those that never have children.
Update the spec to cover the expected syntax:
[source,ruby]
----
describe RuboCop::Cop::Style::SimplifyNotEmptyWithAny, :config do
it 'registers an offense when using `!a.empty?`' do
expect_offense(<<~RUBY)
!array.empty?
^^^^^^^^^^^^^ Use `.any?` and remove the negation part.
RUBY
end
it 'does not register an offense when using `.any?` or `.empty?`' do
expect_no_offenses(<<~RUBY)
array.any?
array.empty?
RUBY
end
end
----
If your code has variables of different lengths, you can use `%{foo}`,
`^{foo}`, and `_{foo}` to format your template; you can also abbreviate
offense messages with `[...]`:
[source,ruby]
----
%w[raise fail].each do |keyword|
expect_offense(<<~RUBY, keyword: keyword)
%{keyword}(RuntimeError, msg)
^{keyword}^^^^^^^^^^^^^^^^^^^ Redundant `RuntimeError` argument [...]
RUBY
%w[has_one has_many].each do |type|
expect_offense(<<~RUBY, type: type)
class Book
%{type} :chapter, foreign_key: 'book_id'
_{type} ^^^^^^^^^^^^^^^^^^^^^^ Specifying the default [...]
end
RUBY
end
----
=== Autocorrect
The autocorrect can help humans automatically fix offenses that have been detected.
It's necessary to `extend AutoCorrector`.
The method `add_offense` yields a corrector object that is a thin wrapper on
https://www.rubydoc.info/gems/parser/Parser/Source/TreeRewriter[parser's TreeRewriter]
to which you can give instructions about what to do with the
offensive node.
Let's start with a simple spec to cover it:
[source,ruby]
----
it 'corrects `!a.empty?`' do
expect_offense(<<~RUBY)
!array.empty?
^^^^^^^^^^^^^ Use `.any?` and remove the negation part.
RUBY
expect_correction(<<~RUBY)
array.any?
RUBY
end
----
And then add the autocorrecting block on the cop side:
[source,ruby]
----
extend AutoCorrector
def on_send(node)
expression = not_empty_call?(node)
return unless expression
add_offense(node) do |corrector|
corrector.replace(node, "#{expression.source}.any?")
end
end
----
The corrector allows you to `insert_after`, `insert_before`, `wrap` or
`replace` a specific node or in any specific range of the code.
Range can be determined on `node.location` where it brings specific
ranges for expression or other internal information that the node holds.
==== Preventing Clobbering
The corrector detects and prevents correcting overlapping nodes, to prevent one correction from clobbering another.
Supporting nested corrections is done by taking multiple passes, and skipping corrections for nested nodes.
This can be implemented using the `IgnoredNode` module:
[source,diff]
----
extend AutoCorrector
+include IgnoredNode
def on_send(node)
return unless some_condition?(node)
add_offense(node) do |corrector|
+ next if part_of_ignored_node?(node)
+
corrector.replace(node, "...")
end
+
+ ignore_node(node)
end
----
This works because the correcting a file is implemented by repeating investigation and correction until the file no longer requires correction, meaning all nested nodes will eventually be processed.
Note that `expect_correction` in `Cop` specs only asserts the result after one pass.
=== Limit by Ruby or Gem versions
Some cops apply changes that only apply in particular contexts, such as if the user has a minimum Ruby version. There are helpers that let you constrain your cops automatically, to only run where applicable.
==== Requiring a minimum Ruby version
If your cop uses new Ruby syntax or standard library APIs, it should only autocorrect if the user has the target Ruby version, which you set with https://www.rubydoc.info/gems/rubocop/RuboCop/Cop/TargetRubyVersion#minimum_target_ruby_version-instance_metho[`TargetRubyVersion#minimum_target_ruby_version`].
For example, the `Performance/SelectMap` cop requires Ruby 2.7, which introduced `Enumerable#filter_map`:
```ruby
module RuboCop::Cop::Performance::SelectMap < Base
extend TargetRubyVersion
minimum_target_ruby_version 2.7
# ...
end
```
This cop won't autocorrect on Ruby 2.6 or older, and it won't even report an offense (since there's no better alternative to recommend instead).
==== Requiring a gem
If your cop depends on the presence of a gem, you can declare that with https://www.rubydoc.info/gems/rubocop/RuboCop/Cop/Base#requires_gem-class_method[`RuboCop::Cop::Base.requires_gem`].
For example, to declare that `MyCop` should only apply if the bundle is using `my-gem` with a version between `1.2.3` and `4.5.6`:
```Ruby
class MyCop < Base
requires_gem "my-gem", ">= 1.2.3", "< 4.5.6"
# ...
end
```
You can specify any gem requirement using https://guides.rubygems.org/patterns/#declaring-dependencies[the same syntax as your `Gemfile`].
==== Special case: Rails
Historically, many cops in `rubocop-rails` aren't actually specific to Rails itself, but some of its components (e.g. `ActiveSupport`). These dependencies are declared with https://www.rubydoc.info/gems/rubocop-rails/RuboCop/Cop/TargetRailsVersion#minimum_target_rails_version-instance_method[`TargetRailsVersion.minimum_target_rails_version`].
For example, the `Rails/Pluck` cop requires ActiveSupport 6.0, which introduces `Enumerable#pluck`:
```ruby
module RuboCop::Cop::Rails::Pluck < Base
extend TargetRailsVersion
minimum_target_rails_version 6.0
#...
end
```
=== Run tests
RuboCop supports two parser engines: the Parser gem and Prism. By default, tests are executed with the Parser:
```console
$ bundle exec rake spec
```
To run all tests with the experimental support for Prism, use `bundle exec prism_spec`, and to execute tests for individual files,
specify the environment variable `PARSER_ENGINE=parser_prism`.
e.g., `PARSER_ENGINE=parser_prism spec/rubocop/cop/style/hash_syntax_spec.rb`
`bundle exec rake` runs tests for both Parser gem and Prism parsers.
But `ascii_spec` rake task does not run by default. Because `ascii_spec` task has not been failing for a while.
However, `ascii_spec` task will continue to be checked in CI.
In CI, all tests required for merging are executed. Please investigate if anything fails.
=== Configuration
Each cop can hold a configuration and you can refer to `cop_config` in the
instance and it will bring a hash with options declared in the `.rubocop.yml`
file.
For example, lets imagine we want to make configurable to make the replacement
works with other method than `.any?`:
[source,yml]
----
Style/SimplifyNotEmptyWithAny:
Enabled: true
ReplaceAnyWith: "size > 0"
----
And then on the autocorrect method, you just need to use the `cop_config` it:
[source,ruby]
----
def on_send(node)
expression = not_empty_call?(node)
return unless expression
add_offense(node) do |corrector|
replacement = cop_config['ReplaceAnyWith'] || 'any?'
corrector.replace(node, "#{expression.source}.#{replacement}")
end
end
----
== Documentation
Every new cop requires explanation and examples to make it easy for the community
to understand its purpose. This documentation is generated by `yard` and is added
directly into the `cop.rb` file. For every `SupportedStyle` and unique
configuration you have included in the cop, there needs to be examples. Examples must
have valid Ruby syntax. Do not use upticks.
[source,ruby]
----
module Department
# Description of your cop. Include description of ALL config options. Particularly
# ones that take booleans and arrays, because we generally do not show examples for
# configs with these value types.
#
# @example EnforcedStyle: bar
# # Description about this particular option
#
# # bad
# bad_example1
# bad_example2
#
# # good
# good_example1
# good_example2
#
# @example EnforcedStyle: foo (default)
# # Description about this particular option
#
# # bad
# bad_example1
# bad_example2
#
# # good
# good_example1
# good_example2
#
# @example AnyUniqueConfigKeyThatIsAString: qux (default)
# # Description about this particular option
#
# # bad
# bad_example1
# bad_example2
#
# # good
# good_example1
# good_example2
#
# @example AnyUniqueConfigKeyThatIsAString: thud
# # Description about this particular option
#
# # bad
# bad_example1
# bad_example2
#
# # good
# good_example1
# good_example2
#
class YourCop
# ...
----
Take note of the placement and spacing of all the documentation pieces. Such as config
keys being in alphabetical order, the `(default)` being specified, and one empty line
before `class YourCop`. While not all examples in the codebase follow this exact format,
we strive to make this consistent. PRs improving RuboCop documentation are very welcome.
== Testing your cop in a real codebase
It's generally good practice to check if your cop is working properly over a
significant codebase (e.g. Rails or some big project you're working on) to
guarantee it's working in a range of different syntaxes.
There are several ways to do this. Two common approaches:
. From within your local `rubocop` repo, run `exe/rubocop ~/your/other/codebase`.
. From within the other codebase's `Gemfile`, set a path to your local repo like this: `gem 'rubocop', path: '/full/path/to/rubocop'`. Then run `rubocop` within your codebase.
With approach #2, you can use local versions of RuboCop extension repos such as `rubocop-rspec` as well.
To make it fast and do not get confused with other cops in action, you can use
`--only` parameter in the command line to filter by your cop name:
[source,sh]
----
$ rubocop --only Style/SimplifyNotEmptyWithAny
----