FinalCAD/csv_row_model

View on GitHub
README.md

Summary

Maintainability
Test Coverage
# CsvRowModel [![Build Status](https://travis-ci.org/FinalCAD/csv_row_model.svg?branch=master)](https://travis-ci.org/FinalCAD/csv_row_model) [![Code Climate](https://codeclimate.com/github/FinalCAD/csv_row_model/badges/gpa.svg)](https://codeclimate.com/github/FinalCAD/csv_row_model) [![Test Coverage](https://codeclimate.com/github/FinalCAD/csv_row_model/badges/coverage.svg)](https://codeclimate.com/github/FinalCAD/csv_row_model/coverage)

Import and export your custom CSVs with a intuitive shared Ruby interface.

First define your schema:

```ruby
class ProjectRowModel
  include CsvRowModel::Model

  column :id, options
  column :name
  
  merge_options :id, more_options # optional
end
```

To export, define your export model like [`ActiveModel::Serializer`](https://github.com/rails-api/active_model_serializers)
and generate the file:

```ruby
class ProjectExportRowModel < ProjectRowModel
  include CsvRowModel::Export

  # this is an override with the default implementation
  def id
    source_model.id
  end
end

export_file = CsvRowModel::Export::File.new(ProjectExportRowModel)
export_file.generate { |csv| csv << project } # `project` is the `source_model` in `ProjectExportRowModel`
export_file.file # => <Tempfile>
export_file.to_s # => export_file.file.read
```

To import, define your import model, which works like [`ActiveRecord`](http://guides.rubyonrails.org/active_record_querying.html),
and iterate through a file:

```ruby
class ProjectImportRowModel < ProjectRowModel
  include CsvRowModel::Import

  # this is an override with the default implementation
  def id
    original_attribute(:id)
  end
end

import_file = CsvRowModel::Import::File.new(file_path, ProjectImportRowModel)
row_model = import_file.next

row_model.headers # => ["id", "name"]

row_model.source_row # => ["1", "Some Project Name"]
row_model.source_attributes # => { id: "1", name: "Some Project Name" }, this is `source_row` mapped to `column_names`
row_model.attributes # => { id: "1", name: "Some Project Name" }, this is final attribute values mapped to `column_names`

row_model.id # => 1
row_model.name # => "Some Project Name"

row_model.previous # => <ProjectImportRowModel instance>
row_model.previous.previous # => nil, save memory by avoiding a linked list
```

## Installation

Add this line to your application's Gemfile:

```ruby
gem 'csv_row_model'
```

And then execute:

    $ bundle

Or install it yourself as:

    $ gem install csv_row_model

## Export

### Header Value
To generate a header value, the following pseudocode is executed:
```ruby
def header(column_name)
  # 1. Header Option
  header = options_for(column_name)[:header]

  # 2. format_header
  header || format_header(column_name, context)
end
```

#### Header Option
Specify the header manually:
```ruby
class ProjectRowModel
  include CsvRowModel::Model
  column :name, header: "NAME"
end
```

#### Format Header
Override the `format_header` method to format column header names:
```ruby
class ProjectExportRowModel < ProjectRowModel
  include CsvRowModel::Export
  class << self
    def format_header(column_name, context)
      column_name.to_s.titleize
    end
  end
end
```

## Import

### Attribute Values
To generate a attribute value, the following pseudocode is executed:

```ruby
def original_attribute(column_name)
  # 1. Get the raw CSV string value for the column
  value = source_attributes[column_name]

  # 2. Clean or format each cell
  value = self.class.format_cell(cell, column_name, context)

  if value.present?
    # 3a. Parse the cell value (which does nothing if no parsing is specified)
    parse(value)
  elsif default_exists?
    # 3b. Set the default
    default_for_column(column_name)
  end
end

def original_attributes; { id: original_attribute(:id) } end
def id; original_attribute(:id) end
```

#### Format Cell
Override the `format_cell` method to clean/format every cell:
```ruby
class ProjectImportRowModel < ProjectRowModel
  include CsvRowModel::Import
  class << self
    def format_cell(cell, column_name, context)
      cell = cell.strip
      cell.blank? ? nil : cell
    end
  end
end
```

#### Type
Automatic type parsing.

```ruby
class ProjectImportRowModel
  include CsvRowModel::Import

  column :id, type: Integer
  column :name, parse: ->(original_string) { parse(original_string) }

  def parse(original_string)
    "#{id} - #{original_string}"
  end
end
```

There are validators for available types: `Boolean`, `Date`, `DateTime`, `Float`, `Integer`. See [Type Format](#type-format) for more. You can also customize and create new types via a override:

```ruby
class ProjectImportRowModel
  # GOTCHA: this should be defined before `::column` is called,
  # as `::column` uses this to check passed `:type` option (and return ArgumentError)
  def self.class_to_parse_lambda
    super.merge(
      Hash => ->(s) { JSON.parse(s) },
      'CommaList' => ->(s) { s.split(",").map(&:strip) }
    )
  end
end
```

#### Default
Sets the default value of the cell:
```ruby
class ProjectImportRowModel
  include CsvRowModel::Import

  column :id, default: 1
  column :name, default: -> { get_name }

  def get_name; "John Doe" end
end
row_model = ProjectImportRowModel.new(["", ""])
row_model.id # => 1
row_model.name # => "John Doe"
row_model.default_changes # => { id: ["", 1], name: ["", "John Doe"] }
```

`DefaultChangeValidator` is provided to allows to add warnings when defaults are set. See [Default Changes](#default-changes) for more.

### Validations

[`ActiveModel::Validations`](http://api.rubyonrails.org/classes/ActiveModel/Validations.html) and [`ActiveWarnings`](https://github.com/s12chung/active_warnings)
are included for errors and warnings.

There are layers to validations.

```ruby
class ProjectImportRowModel
  include CsvRowModel::Import
  
  # Errors - by default, an Error will make the row skip
  validates :id, numericality: { greater_than: 0 } # ActiveModel::Validations
  
  # Warnings - a message you want the user to see, but will not make the row skip
  warnings do # ActiveWarnings, see: https://github.com/s12chung/active_warnings
    validates :some_custom_string, presence: true
  end
  
  # This is for validation of the strings before parsing. See: https://github.com/FinalCAD/csv_row_model#parsedmodel
  parsed_model do
    validates :id, presence: true
    # can do warnings too
  end
end
```

#### Type Format
Notice that there are validators given for different types: `Boolean`, `Date`, `DateTime`, `Float`, `Integer`:

```ruby
class ProjectImportRowModel
  include CsvRowModel::Import

  column :id, type: Integer, validate_type: true

  # the :validate_type option is the same as:
  # parsed_model do
  #   validates :id, integer_format: true, allow_blank: true
  # end
end

ProjectRowModel.new(["not_a_number"])
row_model.valid? # => false
row_model.errors.full_messages # => ["Id is not a Integer format"]
```

The above uses `IntegerFormatValidator` internally, you may customize this class or create new validators for custom types.

#### Default Changes
A custom validator for [Default Changes](#default).

```ruby
class ProjectImportRowModel
  include CsvRowModel::Input

  column :id, default: 1
  validates :id, default_change: true
end

row_model = ProjectImportRowModel.new([""])

row_model.valid? # => false
row_model.errors.full_messages # => ["Id changed by default"]
row_model.default_changes # => { id: ["", 1] }
```

### Skip and Abort
You can iterate through a file with the `#each` method, which calls `#next` internally.
`#next` will always return the next `RowModel` in the file. However, you can implement skips and
abort logic:

```ruby
class ProjectImportRowModel
  # always skip
  def skip?
    true # original implementation: !valid?
  end
end

import_file = CsvRowModel::Import::File.new(file_path, ProjectImportRowModel)
import_file.each { |project_import_model| puts "does not yield here" }
import_file.next # does not skip or abort
```

### File Validations
You can also have file validations, while will make the entire import process abort. Currently, there is one provided validation.

```ruby
class ImportFile < CsvRowModel::Import::File
  validate :headers_invalid_row # checks if header is valid CSV syntax
  validate :headers_count # calls #headers_invalid_row, then check the count. will ignore tailing empty headers
end
```

Can't be used for [File Model](#file-model) schemas.

### Import Callbacks
`CsvRowModel::Import::File` can be subclassed to access
[`ActiveModel::Callbacks`](http://api.rubyonrails.org/classes/ActiveModel/Callbacks.html).

* each_iteration - `before`, `around`, or `after` the an iteration on `#each`.
Use this to handle exceptions. `return` and `break` may be called within the callback for
skips and aborts.
* next - `before`, `around`, or `after` each change in `current_row_model`
* skip - `before`
* abort - `before`

and implement the callbacks:
```ruby
class ImportFile < CsvRowModel::Import::File
  around_each_iteration :logger_track
  before_skip :track_skip

  def logger_track(&block)
    ...
  end

  def track_skip
    ...
  end
end
```

## Advanced Import

### ParsedModel
The `ParsedModel` represents a row BEFORE parsing to add validations.

```ruby
class ProjectImportRowModel
  include CsvRowModel::Import

  # Note the type definition here for parsing
  column :id, type: Integer

  # this is applied to the parsed CSV on the model
  validates :id, numericality: { greater_than: 0 }

  parsed_model do
    # define your parsed_model here

    # this is applied BEFORE the parsed CSV on parsed_model
    validates :id, presence: true

    def random_method; "Hihi" end
  end
end

# Applied to the String
ProjectImportRowModel.new([""])
parsed_model = row_model.parsed_model
parsed_model.random_method => "Hihi"
parsed_model.valid? => false
parsed_model.errors.full_messages # => ["Id can't be blank'"]

# Errors are propagated for simplicity
row_model.valid? # => false
row_model.errors.full_messages # => ["Id can't be blank'"]

# Applied to the parsed Integer
row_model = ProjectRowModel.new(["-1"])
row_model.valid? # => false
row_model.errors.full_messages # => ["Id must be greater than 0"]
```

Note that `ParsedModel` validations are calculated after [Format Attribute](#format-cell) and custom validators can't be autoloaded---[non-reloadable classes can't access reloadable ones](http://stackoverflow.com/questions/29636334/a-copy-of-xxx-has-been-removed-from-the-module-tree-but-is-still-active).

### Represents
A CSV is often a representation of database model(s), much like how JSON parameters represents models in requests.
However, CSVs schemas are **flat** and **static** and JSON parameters are **tree structured** and **dynamic** (but often static).
Because CSVs are flat, `RowModel`s are also flat, but they can represent various models. The `represents` interface attempts to simplify this for importing.

```ruby
class ProjectImportRowModel < ProjectRowModel
  include CsvRowModel::Import

  # this is shorthand for the psuedo_code:
  # def project
  #  return if id.blank? || name.blank?
  #
  #  # turn off memoziation with `memoize: false` option
  #  @project ||= __the_code_inside_the_block__
  # end
  #
  # and the psuedo_code:
  # def valid?
  #   super # calls ActiveModel::Errors code
  #   errors.delete(:project) if id.invalid? || name.invalid?
  #   errors.empty?
  # end
  represents_one :project, dependencies: [:id, :name] do
     project = Project.where(id: id).first
                           
     # project not found, invalid.
     return unless project

     project.name = name
     project
   end
   
   # same as above, but: returns [] if name.blank?
   represents_many :projects, dependencies: [:name] do
     Project.where(name: name)
   end
end

# Importing is the same
import_file = CsvRowModel::Import::File.new(file_path, ProjectImportRowModel)
row_model = import_file.next
row_model.project.name # => "Some Project Name"
```

The `represents_one` method defines a dynamic `#project` method that:

1. Memoizes by default, turn off with `memoize: false` option
2. Handles dependencies:
  - When any of the dependencies are `blank?`, the attribute block is not called and the representation returns `nil`.
  - When any of the dependencies are `invalid?`, `row_model.errors` for dependencies are cleaned. For the example above, if `id/name` are `invalid?`, then
the `:project` key is removed from the errors, so: `row_model.errors.keys # => [:id, :name]` (applies to warnings as well)

`represents_many` is also available, except it returns `[]` when any of the dependencies are `blank?`.

### Children

Child `RowModel` relationships can also be defined:

```ruby
class UserImportRowModel
  include CsvRowModel::Import

  column :id, type: Integer
  column :name
  column :email

  # uses ProjectImportRowModel#valid? to detect the child row
  has_many :projects, ProjectImportRowModel
end

import_file = CsvRowModel::Import::File.new(file_path, UserImportRowModel)
row_model = import_file.next
row_model.projects # => [<ProjectImportRowModel>, ...]
```

## Dynamic Columns
Dynamic columns are columns that can expand to many columns. Currently, we can only one dynamic column after all other standard columns.
The following:

```ruby
class DynamicColumnModel
  include CsvRowModel::Model

  column :first_name
  column :last_name
  # header is optional, below is the default_implementation
  dynamic_column :skills, header: ->(skill_name) { skill_name }, header_models_context_key: :skills
end
```

represents this table:

| first_name | last_name  | skill1 | skill2 |
| ---------- |----------- | ------ | ------ |
| John       | Doe        |   No   |   Yes  |
| Mario      | Super      |   Yes  |   No   |
| Mike       | Jackson    |   Yes  |   Yes  |


The `format_dynamic_column_header(header_model, column_name, context)` can
be used to defined like `format_header`. Defined in both import and export due to headers being used for both.

### Export
Dynamic column attributes are arrays, but each item in the array is defined via singular attribute method like
normal columns:

```ruby
class DynamicColumnExportModel < DynamicColumnModel
  include CsvRowModel::Export

  def skill(skill_name)
    # below is an override, this is the default implementation: skill_name # => "skill1", then "skill2"
    source_model.skills.include?(skill_name) ? "Yes" : "No"
  end
end

# `skills` in the context is used as the header, which is used in `def skill(skill_name)` above
# to change this context key, use the :header_models_context_key option
export_file = CsvRowModel::Export::File.new(DynamicColumnExportModel, { skills: Skill.all  })
export_file.generate do |csv|
  User.all.each { |user| csv << user }
end
```

### Import
Like Export above, each item of the array is defined via singular attribute method like
normal columns:

```ruby
class DynamicColumnImportModel < DynamicColumnModel
  include CsvRowModel::Import

  # this is an override with the default implementation (override highly recommended)
  def skill(value, skill_name)
    value
  end

  class << self
    # Clean/format every dynamic_column attribute array
    #
    # this is an override with the default implementation
    def format_dynamic_column_cells(cells, column_name, context)
      cells
    end
  end
end
row_model = CsvRowModel::Import::File.new(file_path, DynamicColumnImportModel).next
row_model.attributes # => { first_name: "John", last_name: "Doe", skills: ['No', 'Yes'] }
row_model.skills # => ['No', 'Yes']
```

## File Model

A File Model is a RowModel where the row represents the entire file. It looks like this:

| id   |  1   |
|------|------|
| name | abc  |

```ruby
class FileRowModel
  include CsvRowModel::Model
  include CsvRowModel::Model::FileModel

  row :id
  row :name
end
```

The `:header` option is not available. It is a unfinished/unpolished API, so things may change.

### Import

For File Model Import, the headers are matched via regex and the value is the cell to right of the header.
When defining the schema, the order of the `row` calls do not matter.

```ruby
class FileImportModel < FileRowModel
  include CsvRowModel::Import
  include CsvRowModel::Import::FileModel
end
```

### Export

For File Model Export, you have to define a template, where you fill in the values of each cell. Symbol values will match the row's header.

```ruby
class FileExportModel < FileRowModel
  include CsvRowModel::Export
  include CsvRowModel::Export::FileModel

  def rows_template
    @rows_template ||= begin
      [
        [:id, id],
        ['', :name, name]
      ]
    end
  end
  
  def name
    source_model.name.upcase
  end
end
```