README.md

Summary

Maintainability
Test Coverage
# Roo

[![Build Status](https://img.shields.io/travis/roo-rb/roo.svg?style=flat-square)](https://travis-ci.org/roo-rb/roo) [![Maintainability](https://api.codeclimate.com/v1/badges/be8d7bf34e2aeaf67c62/maintainability)](https://codeclimate.com/github/roo-rb/roo/maintainability) [![Coverage Status](https://img.shields.io/coveralls/roo-rb/roo.svg?style=flat-square)](https://coveralls.io/r/roo-rb/roo) [![Gem Version](https://img.shields.io/gem/v/roo.svg?style=flat-square)](https://rubygems.org/gems/roo)

Roo implements read access for all common spreadsheet types. It can handle:
* Excel 2007 - 2013 formats (xlsx, xlsm)
* LibreOffice / OpenOffice.org formats (ods)
* CSV
* Excel 97, Excel 2002 XML, and Excel 2003 XML formats when using the [roo-xls](https://github.com/roo-rb/roo-xls) gem (xls, xml)
* Google spreadsheets with read/write access when using [roo-google](https://github.com/roo-rb/roo-google)

## Installation

Install as a gem

    $ gem install roo

Or add it to your Gemfile

```ruby
gem "roo", "~> 2.10.0"
```
## Usage

### Opening a spreadsheet

You can use the `Roo::Spreadsheet` class so `roo` automatically detects which [parser class](https://github.com/roo-rb/roo/blob/master/lib/roo.rb#L17) to use for you.
```ruby
require 'roo'

file_name = './new_prices.xlsx'
xlsx = Roo::Spreadsheet.open(file_name)
xlsx.info
# => Returns basic info about the spreadsheet file
```

``Roo::Spreadsheet.open`` can accept both string paths and ``File`` instances. Also, you can provide the extension of the file as an option: 

```ruby
require 'roo'

file_name = './rails_temp_upload'
xlsx = Roo::Spreadsheet.open(file_name, extension: :xlsx)
xlsx.info
# => Returns basic info about the spreadsheet file
```

On the other hand, if you know what the file extension is, you can use the specific parser class instead:
```ruby
require 'roo'

xlsx = Roo::Excelx.new("./new_prices.xlsx")
xlsx.info
# => Returns basic info about the spreadsheet file
```

### Working with sheets

```ruby
ods.sheets
# => ['Info', 'Sheet 2', 'Sheet 3']   # an Array of sheet names in the workbook

ods.sheet('Info').row(1)
ods.sheet(0).row(1)

# Set the last sheet as the default sheet.
ods.default_sheet = ods.sheets.last
ods.default_sheet = ods.sheets[2]
ods.default_sheet = 'Sheet 3'

# Iterate through each sheet
ods.each_with_pagename do |name, sheet|
  p sheet.row(1)
end
```

### Accessing rows and columns

Roo uses Excel's numbering for rows, columns and cells, so `1` is the first index, not `0` as it is in an ``Array``

```ruby
sheet.row(1)
# returns the first row of the spreadsheet.

sheet.column(1)
# returns the first column of the spreadsheet.
```

Almost all methods have an optional argument `sheet`. If this parameter is omitted, the default_sheet will be used.

```ruby
sheet.first_row(sheet.sheets[0])
# => 1             # the number of the first row
sheet.last_row
# => 42            # the number of the last row
sheet.first_column
# => 1             # the number of the first column
sheet.last_column
# => 10            # the number of the last column
```

#### Accessing cells

You can access the top-left cell in the following ways

```ruby
sheet.cell(1,1)
sheet.cell('A',1)
sheet.cell(1,'A')
sheet.a1

# Access the second sheet's top-left cell.
sheet.cell(1,'A',sheet.sheets[1])
```

#### Querying a spreadsheet
Use ``each`` to iterate over each row.

If each is given a hash with the names of some columns, then each will generate a hash with the columns supplied for each row.

```ruby
sheet.each(id: 'ID', name: 'FULL_NAME') do |hash|
  puts hash.inspect
  # => { id: 1, name: 'John Smith' }
end
```

Use ``sheet.parse`` to return an array of rows. Column names can be a ``String`` or a ``Regexp``.

```ruby
sheet.parse(id: /UPC|SKU/, qty: /ATS*\sATP\s*QTY\z/)
# => [{:id => 727880013358, :qty => 12}, ...]
```

Use the ``:headers`` option to include the header row in the parsed content.

```ruby
sheet.parse(headers: true)
```

Use the ``:header_search`` option to locate the header row and assign the header names.

```ruby
sheet.parse(header_search: [/UPC*SKU/,/ATS*\sATP\s*QTY\z/])
```

Use the ``:clean`` option to strip out control characters and surrounding white space.

```ruby
sheet.parse(clean: true)
```

#### Options

When opening the file you can add a hash of options.

##### expand_merged_ranges
If you open a document with merged cells and do not want to end up with nil values for the rows after the first one.
```ruby
xlsx = Roo::Excelx.new('./roo_error.xlsx', {:expand_merged_ranges => true})
```

### Exporting spreadsheets
Roo has the ability to export sheets using the following formats. It
will only export the ``default_sheet``.

```ruby
sheet.to_csv
sheet.to_matrix
sheet.to_xml
sheet.to_yaml
```

Specify the file as default argument for `#to_csv`:

```ruby
sheet.to_csv(File.new("/dev/null"))
```

specify the custom separator:

```ruby
sheet.to_csv(separator: ":") # "," using by default
```

### Excel (xlsx and xlsm) Support

Stream rows from an Excelx spreadsheet.

```ruby
xlsx = Roo::Excelx.new("./test_data/test_small.xlsx")
xlsx.each_row_streaming do |row|
  puts row.inspect # Array of Excelx::Cell objects
end
```

By default blank cells will be excluded from the array. To keep them, use the option pad_cells = true. (They will be set to nil in the array)
```ruby
xlsx.each_row_streaming(pad_cells: true) do |row|
  puts row.inspect # Array of Excelx::Cell objects
end
```

To stream only some of the rows, you can use the ```max_rows``` and ```offset```options.
```ruby
xlsx.each_row_streaming(offset: 1) do |row| # Will exclude first (inevitably header) row
  puts row.inspect # Array of Excelx::Cell objects
end
```

```ruby
xlsx.each_row_streaming(max_rows: 3) do |row| # Will yield 4 rows (it's automatically incremented by 1) after the supplied offset.
  puts row.inspect # Array of Excelx::Cell objects
end
```

Iterate over each row

```ruby
xlsx.each_row do |row|
  ...
end
```

``Roo::Excelx`` also provides these helpful methods.

```ruby
xlsx.excelx_type(3, 'C')
# => :numeric_or_formula

xlsx.cell(3, 'C')
# => 600000383.0

xlsx.excelx_value(row,col)
# => '600000383'

xlsx.formatted_value(row,col)
# => '0600000383'
```

``Roo::Excelx`` can access celltype, comments, font information, formulas, hyperlinks and labels.

```ruby
xlsx.comment(1,1, ods.sheets[-1])
xlsx.font(1,1).bold?
xlsx.formula('A', 2)
```

### OpenOffice / LibreOffice Support

Roo::OpenOffice has support for encrypted OpenOffice spreadsheets.

```ruby
# Load an encrypted OpenOffice Spreadsheet
ods = Roo::OpenOffice.new("myspreadsheet.ods", password: "password")
```

``Roo::OpenOffice`` can access celltype, comments, font information, formulas and labels.

```ruby
ods.celltype
# => :percentage

ods.comment(1,1, ods.sheets[-1])

ods.font(1,1).italic?
# => false

ods.formula('A', 2)
```

### CSV Support

```ruby
# Load a CSV file
csv = Roo::CSV.new("mycsv.csv")
```

Because Roo uses the standard CSV library, you can use options available to that library to parse csv files. You can pass options using the ``csv_options`` key.

For instance, you can load tab-delimited files (``.tsv``), and you can use a particular encoding when opening the file.


```ruby
# Load a tab-delimited csv
csv = Roo::CSV.new("mytsv.tsv", csv_options: {col_sep: "\t"})

# Load a csv with an explicit encoding
csv = Roo::CSV.new("mycsv.csv", csv_options: {encoding: Encoding::ISO_8859_1})
```

You can also open csv files through the Roo::Spreadsheet class (useful if you accept both CSV and Excel types from a user file upload, for example).

```ruby
# Load a spreadsheet from a file path
# Roo figures out the right parser based on file extension
spreadsheet = Roo::Spreadsheet.open(csv_or_xlsx_file)

# Load a csv and auto-strip the BOM (byte order mark)
# csv files saved from MS Excel typically have the BOM marker at the beginning of the file
spreadsheet = Roo::Spreadsheet.open("mycsv.csv", { csv_options: { encoding: 'bom|utf-8' } })
```

## Upgrading from Roo 1.13.x
If you use ``.xls`` or Google spreadsheets, you will need to install ``roo-xls`` or ``roo-google`` to continue using that functionality.

Roo's public methods have stayed relatively consistent between 1.13.x and 2.0.0, but please check the [Changelog](https://github.com/roo-rb/roo/blob/master/CHANGELOG.md) to better understand the changes made since 1.13.x.



## Contributing
### Features
1. Fork it ( https://github.com/roo-rb/roo/fork )
2. Install it (`bundle install --with local_development`)
3. Create your feature branch (`git checkout -b my-new-feature`)
4. Commit your changes (`git commit -am 'My new feature'`)
5. Push to the branch (`git push origin my-new-feature`)
6. Create a new Pull Request

### Testing
Roo uses Minitest and RSpec. The best of both worlds! Run `bundle exec rake` to
run the tests/examples.

You can run the tests/examples with Rspec like reporters by running
`USE_REPORTERS=true bundle exec rake`

Roo also has a few tests that take a long time (5+ seconds). To run these, use
`LONG_RUN=true bundle exec rake`

### Issues

If you find an issue, please create a gist and refer to it in an issue ([sample gist](https://gist.github.com/stevendaniels/98a05849036e99bb8b3c)). Here are some instructions for creating such a gist.

1. [Create a gist](https://gist.github.com) with code that creates the error.
2. Clone the gist repo locally, add a stripped down version of the offending spreadsheet to the gist repo, and push the gist's changes master.
3. Paste the gist url here.


## License
[Roo uses an MIT License](https://github.com/roo-rb/roo/blob/master/LICENSE)