CHANGELOG.md
## [0.5.2] - 2023-09-01
Support Apache Arrow 13.0.0 .
This version is compatible with Arrow 12.0.0 .
- Breaking change
- Bug fixes
- Fix bundle install issue by install libyaml-devel (#280)
- Fix ownership in devcontainer ci (#280)
- New features and improvements
- Support Arrow 13.0.0 (#280)
- Documentation and Example
- Add dataframe_comparison_ja (#281)
## [0.5.1] - 2023-08-18
Docker environment is replaced by Dev Container,
and Jupyter Notebooks will be created from qmd files.
- Breaking change
- Bug fixes
- Fix timestamp test to set TZ locally (#249)
- Fix regexp for beginning of String (#251)
- Fix loading bin/Gemfile locally in bin/jupyter script (#261)
- New features and improvements
- Support sort and null_placement options in Vector#rank (#265)
- Add Vector#find_substring method (#270)
- Add Group#one method (#274)
- Add Group#all and #any method (#274)
- Add Group#median method (#274)
- Add Group#count_uniq method (#274)
- Introduce Dev Container environment
- Introduce Devcontainer environment (#253)
- Change lifecycle script from postCreate to onCreate (#253)
- Move example to bin (#253)
- Fix Python and Ruby versions in Dev Container (#254)
- Add locale and timezone settings (#256)
- Add quarto from devcontainer feature (#259)
- Install HaranoAjiFonts as default Tex font (#259)
- Refactoring
- Rename boolean methods in VectorStringFunction (#263)
- Refine Vector#inspect to show wheather chunked or not (#267)
- Add an alias Group#count_all for #group_count (#274)
- Improve in tests/CI
- Create rake commands for Notebook convert/test (#269)
- Fix rubocop warning of forwarding arguments in assign_update (#269)
- Use rake to start example script (#269)
- Add test in Vector#rank to cover illegal rank option error (#271)
- Add bundle install to Rakefile (#276)
- Use Dockerfile to create dev container (#276)
- Save image to ghcr in ci (#276)
- Documentation and Example
- YARD
- Update Docker Environment (#245)
- Refine jupyter notebook environment (#253)
- Refine yard in Group aggregations (#274)
- Fix yard of Vector#rank (#269)
- Fix yard of Group (#269)
- Notebook
- Start source management for jupyter notebook by qmd (#259)
- Don't create ipynb if it exists (#261)
- Add Group methods (125 in total) (#269)
- Add ArrowFunction (126 in total) (#269)
- Add DataFrame#auto_cast (127 in total) (#269)
- Update required version in examples notebook (#269)
- Update examples_of_red_amber (#269)
- Update red-amber.qmd (#269)
- GitHub site
- Fix broken link in README/README.ja by Viktorius Suwandi (#262)
- Change description in gemspec (#254)
- Add documents for Dev Container (#254)
- Thanks
- Viktorius Suwandi
## [0.5.0] - 2023-05-24
- Breaking change
- Use non keyword argument in #sub_by_value (#219)
- Upgrade dependency to Arrow 12.0.0 (#238)
- right_join will output columns as same order as Red Arrow.
- DataFrame#join will not force ordering of original column by default
- Join with type, such as full_join, sort after join by default
- Bug fixes
- Use truncate in Vector#sample(float) (#229)
- Support options in DataFrame#tdra (#231)
- Fix printing table with non-ascii strings (#233)
- Fix join for Arrow 12.0.0
- New features and improvements
- Add a singleton method Vector.[] (#218)
- Add an alias #sub_group (#219)
- Accept Group#summarize{Hash} to rename aggregated columns (#219)
- Add Group#group_frame (#219)
- Add Vector#cast (#224)
- Add Vector#fill_nil(value) (#226)
- Add Vector#one (#227)
- Add Vector#mode (#228)
- Add DataFrame#propagate (#235)
- Add DataFrame#sample (#237)
- Add DataFrame#shuffle (#237)
- Support RankOptions in Vector#rank (#239)
- Introduce MatchSubstringOptions family in Vector (#241)
- Introduce Vector#match_substring?
- Add Vector#end_with?, #start_with? method
- Add Vector#match_like?
- Add Vector#count_substring method
- Refactoring
- Refine Group and SubFrames function (#219)
- Refine Group#group_count
- Use Acero in Group#filters
- Refine Group#filters, not using Acero
- Refine Group#summarize(array)
- Use Acero for renaming columns in join (#238)
- Use index kernel with IndexOptions introduced in 12.0.0 (#240)
- Improve in tests/CI
- Use Fedra 39 Rawhide in CI (#238)
- Documentation and Example
- Add missing yard documents for SubFrames::Selectors (#219)
- Update docker/example (#219)
- Update Gemfile in docker (#219)
- Add README.ja.md (#242)
- GitHub site
- Update link of Red Data Tools Chat to matrix (#242)
- Thanks
## [0.4.2] - 2023-04-02
- Breaking change
- Bug fixes
- Fix Vector#modulo, #fdiv, #remainder (#203)
- New features and improvements
- Update SubFrames#take to return SubFrames (#212)
- Refactoring
- Refine SubFrames to support partial retrieval (#207)
- Upgrade SubFrames#frames and promote to public (#207)
- Use faster count in Group#inspect (#207)
- Improve in tests/CI
- Documentation and Example
- Introduce minimum docker environment (#205)
- Move example REPL to docker (#205)
- Add readme.md in docker (#205)
- Add example_of_red_amber.ipynb (#205)
- Use smaller dataset in irb example
- Fix docker/example
- Updated link to red-data-tools (#213)
- Thanks to Soumya Kushwaha
- GitHub site
- Migrated to [Red Data Tools](https://github.com/red-data-tools)
- Thanks to Sutou Kouhei
- Thanks
- Sutou Kouhei
- Soumya Kushwaha
## [0.4.1] - 2023-03-11
- Breaking change
- Remove Vector.aggregate? method (#200)
- Bug fixes
- Return self in DataFrame#drop when dropper is empty (reverts 746ac263) (#193)
- Return self in DataFrame#rename when renaming to same name (#193)
- Return self in DataFrame#pick when pick itself (#199)
- Fix column width for non-ascii elemnts in DataFrame#to_s (#193)
- This change uses String#width.
- Fix DataFrame#to_iruby when data is date32 type (#193)
- Fix DataFrame#shorthand to show temporal type data simply (#193)
- Fix Vector#rank when data is ChunkedArray (#198)
- Fix Vector element-wise functions with nil as scalar (#198)
- Support :force_order for all methods of join family (#199)
- Supports :force_order option to force sorting after join for all #join familiy.
- This will valuable in some cases such as large dataframes.
- Ensure baseframe's schema for SubFrames (#200)
- New features and improvements
- Add Vector#first, #last method (#198)
- This method will be used in SubFrames feature.
- Add Vector#modulo method (#198)
- The divmod function in Arrow C++ is still in draft state.
This method was created by combining existing functions
- Add Vector#quotient method (#198)
- Add aliases #div, #mod, #mul, #pow, #quo and #sub for Vector (#198)
- Add Vector#*_checked functions (#198)
- This functions will check numeric range overflow.
- Add 'tdra' and 'plain' in display mode (#193)
- The plain mode and default inspect will show up to 128 rows and 128 columns.
- Add String#width method in refinements (#193)
- This will be used to update DataFrame#to_s.
- Introduce pre-loaded REPL environment (#199)
- This commit will add bin/example and it will start irb environment
with enabled commonly used datasets such as penguins, diamonds, etc.
- Upgrade SubFrames#aggregate to accept block (#200)
- Refactoring
- Use symbolized keys in refinements of Table#keys, #key? (#193)
- This can be treat Tables and DataFrames as same manner.
- Use key_name.succ in suffix of DataFrame#join (#193)
- This will make simple to get name candidate.
- Use ||= to memorize instance variables (#193)
- Refine vector projection to use #variables (#193)
- #variables is fastest when picking Vectors.
- Refine Vector#is_in to avoid #pack (#198)
- Refine Vector#index (#198)
- Improve in tests/CI
- Tests
- Update benchmarks to test from older version (#193)
- Refine test of Vector function with scalar (#198)
- Refine test subframes and test_vector_selectable (#200)
- Cops
- CI
- Documentation
- Update documents(small fix) (#201)
- GitHub site
- Thanks
## [0.4.0] - 2023-02-25
- Breaking change
- Upgrade dependency to Arrow 11.0.0 (#188)
- Bug fixes
- Add :force_order option for DataFrame#join (#174)
- Return error for empty DataFrame in DataFrame#filter (#172)
- Accept ChunkedArray in DataFrame#filter (#172)
- Fix Vector#replace to accept Arrow::Array as a replacer (#179)
- Fix Vector#round_to_multiple to accept Float or Integer (#180)
- Change Vector atan2 to a class method (#180)
- Fix Vector#shift when boolean Vector (#184)
- Fix processing empty SubFrames (#183)
- Do not check object id in DataFrame#rename, #drop for self (#188)
- New features and improvements
- Accept a block in DataFrame#filter (#172)
- Add Vector.aggregate? method (#175)
- Introduce Vector#propagate method (#175)
- Add Vector#rank methods (#176)
- Add Vector#sample method (#176)
- Add Vector#sort method (#176)
- Promote DataFrame#shape_str to public (#184)
- Introduce Vector#concatenate (#184)
- Add #numeric? in refinements of Array (#184)
- Add Vector#cumulative_sum_checked and #cumsum (#184)
- Add Vector#resolve method (#184)
- Add DataFrame#tdra method (#184)
- Add #expand as an alias for Vector#propagate (#184)
- Add #glimpse as an alias for DataFrame#tdr (#184)
- New class SubFrames (#183)
- Introduce class SubFrames
- Memorize dataframes in SubFrames
- Add @frames to memorize sub DataFrames
- Accept filters in SubFrames.new
- Accept block in SubFrames.new
- Add SubFrames.by_filter
- Introduce methods creating SubFrames from DataFrame
- Introduce SubFrames#each method
- Add SubFrames#to_s method
- Add SubFrames#concatenate method
- Add SubFrames#offset_indices method
- SubFrames#aggregate method
- Redefine SubFrames#map to return SubFrames
- Define SubFrame#map dynamically
- Add SubFrames#assign method
- Redefine SubFrames#select to return SubFrames
- Add SubFrames#reject method
- Add SubFrames#filter_map method
- Refine DataFrame#indices memorizing @indices
- Rename SubFrames#universal_frame as #baseframe
- Set Group iteration feature to @api private
- Refactoring
- Generate Vector functions in class method (#177)
- Set Constant visibility to private (#179)
- Separate test_vector_function (#179)
- Relocate methods in DataFrameIndexable (#179)
- Rename Array refinements to the same name as Vector (#184)
- Improve in tests/CI
- Tests
- Update benchmarks to set 0.3.0 as a reference (#167)
- Move test of Vector#logb to proper location (#180)
- Cops
- Update .rubocop.yml to align with latest cops (#174)
- Unify style of MethodCallIndentation as relative to reciever (#184)
- CI
- Fix setting up Arrow by homebrew in CI (#167)
- Fix CI error on homebrew deleting python link (#167)
- Set cache-version to get new C extensions in CI (#173)
- Thanks to @kou for suggestion.
- Documentation
- Update DataFrame.md about loading csv without headers (#165)
- Thanks to kojix2
- Update YARD in DataFrame combinable (#168)
- Update comment for Ruby 2.7 support in README.md
- Update license year
- Update README (#172)
- Update Vector.md and yardoc in #propagate (#175)
- Use customized style sheet for YARD (#179)
- Add examples for the doc of #pick and #drop (#179)
- Add examples to YARD in DataFrame reshaping methods (#179)
- Update documents in DataFrameDisplayable (#179)
- Update documents in DataFrameVariableOperation (#179)
- Update document for dynamically generated methods (#179)
- Unify style in document (#179)
- Update documents in DataFrameSelectable (#179)
- Update documents of basic Vector methods (#179)
- Update document in VectorUpdatable (#179)
- Update document of Group (#179)
- Update document of DataFrameLoadSave (#180)
- Add examples for document of ArrowFunction (#180)
- Update document of Vector_unary_aggregation (#180)
- Update document of Vector_unary_element_wise (#180)
- Update document of Vector_biary_element_wise (#180)
- Add documentation to give comparison of dataframes(#169)
- Thanks to Benson Muite
- Update documents for consistency of method indentation (#189)
- Update CHANGELOG (#189)
- Update README for 0.4.0 (#189)
- GitHub site
- Thanks
- kojix2
- Benson Muite
## [0.3.0] - 2022-12-18
- Breaking change
- Supported Ruby version has changed from 2.7 to 3.0
- Upgrade minimum supported/required version of Ruby from 2.7 to 3.0 (#159, #160)
- Bug fixes
- Add check with #key? in DataFrame#method_missing (#140)
- Delete unnecessary backslash to supress warning in unary functions (#140)
- Fix syntax in code_climate.yml (144)
- Temporary disable simplecov test report (#149)
- Change Vector#[] to return Array or scalar (#148)
- Add missing simplecov HTML formatter (#148)
- Change return value of DataFrame#save to self (#160)
- Originally reported by kojix2.
- New features and improvements
- Update Vector#take to accept block (#148)
- Add properties of list Vectors (#148)
- Add Vector#split, #split_to_column, #split_to_row (#148)
- Add Vector#merge (#148)
- Refactoring
- Refactor code (#140)
- Add DataFrame.create as a faster constructor
- Refactor DataFrame.new using refinements and duck typing
- Refactor Vector.new using refinements and duck typing
- Add Vector.create as a faster constructor
- Refactor Group
- Refactor DataFrame#pick/#drop by refininig Array
- Refactor DataFrame#pick/#drop
- Refactor nil treatment in pick/drop
- Refactor DataFrame#pick/#drop using new parser
- Refactor DataFrame#[]
- Refactor Vector#[], #take, #filter by updating parser
- Add for_keys option to parse_args
- Refactor Vector properties by refinements for Arrow::Array
- Refactor DataFrame selectable using Arrow::Array refinements instead of Vector methods
- Refactor DataFrame#assign
- Refine error message in DataFrame#to_long/to_wide #143)
- Refactor Vector#take/filter returns arrow array (#148)
- Change LineLength in cop from 120 to 90 (#152)
- Refine DataFrame combinable (join) operations (#159)
- Refine DataFrame#join effectively using outputs options
- Simplify DataFrame set operations
- Improve in tests/CI
- Tests
- Update benchmark using 0.2.3 (#138)
- Update benchmark basic#02/pick by [] (#140)
- Update benchmark contexts and loop_count (#140)
- Add benchmark for vector (#140)
- Add tests for refinements (#140)
- Add benchmark for the series of DataFrame operations (#140)
- Add missing test for tdr and dictionary (#140)
- Add missing test for group#method with foreign key (#152)
- Add missing test for set operations and natural join (#152)
- Add missing test for DataFrame#[] with selecting by Array of illegal type' (#152)
- Add missing test for DataFrame#assign when assigner size is mismatch (#152)
- Accept Hash as join keys in DataFrame join methods (#159)
- Cops
- Refactor/clean rubocop.yml (#138)
- CI
- Support Ruby 3.2 in CI test (#141)
- Send test coverage report to Code Climate (#144)
- Add test on Fedora (#151)
- Thanks to Benson Muite.
- Add workflow to generate document (#153)
- Thanks to kojix2.
- Support Code Climate test coverage report in CI (#155)
- Documentation
- Add YARD in data_frame.rb (#140)
- Fix YARD document in the code (#140)
- Add Code Climate badges of maintainability and coverage (#144)
- Add installation for Fedora in README (#147)
- Thanks to Benson Muite.
- Add Vector#split/merge in Vector.md (#148)
- Fix codeclimate badges in README (#155)
- Update YARD in DataFrame join methods (#159)
- Update jupyter notebook '89 examples of Redamber' (#160)
- Thanks
- Benson Muite
- kojix2
## [0.2.3] - 2022-11-16
- Bug fixes
- Fix DataFrame#to_s when DataFrame.size == 0 (#125)
- Remove unused lines in funcs (#128)
- Remove unused methods in helper (#128)
- Add test for invalid arg in DataFrame.new (#128)
- Add test for Vector#shift(0) (#128)
- Fix bugs for DataFrame#[], #pick and #drop with Range of Symbols and Symbol (#135)
- New features and improvements
- Upgrade dependency to Arrow 10.0.0 (#132)
It is possible to initialize by the objects responsible to `to_arrow` since 0.2.3 .
Arrays in Numo::NArray is responsible to `to_arrow` with Red Arrow Numo::NArray 0.0.6 .
This feature is proposed by the Red Data Tools member @kojix2 and implemented by @kou.
I made also Vector to be responsible to `to_arrow` and `to_arrow_array`.
It becomes a member of ducks ('quack quack'). Thanks!
- Change dev dependency to red-dataset-arrow (#117)
- Add dev dependency for red-arrow-numo-narray (#132)
- Support Numo::NArray in Vector.new (#132)
- Support Vector#to_arrow_array (#132)
- Update group (#118)
- Introduce new DataFrame group support (experimental)
This additional API will treat a grouped DataFrame as a list of DataFrames.
I think this API has pros such as:
- API is easy to understand and flexible.
- It has good compatibility with Ruby's primitive Enumerables.
- We can only use non hash-ed aggregation functions.
- Do not need grouped DataFrame state, nor `#ungroup` method.
- May be useful for concurrent operations.
This feature is implemented by Ruby, so it is pretty slow and experimental.
Use original Group API for practical purpose.
- `include Enumerable` to Group (experimental)
- Add Group#each, #inspect
- Refactor Group to align with Arrow
- Introduce DataFrame combining methods (#125)
- Introduce DataFrame#concatenate method
- Add DataFrame#merge method
- Add DataFrame#inner_join method
- Add DataFrame#full_join method
- Add DataFrame#left_join method
- Add DataFrame#right_join method
- Add DataFrame#semi_join method
- Add DataFrame#anti_join method
- Add DataFrame#intersect method
- Add DataFrame#union method
- Add DataFrame#setdiff method
- Rename #setdiff to #difference
- Support natural join in DataFrame#join
- Support partial join_key and renaming
- Fix DataFrame#join to merge key columns
- Add DataFrame#set_operable? method
- Add join/set/bind image to DataFrame.md
- Fix DataFrame#join, #right_semi, #right_anti (#128)
- Miscellaneous
- Return Vector in DataFrame#indices (#118)
- Improve tests/ci
- Improve CI
- Add CI test on macOS (#133)
- Enable bundler-cache on macOS (#128)
- Add install gobject introspection prior to glib in CI (#133)
This will stabilize CI system installation especially with cache.
- Rename workflows/test.yml to ci.yml (#133)
- Fix link in CI badge of README.md (#118)
- Add github action for coverage (#128)
- Add benchmark
- Add benchmarks with Rover (#118)
- Introduce benchmark suite (#134)
- Add benchmark for combining operations (#134)
- Measuring test coverage
- Add test coverage measurement (#128)
- Refactoring
- Remove redundant string escape in `test_vector_function` (#132)
- Refine tests to use `assert_equal_array` (#128)
- Rewrite Vector#replace (#128)
- Documentation
- Update README.md for installation (#126)
- Add clause that keys must be unique in doc. (#126)
- Rows should be called as 'records' (#126)
- Update Jupyter Notebook `83 examples of RedAmber` (#135)
- GitHub site
- Update Jupyter notebooks in Binder
- Change default branch name from 'master' to 'main' (#127)
- Thanks
Ruby Association Grant committee
It is a great honor for selecting RedAmber as a project of Ruby Association Grant 2022.
## [0.2.2] - 2022-10-04
- Bug fixes
- Return self when no replacement happen in Vector#replace. (#92)
- Limit n-digits in to_iruby. (#111)
- Fix displaying space in to_iruby. (#111)
- Raise error if key is duplicated. (#113)
- Fix DataFrame#pick/#drop with endless Range. (#113)
- Change type from dictionary to string in DataFrame reshaping methods. (#113)
- Fix arguments parser to accept Enumerator. (#114)
- New features and improvements
- Support to make a data frame from a to_arrow-responsible object. (#106) [Patch by Kenta Murata]
- Introduce DataFrame#auto_cast (experimental feature) (#105)
- Change default name in DataFrame#transpose, #to_long, #to_wide. (#110)
- Add Vector#dictionary? method. (#113)
- Add display mode 'Plain' and 'Minimum'. (#113)
- Refactor code
- Refine test_vector_selectable. (#92)
- Refine test_vector_updatable. (#92)
- Refine Vector.new. (#113)
- Refine DataFrame#pick, #drop. (#113)
- Documents
- Update images. (#90, #105, #113)
- Update README to use simpler examples. (#112)
- Update README with a new screenshot example. (#113)
- GitHub site
- Update Jupyter notebooks in Binder (#88, #115)
- Move binder support to heronshoes/docker-stacks repository.
- Update README notebook on binder.
- Add examples_of_RedAmber notebook on binder.
- Start to use discussions.
- Thanks
- Kenta Murata
## [0.2.1] - 2022-09-07
- Bug fixes
- Fix `Vector#each` with block (#66)
`Vector#each` will return value of each element with block.
- Fix table format at size == 9 (#67)
- Fix to support Vector in `DataFrame#assign` (#77)
- Add `assert_delta` functionality for `assert_with_NaN` (#78)
- Fix Vector#is_in when self is chunked (#79)
- Fix Array type error (uint/int) (#79)
- New features and improvements
- Refine `DataFrame#indices` method (#67)
- Update DataFrame reshaping methods (#73)
- Change default option value of DataFrame reshaping
- Change the order of import_cars example
- Add `DataFrame#method_missing` to get column vector by method (#75)
- Add `DataFrame#method_missing` to get column (#75)
- Accept both args and block in `DataFrame#assign` (#75)
- Accept indices in `DataFrame#pick` and `DataFrame#drop` (#76)
- Add `DataFrame#slice_by` method (#77)
- Add new Vector functions (#78)
- Add inverse trigonometric function for Vector
- `acos`
- `asin`
- Add logarithmic function for Vector
- `ln`
- `log10`
- `log1p`
- `log2`
- Add binary function `Vector#logb`
- Docker image and Jupyter Notebook [Thanks to Kenta Murata]
- Add link to RubyData in README
- Add link to interactive README by Binder
- Update Jupyter Notebook `71 examples of RedAmber`
- Thanks
- Kenta Murata
## [0.2.0] - 2022-08-15
- Bump version up to 0.2.0
- Bug fixes
- Fix order of multiple group keys (#55)
Only 1 group key comes to left. Other keys remain in right.
- Remove optional `require` for rover (#55)
Fix DataFrame.new for argument with Rover::DataFrame.
- Fix occasional failure in CI (#59)
Sometimes the CI test fails. I added -dev dependency
in Arrow install by apt, not doing in bundler.
- Fix calling :take in V#[] (#56)
Fixed to call Arrow function :take instead of :array_take in Vector#take_by_vector. This will prevent the error below
when called with Arrow::ChunkedArray.
- Raise error renaming non existing key (#61)
Add error when specified key is not exist.
- Fix DataFrame#rename #assign by array (#65)
- New features and improvements
- Support Arrow 9.0.0
- Upgrade to Arrow 9.0.0 (#59)
- Add Vector#quantile method (#59)
Arrow::QuantileOptions has supported in Arrow GLib 9.0.0 (ARROW-16623, Thanks!)
- Add Vector#quantiles (#62)
- Add DataFrame#each_row (#56)
- Returns Enumerator if block is not given.
- Change DataFrame#each_row to return a Hash {key => row} (#63)
- Refactor to use pattern match in overloaded parameter parsing (#61)
- Refine DataFrame.new to use pattern match
- Use pattern match in DataFrame#assign
- Use pattern match in DataFrame#rename
- Accept Array for renamer/assigner in #rename/#assign (#61)
- Accept assigner by Arrays in DataFrame#assign
- Accept renamer pairs by Arrays in DataFrame#rename
- Add DataFrame#assign_left method
- Add summary/describe (#62)
- Introduce DataFrame#summary(#describe)
- Introduce reshaping methods for DataFrame (#64)
- Introduce DataFrame#transpose method
- Intorduce DataFrame#to_long method
- Intorduce DataFrame#to_wide method
- Others
- Add alias sort_index for array_sort_indices (#59)
- Enable :width option in DataFrame#to_s (#62)
- Add options to DataFrame#format_table (#62)
- Update Documents
- Add Yard doc for some methods
- Update Jupyter notebook '61 Examples of Red Amber' (#65)
## [0.1.8] - 2022-08-04 (experimental)
- Bug fixes
- Fix unnamed column in table formatter (#52)
- Fix DataFrame#key?, DataFrame#key_index when @keys.nil? (#52)
- Align order of replacer in Vector#replace (#53, resolved #38)
- New features and improvements
- Refine DataFrame.new for empty arguments (#50)
- Delete .rubocop_todo.yml for not to use yoda condition (#50)
- Refine Group (#52, resolved #28)
- Refine Group methods creation
- Make group key at first(left)
- Show only one group count when same counts
- Add block acceptability for group
- Rename empty key to :unnamed in DataFrame.new
- Rename Group#aggregated_by to #summarize (#54)
- Add Vector#shift (#51)
- Vector#[] accepts Range as an argument (#51)
- Update documents
- Add support for yard (#54)
- Renew jupyter notebook '53 examples' (#54)
- Add more examples and images in README (#52)
- Add document of group manipulations in README (#52)
- Renew DF#group document in DataFrame.md (#52)
## [0.1.7] - 2022-07-15 (experimental)
- Bug fixes
- Remove development dependency for red-dataset-arrow (#47)
- To avoid irregular fails in CI test
- Add red-datasets to development dependency instead (#49)
- Supress useless log in tests (#46)
Suppress log of Webrick and iruby.
- New features and improvements
- Use Table mode as default preview mode in `inspect`/`to_s` (#40)
- Show examples in documents in Table
- Use the word rows/columns
- Update images of data processing in Table style
- Introduce a new Table formatter (#47)
- Migrate from the Arrow's formatter
- Do not use TAB, format by spaces only.
- Align column width with head rows and tail rows.
- Show nils.
- Show data types.
- Refine documents to use new formatter output
- Simplify options of Vector functions (#46)
Vector functions with options use optional argument opt in previous code.
- Add `#float?`, `#integer?` to Vector (#46)
- Add `#each` to Vector (#47)
- Introduce class `Group` (#48)
- Refine `DataFrame#group` to use class Group
- Add methods to Group
- Move parquet and rover to development dependency (#49)
- Refine text in `DataFrame#to_iruby` (#40)
- Add badges in Github site
- Gitter badge for Red Data Tools (#42)
- Gem version and CI status badge (#45)
- Exchange containers in red-amber.rb and red_amber.rb (#47)
- Mainly use red_amber by consistency with the folder name
- Add Jupyter notebook '47 Examples of Red Amber' (#49)
## [0.1.6] - 2022-06-26 (experimental)
- Bug fixes
- Fix mime-type of empty DataFrame in `#to_iruby` (#31)
- Fix mime setting in `DataFrame#to_iruby` (#36)
- Fix unmatched return val in Selectable (#34)
- Fix to return same error as `#[]` in `DataFrame#slice` (#34)
- New features and improvements
- Introduce Jupyter support (#29, #30, #31, #32)
- Add `DataFrame#to_html (changed to use #to_iruby)
- Add feature to show nil in to_iruby
- nil is expressed as (nil)
- empty string('') is ""
- blank spaces are " "
- Enable to change DataFrame display mode by ENV (#36)
- Support ENV['RED_AMBER_OUTPUT_STYLE'] to change display mode in `#inspect` and `#to_iruby`
- ENV['RED_AMBER_OUTPUT_STYLE'] = 'table' # => Table mode
- ENV['RED_AMBER_OUTPUT_STYLE'] = nil or other than 'table' # => TDR mode
- Support `require 'red-amber'`, as well (#34)
- Refine Vector slicing methods (#31)
- Introduce `Vector#take` method
- Introduce `Vector#filter` method
- Improve `Vector#[]` to overload take and filter
- Introduce `Vector#drop_nil` method
- Introduce `Vector#if_else` method
- Intorduce `Vector#is_in` method
- Add alias `Vector#all?`, `#any?` methods (#32)
- Add `Vector#has_nil?` method(#32)
- Add `Vector#empty?` method
- Add `Vector#primitive_invert` method
- Refactor `Vector#take`, `#filter`
- Move `Vector#if_else` from function to Updatable
- Move if_else test to updatable
- Rename updatable in test
- Remove method `Vector#take_out_element_wise`
- Rename inner metthod name
- Refine DataFrame slicing methods (#31)
- Introduce `DataFrame#take method
- #take is implemented as vector calculation by #if_else
- Introduce `DataFrame#fliter method
- Change `DataFrame#[] to use take and filter
- Float indices is acceptable (#10)
- Negative index (like Array) is also acceptable
- Further refinement in DataFrame slicing methods (#34)
- Improve `DataFrame#[]`, `#slice`, `#remove` by a new engine
- It parses arguments to Vector internally.
- Used Kernel#Array to simplify code (#16) .
- Move `DataFrame#slice`, `#remove` to Selectable
- Refine `DataFrame#take`, `#filter` (undocumented)
- Introduce coerce in Vector (#35)
- Introduce `Vector#coerce`
- Now we can `-1 * Vector.new([1, 2, 3])`
- Add `Vector#to_ary` method
- Now we can `[1, 2] + Vector.new([3, 4, 5])`
- Other new feature or refinements
- Common
- Refactor helper as common for DataFrame and Vector (#35)
- Change name row/col to obs/var (#34)
- Rename internal function name (#34)
- Delete unused methods (#34)
- DataFrame
- Change to return instance variable in `#to_arrow`, `#keys` and `#key_index` (#34)
- Change to return an Array in `DataFrame#indices` (#35)
- Vector
- Introduce `Vector#replace` method
- Accept Range and expanded Array in `Vector#new`
- Add `Vector#indices` method (#35)
- Add `Vector#index` method (#35)
- Rename VectorCompensable to *Updatable (#33)
- Documentation
- Fix typo in DataFrame.md
- Github site
- Add gem and status badges in README. (#42) [Patch by kojix2]
- Thanks
- kojix2
## [0.1.5] - 2022-06-12 (experimental)
- Bug fixes
- Fix DataFrame#tdr to display timestamp type (#19)
- Add TZ setting in CI test to pass temporal tests (#19)
- Fix example in document of #load(csv_from_URI) (#23)
- New features and improvements
- Improve usability of DataFrame manipulating block (#19)
- Add `DataFrame#v` to select a Vector
- Add `DataFrame#variables` method
- Add `DataFrame#to_arrow`
- Add instance variables in DataFrame with lazy initialization
- Add `Vector#key` to get key name
- Add `Vector#temporal?` to check if temporal type
- Refine around DataFrame#variables
- Refine init of instance variables
- Refine DataFrame#type_classes, Vector#ectortype_class
- Refine DataFrame#tdr to shorten temporal data
- Add supports to make up for missing values (#20)
- Add VectorArgumentError
- Add `Vector#replace_with`
- Add helper function to assert with NaN
- To assert NaN == NaN
- Add `Vector#fill_nil_backward`, `Vector#forward`
- Add `DataFrame#remove_nil` method
- Change to accept nil as replacement in Vector#replace_with
- Introduce index related methods (#22)
- Add `Vector#sort_indexes` method
- Add `Vector#uniq` method
- Add `Vector#tally` and `Vectorvalue_counts` methods
- Add `DataFrame#sort` method
- Add `DataFrame#group` method
- Change to use DataFrame#map_indices in #[]
- Add rounding functions with opts (#21)
- With options :mode and :n_digits
- :n_digits also can be specified with :multiple option in `Vector#round_to_multiple`
- `Vector#round`
- `Vector#ceil`
- `Vector#floor`
- `Vector#trunc`
- Documentation
- Update TDR, TDR_ja documents to latest (#18)
- Refinement and small fix in DataFrame.md (#18)
- Update README to use more effective example (#18)
- Delete expired TDR_operations.pdf (#23)
- Update README and dataframe_model image (#23)
- Update description about rover-df in README (#23)
- Add installation of Arrow in README (#23)
- Others
- Tried but cannot use bundler cache in ci test (#17)
- Bump up requirements to Arrow 8.0.0 (#25)
- Arrow 7.0.0 with Ubuntu 21.04 causes an fatal error in replace_with_mask function.
- Update the description of gem (#23)
- Add benchmark tests (#26)
## [0.1.4] - 2022-05-29 (experimental)
- Bug fixes
- Fix missing support for scalar argument (#1)
- Fix type name of boolean in DataFrame#types to be same as Vector#type (#6, #7)
- Fix zero picking to return empty DataFrame (#8)
- Fix code at both args and a block given (#8)
- New features and improvements
- `DataFrame`
- Refine module name `Displayable`
- Rename nrow/ncol methods to `size`/`n_keys` to align with TDR concept (#4)
- Remain `n_row`/`n_col` for compatibility
- Rename `ls` method to `tdr` (#4)
- Add limit option to `tdr`
- Shorten option name (#11)
- Introduce `pick` method to create sub DataFrame (#8)
- Add boolean support (#8)
- Refactor `pick` (#9)
- Introduce `drop` method to create sub DataFrame (#8)
- Add boolean support (#8)
- Refactor `drop` (#9)
- Add boolean array support for `[]` (#9)
- Add `indexes`/`indices` to use with selecting observations (#9)
- Introduce `slice` method to create sub DataFrame (#8)
- Refactor `slice` (#9)
- Introduce `remove` method to create sub DataFrame (#9)
- Introduce `rename` method to create sub DataFrame (#14)
- Introduce `assign` method to create sub DataFrame (#14)
- Improve to call block by instance_eval (#13)
- `Vector`
- Refine `find(function)`
- Add `min_max` method (#2)
- Add `std`/`sd` method (ddof=0 version: `stddev`) (#2)
- Add `var` method (ddof=0 version: `variance`) (#2)
- Add `VectorFunctions.arrow_doc(func_name)` (temporally)
- Documentation
- Show code in README
- Change row/column names for **TDR** concept (#4)
- Add documents about **TDR** concept (#4)
- Add example about TDR (#4)
- Separate README to create DataFrame and Vector documents (#12)
- Add DataFrame model concept image to README (#12)
- GitHub site
- Switched to use merge on GitHub (not to push merged master) (#1)
- Create lifetime issue #3 to show the goal of this project (#3)
## [0.1.3] - 2022-05-15 (experimental)
- Bug fixes
- Fix boolean functions in `Vector` to align with Ruby's behavior
- `&` == `and_kleene`
- `|` == `or_kleene`
- Quote strings of data-preview in `DataFrame#inspect`
- Quote empty and blank keys in `DataFrame#inspect`
- Respond to error for a wrong key in `DataFrame#[]`
- New features and improvements
- `DataFrame`
- Display nil elements in `inspect`
- Show NaN and nil counts in `inspect`
- Refactor `inspect`
- Add method `key` and `key_index`
- Add how to load/save Parquet to README
- `Vector`
- Add categorization functions
This is an important step to support `slice` method and NA treatment features.
- `is_finite`
- `is_inf`
- `is_na` (RedAmber original)
- `is_nan`
- `is_nil`, `is_null`
- `is_valid`
- Show in a reduced representation for long array in `inspect`
- Support options in aggregatiton functions
- Return values in non-arrow object for scalar aggregation functions
## [0.1.2] - 2022-05-08 (experimental)
- Bug fixes:
- `DataFrame`
- Fix bug in `#[]` with end-less Range
- New features and improvements
- Add support for Arrow 8.0.0
- `DataFrame`
- `types` and `data_types`
- Range is usable to specify columns in `#[]`
- `Vector`
- `type` and `data_type`
## [0.1.1] - 2022-05-06 (experimental)
- Release on rubygems.org
- Introduce class `DataFrame`
- New from Hash, schema/rows, `Arrow::Table`, `Rover::DataFrame`
- Load from file, string, URI
- Save to file, string, URI
- Methods for basic properties
- Rich inspect method
- Basic selecting by `#[]`
- Introduce class `Vector`
- New from a column in a `DataFlame`
- New from `Arrow::Array`, `Arrow::ChunkedArray`, `Array`
- Methods for basic properties
- Function support
- Unary aggregations
- Unary element-wises
- Binary element-wises
- Some operators defined
## [0.1.0] - 2022-04-15 (unreleased)
- Initial version