forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 1
Windows compatibility + add flatbuffers as submodule #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mehrdadn
wants to merge
3
commits into
static
Choose a base branch
from
ray
base: static
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pcmoritz
pushed a commit
that referenced
this pull request
Mar 15, 2017
Runtimes of the `builder-benchmark`: ``` BM_BuildPrimitiveArrayNoNulls/repeats:3 901 ms 889 ms 1 576.196MB/s BM_BuildPrimitiveArrayNoNulls/repeats:3 833 ms 829 ms 1 617.6MB/s BM_BuildPrimitiveArrayNoNulls/repeats:3 825 ms 821 ms 1 623.855MB/s BM_BuildPrimitiveArrayNoNulls/repeats:3_mean 853 ms 846 ms 1 605.884MB/s BM_BuildPrimitiveArrayNoNulls/repeats:3_stddev 34 ms 30 ms 0 21.147MB/s BM_BuildVectorNoNulls/repeats:3 712 ms 701 ms 1 729.866MB/s BM_BuildVectorNoNulls/repeats:3 671 ms 670 ms 1 764.464MB/s BM_BuildVectorNoNulls/repeats:3 688 ms 681 ms 1 751.285MB/s BM_BuildVectorNoNulls/repeats:3_mean 690 ms 684 ms 1 748.538MB/s BM_BuildVectorNoNulls/repeats:3_stddev 17 ms 13 ms 0 14.2578MB/s ``` With an aligned `Reallocate`, the jemalloc version is 50% faster and even outperforms `std::vector`: ``` BM_BuildPrimitiveArrayNoNulls/repeats:3 565 ms 559 ms 1 916.516MB/s BM_BuildPrimitiveArrayNoNulls/repeats:3 540 ms 537 ms 1 952.727MB/s BM_BuildPrimitiveArrayNoNulls/repeats:3 544 ms 543 ms 1 942.948MB/s BM_BuildPrimitiveArrayNoNulls/repeats:3_mean 550 ms 546 ms 1 937.397MB/s BM_BuildPrimitiveArrayNoNulls/repeats:3_stddev 11 ms 9 ms 0 15.2949MB/s ``` Author: Uwe L. Korn <[email protected]> Closes apache#270 from xhochy/ARROW-456 and squashes the following commits: d3ce3bf [Uwe L. Korn] Zero arrays for now 831399d [Uwe L. Korn] cpplint #2 e6e251b [Uwe L. Korn] cpplint 52b3c76 [Uwe L. Korn] Add Reallocate implementation to PyArrowMemoryPool 113e650 [Uwe L. Korn] Add missing file d331cd9 [Uwe L. Korn] Add tests for Reallocate c2be086 [Uwe L. Korn] Add JEMALLOC_HOME to the Readme bd47f51 [Uwe L. Korn] Add missing return value 5142ac3 [Uwe L. Korn] Don't use deprecated GBenchmark interfaces b6bff98 [Uwe L. Korn] Add missing (win) include 6f08e19 [Uwe L. Korn] Don't build jemalloc on AppVeyor 834c3b2 [Uwe L. Korn] Add jemalloc to Travis builds 10c6839 [Uwe L. Korn] Implement Reallocate function a17b313 [Uwe L. Korn] ARROW-456: C++: Add jemalloc based MemoryPool
pcmoritz
pushed a commit
that referenced
this pull request
Jan 26, 2018
…lue data Modified BinaryBuilder::Resize(int64_t) so that when building BinaryArrays with a known size, space is also reserved for value_data_builder_ to prevent internal reallocation. Author: Panchen Xue <[email protected]> Closes apache#1481 from xuepanchen/master and squashes the following commits: 707b67b [Panchen Xue] ARROW-1712: [C++] Fix lint errors 360e601 [Panchen Xue] Merge branch 'master' of https://github.com/xuepanchen/arrow d4bbd15 [Panchen Xue] ARROW-1712: [C++] Modify test case for BinaryBuilder::ReserveData() and change arguments for offsets_builder_.Resize() 77f8f3c [Panchen Xue] Merge pull request #5 from apache/master bc5db7d [Panchen Xue] ARROW-1712: [C++] Remove unneeded data member in BinaryBuilder and modify test case 5a5b70e [Panchen Xue] Merge pull request #4 from apache/master 8e4c892 [Panchen Xue] Merge pull request #3 from xuepanchen/xuepanchen-arrow-1712 d3c8202 [Panchen Xue] ARROW-1945: [C++] Fix a small typo 0b07895 [Panchen Xue] ARROW-1945: [C++] Add data_capacity_ to track capacity of value data 18f90fb [Panchen Xue] ARROW-1945: [C++] Add data_capacity_ to track capacity of value data bbc6527 [Panchen Xue] ARROW-1945: [C++] Update test case for BinaryBuild data value space reservation 15e045c [Panchen Xue] Add test case for array-test.cc 5a5593e [Panchen Xue] Update again ReserveData(int64_t) method for BinaryBuilder 9b5e805 [Panchen Xue] Update ReserveData(int64_t) method signature for BinaryBuilder 8dd5eaa [Panchen Xue] Update builder.cc b002e0b [Panchen Xue] Remove override keyword from ReserveData(int64_t) method for BinaryBuilder de318f4 [Panchen Xue] Implement ReserveData(int64_t) method for BinaryBuilder e0434e6 [Panchen Xue] Add ReserveData(int64_t) and value_data_capacity() for methods for BinaryBuilder 5ebfb32 [Panchen Xue] Add capacity() method for TypedBufferBuilder 5b73c1c [Panchen Xue] Update again BinaryBuilder::Resize(int64_t capacity) in builder.cc d021c54 [Panchen Xue] Merge pull request #2 from xuepanchen/xuepanchen-arrow-1712 232024e [Panchen Xue] Update BinaryBuilder::Resize(int64_t capacity) in builder.cc c2f8dc4 [Panchen Xue] Merge pull request #1 from apache/master
pcmoritz
pushed a commit
that referenced
this pull request
Feb 26, 2018
This PR moves the `Table` class out of the Vector hierarchy and adds optimized dataframe operations to it. Currently implements an optimized `scan()` method, `filter(predicate)`, `count()`, and `countBy(column_name)` (only works on dictionary-encoded columns).
Some usage examples, based on the file generated by `js/test/data/tables/generate.py`:
``` js
> let table = Table.from(...);
> table.count()
1000000
> table.filter(col('lat').gteq(0)).count()
499718
> table.countBy('origin').toJSON()
{ Charlottesville: 166839,
'New York': 166251,
'San Francisco': 166642,
Seattle: 166659,
'Terre Haute': 166756,
'Washington, DC': 166853 }
> table.filter(col('lng').gteq(0)).countBy('origin').toJSON()
{ Charlottesville: 83109,
'New York': 83221,
'San Francisco': 83515,
Seattle: 83362,
'Terre Haute': 83314,
'Washington, DC': 83479 }
```
There are performance tests for the dataframe operations, to run them you must first generate the test data by running `npm run create:perfdata`.
The PR also includes @trxcllnt's refactor of the JS implementation to make it more closely resemble the C++ implementation. This refactor resolves multiple JIRAs: ARROW-1903, ARROW-1898, ARROW-1502, ARROW-1952 (partially), and ARROW-1985
Author: Paul Taylor <[email protected]>
Author: Brian Hulette <[email protected]>
Author: Brian Hulette <[email protected]>
Closes apache#1482 from TheNeuralBit/table-scan-perf and squashes the following commits:
52f1e0e [Brian Hulette] <, > are not commutative, misc cleanup
04b1838 [Brian Hulette] even more table tests
16b9ccb [Brian Hulette] Merge pull request #4 from trxcllnt/js-cpp-refactor
fe300df [Paul Taylor] fix closure es5/umd toString() iterator
3d5240a [Paul Taylor] fix more externs
10c48ad [Paul Taylor] Merge branch 'table-scan-perf' of github.com:ccri/arrow into js-cpp-refactor
dbe7f81 [Brian Hulette] Add more Table unit tests
1910962 [Brian Hulette] Add optional bind callback to scan
5bdf17f [Brian Hulette] Fix perf
8cf2473 [Brian Hulette] Merge remote-tracking branch 'origin/master' into table-scan-perf
4a41b18 [Paul Taylor] add src/predicate to the list of exports we should save from uglify
5a91fab [Paul Taylor] add more view, predicate externs
f6adfb3 [Brian Hulette] Create predicate namespace
f7bb0ed [Paul Taylor] Merge branch 'table-scan-perf' of github.com:ccri/arrow into js-cpp-refactor
e148ee4 [Paul Taylor] Merge branch 'extern-woes' into js-cpp-refactor
25cdc4a [Paul Taylor] add src/predicate to the list of exports we should save from uglify
dc7c728 [Paul Taylor] add more view, predicate externs
25e6af7 [Brian Hulette] Create predicate namespace
579ab1f [Brian Hulette] Merge pull request #2 from trxcllnt/js-cpp-refactor
f3cde1a [Paul Taylor] fix lint
9769773 [Paul Taylor] fix vector perf tests
016ba78 [Brian Hulette] Merge pull request #1 from trxcllnt/js-cpp-refactor
272d293 [Paul Taylor] Merge pull request #4 from ccri/empty-table
7bc7363 [Brian Hulette] Fix exception for empty Table
8ddce0a [Paul Taylor] check bounds in getChildAt(i) to avoid NPEs
f1dead0 [Paul Taylor] compute chunked nested childData list correctly
18807c6 [Paul Taylor] rename ChunkData's fields so it's more clear they're not semantically similar to other similarly named fields
7e43b78 [Paul Taylor] add test:integration npm script
a5f200f [Paul Taylor] Merge pull request #3 from ccri/table-from-struct
c8cd286 [Brian Hulette] Add Table.fromStruct
a00415e [Brian Hulette] Fix perf
54d4f5b [Paul Taylor] lazily allocate table and recordbatch columns, support NestedView's getChildAt(i) method in ChunkedView
40b3638 [Paul Taylor] run integration tests with local data for coverage stats
fe31ee0 [Paul Taylor] slice the flat data values before returning an iterator of them
e537789 [Paul Taylor] make it easier to run all integration tests from local data
c0fd2f9 [Paul Taylor] use the dictionary of the last chunked vector list for chunked dictionary vectors
e33c068 [Paul Taylor] Merge pull request #2 from ccri/fixed-size-list
5bb63af [Brian Hulette] Don't read OFFSET vector for FixedSizeList
614b688 [Paul Taylor] add asEpochMs to date and timestamp vectors
87334a5 [Paul Taylor] Merge branch 'table-scan-perf' of github.com:ccri/arrow into js-cpp-refactor
b7f5bfb [Paul Taylor] rename numRows to length, add table.getColumn()
e81082f [Paul Taylor] export vector views, allow cloning data as another type
700a47c [Paul Taylor] export visitors
e859e13 [Paul Taylor] fix package.json bin entry
0620cfd [Brian Hulette] use Math.fround
0126dc4 [Brian Hulette] Don't recompute total length
e761eee [Brian Hulette] Rename asJSON to toJSON
6c91ed4 [Paul Taylor] Merge branch 'master' of github.com:apache/arrow into js-cpp-refactor-merge_with-table-scan-perf
d2b18d5 [Paul Taylor] Merge remote-tracking branch 'ccri/table-scan-perf' into js-cpp-refactor-merge_with-table-scan-perf
f3f3b86 [Paul Taylor] rename table.ts to recordbatch.ts in preparation for merging latest
e3f629d [Paul Taylor] fix rest of the mangling issues
fa7c17a [Paul Taylor] passing all tests except es5 umd mangler ones
e20decd [Brian Hulette] Add license headers
edcbdbe [Brian Hulette] cleanup
20717d5 [Brian Hulette] Fixed countBy(string)
7244887 [Brian Hulette] Add table unit tests...
6719147 [Brian Hulette] Add DataFrame.countBy operation
2f4a349 [Brian Hulette] Minor tweaks
2e118ab [Brian Hulette] linter
a788db3 [Brian Hulette] Cleanup
a9fff89 [Brian Hulette] Move Table out of the Vector hierarchy
1d60aa1 [Brian Hulette] Moved DataFrame ops to Table. DataFrame is now an interface
e8979ba [Brian Hulette] Refactor DataFrame to extend Vector<StructRow>
6a41d68 [Brian Hulette] clean up table benchmarks
2744c63 [Brian Hulette] Remove Chunked/Simple DataFrame distinction
aa999f8 [Brian Hulette] Add DictionaryVector optimization for equals predicate
4d9e8c0 [Brian Hulette] Add concept of predicates for filtering dataframes
796f45d [Brian Hulette] add DataFrame filter and count ops
30f0330 [Brian Hulette] Add basic DataFrame impl ...
a1edac2 [Brian Hulette] Add perf tests for table scans
d18d915 [Paul Taylor] fix struct and map rows
61dc699 [Paul Taylor] WIP -- refactor types to closer match arrow-cpp
62db338 [Paul Taylor] update dependencies and add es6+ umd targets to jest transform ignore patterns to fix ci
6ff18e9 [Paul Taylor] ship es2015 commonJS in main package to avoid confusion
74e828a [Paul Taylor] fix typings issues (ARROW-1903)
pcmoritz
pushed a commit
that referenced
this pull request
Jan 12, 2019
I am contributing to [Arrow 3731](https://issues.apache.org/jira/browse/ARROW-3731). This PR has the minimum functionality to read parquet files into an arrow::Table, which can then be converted to a tibble. Multiple parquet files can be read inside `lapply`, and then concatenated at the end. Steps to compile 1) Build arrow and parquet c++ projects 2) In R run `devtools::load_all()` What I could use help with: The biggest challenge for me is my lack of experience with pkg-config. The R library has a `configure` file which uses pkg-config to figure out what c++ libraries to link to. Currently, `configure` looks up the Arrow project and links to -larrow only. We need it to also link to -lparquet. I do not know how to modify pkg-config's metadata to let it know to link to both -larrow and -lparquet Author: Jeffrey Wong <[email protected]> Author: Romain Francois <[email protected]> Author: jeffwong-nflx <[email protected]> Closes apache#3230 from jeffwong-nflx/master and squashes the following commits: c67fa3d <jeffwong-nflx> Merge pull request #3 from jeffwong-nflx/cleanup 1df3026 <Jeffrey Wong> don't hard code -larrow and -lparquet 8ccaa51 <Jeffrey Wong> cleanup 75ba5c9 <Jeffrey Wong> add contributor 56adad2 <jeffwong-nflx> Merge pull request #2 from romainfrancois/3731/parquet-2 7d6e64d <Romain Francois> read_parquet() only reading one parquet file, and gains a `as_tibble` argument e936b44 <Romain Francois> need parquet on travis too ff260c5 <Romain Francois> header was too commented, renamed to parquet.cpp 9e1897f <Romain Francois> styling etc ... 456c5d2 <Jeffrey Wong> read parquet files 22d89dd <Jeffrey Wong> hardcode -larrow and -lparquet
pcmoritz
pushed a commit
that referenced
this pull request
Feb 1, 2019
https://issues.apache.org/jira/browse/ARROW-3965 This creates an object which configures the BaseAllocator and Calendar used during to configure the translation from a JDBC ResultSet to an Arrow vector. Author: Mike Pigott <[email protected]> Author: Michael Pigott <[email protected]> Closes apache#3133 from mikepigott/jdbc-to-arrow-config and squashes the following commits: be95426 <Mike Pigott> ARROW-3965: JDBC-To-Arrow Config Builder javadocs. d6c64a7 <Mike Pigott> ARROW-3965: JdbcToArrowConfigBuilder d7ca982 <Mike Pigott> Merge branch 'master' into jdbc-to-arrow-config 789c8c8 <Michael Pigott> Merge pull request #4 from apache/master e5b19ee <Michael Pigott> Merge pull request #3 from apache/master 3b17c29 <Michael Pigott> Merge pull request #2 from apache/master 5b1b364 <Mike Pigott> Merge branch 'master' into jdbc-to-arrow-config 881c6c8 <Michael Pigott> Merge pull request #1 from apache/master bb3165b <Mike Pigott> Updating the function calls to use the JdbcToArrowConfig versions. 68c91e7 <Mike Pigott> Modifying the jdbcToArrowSchema and jdbcToArrowVectors methods to receive JdbcToArrowConfig objects. 8d6cf00 <Mike Pigott> Documentation for public static VectorSchemaRoot sqlToArrow(Connection connection, String query, JdbcToArrowConfig config) 4f1260c <Mike Pigott> Adding documentation for public static VectorSchemaRoot sqlToArrow(ResultSet resultSet, JdbcToArrowConfig config) df632e3 <Mike Pigott> Updating the SQL tests to include JdbcToArrowConfig versions. b270044 <Mike Pigott> Updated validaton & documentation, and unit tests for the new JdbcToArrowConfig. da77cbe <Mike Pigott> Creating a configuration class for the JDBC-to-Arrow converter.
pcmoritz
pushed a commit
that referenced
this pull request
Feb 6, 2019
https://issues.apache.org/jira/browse/ARROW-3923 Hello! I was reading through the JDBC source code and I noticed that a java.util.Calendar was required for creating an Arrow Schema and Arrow Vectors from a JDBC ResultSet, when none is required. This change makes the Calendar optional. Unit Tests: The existing SureFire plugin configuration uses a UTC calendar for the database, which is the default Calendar in the existing code. Likewise, no changes to the unit tests are required to provide adequate coverage for the change. Author: Michael Pigott <[email protected]> Author: Mike Pigott <[email protected]> Closes apache#3066 from mikepigott/jdbc-timestamp-no-calendar and squashes the following commits: 4d95da0 <Mike Pigott> ARROW-3923: Supporting a null Calendar in the config, and reverting the breaking change. cd9a230 <Mike Pigott> Merge branch 'master' into jdbc-timestamp-no-calendar 509a1cc <Michael Pigott> Merge pull request #5 from apache/master 789c8c8 <Michael Pigott> Merge pull request #4 from apache/master e5b19ee <Michael Pigott> Merge pull request #3 from apache/master 3b17c29 <Michael Pigott> Merge pull request #2 from apache/master 881c6c8 <Michael Pigott> Merge pull request #1 from apache/master 089cff4 <Mike Pigott> Format fixes a58a4a5 <Mike Pigott> Fixing calendar usage. e12832a <Mike Pigott> Allowing for timestamps without a time zone.
xhochy
pushed a commit
that referenced
this pull request
Feb 8, 2019
https://issues.apache.org/jira/browse/ARROW-3923 Hello! I was reading through the JDBC source code and I noticed that a java.util.Calendar was required for creating an Arrow Schema and Arrow Vectors from a JDBC ResultSet, when none is required. This change makes the Calendar optional. Unit Tests: The existing SureFire plugin configuration uses a UTC calendar for the database, which is the default Calendar in the existing code. Likewise, no changes to the unit tests are required to provide adequate coverage for the change. Author: Michael Pigott <[email protected]> Author: Mike Pigott <[email protected]> Closes apache#3066 from mikepigott/jdbc-timestamp-no-calendar and squashes the following commits: 4d95da0 <Mike Pigott> ARROW-3923: Supporting a null Calendar in the config, and reverting the breaking change. cd9a230 <Mike Pigott> Merge branch 'master' into jdbc-timestamp-no-calendar 509a1cc <Michael Pigott> Merge pull request #5 from apache/master 789c8c8 <Michael Pigott> Merge pull request #4 from apache/master e5b19ee <Michael Pigott> Merge pull request #3 from apache/master 3b17c29 <Michael Pigott> Merge pull request #2 from apache/master 881c6c8 <Michael Pigott> Merge pull request #1 from apache/master 089cff4 <Mike Pigott> Format fixes a58a4a5 <Mike Pigott> Fixing calendar usage. e12832a <Mike Pigott> Allowing for timestamps without a time zone.
xhochy
pushed a commit
that referenced
this pull request
Feb 8, 2019
https://issues.apache.org/jira/browse/ARROW-3966 This change includes apache#3133, and supports a new configuration item called "Include Metadata." If true, metadata from the JDBC ResultSetMetaData object is pulled along to the Schema Field Metadata. For now, this includes: * Catalog Name * Table Name * Column Name * Column Type Name Author: Mike Pigott <[email protected]> Author: Michael Pigott <[email protected]> Closes apache#3134 from mikepigott/jdbc-column-metadata and squashes the following commits: 02f2f34 <Mike Pigott> ARROW-3966: Picking up lost change to support null calendars. 7049c36 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata e9a9b2b <Michael Pigott> Merge pull request apache#6 from apache/master 65741a9 <Mike Pigott> ARROW-3966: Code review feedback cc6cc88 <Mike Pigott> ARROW-3966: Using a 1:N loop instead of a 0:N-1 loop for fewer index offsets in code. cfb2ba6 <Mike Pigott> ARROW-3966: Using a helper method for building a UTC calendar with root locale. 2928513 <Mike Pigott> ARROW-3966: Moving the metadata flag assignment into the builder. 69022c2 <Mike Pigott> ARROW-3966: Fixing merge. 4a6de86 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata 509a1cc <Michael Pigott> Merge pull request #5 from apache/master 789c8c8 <Michael Pigott> Merge pull request #4 from apache/master e5b19ee <Michael Pigott> Merge pull request #3 from apache/master 3b17c29 <Michael Pigott> Merge pull request #2 from apache/master d847ebc <Mike Pigott> Fixing file location 1ceac9e <Mike Pigott> Merge branch 'master' into jdbc-column-metadata 881c6c8 <Michael Pigott> Merge pull request #1 from apache/master 03091a8 <Mike Pigott> Unit tests for including result set metadata. 72d64cc <Mike Pigott> Affirming the field metadata is empty when the configuration excludes field metadata. 7b4527c <Mike Pigott> Test for the include-metadata flag in the configuration. 7e9ce37 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata bb3165b <Mike Pigott> Updating the function calls to use the JdbcToArrowConfig versions. a6fb1be <Mike Pigott> Fixing function call 5bfd6a2 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata 68c91e7 <Mike Pigott> Modifying the jdbcToArrowSchema and jdbcToArrowVectors methods to receive JdbcToArrowConfig objects. b5b0cb1 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata 8d6cf00 <Mike Pigott> Documentation for public static VectorSchemaRoot sqlToArrow(Connection connection, String query, JdbcToArrowConfig config) 4f1260c <Mike Pigott> Adding documentation for public static VectorSchemaRoot sqlToArrow(ResultSet resultSet, JdbcToArrowConfig config) e34a9e7 <Mike Pigott> Fixing formatting. fe097c8 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata df632e3 <Mike Pigott> Updating the SQL tests to include JdbcToArrowConfig versions. b270044 <Mike Pigott> Updated validaton & documentation, and unit tests for the new JdbcToArrowConfig. da77cbe <Mike Pigott> Creating a configuration class for the JDBC-to-Arrow converter. a78c770 <Mike Pigott> Updating Javadocs. 523387f <Mike Pigott> Updating the API to support an optional 'includeMetadata' field. 5af1b5b <Mike Pigott> Separating out the field-type creation from the field creation.
pcmoritz
pushed a commit
that referenced
this pull request
Mar 16, 2019
I'm sure I'll need some guidance on this one from @sunchao or @liurenjie1024 but I am keen to get parquet support added for primitive types so that I can actually use DataFusion and Arrow in production at some point. Author: Andy Grove <[email protected]> Author: Neville Dipale <[email protected]> Author: Andy Grove <[email protected]> Closes apache#3851 from andygrove/ARROW-4466 and squashes the following commits: 3158529 <Andy Grove> add test for reading small batches 549c829 <Andy Grove> Remove hard-coded batch size, fix nits 8d2df06 <Andy Grove> move schema projection function from arrow into datafusion 204db83 <Andy Grove> fix timestamp nano issue 73aa934 <Andy Grove> Remove println from test 25d34ac <Andy Grove> Make INT32/64/96 handling consistent with C++ implementation 9b1308f <Andy Grove> clean up handling of INT96 and DATE/TIME/TIMESTAMP types in schema converter 1ec815b <Andy Grove> Clean up imports 023dc25 <Andy Grove> Merge pull request #2 from nevi-me/ARROW-4466 02b2ed3 <Neville Dipale> fix int96 conversion to read timestamps correctly 2aeea24 <Andy Grove> remove println from tests 9d3047a <Andy Grove> code cleanup 639e13e <Andy Grove> null handling for int96 1503855 <Andy Grove> handle nulls for binary data 80cf303 <Andy Grove> add date support 5a3368c <Andy Grove> Remove unnecessary slice, fix null handling 306d07a <Neville Dipale> fmt 3c711a5 <Neville Dipale> immediately allocate vec e6cbbaa <Neville Dipale> replace read_column! macro with generic 607a29f <Andy Grove> return result if there are null values e8aa784 <Andy Grove> revert temp debug change to error messages 6457c36 <Andy Grove> use parquet::reader::schema::parquet_to_arrow_schema c56510e <Andy Grove> projection takes slice instead of vec 7e1a98f <Andy Grove> remove println and unwrap dddb7d7 <Andy Grove> update to use partition-aware changes from master 157512e <Andy Grove> Remove invalid TODO comment debb2fb <Andy Grove> code cleanup 6c3b7e2 <Andy Grove> add support for all primitive parquet types b4981ed <Andy Grove> implement more parquet column types and tests 5ce3086 <Andy Grove> revert to columnar reads c3f71d7 <Andy Grove> add integration test aea9f8a <Andy Grove> convert to use row iter f46e6f7 <Andy Grove> save eaddafb <Andy Grove> save 322fc87 <Andy Grove> add test for reading strings from parquet 3a412b1 <Andy Grove> first parquet test passes ff3e5b7 <Andy Grove> test 10710a2 <Andy Grove> Parquet datasource
pcmoritz
pushed a commit
that referenced
this pull request
Oct 27, 2019
This updates the language in `install_arrow()` to follow the README revision that will land in https://github.com/apache/arrow/pull/4948/files#diff-563b2cb2c8c2d51b2ff6b177e2d84286R33. The [Jira ticket](https://issues.apache.org/jira/browse/ARROW-6142) requested three things; this is `#2` in the list. On `#1`, I defer to the C++ installation docs, which are already included in the install_arrow message, rather than duplicating content here. `#3` is out of scope. Closes apache#5027 from nealrichardson/no-ppa and squashes the following commits: 80b142e <Neal Richardson> s/arrow/Arrow/ 44c9659 <Neal Richardson> Tweak language again 36cfe28 <Neal Richardson> Further linux install revisions 79bd7e0 <Neal Richardson> One more PPurge 63f75bd <Neal Richardson> Revise install_arrow instructions for Linux Authored-by: Neal Richardson <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.