| ------------------------------------------------------------------- |
| Sat Dec 7 19:08:29 UTC 2019 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 2.9.0: |
| * Array |
| + Fix da.std to work with NumPy arrays (:pr:`5681`) James Bourbeau |
| * Core |
| + Register sizeof functions for Numba and RMM (:pr:`5668`) John A |
| Kirkham |
| + Update meeting time (:pr:`5682`) Tom Augspurger |
| * DataFrame |
| + Modify dd.DataFrame.drop to use shallow copy (:pr:`5675`) |
| Richard J Zamora |
| + Fix bug in _get_md_row_groups (:pr:`5673`) Richard J Zamora |
| + Close sqlalchemy engine after querying DB (:pr:`5629`) Krishan |
| Bhasin |
| + Allow dd.map_partitions to not enforce meta (:pr:`5660`) Matthew |
| Rocklin |
| + Generalize concat_unindexed_dataframes to support cudf-backend |
| (:pr:`5659`) Richard J Zamora |
| + Add dataframe resample methods (:pr:`5636`) Ben Zaitlen |
| + Compute length of dataframe as length of first column |
| (:pr:`5635`) Matthew Rocklin |
| * Documentation |
| + Doc fixup (:pr:`5665`) James Bourbeau |
| + Update doc build instructions (:pr:`5640`) James Bourbeau |
| + Fix ADL link (:pr:`5639`) Ray Bell |
| + Add documentation build (:pr:`5617`) James Bourbeau |
| |
| ------------------------------------------------------------------- |
| Sun Nov 24 17:35:04 UTC 2019 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 2.8.1: |
| * Array |
| + Use auto rechunking in da.rechunk if no value given (:pr:`5605`) |
| Matthew Rocklin |
| * Core |
| + Add simple action to activate GH actions (:pr:`5619`) James |
| Bourbeau |
| * DataFrame |
| + Fix "file_path_0" bug in aggregate_row_groups (:pr:`5627`) |
| Richard J Zamora |
| + Add chunksize argument to read_parquet (:pr:`5607`) Richard J |
| Zamora |
| + Change test_repartition_npartitions to support arch64 |
| architecture (:pr:`5620`) ossdev07 |
| + Categories lost after groupby + agg (:pr:`5423`) Oliver Hofkens |
| + Fixed relative path issue with parquet metadata file |
| (:pr:`5608`) Nuno Gomes Silva |
| + Enable gpu-backed covariance/correlation in dataframes |
| (:pr:`5597`) Richard J Zamora |
| * Documentation |
| + Fix institutional faq and unknown doc warnings (:pr:`5616`) |
| James Bourbeau |
| + Add doc for some utils (:pr:`5609`) Tom Augspurger |
| + Removes html_extra_path (:pr:`5614`) James Bourbeau |
| + Fixed See Also referencence (:pr:`5612`) Tom Augspurger |
| |
| ------------------------------------------------------------------- |
| Sat Nov 16 17:53:12 UTC 2019 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 2.8.0: |
| * Array |
| + Implement complete dask.array.tile function (:pr:`5574`) Bouwe |
| Andela |
| + Add median along an axis with automatic rechunking (:pr:`5575`) |
| Matthew Rocklin |
| + Allow da.asarray to chunk inputs (:pr:`5586`) Matthew Rocklin |
| * Bag |
| + Use key_split in Bag name (:pr:`5571`) Matthew Rocklin |
| * Core |
| + Switch Doctests to Py3.7 (:pr:`5573`) Ryan Nazareth |
| + Relax get_colors test to adapt to new Bokeh release (:pr:`5576`) |
| Matthew Rocklin |
| + Add dask.blockwise.fuse_roots optimization (:pr:`5451`) Matthew |
| Rocklin |
| + Add sizeof implementation for small dicts (:pr:`5578`) Matthew |
| Rocklin |
| + Update fsspec, gcsfs, s3fs (:pr:`5588`) Tom Augspurger |
| * DataFrame |
| + Add dropna argument to groupby (:pr:`5579`) Richard J Zamora |
| + Revert "Remove import of dask_cudf, which is now a part of cudf |
| (:pr:`5568`)" (:pr:`5590`) Matthew Rocklin |
| * Documentation |
| + Add best practice for dask.compute function (:pr:`5583`) Matthew |
| Rocklin |
| + Create FUNDING.yml (:pr:`5587`) Gina Helfrich |
| + Add screencast for coordination primitives (:pr:`5593`) Matthew |
| Rocklin |
| + Move funding to .github repo (:pr:`5589`) Tom Augspurger |
| + Update calendar link (:pr:`5569`) Tom Augspurger |
| |
| ------------------------------------------------------------------- |
| Mon Nov 11 18:24:07 UTC 2019 - Todd R <toddrme2178@gmail.com> |
| |
| - Update to 2.7.0 |
| + Array |
| * Reuse code for assert_eq util method |
| * Update da.array to always return a dask array |
| * Skip transpose on trivial inputs |
| * Avoid NumPy scalar string representation in tokenize |
| * Remove unnecessary tiledb shape constraint |
| * Removes bytes from sparse array HTML repr |
| + Core |
| * Drop Python 3.5 |
| * Update the use of fixtures in distributed tests |
| * Changed deprecated bokeh-port to dashboard-address |
| * Avoid updating with identical dicts in ensure_dict |
| * Test Upstream |
| * Accelerate reverse_dict |
| * Update test_imports.sh |
| * Support cgroups limits on cpu count in multiprocess and threaded schedulers |
| * Update minimum pyarrow version on CI |
| * Make cloudpickle optional |
| + DataFrame |
| * Add an example of index_col usage |
| * Explicitly use iloc for row indexing |
| * Accept dask arrays on columns assignemnt |
| * Implement unique and value_counts for SeriesGroupBy |
| * Add sizeof definition for pyarrow tables and columns |
| * Enable row-group task partitioning in pyarrow-based read_parquet |
| * Removes npartitions='auto' from dd.merge docstring |
| * Apply enforce error message shows non-overlapping columns. |
| * Optimize meta_nonempty for repetitive dtypes |
| * Remove import of dask_cudf, which is now a part of cudf |
| + Documentation |
| * Make capitalization more consistent in FAQ docs |
| * Add CONTRIBUTING.md |
| * Document optional dependencies |
| * Update helm chart docs to reflect new chart repo |
| * Add Resampler to API docs |
| * Fix typo in read_sql_table |
| * Add adaptive deployments screencast |
| - Update to 2.6.0 |
| + Core |
| * Call ``ensure_dict`` on graphs before entering ``toolz.merge`` |
| * Consolidating hash dispatch functions |
| + DataFrame |
| * Support Python 3.5 in Parquet code |
| * Avoid identity check in ``warn_dtype_mismatch`` |
| * Enable unused groupby tests |
| * Remove old parquet and bcolz dataframe optimizations |
| * Add getitem optimization for ``read_parquet`` |
| * Use ``_constructor_sliced`` method to determine Series type |
| * Fix map(series) for unsorted base series index |
| * Fix ``KeyError`` with Groupby label |
| + Documentation |
| * Use Zoom meeting instead of appear.in |
| * Added curated list of resources |
| * Update SSH docs to include ``SSHCluster`` |
| * Update "Why Dask?" page |
| * Fix typos in docstrings |
| - Update to 2.5.2 |
| + Array |
| * Correct chunk size logic for asymmetric overlaps |
| * Make da.unify_chunks public API |
| + DataFrame |
| * Fix dask.dataframe.fillna handling of Scalar object |
| + Documentation |
| * Remove boxes in Spark comparison page |
| * Add latest presentations |
| * Update cloud documentation |
| - Update to 2.5.0 |
| + Core |
| * Add sentinel no_default to get_dependencies task |
| * Update fsspec version |
| * Remove PY2 checks |
| + DataFrame |
| * Add option to not check meta in dd.from_delayed |
| * Fix test_timeseries_nulls_in_schema failures with pyarrow master |
| * Reduce read_metadata output size in pyarrow/parquet |
| * Test numeric edge case for repartition with npartitions. |
| * Unxfail pandas-datareader test |
| * Add DataFrame.pop implementation |
| * Enable merge/set_index for cudf-based dataframes with cupy ``values`` |
| * drop_duplicates support for positional subset parameter |
| + Documentation |
| * Add screencasts to array, bag, dataframe, delayed, futures and setup |
| * Fix delimeter parsing documentation |
| * Update overview image |
| - Update to 2.4.0 |
| + Array |
| * Adds explicit ``h5py.File`` mode |
| * Provides method to compute unknown array chunks sizes |
| * Ignore runtime warning in Array ``compute_meta`` |
| * Add ``_meta`` to ``Array.__dask_postpersist__`` |
| * Fixup ``da.asarray`` and ``da.asanyarray`` for datetime64 dtype and xarray objects |
| * Add shape implementation |
| * Add chunktype to array text repr |
| * Array.random.choice: handle array-like non-arrays |
| + Core |
| * Remove deprecated code |
| * Fix ``funcname`` when vectorized func has no ``__name__`` |
| * Truncate ``funcname`` to avoid long key names |
| * Add support for ``numpy.vectorize`` in ``funcname`` |
| * Fixed HDFS upstream test |
| * Support numbers and None in ``parse_bytes``/``timedelta`` |
| * Fix tokenizing of subindexes on memmapped numpy arrays |
| * Upstream fixups |
| + DataFrame |
| * Allow pandas to cast type of statistics |
| * Preserve index dtype after applying ``dd.pivot_table`` |
| * Implement explode for Series and DataFrame |
| * ``set_index`` on categorical fails with less categories than partitions |
| * Support output to a single CSV file |
| * Add ``groupby().transform()`` |
| * Adding filter kwarg to pyarrow dataset call |
| * Implement and check compression defaults for parquet |
| * Pass sqlalchemy params to delayed objects |
| * Fixing schema handling in arrow-parquet |
| * Add support for DF and Series ``groupby().idxmin/max()`` |
| * Add correlation calculation and add test |
| + Documentation |
| * Numpy docstring standard has moved |
| * Reference correct NumPy array name |
| * Minor edits to Array chunk documentation |
| * Add methods to API docs |
| * Add namespacing to configuration example |
| * Add get_task_stream and profile to the diagnostics page |
| * Add best practice to load data with Dask |
| * Update ``institutional-faq.rst`` |
| * Add threads and processes note to the best practices |
| * Update cuDF links |
| * Fixed small typo with parentheses placement |
| * Update link in reshape docstring |
| - Update to 2.3.0 |
| + Array |
| * Raise exception when ``from_array`` is given a dask array |
| * Avoid adjusting gufunc's meta dtype twice |
| * Add ``meta=`` keyword to map_blocks and add test with sparse |
| * Add rollaxis and moveaxis |
| * Always increment old chunk index |
| * Shuffle dask array |
| * Fix ordering when indexing a dask array with a bool dask array |
| + Bag |
| * Add workaround for memory leaks in bag generators |
| + Core |
| * Set strict xfail option |
| * test-upstream |
| * Fixed HDFS CI failure |
| * Error nicely if no file size inferred |
| * A few changes to ``config.set`` |
| * Fixup black string normalization |
| * Pin NumPy in windows tests |
| * Ensure parquet tests are skipped if fastparquet and pyarrow not installed |
| * Add fsspec to readthedocs |
| * Bump NumPy and Pandas to 1.17 and 0.25 in CI test |
| + DataFrame |
| * Fix ``DataFrame.query`` docstring (incorrect numexpr API) |
| * Parquet metadata-handling improvements |
| * Improve messaging around sorted parquet columns for index |
| * Add ``rearrange_by_divisions`` and ``set_index`` support for cudf |
| * Fix ``groupby.std()`` with integer colum names |
| * Add ``Series.__iter__`` |
| * Generalize ``hash_pandas_object`` to work for non-pandas backends |
| * Add rolling cov |
| * Add columns argument in drop function |
| + Documentation |
| * Update institutional FAQ doc |
| * Add draft of institutional FAQ |
| * Make boxes for dask-spark page |
| * Add motivation for shuffle docs |
| * Fix links and API entries for best-practices |
| * Remove "bytes" (internal data ingestion) doc page |
| * Redirect from our local distributed page to distributed.dask.org |
| * Cleanup API page |
| * Remove excess endlines from install docs |
| * Remove item list in phases of computation doc |
| * Remove custom graphs from the TOC sidebar |
| * Remove experimental status of custom collections |
| * Adds table of contents to Why Dask? |
| * Moves bag overview to top-level bag page |
| * Remove use-cases in favor of stories.dask.org |
| * Removes redundant TOC information in index.rst |
| * Elevate dashboard in distributed diagnostics documentation |
| * Updates "add" layer in HLG docs example |
| * Update GUFunc documentation |
| - Update to 2.2.0 |
| + Array |
| * Use da.from_array(..., asarray=False) if input follows NEP-18 |
| * Add missing attributes to from_array documentation |
| * Fix meta computation for some reduction functions |
| * Raise informative error in to_zarr if unknown chunks |
| * Remove invalid pad tests |
| * Ignore NumPy warnings in compute_meta |
| * Fix kurtosis calc for single dimension input array |
| * Support Numpy 1.17 in tests |
| + Bag |
| * Supply pool to bag test to resolve intermittent failure |
| + Core |
| * Base dask on fsspec |
| * Various upstream compatibility fixes |
| * Make distributed tests optional again. |
| * Fix HDFS in dask |
| * Ignore some more invalid value warnings. |
| + DataFrame |
| * Fix pd.MultiIndex size estimate |
| * Generalizing has_known_categories |
| * Refactor Parquet engine |
| * Add divide method to series and dataframe |
| * fix flaky partd test |
| * Adjust is_dataframe_like to adjust for value_counts change |
| * Generalize rolling windows to support non-Pandas dataframes |
| * Avoid unnecessary aggregation in pivot_table |
| * Add column names to apply_and_enforce error message |
| * Add schema keyword argument to to_parquet |
| * Remove recursion error in accessors |
| * Allow fastparquet to handle gather_statistics=False for file lists |
| + Documentation |
| * Adds NumFOCUS badge to the README |
| * Update developer docs |
| * Document DataFrame.set_index computataion behavior |
| * Use pip install . instead of calling setup.py |
| * Close user survey |
| * Fix Google Calendar meeting link |
| * Add docker image customization example |
| * Update remote-data-services after fsspec |
| * Fix typo in spark.rstZ |
| * Update setup/python docs for async/await API |
| * Update Local Storage HPC documentation |
| |
| ------------------------------------------------------------------- |
| Tue Jul 23 00:23:55 UTC 2019 - Todd R <toddrme2178@gmail.com> |
| |
| - Update to 2.1.0 |
| + Array |
| * Add ``recompute=`` keyword to ``svd_compressed`` for lower-memory use |
| * Change ``__array_function__`` implementation for backwards compatibility |
| * Added ``dtype`` and ``shape`` kwargs to ``apply_along_axis`` |
| * Fix reduction with empty tuple axis |
| * Drop size 0 arrays in ``stack`` |
| + Core |
| * Removes index keyword from pandas ``to_parquet`` call |
| * Fixes upstream dev CI build installation |
| * Ensure scalar arrays are not rendered to SVG |
| * Environment creation overhaul |
| * s3fs, moto compatibility |
| * pytest 5.0 compat |
| + DataFrame |
| * Fix ``compute_meta`` recursion in blockwise |
| * Remove hard dependency on pandas in ``get_dummies`` |
| * Check dtypes unchanged when using ``DataFrame.assign`` |
| * Fix cumulative functions on tables with more than 1 partition |
| * Handle non-divisible sizes in repartition |
| * Handles timestamp and ``preserve_index`` changes in pyarrow |
| * Fix undefined ``meta`` for ``str.split(expand=False)`` |
| * Removed checks used for debugging ``merge_asof`` |
| * Don't use type when getting accessor in dataframes |
| * Add ``melt`` as a method of Dask DataFrame |
| * Adds path-like support to ``to_hdf`` |
| + Documentation |
| * Point to latest K8s setup article in JupyterHub docs |
| * Changes vizualize to visualize |
| * Fix ``from_sequence`` typo in delayed best practices |
| * Add user survey link to docs |
| * Fixes typo in optimization docs |
| * Update community meeting information |
| - Update to 2.0.0 |
| + Array |
| * Support automatic chunking in da.indices |
| * Err if there are no arrays to stack |
| * Asymmetrical Array Overlap |
| * Dispatch concatenate where possible within dask array |
| * Fix tokenization of memmapped numpy arrays on different part of same file |
| * Preserve NumPy condition in da.asarray to preserve output shape |
| * Expand foo_like_safe usage |
| * Defer order/casting einsum parameters to NumPy implementation |
| * Remove numpy warning in moment calculation |
| * Fix meta_from_array to support Xarray test suite |
| * Cache chunk boundaries for integer slicing |
| * Drop size 0 arrays in concatenate |
| * Raise ValueError if concatenate is given no arrays |
| * Promote types in `concatenate` using `_meta` |
| * Add chunk type to html repr in Dask array |
| * Add Dask Array._meta attribute |
| > Fix _meta slicing of flexible types |
| > Minor meta construction cleanup in concatenate |
| > Further relax Array meta checks for Xarray |
| > Support meta= keyword in da.from_delayed |
| > Concatenate meta along axis |
| > Use meta in stack |
| > Move blockwise_meta to more general compute_meta function |
| * Alias .partitions to .blocks attribute of dask arrays |
| * Drop outdated `numpy_compat` functions |
| * Allow da.eye to support arbitrary chunking sizes with chunks='auto' |
| * Fix CI warnings in dask.array tests |
| * Make map_blocks work with drop_axis + block_info |
| * Add SVG image and table in Array._repr_html_ |
| * ufunc: avoid __array_wrap__ in favor of __array_function__ |
| * Ensure trivial padding returns the original array |
| * Test ``da.block`` with 0-size arrays |
| + Core |
| * **Drop Python 2.7** |
| * Quiet dependency installs in CI |
| * Raise on warnings in tests |
| * Add a diagnostics extra to setup.py (includes bokeh) |
| * Add newline delimter keyword to OpenFile |
| * Overload HighLevelGraphs values method |
| * Add __await__ method to Dask collections |
| * Also ignore AttributeErrors which may occur if snappy (not python-snappy) is installed |
| * Canonicalize key names in config.rename |
| * Bump minimum partd to 0.3.10 |
| * Catch async def SyntaxError |
| * catch IOError in ensure_file |
| * Cleanup CI warnings |
| * Move distributed's parse and format functions to dask.utils |
| * Apply black formatting |
| * Package license file in wheels |
| + DataFrame |
| * Add an optional partition_size parameter to repartition |
| * merge_asof and prefix_reduction |
| * Allow dataframes to be indexed by dask arrays |
| * Avoid deprecated message parameter in pytest.raises |
| * Update test_to_records to test with lengths argument(:pr:`4515`) `asmith26`_ |
| * Remove pandas pinning in Dataframe accessors |
| * Fix correlation of series with same names |
| * Map Dask Series to Dask Series |
| * Warn in dd.merge on dtype warning |
| * Add groupby Covariance/Correlation |
| * keep index name with to_datetime |
| * Add Parallel variance computation for dataframes |
| * Add divmod implementation to arrays and dataframes |
| * Add documentation for dataframe reshape methods |
| * Avoid use of pandas.compat |
| * Added accessor registration for Series, DataFrame, and Index |
| * Add read_function keyword to read_json |
| * Provide full type name in check_meta |
| * Correctly estimate bytes per row in read_sql_table |
| * Adding support of non-numeric data to describe() |
| * Scalars for extension dtypes. |
| * Call head before compute in dd.from_delayed |
| * Add support for rolling operations with larger window that partition size in DataFrames with Time-based index |
| * Update groupby-apply doc with warning |
| * Change groupby-ness tests in `_maybe_slice` |
| * Add master best practices document |
| * Add document for how Dask works with GPUs |
| * Add cli API docs |
| * Ensure concat output has coherent dtypes |
| * Fixes pandas_datareader dependencies installation |
| * Accept pathlib.Path as pattern in read_hdf |
| + Documentation |
| * Move CLI API docs to relavant pages |
| * Add to_datetime function to dataframe API docs `Matthew Rocklin`_ |
| * Add documentation entry for dask.array.ma.average |
| * Add bag.read_avro to bag API docs |
| * Fix typo |
| * Docs: Drop support for Python 2.7 |
| * Remove requirement to modify changelog |
| * Add documentation about meta column order |
| * Add documentation note in DataFrame.shift |
| * Docs: Fix typo |
| * Put do/don't into boxes for delayed best practice docs |
| * Doc fixups |
| * Add quansight to paid support doc section |
| * Add document for custom startup |
| * Allow `utils.derive_from` to accept functions, apply across array |
| * Add "Avoid Large Partitions" section to best practices |
| * Update URL for joblib to new website hosting their doc (:pr:`4816`) `Christian Hudon`_ |
| |
| ------------------------------------------------------------------- |
| Tue May 21 11:48:23 UTC 2019 - pgajdos@suse.com |
| |
| - version update to 1.2.2 |
| + Array |
| * Clarify regions kwarg to array.store (:pr:`4759`) `Martin Durant`_ |
| * Add dtype= parameter to da.random.randint (:pr:`4753`) `Matthew Rocklin`_ |
| * Use "row major" rather than "C order" in docstring (:pr:`4452`) `@asmith26`_ |
| * Normalize Xarray datasets to Dask arrays (:pr:`4756`) `Matthew Rocklin`_ |
| * Remove normed keyword in da.histogram (:pr:`4755`) `Matthew Rocklin`_ |
| + Bag |
| * Add key argument to Bag.distinct (:pr:`4423`) `Daniel Severo`_ |
| + Core |
| * Add core dask config file (:pr:`4774`) `Matthew Rocklin`_ |
| * Add core dask config file to MANIFEST.in (:pr:`4780`) `James Bourbeau`_ |
| * Enabling glob with HTTP file-system (:pr:`3926`) `Martin Durant`_ |
| * HTTPFile.seek with whence=1 (:pr:`4751`) `Martin Durant`_ |
| * Remove config key normalization (:pr:`4742`) `Jim Crist`_ |
| + DataFrame |
| * Remove explicit references to Pandas in dask.dataframe.groupby (:pr:`4778`) `Matthew Rocklin`_ |
| * Add support for group_keys kwarg in DataFrame.groupby() (:pr:`4771`) `Brian Chu`_ |
| * Describe doc (:pr:`4762`) `Martin Durant`_ |
| * Remove explicit pandas check in cumulative aggregations (:pr:`4765`) `Nick Becker`_ |
| * Added meta for read_json and test (:pr:`4588`) `Abhinav Ralhan`_ |
| * Add test for dtype casting (:pr:`4760`) `Martin Durant`_ |
| * Document alignment in map_partitions (:pr:`4757`) `Jim Crist`_ |
| * Implement Series.str.split(expand=True) (:pr:`4744`) `Matthew Rocklin`_ |
| + Documentation |
| * Tweaks to develop.rst from trying to run tests (:pr:`4772`) `Christian Hudon`_ |
| * Add document describing phases of computation (:pr:`4766`) `Matthew Rocklin`_ |
| * Point users to Dask-Yarn from spark documentation (:pr:`4770`) `Matthew Rocklin`_ |
| * Update images in delayed doc to remove labels (:pr:`4768`) `Martin Durant`_ |
| * Explain intermediate storage for dask arrays (:pr:`4025`) `John A Kirkham`_ |
| * Specify bash code-block in array best practices (:pr:`4764`) `James Bourbeau`_ |
| * Add array best practices doc (:pr:`4705`) `Matthew Rocklin`_ |
| * Update optimization docs now that cull is not automatic (:pr:`4752`) `Matthew Rocklin`_ |
| - version update to 1.2.1 |
| + Array |
| * Fix map_blocks with block_info and broadcasting (:pr:`4737`) `Bruce Merry`_ |
| * Make 'minlength' keyword argument optional in da.bincount (:pr:`4684`) `Genevieve Buckley`_ |
| * Add support for map_blocks with no array arguments (:pr:`4713`) `Bruce Merry`_ |
| * Add dask.array.trace (:pr:`4717`) `Danilo Horta`_ |
| * Add sizeof support for cupy.ndarray (:pr:`4715`) `Peter Andreas Entschev`_ |
| * Add name kwarg to from_zarr (:pr:`4663`) `Michael Eaton`_ |
| * Add chunks='auto' to from_array (:pr:`4704`) `Matthew Rocklin`_ |
| * Raise TypeError if dask array is given as shape for da.ones, zeros, empty or full (:pr:`4707`) `Genevieve Buckley`_ |
| * Add TileDB backend (:pr:`4679`) `Isaiah Norton`_ |
| + Core |
| * Delay long list arguments (:pr:`4735`) `Matthew Rocklin`_ |
| * Bump to numpy >= 1.13, pandas >= 0.21.0 (:pr:`4720`) `Jim Crist`_ |
| * Remove file "test" (:pr:`4710`) `James Bourbeau`_ |
| * Reenable development build, uses upstream libraries (:pr:`4696`) `Peter Andreas Entschev`_ |
| * Remove assertion in HighLevelGraph constructor (:pr:`4699`) `Matthew Rocklin`_ |
| + DataFrame |
| * Change cum-aggregation last-nonnull-value algorithm (:pr:`4736`) `Nick Becker`_ |
| * Fixup series-groupby-apply (:pr:`4738`) `Jim Crist`_ |
| * Refactor array.percentile and dataframe.quantile to use t-digest (:pr:`4677`) `Janne Vuorela`_ |
| * Allow naive concatenation of sorted dataframes (:pr:`4725`) `Matthew Rocklin`_ |
| * Fix perf issue in dd.Series.isin (:pr:`4727`) `Jim Crist`_ |
| * Remove hard pandas dependency for melt by using methodcaller (:pr:`4719`) `Nick Becker`_ |
| * A few dataframe metadata fixes (:pr:`4695`) `Jim Crist`_ |
| * Add Dataframe.replace (:pr:`4714`) `Matthew Rocklin`_ |
| * Add 'threshold' parameter to pd.DataFrame.dropna (:pr:`4625`) `Nathan Matare`_ |
| + Documentation |
| * Add warning about derived docstrings early in the docstring (:pr:`4716`) `Matthew Rocklin`_ |
| * Create dataframe best practices doc (:pr:`4703`) `Matthew Rocklin`_ |
| * Uncomment dask_sphinx_theme (:pr:`4728`) `James Bourbeau`_ |
| * Fix minor typo fix in a Queue/fire_and_forget example (:pr:`4709`) `Matthew Rocklin`_ |
| * Update from_pandas docstring to match signature (:pr:`4698`) `James Bourbeau`_ |
| |
| ------------------------------------------------------------------- |
| Mon Apr 22 19:32:28 UTC 2019 - Todd R <toddrme2178@gmail.com> |
| |
| - Update to version 1.2.0 |
| + Array |
| * Fixed mean() and moment() on sparse arrays |
| * Add test for NEP-18. |
| * Allow None to say "no chunking" in normalize_chunks |
| * Fix limit value in auto_chunks |
| + Core |
| * Updated diagnostic bokeh test for compatibility with bokeh>=1.1.0 |
| * Adjusts codecov's target/threshold, disable patch |
| * Always start with empty http buffer, not None |
| + DataFrame |
| * Propagate index dtype and name when create dask dataframe from array |
| * Fix ordering of quantiles in describe |
| * Clean up and document rearrange_column_by_tasks |
| * Mark some parquet tests xfail |
| * Fix parquet breakages with arrow 0.13.0 |
| * Allow sample to be False when reading CSV from a remote URL |
| * Fix timezone metadata inference on parquet load |
| * Use is_dataframe/index_like in dd.utils |
| * Add min_count parameter to groupby sum method |
| * Correct quantile to handle unsorted quantiles |
| + Documentation |
| * Add delayed extra dependencies to install docs |
| - Update to version 1.1.5 |
| + Array |
| * Ensure that we use the dtype keyword in normalize_chunks |
| + Core |
| * Use recursive glob in LocalFileSystem |
| * Avoid YAML deprecation |
| * Fix CI and add set -e |
| * Support builtin sequence types in dask.visualize |
| * unpack/repack orderedDict |
| * Add da.random.randint to API docs |
| * Add zarr to CI environment |
| * Enable codecov |
| + DataFrame |
| * Support setting the index |
| * DataFrame.itertuples accepts index, name kwargs |
| * Support non-Pandas series in dd.Series.unique |
| * Replace use of explicit type check with ._is_partition_type predicate |
| * Remove additional pandas warnings in tests |
| * Check object for name/dtype attributes rather than type |
| * Fix comparison against pd.Series |
| * Fixing warning from setting categorical codes to floats |
| * Fix renaming on index to_frame method |
| * Fix divisions when joining two single-partition dataframes |
| * Warn if partitions overlap in compute_divisions |
| * Give informative meta= warning |
| * Add informative error message to Series.__getitem__ |
| * Add clear exception message when using index or index_col in read_csv |
| + Documentation |
| * Add documentation for custom groupby aggregations |
| * Docs dataframe joins |
| * Specify fork-based contributions |
| * correct to_parquet example in docs |
| * Update and secure several references |
| |
| ------------------------------------------------------------------- |
| Tue Apr 9 10:06:13 UTC 2019 - pgajdos@suse.com |
| |
| - do not require optional python2-sparse for testing, python-sparse |
| is going to be python3-only |
| |
| ------------------------------------------------------------------- |
| Mon Mar 11 12:30:53 UTC 2019 - Tomáš Chvátal <tchvatal@suse.com> |
| |
| - Update to 1.1.4: |
| * Various bugfixes in 1.1 branch |
| |
| ------------------------------------------------------------------- |
| Wed Feb 20 11:19:16 UTC 2019 - Tomáš Chvátal <tchvatal@suse.com> |
| |
| - Enable tests and switch to multibuild |
| |
| ------------------------------------------------------------------- |
| Sat Feb 2 17:09:28 UTC 2019 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 1.1.1: |
| * Array |
| + Add support for cupy.einsum (:pr:`4402`) Johnnie Gray |
| + Provide byte size in chunks keyword (:pr:`4434`) Adam Beberg |
| + Raise more informative error for histogram bins and range |
| (:pr:`4430`) James Bourbeau |
| * DataFrame |
| + Lazily register more cudf functions and move to backends file |
| (:pr:`4396`) Matthew Rocklin |
| + Fix ORC tests for pyarrow 0.12.0 (:pr:`4413`) Jim Crist |
| + rearrange_by_column: ensure that shuffle arg defaults to 'disk' |
| if it's None in dask.config (:pr:`4414`) George Sakkis |
| + Implement filters for _read_pyarrow (:pr:`4415`) George Sakkis |
| + Avoid checking against types in is_dataframe_like (:pr:`4418`) |
| Matthew Rocklin |
| + Pass username as 'user' when using pyarrow (:pr:`4438`) Roma |
| Sokolov |
| * Delayed |
| + Fix DelayedAttr return value (:pr:`4440`) Matthew Rocklin |
| * Documentation |
| + Use SVG for pipeline graphic (:pr:`4406`) John A Kirkham |
| + Add doctest-modules to py.test documentation (:pr:`4427`) Daniel |
| Severo |
| * Core |
| + Work around psutil 5.5.0 not allowing pickling Process objects |
| Dimplexion |
| |
| ------------------------------------------------------------------- |
| Sun Jan 20 04:50:39 UTC 2019 - Arun Persaud <arun@gmx.de> |
| |
| - specfile: |
| * update copyright year |
| |
| - update to version 1.1.0: |
| * Array |
| + Fix the average function when there is a masked array |
| (:pr:`4236`) Damien Garaud |
| + Add allow_unknown_chunksizes to hstack and vstack (:pr:`4287`) |
| Paul Vecchio |
| + Fix tensordot for 27+ dimensions (:pr:`4304`) Johnnie Gray |
| + Fixed block_info with axes. (:pr:`4301`) Tom Augspurger |
| + Use safe_wraps for matmul (:pr:`4346`) Mark Harfouche |
| + Use chunks="auto" in array creation routines (:pr:`4354`) |
| Matthew Rocklin |
| + Fix np.matmul in dask.array.Array.__array_ufunc__ (:pr:`4363`) |
| Stephan Hoyer |
| + COMPAT: Re-enable multifield copy->view change (:pr:`4357`) |
| Diane Trout |
| + Calling np.dtype on a delayed object works (:pr:`4387`) Jim |
| Crist |
| + Rework normalize_array for numpy data (:pr:`4312`) Marco Neumann |
| * DataFrame |
| + Add fill_value support for series comparisons (:pr:`4250`) James |
| Bourbeau |
| + Add schema name in read_sql_table for empty tables (:pr:`4268`) |
| Mina Farid |
| + Adjust check for bad chunks in map_blocks (:pr:`4308`) Tom |
| Augspurger |
| + Add dask.dataframe.read_fwf (:pr:`4316`) @slnguyen |
| + Use atop fusion in dask dataframe (:pr:`4229`) Matthew Rocklin |
| + Use parallel_types(`) in from_pandas (:pr:`4331`) Matthew |
| Rocklin |
| + Change DataFrame._repr_data to method (:pr:`4330`) Matthew |
| Rocklin |
| + Install pyarrow fastparquet for Appveyor (:pr:`4338`) Gábor |
| Lipták |
| + Remove explicit pandas checks and provide cudf lazy registration |
| (:pr:`4359`) Matthew Rocklin |
| + Replace isinstance(..., pandas`) with is_dataframe_like |
| (:pr:`4375`) Matthew Rocklin |
| + ENH: Support 3rd-party ExtensionArrays (:pr:`4379`) Tom |
| Augspurger |
| + Pandas 0.24.0 compat (:pr:`4374`) Tom Augspurger |
| * Documentation |
| + Fix link to 'map_blocks' function in array api docs (:pr:`4258`) |
| David Hoese |
| + Add a paragraph on Dask-Yarn in the cloud docs (:pr:`4260`) Jim |
| Crist |
| + Copy edit documentation (:pr:`4267), (:pr:`4263`), (:pr:`4262`), |
| (:pr:`4277`), (:pr:`4271`), (:pr:`4279), (:pr:`4265`), |
| (:pr:`4295`), (:pr:`4293`), (:pr:`4296`), (:pr:`4302`), |
| (:pr:`4306`), (:pr:`4318`), (:pr:`4314`), (:pr:`4309`), |
| (:pr:`4317`), (:pr:`4326`), (:pr:`4325`), (:pr:`4322`), |
| (:pr:`4332`), (:pr:`4333`), Miguel Farrajota |
| + Fix typo in code example (:pr:`4272`) Daniel Li |
| + Doc: Update array-api.rst (:pr:`4259`) (:pr:`4282`) Prabakaran |
| Kumaresshan |
| + Update hpc doc (:pr:`4266`) Guillaume Eynard-Bontemps |
| + Doc: Replace from_avro with read_avro in documents (:pr:`4313`) |
| Prabakaran Kumaresshan |
| + Remove reference to "get" scheduler functions in docs |
| (:pr:`4350`) Matthew Rocklin |
| + Fix typo in docstring (:pr:`4376`) Daniel Saxton |
| + Added documentation for dask.dataframe.merge (:pr:`4382`) |
| Jendrik Jördening |
| * Core |
| + Avoid recursion in dask.core.get (:pr:`4219`) Matthew Rocklin |
| + Remove verbose flag from pytest setup.cfg (:pr:`4281`) Matthew |
| Rocklin |
| + Support Pytest 4.0 by specifying marks explicitly (:pr:`4280`) |
| Takahiro Kojima |
| + Add High Level Graphs (:pr:`4092`) Matthew Rocklin |
| + Fix SerializableLock locked and acquire methods (:pr:`4294`) |
| Stephan Hoyer |
| + Pin boto3 to earlier version in tests to avoid moto conflict |
| (:pr:`4276`) Martin Durant |
| + Treat None as missing in config when updating (:pr:`4324`) |
| Matthew Rocklin |
| + Update Appveyor to Python 3.6 (:pr:`4337`) Gábor Lipták |
| + Use parse_bytes more liberally in dask.dataframe/bytes/bag |
| (:pr:`4339`) Matthew Rocklin |
| + Add a better error message when cloudpickle is missing |
| (:pr:`4342`) Mark Harfouche |
| + Support pool= keyword argument in threaded/multiprocessing get |
| functions (:pr:`4351`) Matthew Rocklin |
| + Allow updates from arbitrary Mappings in config.update, not only |
| dicts. (:pr:`4356`) Stuart Berg |
| + Move dask/array/top.py code to dask/blockwise.py (:pr:`4348`) |
| Matthew Rocklin |
| + Add has_parallel_type (:pr:`4395`) Matthew Rocklin |
| + CI: Update Appveyor (:pr:`4381`) Tom Augspurger |
| + Ignore non-readable config files (:pr:`4388`) Jim Crist |
| |
| ------------------------------------------------------------------- |
| Sat Dec 1 18:36:31 UTC 2018 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 1.0.0: |
| * Array |
| + Add nancumsum/nancumprod unit tests (:pr:`4215`) Guido Imperiale |
| * DataFrame |
| + Add index to to_dask_dataframe docstring (:pr:`4232`) James |
| Bourbeau |
| + Text and fix when appending categoricals with fastparquet |
| (:pr:`4245`) Martin Durant |
| + Don't reread metadata when passing ParquetFile to read_parquet |
| (:pr:`4247`) Martin Durant |
| * Documentation |
| + Copy edit documentation (:pr:`4222`) (:pr:`4224`) (:pr:`4228`) |
| (:pr:`4231`) (:pr:`4230`) (:pr:`4234`) (:pr:`4235`) (:pr:`4254`) |
| Miguel Farrajota |
| + Updated doc for the new scheduler keyword (:pr:`4251`) @milesial |
| * Core |
| + Avoid a few warnings (:pr:`4223`) Matthew Rocklin |
| + Remove dask.store module (:pr:`4221`) Matthew Rocklin |
| + Remove AUTHORS.md Jim Crist |
| |
| ------------------------------------------------------------------- |
| Thu Nov 22 22:46:17 UTC 2018 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 0.20.2: |
| * Array |
| + Avoid fusing dependencies of atop reductions (:pr:`4207`) |
| Matthew Rocklin |
| * Dataframe |
| + Improve memory footprint for dataframe correlation (:pr:`4193`) |
| Damien Garaud |
| + Add empty DataFrame check to boundary_slice (:pr:`4212`) James |
| Bourbeau |
| * Documentation |
| + Copy edit documentation (:pr:`4197`) (:pr:`4204`) (:pr:`4198`) |
| (:pr:`4199`) (:pr:`4200`) (:pr:`4202`) (:pr:`4209`) Miguel |
| Farrajota |
| + Add stats module namespace (:pr:`4206`) James Bourbeau |
| + Fix link in dataframe documentation (:pr:`4208`) James Bourbeau |
| |
| ------------------------------------------------------------------- |
| Mon Nov 12 05:54:54 UTC 2018 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 0.20.1: |
| * Array |
| + Only allocate the result space in wrapped_pad_func (:pr:`4153`) |
| John A Kirkham |
| + Generalize expand_pad_width to expand_pad_value (:pr:`4150`) |
| John A Kirkham |
| + Test da.pad with 2D linear_ramp case (:pr:`4162`) John A Kirkham |
| + Fix import for broadcast_to. (:pr:`4168`) samc0de |
| + Rewrite Dask Array's pad to add only new chunks (:pr:`4152`) |
| John A Kirkham |
| + Validate index inputs to atop (:pr:`4182`) Matthew Rocklin |
| * Core |
| + Dask.config set and get normalize underscores and hyphens |
| (:pr:`4143`) James Bourbeau |
| + Only subs on core collections, not subclasses (:pr:`4159`) |
| Matthew Rocklin |
| + Add block_size=0 option to HTTPFileSystem. (:pr:`4171`) Martin |
| Durant |
| + Add traverse support for dataclasses (:pr:`4165`) Armin Berres |
| + Avoid optimization on sharedicts without dependencies |
| (:pr:`4181`) Matthew Rocklin |
| + Update the pytest version for TravisCI (:pr:`4189`) Damien |
| Garaud |
| + Use key_split rather than funcname in visualize names |
| (:pr:`4160`) Matthew Rocklin |
| * Dataframe |
| + Add fix for DataFrame.__setitem__ for index (:pr:`4151`) |
| Anderson Banihirwe |
| + Fix column choice when passing list of files to fastparquet |
| (:pr:`4174`) Martin Durant |
| + Pass engine_kwargs from read_sql_table to sqlalchemy |
| (:pr:`4187`) Damien Garaud |
| * Documentation |
| + Fix documentation in Delayed best practices example that |
| returned an empty list (:pr:`4147`) Jonathan Fraine |
| + Copy edit documentation (:pr:`4164`) (:pr:`4175`) (:pr:`4185`) |
| (:pr:`4192`) (:pr:`4191`) (:pr:`4190`) (:pr:`4180`) Miguel |
| Farrajota |
| + Fix typo in docstring (:pr:`4183`) Carlos Valiente |
| |
| ------------------------------------------------------------------- |
| Tue Oct 30 03:04:38 UTC 2018 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 0.20.0: |
| * Array |
| + Fuse Atop operations (:pr:`3998`), (:pr:`4081`) Matthew Rocklin |
| + Support da.asanyarray on dask dataframes (:pr:`4080`) Matthew |
| Rocklin |
| + Remove unnecessary endianness check in datetime test |
| (:pr:`4113`) Elliott Sales de Andrade |
| + Set name=False in array foo_like functions (:pr:`4116`) Matthew |
| Rocklin |
| + Remove dask.array.ghost module (:pr:`4121`) Matthew Rocklin |
| + Fix use of getargspec in dask array (:pr:`4125`) Stephan Hoyer |
| + Adds dask.array.invert (:pr:`4127`), (:pr:`4131`) Anderson |
| Banihirwe |
| + Raise informative error on arg-reduction on unknown chunksize |
| (:pr:`4128`), (:pr:`4135`) Matthew Rocklin |
| + Normalize reversed slices in dask array (:pr:`4126`) Matthew |
| Rocklin |
| * Bag |
| + Add bag.to_avro (:pr:`4076`) Martin Durant |
| * Core |
| + Pull num_workers from config.get (:pr:`4086`), (:pr:`4093`) |
| James Bourbeau |
| + Fix invalid escape sequences with raw strings (:pr:`4112`) |
| Elliott Sales de Andrade |
| + Raise an error on the use of the get= keyword and set_options |
| (:pr:`4077`) Matthew Rocklin |
| + Add import for Azure DataLake storage, and add docs (:pr:`4132`) |
| Martin Durant |
| + Avoid collections.Mapping/Sequence (:pr:`4138`) Matthew Rocklin |
| * Dataframe |
| + Include index keyword in to_dask_dataframe (:pr:`4071`) Matthew |
| Rocklin |
| + add support for duplicate column names (:pr:`4087`) Jan Koch |
| + Implement min_count for the DataFrame methods sum and prod |
| (:pr:`4090`) Bart Broere |
| + Remove pandas warnings in concat (:pr:`4095`) Matthew Rocklin |
| + DataFrame.to_csv header option to only output headers in the |
| first chunk (:pr:`3909`) Rahul Vaidya |
| + Remove Series.to_parquet (:pr:`4104`) Justin Dennison |
| + Avoid warnings and deprecated pandas methods (:pr:`4115`) |
| Matthew Rocklin |
| + Swap 'old' and 'previous' when reporting append error |
| (:pr:`4130`) Martin Durant |
| * Documentation |
| + Copy edit documentation (:pr:`4073`), (:pr:`4074`), |
| (:pr:`4094`), (:pr:`4097`), (:pr:`4107`), (:pr:`4124`), |
| (:pr:`4133`), (:pr:`4139`) Miguel Farrajota |
| + Fix typo in code example (:pr:`4089`) Antonino Ingargiola |
| + Add pycon 2018 presentation (:pr:`4102`) Javad |
| + Quick description for gcsfs (:pr:`4109`) Martin Durant |
| + Fixed typo in docstrings of read_sql_table method (:pr:`4114`) |
| TakaakiFuruse |
| + Make target directories in redirects if they don't exist |
| (:pr:`4136`) Matthew Rocklin |
| |
| ------------------------------------------------------------------- |
| Wed Oct 10 01:49:52 UTC 2018 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 0.19.4: |
| * Array |
| + Implement apply_gufunc(..., axes=..., keepdims=...) (:pr:`3985`) |
| Markus Gonser |
| * Bag |
| + Fix typo in datasets.make_people (:pr:`4069`) Matthew Rocklin |
| * Dataframe |
| + Added percentiles options for dask.dataframe.describe method |
| (:pr:`4067`) Zhenqing Li |
| + Add DataFrame.partitions accessor similar to Array.blocks |
| (:pr:`4066`) Matthew Rocklin |
| * Core |
| + Pass get functions and Clients through scheduler keyword |
| (:pr:`4062`) Matthew Rocklin |
| * Documentation |
| + Fix Typo on hpc example. (missing = in kwarg). (:pr:`4068`) |
| Matthias Bussonier |
| + Extensive copy-editing: (:pr:`4065`), (:pr:`4064`), (:pr:`4063`) |
| Miguel Farrajota |
| |
| ------------------------------------------------------------------- |
| Mon Oct 8 15:01:22 UTC 2018 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 0.19.3: |
| * Array |
| + Make da.RandomState extensible to other modules (:pr:`4041`) |
| Matthew Rocklin |
| + Support unknown dims in ravel no-op case (:pr:`4055`) Jim Crist |
| + Add basic infrastructure for cupy (:pr:`4019`) Matthew Rocklin |
| + Avoid asarray and lock arguments for from_array(getitem`) |
| (:pr:`4044`) Matthew Rocklin |
| + Move local imports in corrcoef to global imports (:pr:`4030`) |
| John A Kirkham |
| + Move local indices import to global import (:pr:`4029`) John A |
| Kirkham |
| + Fix-up Dask Array's fromfunction w.r.t. dtype and kwargs |
| (:pr:`4028`) John A Kirkham |
| + Don't use dummy expansion for trim_internal in overlapped |
| (:pr:`3964`) Mark Harfouche |
| + Add unravel_index (:pr:`3958`) John A Kirkham |
| * Bag |
| + Sort result in Bag.frequencies (:pr:`4033`) Matthew Rocklin |
| + Add support for npartitions=1 edge case in groupby (:pr:`4050`) |
| James Bourbeau |
| + Add new random dataset for people (:pr:`4018`) Matthew Rocklin |
| + Improve performance of bag.read_text on small files (:pr:`4013`) |
| Eric Wolak |
| + Add bag.read_avro (:pr:`4000`) (:pr:`4007`) Martin Durant |
| * Dataframe |
| + Added an index parameter to |
| :meth:`dask.dataframe.from_dask_array` for creating a dask |
| DataFrame from a dask Array with a given index. (:pr:`3991`) Tom |
| Augspurger |
| + Improve sub-classability of dask dataframe (:pr:`4015`) Matthew |
| Rocklin |
| + Fix failing hdfs test [test-hdfs] (:pr:`4046`) Jim Crist |
| + fuse_subgraphs works without normal fuse (:pr:`4042`) Jim Crist |
| + Make path for reading many parquet files without prescan |
| (:pr:`3978`) Martin Durant |
| + Index in dd.from_dask_array (:pr:`3991`) Tom Augspurger |
| + Making skiprows accept lists (:pr:`3975`) Julia Signell |
| + Fail early in fastparquet read for nonexistent column |
| (:pr:`3989`) Martin Durant |
| * Core |
| + Add support for npartitions=1 edge case in groupby (:pr:`4050`) |
| James Bourbeau |
| + Automatically wrap large arguments with dask.delayed in |
| map_blocks/partitions (:pr:`4002`) Matthew Rocklin |
| + Fuse linear chains of subgraphs (:pr:`3979`) Jim Crist |
| + Make multiprocessing context configurable (:pr:`3763`) Itamar |
| Turner-Trauring |
| * Documentation |
| + Extensive copy-editing (:pr:`4049`), (:pr:`4034`), (:pr:`4031`), |
| (:pr:`4020`), (:pr:`4021`), (:pr:`4022`), (:pr:`4023`), |
| (:pr:`4016`), (:pr:`4017`), (:pr:`4010`), (:pr:`3997`), |
| (:pr:`3996`), Miguel Farrajota |
| + Update shuffle method selection docs [skip ci] (:pr:`4048`) |
| James Bourbeau |
| + Remove docs/source/examples, point to examples.dask.org |
| (:pr:`4014`) Matthew Rocklin |
| + Replace readthedocs links with dask.org (:pr:`4008`) Matthew |
| Rocklin |
| + Updates DataFrame.to_hdf docstring for returned values [skip ci] |
| (:pr:`3992`) James Bourbeau |
| |
| ------------------------------------------------------------------- |
| Mon Sep 17 14:54:42 UTC 2018 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 0.19.2: |
| * Array |
| + apply_gufunc implements automatic infer of functions output |
| dtypes (:pr:`3936`) Markus Gonser |
| + Fix array histogram range error when array has nans (#3980) |
| James Bourbeau |
| + Issue 3937 follow up, int type checks. (#3956) Yu Feng |
| + from_array: add @martindurant's explaining of how hashing is |
| done for an array. ( |
| + Support gradient with coordinate ( |
| * Core |
| + Fix use of has_keyword with partial in Python 2.7 ( |
| Harfouche |
| + Set pyarrow as default for HDFS ( |
| * Documentation |
| + Use dask_sphinx_theme ( |
| + Use JupyterLab in Binder links from main page Matthew Rocklin |
| + DOC: fixed sphinx syntax ( |
| |
| ------------------------------------------------------------------- |
| Sat Sep 8 04:33:17 UTC 2018 - Arun Persaud <arun@gmx.de> |
| |
| - update to version 0.19.1: |
| * Array |
| + Don't enforce dtype if result has no dtype (:pr:`3928`) Matthew |
| Rocklin |
| + Fix NumPy issubtype deprecation warning (:pr:`3939`) Bruce Merry |
| + Fix arg reduction tokens to be unique with different arguments |
| (:pr:`3955`) Tobias de Jong |
| + Coerce numpy integers to ints in slicing code (:pr:`3944`) Yu |
| Feng |
| + Linalg.norm ndim along axis partial fix (:pr:`3933`) Tobias de |
| Jong |
| * Dataframe |
| + Deterministic DataFrame.set_index (:pr:`3867`) George Sakkis |
| + Fix divisions in read_parquet when dealing with filters #3831 |
| #3930 (:pr:`3923`) (:pr:`3931`) @andrethrill |
| + Fixing returning type in categorical.as_known (:pr:`3888`) |
| Sriharsha Hatwar |
| + Fix DataFrame.assign for callables (:pr:`3919`) Tom Augspurger |
| + Include partitions with no width in repartition (:pr:`3941`) |
| Matthew Rocklin |
| + Don't constrict stage/k dtype in dataframe shuffle (:pr:`3942`) |
| Matthew Rocklin |
| * Documentation |
| + DOC: Add hint on how to render task graphs horizontally |
| (:pr:`3922`) Uwe Korn |
| + Add try-now button to main landing page (:pr:`3924`) Matthew |
| Rocklin |
| |
| ------------------------------------------------------------------- |
| Sun Sep 2 17:00:59 UTC 2018 - arun@gmx.de |
| |
| - specfile: |
| * remove devel from noarch |
| |
| - update to version 0.19.0: |
| * Array |
| + Fix argtopk split_every bug (:pr:`3810`) Guido Imperiale |
| + Ensure result computing dask.array.isnull(`) always gives a |
| numpy array (:pr:`3825`) Stephan Hoyer |
| + Support concatenate for scipy.sparse in dask array (:pr:`3836`) |
| Matthew Rocklin |
| + Fix argtopk on 32-bit systems. (:pr:`3823`) Elliott Sales de |
| Andrade |
| + Normalize keys in rechunk (:pr:`3820`) Matthew Rocklin |
| + Allow shape of dask.array to be a numpy array (:pr:`3844`) Mark |
| Harfouche |
| + Fix numpy deprecation warning on tuple indexing (:pr:`3851`) |
| Tobias de Jong |
| + Rename ghost module to overlap (:pr:`3830`) `Robert Sare`_ |
| + Re-add the ghost import to da __init__ (:pr:`3861`) Jim Crist |
| + Ensure copy preserves masked arrays (:pr:`3852`) Tobias de Jong |
| * DataFrame |
| + Added dtype and sparse keywords to |
| :func:`dask.dataframe.get_dummies` (:pr:`3792`) Tom Augspurger |
| + Added :meth:`dask.dataframe.to_dask_array` for converting a Dask |
| Series or DataFrame to a Dask Array, possibly with known chunk |
| sizes (:pr:`3884`) Tom Augspurger |
| + Changed the behavior for :meth:`dask.array.asarray` for dask |
| dataframe and series inputs. Previously, the series was eagerly |
| converted to an in-memory NumPy array before creating a dask |
| array with known chunks sizes. This caused unexpectedly high |
| memory usage. Now, no intermediate NumPy array is created, and a |
| Dask array with unknown chunk sizes is returned (:pr:`3884`) Tom |
| Augspurger |
| + DataFrame.iloc (:pr:`3805`) Tom Augspurger |
| + When reading multiple paths, expand globs. (:pr:`3828`) Irina |
| Truong |
| + Added index column name after resample (:pr:`3833`) Eric |
| Bonfadini |
| + Add (lazy) shape property to dataframe and series (:pr:`3212`) |
| Henrique Ribeiro |
| + Fix failing hdfs test [test-hdfs] (:pr:`3858`) Jim Crist |
| + Fixes for pyarrow 0.10.0 release (:pr:`3860`) Jim Crist |
| + Rename to_csv keys for diagnostics (:pr:`3890`) Matthew Rocklin |
| + Match pandas warnings for concat sort (:pr:`3897`) Tom |
| Augspurger |
| + Include filename in read_csv (:pr:`3908`) Julia Signell |
| * Core |
| + Better error message on import when missing common dependencies |
| (:pr:`3771`) Danilo Horta |
| + Drop Python 3.4 support (:pr:`3840`) Jim Crist |
| + Remove expired deprecation warnings (:pr:`3841`) Jim Crist |
| + Add DASK_ROOT_CONFIG environment variable (:pr:`3849`) `Joe |
| Hamman`_ |
| + Don't cull in local scheduler, do cull in delayed (:pr:`3856`) |
| Jim Crist |
| + Increase conda download retries (:pr:`3857`) Jim Crist |
| + Add python_requires and Trove classifiers (:pr:`3855`) @hugovk |
| + Fix collections.abc deprecation warnings in Python 3.7.0 |
| (:pr:`3876`) Jan Margeta |
| + Allow dot jpeg to xfail in visualize tests (:pr:`3896`) Matthew |
| Rocklin |
| + Add Python 3.7 to travis.yml (:pr:`3894`) Matthew Rocklin |
| + Add expand_environment_variables to dask.config (:pr:`3893`) |
| `Joe Hamman`_ |
| * Docs |
| + Fix typo in import statement of diagnostics (:pr:`3826`) John |
| Mrziglod |
| + Add link to YARN docs (:pr:`3838`) Jim Crist |
| + fix of minor typos in landing page index.html (:pr:`3746`) |
| Christoph Moehl |
| + Update delayed-custom.rst (:pr:`3850`) Anderson Banihirwe |
| + DOC: clarify delayed docstring (:pr:`3709`) Scott Sievert |
| + Add new presentations (:pr:`3880`) @javad94 |
| + Add dask array normalize_chunks to documentation (:pr:`3878`) |
| Daniel Rothenberg |
| + Docs: Fix link to snakeviz (:pr:`3900`) Hans Moritz Günther |
| + Add missing ` to docstring (:pr:`3915`) @rtobar |
| |
| - changes from version 0.18.2: |
| * Array |
| + Reimplemented argtopk to make it release the GIL (:pr:`3610`) |
| Guido Imperiale |
| + Don't overlap on non-overlapped dimensions in map_overlap |
| (:pr:`3653`) Matthew Rocklin |
| + Fix linalg.tsqr for dimensions of uncertain length (:pr:`3662`) |
| Jeremy Chen |
| + Break apart uneven array-of-int slicing to separate chunks |
| (:pr:`3648`) Matthew Rocklin |
| + Align auto chunks to provided chunks, rather than shape |
| (:pr:`3679`) Matthew Rocklin |
| + Adds endpoint and retstep support for linspace (:pr:`3675`) |
| James Bourbeau |
| + Implement .blocks accessor (:pr:`3689`) Matthew Rocklin |
| + Add block_info keyword to map_blocks functions (:pr:`3686`) |
| Matthew Rocklin |
| + Slice by dask array of ints (:pr:`3407`) Guido Imperiale |
| + Support dtype in arange (:pr:`3722`) Guido Imperiale |
| + Fix argtopk with uneven chunks (:pr:`3720`) Guido Imperiale |
| + Raise error when replace=False in da.choice (:pr:`3765`) James |
| Bourbeau |
| + Update chunks in Array.__setitem__ (:pr:`3767`) Itamar |
| Turner-Trauring |
| + Add a chunksize convenience property (:pr:`3777`) Jacob |
| Tomlinson |
| + Fix and simplify array slicing behavior when step < 0 |
| (:pr:`3702`) Ziyao Wei |
| + Ensure to_zarr with return_stored True returns a Dask Array |
| (:pr:`3786`) John A Kirkham |
| * Bag |
| + Add last_endline optional parameter in to_textfiles (:pr:`3745`) |
| George Sakkis |
| * Dataframe |
| + Add aggregate function for rolling objects (:pr:`3772`) Gerome |
| Pistre |
| + Properly tokenize cumulative groupby aggregations (:pr:`3799`) |
| Cloves Almeida |
| * Delayed |
| + Add the @ operator to the delayed objects (:pr:`3691`) Mark |
| Harfouche |
| + Add delayed best practices to documentation (:pr:`3737`) Matthew |
| Rocklin |
| + Fix @delayed decorator for methods and add tests (:pr:`3757`) |
| Ziyao Wei |
| * Core |
| + Fix extra progressbar (:pr:`3669`) Mike Neish |
| + Allow tasks back onto ordering stack if they have one dependency |
| (:pr:`3652`) Matthew Rocklin |
| + Prefer end-tasks with low numbers of dependencies when ordering |
| (:pr:`3588`) Tom Augspurger |
| + Add assert_eq to top-level modules (:pr:`3726`) Matthew Rocklin |
| + Test that dask collections can hold scipy.sparse arrays |
| (:pr:`3738`) Matthew Rocklin |
| + Fix setup of lz4 decompression functions (:pr:`3782`) Elliott |
| Sales de Andrade |
| + Add datasets module (:pr:`3780`) Matthew Rocklin |
| |
| ------------------------------------------------------------------- |
| Sun Jun 24 01:07:09 UTC 2018 - arun@gmx.de |
| |
| - update to version 0.18.1: |
| * Array |
| + from_array now supports scalar types and nested lists/tuples in |
| input, just like all numpy functions do. It also produces a |
| simpler graph when the input is a plain ndarray (:pr:`3556`) |
| Guido Imperiale |
| + Fix slicing of big arrays due to cumsum dtype bug (:pr:`3620`) |
| Marco Rossi |
| + Add Dask Array implementation of pad (:pr:`3578`) John A Kirkham |
| + Fix array random API examples (:pr:`3625`) James Bourbeau |
| + Add average function to dask array (:pr:`3640`) James Bourbeau |
| + Tokenize ghost_internal with axes (:pr:`3643`) Matthew Rocklin |
| + from_array: special handling for ndarray, list, and scalar types |
| (:pr:`3568`) Guido Imperiale |
| + Add outer for Dask Arrays (:pr:`3658`) John A Kirkham |
| * DataFrame |
| + Add Index.to_series method (:pr:`3613`) Henrique Ribeiro |
| + Fix missing partition columns in pyarrow-parquet (:pr:`3636`) |
| Martin Durant |
| * Core |
| + Minor tweaks to CI (:pr:`3629`) Guido Imperiale |
| + Add back dask.utils.effective_get (:pr:`3642`) Matthew Rocklin |
| + DASK_CONFIG dictates config write location (:pr:`3621`) Jim |
| Crist |
| + Replace 'collections' key in unpack_collections with unique key |
| (:pr:`3632`) Yu Feng |
| + Avoid deepcopy in dask.config.set (:pr:`3649`) Matthew Rocklin |
| |
| - changes from version 0.18.0: |
| * Array |
| + Add to/read_zarr for Zarr-format datasets and arrays |
| (:pr:`3460`) Martin Durant |
| + Experimental addition of generalized ufunc support, |
| apply_gufunc, gufunc, and as_gufunc (:pr:`3109`) (:pr:`3526`) |
| (:pr:`3539`) Markus Gonser |
| + Avoid unnecessary rechunking tasks (:pr:`3529`) Matthew Rocklin |
| + Compute dtypes at runtime for fft (:pr:`3511`) Matthew Rocklin |
| + Generate UUIDs for all da.store operations (:pr:`3540`) Martin |
| Durant |
| + Correct internal dimension of Dask's SVD (:pr:`3517`) John A |
| Kirkham |
| + BUG: do not raise IndexError for identity slice in array.vindex |
| (:pr:`3559`) Scott Sievert |
| + Adds isneginf and isposinf (:pr:`3581`) John A Kirkham |
| + Drop Dask Array's learn module (:pr:`3580`) John A Kirkham |
| + added sfqr (short-and-fat) as a counterpart to tsqr… |
| (:pr:`3575`) Jeremy Chen |
| + Allow 0-width chunks in dask.array.rechunk (:pr:`3591`) Marc |
| Pfister |
| + Document Dask Array's nan_to_num in public API (:pr:`3599`) John |
| A Kirkham |
| + Show block example (:pr:`3601`) John A Kirkham |
| + Replace token= keyword with name= in map_blocks (:pr:`3597`) |
| Matthew Rocklin |
| + Disable locking in to_zarr (needed for using to_zarr in a |
| distributed context) (:pr:`3607`) John A Kirkham |
| + Support Zarr Arrays in to_zarr/from_zarr (:pr:`3561`) John A |
| Kirkham |
| + Added recursion to array/linalg/tsqr to better manage the single |
| core bottleneck (:pr:`3586`) `Jeremy Chan`_ |
| * Dataframe |
| + Add to/read_json (:pr:`3494`) Martin Durant |
| + Adds index to unsupported arguments for DataFrame.rename method |
| (:pr:`3522`) James Bourbeau |
| + Adds support to subset Dask DataFrame columns using |
| numpy.ndarray, pandas.Series, and pandas.Index objects |
| (:pr:`3536`) James Bourbeau |
| + Raise error if meta columns do not match dataframe (:pr:`3485`) |
| Christopher Ren |
| + Add index to unsupprted argument for DataFrame.rename |
| (:pr:`3522`) James Bourbeau |
| + Adds support for subsetting DataFrames with pandas Index/Series |
| and numpy ndarrays (:pr:`3536`) James Bourbeau |
| + Dataframe sample method docstring fix (:pr:`3566`) James |
| Bourbeau |
| + fixes dd.read_json to infer file compression (:pr:`3594`) Matt |
| Lee |
| + Adds n to sample method (:pr:`3606`) James Bourbeau |
| + Add fastparquet ParquetFile object support (:pr:`3573`) |
| @andrethrill |
| * Bag |
| + Rename method= keyword to shuffle= in bag.groupby (:pr:`3470`) |
| Matthew Rocklin |
| * Core |
| + Replace get= keyword with scheduler= keyword (:pr:`3448`) |
| Matthew Rocklin |
| + Add centralized dask.config module to handle configuration for |
| all Dask subprojects (:pr:`3432`) (:pr:`3513`) (:pr:`3520`) |
| Matthew Rocklin |
| + Add dask-ssh CLI Options and Description. (:pr:`3476`) @beomi |
| + Read whole files fix regardless of header for HTTP (:pr:`3496`) |
| Martin Durant |
| + Adds synchronous scheduler syntax to debugging docs (:pr:`3509`) |
| James Bourbeau |
| + Replace dask.set_options with dask.config.set (:pr:`3502`) |
| Matthew Rocklin |
| + Update sphinx readthedocs-theme (:pr:`3516`) Matthew Rocklin |
| + Introduce "auto" value for normalize_chunks (:pr:`3507`) Matthew |
| Rocklin |
| + Fix check in configuration with env=None (:pr:`3562`) Simon |
| Perkins |
| + Update sizeof definitions (:pr:`3582`) Matthew Rocklin |
| + Remove --verbose flag from travis-ci (:pr:`3477`) Matthew |
| Rocklin |
| + Remove "da.random" from random array keys (:pr:`3604`) Matthew |
| Rocklin |
| |
| ------------------------------------------------------------------- |
| Mon May 21 03:57:53 UTC 2018 - arun@gmx.de |
| |
| - update to version 0.17.5: |
| * Compatibility with pandas 0.23.0 (:pr:`3499`) Tom Augspurger |
| |
| ------------------------------------------------------------------- |
| Sun May 6 05:33:50 UTC 2018 - arun@gmx.de |
| |
| - update to version 0.17.4: |
| * Dataframe |
| + Add support for indexing Dask DataFrames with string subclasses |
| (:pr:`3461`) James Bourbeau |
| + Allow using both sorted_index and chunksize in read_hdf |
| (:pr:`3463`) Pierre Bartet |
| + Pass filesystem to arrow piece reader (:pr:`3466`) Martin Durant |
| + Switches to using dask.compat string_types ( |
| Bourbeau |
| |
| - changes from version 0.17.3: |
| * Array |
| + Add einsum for Dask Arrays (:pr:`3412`) Simon Perkins |
| + Add piecewise for Dask Arrays (:pr:`3350`) John A Kirkham |
| + Fix handling of nan in broadcast_shapes (:pr:`3356`) John A |
| Kirkham |
| + Add isin for dask arrays (:pr:`3363`). Stephan Hoyer |
| + Overhauled topk for Dask Arrays: faster algorithm, particularly |
| for large k's; added support for multiple axes, recursive |
| aggregation, and an option to pick the bottom k elements |
| instead. (:pr:`3395`) Guido Imperiale |
| + The topk API has changed from topk(k, array) to the more |
| conventional topk(array, k). The legacy API still works but is |
| now deprecated. (:pr:`2965`) Guido Imperiale |
| + New function argtopk for Dask Arrays (:pr:`3396`) Guido |
| Imperiale |
| + Fix handling partial depth and boundary in map_overlap |
| (:pr:`3445`) John A Kirkham |
| + Add gradient for Dask Arrays (:pr:`3434`) John A Kirkham |
| * DataFrame |
| + Allow t as shorthand for table in to_hdf for pandas |
| compatibility (:pr:`3330`) Jörg Dietrich |
| + Added top level isna method for Dask DataFrames (:pr:`3294`) |
| Christopher Ren |
| + Fix selection on partition column on read_parquet for |
| engine="pyarrow" (:pr:`3207`) Uwe Korn |
| + Added DataFrame.squeeze method (:pr:`3366`) Christopher Ren |
| + Added infer_divisions option to read_parquet to specify whether |
| read engines should compute divisions (:pr:`3387`) Jon Mease |
| + Added support for inferring division for engine="pyarrow" |
| (:pr:`3387`) Jon Mease |
| + Provide more informative error message for meta= errors |
| (:pr:`3343`) Matthew Rocklin |
| + add orc reader (:pr:`3284`) Martin Durant |
| + Default compression for parquet now always Snappy, in line with |
| pandas (:pr:`3373`) Martin Durant |
| + Fixed bug in Dask DataFrame and Series comparisons with NumPy |
| scalars (:pr:`3436`) James Bourbeau |
| + Remove outdated requirement from repartition docstring |
| (:pr:`3440`) Jörg Dietrich |
| + Fixed bug in aggregation when only a Series is selected |
| (:pr:`3446`) Jörg Dietrich |
| + Add default values to make_timeseries (:pr:`3421`) Matthew |
| Rocklin |
| * Core |
| + Support traversing collections in persist, visualize, and |
| optimize (:pr:`3410`) Jim Crist |
| + Add schedule= keyword to compute and persist. This replaces |
| common use of the get= keyword (:pr:`3448`) Matthew Rocklin |
| |
| ------------------------------------------------------------------- |
| Sat Mar 24 18:48:24 UTC 2018 - arun@gmx.de |
| |
| - update to version 0.17.2: |
| * Array |
| + Add broadcast_arrays for Dask Arrays (:pr:`3217`) John A Kirkham |
| + Add bitwise_* ufuncs (:pr:`3219`) John A Kirkham |
| + Add optional axis argument to squeeze (:pr:`3261`) John A |
| Kirkham |
| + Validate inputs to atop (:pr:`3307`) Matthew Rocklin |
| + Avoid calls to astype in concatenate if all parts have the same |
| dtype (:pr:`3301`) `Martin Durant`_ |
| * DataFrame |
| + Fixed bug in shuffle due to aggressive truncation (:pr:`3201`) |
| Matthew Rocklin |
| + Support specifying categorical columns on read_parquet with |
| categories=[…] for engine="pyarrow" (:pr:`3177`) Uwe Korn |
| + Add dd.tseries.Resampler.agg (:pr:`3202`) Richard Postelnik |
| + Support operations that mix dataframes and arrays (:pr:`3230`) |
| Matthew Rocklin |
| + Support extra Scalar and Delayed args in |
| dd.groupby._Groupby.apply (:pr:`3256`) Gabriele Lanaro |
| * Bag |
| + Support joining against single-partitioned bags and delayed |
| objects (:pr:`3254`) Matthew Rocklin |
| * Core |
| + Fixed bug when using unexpected but hashable types for keys |
| (:pr:`3238`) Daniel Collins |
| + Fix bug in task ordering so that we break ties consistently with |
| the key name (:pr:`3271`) Matthew Rocklin |
| + Avoid sorting tasks in order when the number of tasks is very |
| large (:pr:`3298`) Matthew Rocklin |
| |
| ------------------------------------------------------------------- |
| Fri Mar 2 19:52:06 UTC 2018 - sebix+novell.com@sebix.at |
| |
| - correctly package bytecode |
| - use %license macro |
| |
| ------------------------------------------------------------------- |
| Fri Feb 23 03:52:52 UTC 2018 - arun@gmx.de |
| |
| - update to version 0.17.1: |
| * Array |
| + Corrected dimension chunking in indices (:issue:`3166`, |
| :pr:`3167`) Simon Perkins |
| + Inline store_chunk calls for store's return_stored option |
| (:pr:`3153`) John A Kirkham |
| + Compatibility with struct dtypes for NumPy 1.14.1 release |
| (:pr:`3187`) Matthew Rocklin |
| * DataFrame |
| + Bugfix to allow column assignment of pandas |
| datetimes(:pr:`3164`) Max Epstein |
| * Core |
| + New file-system for HTTP(S), allowing direct loading from |
| specific URLs (:pr:`3160`) `Martin Durant`_ |
| + Fix bug when tokenizing partials with no keywords (:pr:`3191`) |
| Matthew Rocklin |
| + Use more recent LZ4 API (:pr:`3157`) `Thrasibule`_ |
| + Introduce output stream parameter for progress bar (:pr:`3185`) |
| `Dieter Weber`_ |
| |
| ------------------------------------------------------------------- |
| Sat Feb 10 17:26:43 UTC 2018 - arun@gmx.de |
| |
| - update to version 0.17.0: |
| * Array |
| + Added a support object-type arrays for nansum, nanmin, and |
| nanmax (:issue:`3133`) Keisuke Fujii |
| + Update error handling when len is called with empty chunks |
| (:issue:`3058`) Xander Johnson |
| + Fixes a metadata bug with store's return_stored option |
| (:pr:`3064`) John A Kirkham |
| + Fix a bug in optimization.fuse_slice to properly handle when |
| first input is None (:pr:`3076`) James Bourbeau |
| + Support arrays with unknown chunk sizes in percentile |
| (:pr:`3107`) Matthew Rocklin |
| + Tokenize scipy.sparse arrays and np.matrix (:pr:`3060`) Roman |
| Yurchak |
| * DataFrame |
| + Support month timedeltas in repartition(freq=...) (:pr:`3110`) |
| Matthew Rocklin |
| + Avoid mutation in dataframe groupby tests (:pr:`3118`) Matthew |
| Rocklin |
| + read_csv, read_table, and read_parquet accept iterables of paths |
| (:pr:`3124`) Jim Crist |
| + Deprecates the dd.to_delayed function in favor of the existing |
| method (:pr:`3126`) Jim Crist |
| + Return dask.arrays from df.map_partitions calls when the UDF |
| returns a numpy array (:pr:`3147`) Matthew Rocklin |
| + Change handling of columns and index in dd.read_parquet to be |
| more consistent, especially in handling of multi-indices |
| (:pr:`3149`) Jim Crist |
| + fastparquet append=True allowed to create new dataset |
| (:pr:`3097`) `Martin Durant`_ |
| + dtype rationalization for sql queries (:pr:`3100`) `Martin |
| Durant`_ |
| * Bag |
| + Document bag.map_paritions function may recieve either a list or |
| generator. (:pr:`3150`) Nir |
| * Core |
| + Change default task ordering to prefer nodes with few dependents |
| and then many downstream dependencies (:pr:`3056`) Matthew |
| Rocklin |
| + Add color= option to visualize to color by task order |
| (:pr:`3057`) (:pr:`3122`) Matthew Rocklin |
| + Deprecate dask.bytes.open_text_files (:pr:`3077`) Jim Crist |
| + Remove short-circuit hdfs reads handling due to maintenance |
| costs. May be re-added in a more robust manner later |
| (:pr:`3079`) Jim Crist |
| + Add dask.base.optimize for optimizing multiple collections |
| without computing. (:pr:`3071`) Jim Crist |
| + Rename dask.optimize module to dask.optimization (:pr:`3071`) |
| Jim Crist |
| + Change task ordering to do a full traversal (:pr:`3066`) Matthew |
| Rocklin |
| + Adds an optimize_graph keyword to all to_delayed methods to |
| allow controlling whether optimizations occur on |
| conversion. (:pr:`3126`) Jim Crist |
| + Support using pyarrow for hdfs integration (:pr:`3123`) Jim |
| Crist |
| + Move HDFS integration and tests into dask repo (:pr:`3083`) Jim |
| Crist |
| + Remove write_bytes (:pr:`3116`) Jim Crist |
| |
| ------------------------------------------------------------------- |
| Thu Jan 11 23:56:36 UTC 2018 - arun@gmx.de |
| |
| - specfile: |
| * update copyright year |
| |
| - update to version 0.16.1: |
| * Array |
| + Fix handling of scalar percentile values in "percentile" |
| (:pr:`3021`) `James Bourbeau`_ |
| + Prevent "bool()" coercion from calling compute (:pr:`2958`) |
| `Albert DeFusco`_ |
| + Add "matmul" (:pr:`2904`) `John A Kirkham`_ |
| + Support N-D arrays with "matmul" (:pr:`2909`) `John A Kirkham`_ |
| + Add "vdot" (:pr:`2910`) `John A Kirkham`_ |
| + Explicit "chunks" argument for "broadcast_to" (:pr:`2943`) |
| `Stephan Hoyer`_ |
| + Add "meshgrid" (:pr:`2938`) `John A Kirkham`_ and (:pr:`3001`) |
| `Markus Gonser`_ |
| + Preserve singleton chunks in "fftshift"/"ifftshift" (:pr:`2733`) |
| `John A Kirkham`_ |
| + Fix handling of negative indexes in "vindex" and raise errors |
| for out of bounds indexes (:pr:`2967`) `Stephan Hoyer`_ |
| + Add "flip", "flipud", "fliplr" (:pr:`2954`) `John A Kirkham`_ |
| + Add "float_power" ufunc (:pr:`2962`) (:pr:`2969`) `John A |
| Kirkham`_ |
| + Compatability for changes to structured arrays in the upcoming |
| NumPy 1.14 release (:pr:`2964`) `Tom Augspurger`_ |
| + Add "block" (:pr:`2650`) `John A Kirkham`_ |
| + Add "frompyfunc" (:pr:`3030`) `Jim Crist`_ |
| * DataFrame |
| + Fixed naming bug in cumulative aggregations (:issue:`3037`) |
| `Martijn Arts`_ |
| + Fixed "dd.read_csv" when "names" is given but "header" is not |
| set to "None" (:issue:`2976`) `Martijn Arts`_ |
| + Fixed "dd.read_csv" so that passing instances of |
| "CategoricalDtype" in "dtype" will result in known categoricals |
| (:pr:`2997`) `Tom Augspurger`_ |
| + Prevent "bool()" coercion from calling compute (:pr:`2958`) |
| `Albert DeFusco`_ |
| + "DataFrame.read_sql()" (:pr:`2928`) to an empty database tables |
| returns an empty dask dataframe `Apostolos Vlachopoulos`_ |
| + Compatability for reading Parquet files written by PyArrow 0.8.0 |
| (:pr:`2973`) `Tom Augspurger`_ |
| + Correctly handle the column name (`df.columns.name`) when |
| reading in "dd.read_parquet" (:pr:2973`) `Tom Augspurger`_ |
| + Fixed "dd.concat" losing the index dtype when the data contained |
| a categorical (:issue:`2932`) `Tom Augspurger`_ |
| + Add "dd.Series.rename" (:pr:`3027`) `Jim Crist`_ |
| + "DataFrame.merge()" (:pr:`2960`) now supports merging on a |
| combination of columns and the index `Jon Mease`_ |
| + Removed the deprecated "dd.rolling*" methods, in preperation for |
| their removal in the next pandas release (:pr:`2995`) `Tom |
| Augspurger`_ |
| + Fix metadata inference bug in which single-partition series were |
| mistakenly special cased (:pr:`3035`) `Jim Crist`_ |
| + Add support for "Series.str.cat" (:pr:`3028`) `Jim Crist`_ |
| * Core |
| + Improve 32-bit compatibility (:pr:`2937`) `Matthew Rocklin`_ |
| + Change task prioritization to avoid upwards branching |
| (:pr:`3017`) `Matthew Rocklin`_ |
| |
| ------------------------------------------------------------------- |
| Sun Nov 19 05:11:59 UTC 2017 - arun@gmx.de |
| |
| - update to version 0.16.0: |
| * Fix install of fastparquet on travis (#2897) |
| * Fix port for bokeh dashboard (#2889) |
| * fix hdfs3 version |
| * Modify hdfs import to point to hdfs3 (#2894) |
| * Explicitly pass in pyarrow filesystem for parquet (#2881) |
| * COMPAT: Ensure lists for multiple groupby keys (#2892) |
| * Avoid list index error in repartition_freq (#2873) |
| * Finish moving `infer_storage_options` (#2886) |
| * Support arrow in `to_parquet`. Several other parquet |
| cleanups. (#2868) |
| * Bugfix: Filesystem object not passed to pyarrow reader (#2527) |
| * Fix py34 build |
| * Fixup s3 tests (#2875) |
| * Close resource profiler process on __exit__ (#2871) |
| * Add changelog for to_parquet changes. [ci skip] |
| * A few parquet cleanups (#2867) |
| * Fixed fillna with Series (#2810) |
| * Error nicely on parse dates failure in read_csv (#2863) |
| * Fix empty dataframe partitioning for numpy 1.10.4 (#2862) |
| * Test `unique`'s inverse mapping's shape (#2857) |
| * Move `thread_state` out of the top namespace (#2858) |
| * Explain unique's steps ( |
| * fix and test for issue |
| * Minor tweaks to `_unique_internal` optional result handling |
| ( |
| * Update dask interface during XArray integration ( |
| * Remove unnecessary map_partitions in aggregate ( |
| * Simplify `_unique_internal` ( |
| * Add more tests for read_parquet(engine='pyarrow') ( |
| * Do not raise exception when calling set_index on empty dataframe |
| |
| * Test unique on more data ( |
| * Do not except on set_index on text column with empty partitions |
| |
| * Compat for bokeh 0.12.10 ( |
| * Support `return_*` arguments with `unique` ( |
| * Fix installing of pandas dev ( |
| * Squash a few warnings in dask.array ( |
| * Array optimizations don't elide some getter calls (#2826) |
| * test against pandas rc (#2814) |
| * df.astype(categorical_dtype) -> known categoricals (#2835) |
| * Fix cloudpickle test (#2836) |
| * BUG: Quantile with missing data (#2791) |
| * API: remove dask.async (#2828) |
| * Adds comma to flake8 section in setup.cfg (#2817) |
| * Adds asarray and asanyarray to the dask.array public API (#2787) |
| * flake8 now checks bare excepts (#2816) |
| * CI: Update for new flake8 / pycodestyle (#2808) |
| * Fix concat series bug (#2800) |
| * Typo in the docstring of read_parquet's filters param ( |
| * Docs update ( |
| * minor doc changes in bag.core ( |
| * da.random.choice works with array args ( |
| * Support broadcasting 0-length dimensions ( |
| * ResourceProfiler plot works with single point ( |
| * Implement Dask Array's unique to be lazy (#2775) |
| * Dask Collection Interface |
| * Reduce test memory usage (#2782) |
| * Deprecate vnorm (#2773) |
| * add auto-import of gcsfs (#2776) |
| * Add allclose (#2771) |
| * Remove `random.different_seeds` from API docs (#2772) |
| * Follow-up for atleast_nd (#2765) |
| * Use get_worker().client.get if available (#2762) |
| * Link PR for "Allow tuples as sharedict keys" (#2766) |
| * Allow tuples as sharedict keys (#2763) |
| * update docs to use flatten vs concat (#2764) |
| * Add atleast_nd functions (#2760) |
| * Consolidate changelog for 0.15.4 (#2759) |
| * Add changelog template for future date (#2758) |
| |
| ------------------------------------------------------------------- |
| Mon Oct 30 06:16:22 UTC 2017 - arun@gmx.de |
| |
| - update to version 0.15.4: |
| * Drop s3fs requirement (#2750) |
| * Support -1 as an alias for dimension size in chunks (#2749) |
| * Handle zero dimension when rechunking (#2747) |
| * Pandas 0.21 compatability (#2737) |
| * API: Add `.str` accessor for Categorical with object dtype (#2743) |
| * Fix install failures |
| * Reduce memory usage |
| * A few test cleanups |
| * Fix #2720 (#2729) |
| * Pass on file_scheme to fastparquet (#2714) |
| * Support indexing with np.int (#2719) |
| * Tree reduction support for dask.bag.Bag.foldby (#2710) |
| * Update link to IPython parallel docs (#2715) |
| * Call mkdir from correct namespace in array.to_npy_stack. (#2709) |
| * add int96 times to parquet writer (#2711) |
| |
| ------------------------------------------------------------------- |
| Sun Sep 24 21:28:49 UTC 2017 - arun@gmx.de |
| |
| - update to version 0.15.3: |
| * add .github/PULL_REQUEST_TEMPLATE.md file |
| * Make `y` optional in dask.array.learn (#2701) |
| * Add apply_over_axes (#2702) |
| * Use apply_along_axis name in Dask (#2704) |
| * Tweak apply_along_axis's pre-NumPy 1.13.0 error ( |
| * Add apply_along_axis ( |
| * Use travis conditional builds ( |
| * Skip days in daily_stock that have nan values ( |
| * TST: Have array assert_eq check scalars ( |
| * Add schema keyword to read_sql ( |
| * Only install pytest-runner if needed ( |
| * Remove resize tool from bokeh plots ( |
| * Add ptp ( |
| * Catch warning from numpy in subs ( |
| * Publish Series methods in dataframe api ( |
| * Fix norm keepdims ( |
| * Dask array slicing with boolean arrays ( |
| * repartition works with mixed categoricals ( |
| * Merge pull request |
| * Fix for parquet file schemes |
| * Optional axis argument for cumulative functions ( |
| * Remove partial_by_order |
| * Support literals in atop |
| * [ci skip] Add flake8 note in developer doc page ( |
| * Add filenames return for ddf.to_csv and bag.to_textfiles as they |
| both… ( |
| * CLN: Remove redundant code, fix typos ( |
| * [docs] company name change from Continuum to Anaconda ( |
| * Fix what hapend when combining partition_on and append in |
| to_parquet ( |
| * WIP: Add user defined aggregations ( |
| * [docs] new cheatsheet ( |
| * Masked arrays ( |
| * Indexing with an unsigned integer array ( |
| * ENH: Allow the groupby by param to handle columns and index levels |
| ( |
| * update copyright date ( |
| * python setup.py test runs py.test ( |
| * Avoid using operator.itemgetter in dask.dataframe ( |
| * Add `*_like` array creation functions ( |
| * Consistent slicing names ( |
| * Replace Continuum Analytics with Anaconda Inc. ( |
| * Implement Series.str[index] ( |
| * Support complex data with vnorm ( |
| |
| - changes from version 0.15.2: |
| * BUG: setitem should update divisions ( |
| * Allow dataframe.loc with numpy array ( |
| * Add link to Stack Overflow's mcve docpage to support docs (#2612) |
| * Improve dtype inference and reflection (#2571) |
| * Add ediff1d (#2609) |
| * Optimize concatenate on singleton sequences (#2610) |
| * Add diff (#2607) |
| * Document norm in Dask Array API (#2605) |
| * Add norm (#2597) |
| * Don't check for memory leaks in distributed tests ( |
| * Include computed collection within sharedict in delayed ( |
| * Reorg array ( |
| * Remove `expand` parameter from df.str.split ( |
| * Normalize `meta` on call to `dd.from_delayed` ( |
| * Remove bare `except:` blocks and test that none exist. ( |
| * Adds choose method to dask.array.Array ( |
| * Generalize vindex in dask.array ( |
| * Clear `_cached_keys` on name change in dask.array ( |
| * Don't render None for unknown divisions (#2570) |
| * Add missing initialization to CacheProfiler (#2550) |
| * Add argwhere, *nonzero, where (cond) (#2539) |
| * Fix indices error message (#2565) |
| * Fix and secure some references (#2563) |
| * Allows for read_hdf to accept an iterable of files (#2547) |
| * Allow split on rechunk on first pass (#2560) |
| * Improvements to dask.array.where (#2549) |
| * Adds isin method to dask.dataframe.DataFrame (#2558) |
| * Support dask array conditional in compress (#2555) |
| * Clarify ResourceProfiler docstring [ci skip] (#2553) |
| * In compress, use Dask to expand condition array (#2545) |
| * Support compress with axis as None (#2541) |
| * df.idxmax/df.idxmin work with empty partitions (#2542) |
| * FIX typo in accumulate docstring (#2552) |
| * da.where works with non-bool condition (#2543) |
| * da.repeat works with negative axis (#2544) |
| * Check metadata in `dd.from_delayed` (#2534) |
| * TST: clean up test directories in shuffle (#2535) |
| * Do no attemp to compute divisions on empty dataframe. (#2529) |
| * Remove deprecated bag behavior (#2525) |
| * Updates read_hdf docstring (#2518) |
| * Add dd.to_timedelta (#2523) |
| * Better error message for read_csv (#2522) |
| * Remove spurious keys from map_overlap graph (#2520) |
| * Do not compare x.dim with None in array. (#1847) |
| * Support concat for categorical MultiIndex (#2514) |
| * Support for callables in df.assign (#2513) |
| |
| ------------------------------------------------------------------- |
| Thu May 4 22:24:37 UTC 2017 - toddrme2178@gmail.com |
| |
| - Implement single-spec version |
| - Update source URL. |
| - Split classes into own subpackages to lighten base dependencies. |
| - Update to version 0.15.1 |
| * Add storage_options to to_textfiles and to_csv (:pr:`2466`) |
| * Rechunk and simplify rfftfreq (:pr:`2473`), (:pr:`2475`) |
| * Better support ndarray subclasses (:pr:`2486`) |
| * Import star in dask.distributed (:pr:`2503`) |
| * Threadsafe cache handling with tokenization (:pr:`2511`) |
| - Update to version 0.15.0 |
| + Array |
| * Add dask.array.stats submodule (:pr:`2269`) |
| * Support ``ufunc.outer`` (:pr:`2345`) |
| * Optimize fancy indexing by reducing graph overhead (:pr:`2333`) (:pr:`2394`) |
| * Faster array tokenization using alternative hashes (:pr:`2377`) |
| * Added the matmul ``@`` operator (:pr:`2349`) |
| * Improved coverage of the ``numpy.fft`` module (:pr:`2320`) (:pr:`2322`) (:pr:`2327`) (:pr:`2323`) |
| * Support NumPy's ``__array_ufunc__`` protocol (:pr:`2438`) |
| + Bag |
| * Fix bug where reductions on bags with no partitions would fail (:pr:`2324`) |
| * Add broadcasting and variadic ``db.map`` top-level function. Also remove |
| auto-expansion of tuples as map arguments (:pr:`2339`) |
| * Rename ``Bag.concat`` to ``Bag.flatten`` (:pr:`2402`) |
| + DataFrame |
| * Parquet improvements (:pr:`2277`) (:pr:`2422`) |
| + Core |
| * Move dask.async module to dask.local (:pr:`2318`) |
| * Support callbacks with nested scheduler calls (:pr:`2397`) |
| * Support pathlib.Path objects as uris (:pr:`2310`) |
| - Update to version 0.14.3 |
| + DataFrame |
| * Pandas 0.20.0 support |
| - Update to version 0.14.2 |
| + Array |
| * Add da.indices (:pr:`2268`), da.tile (:pr:`2153`), da.roll (:pr:`2135`) |
| * Simultaneously support drop_axis and new_axis in da.map_blocks (:pr:`2264`) |
| * Rechunk and concatenate work with unknown chunksizes (:pr:`2235`) and (:pr:`2251`) |
| * Support non-numpy container arrays, notably sparse arrays (:pr:`2234`) |
| * Tensordot contracts over multiple axes (:pr:`2186`) |
| * Allow delayed targets in da.store (:pr:`2181`) |
| * Support interactions against lists and tuples (:pr:`2148`) |
| * Constructor plugins for debugging (:pr:`2142`) |
| * Multi-dimensional FFTs (single chunk) (:pr:`2116`) |
| + Bag |
| * to_dataframe enforces consistent types (:pr:`2199`) |
| + DataFrame |
| * Set_index always fully sorts the index (:pr:`2290`) |
| * Support compatibility with pandas 0.20.0 (:pr:`2249`), (:pr:`2248`), and (:pr:`2246`) |
| * Support Arrow Parquet reader (:pr:`2223`) |
| * Time-based rolling windows (:pr:`2198`) |
| * Repartition can now create more partitions, not just less (:pr:`2168`) |
| + Core |
| * Always use absolute paths when on POSIX file system (:pr:`2263`) |
| * Support user provided graph optimizations (:pr:`2219`) |
| * Refactor path handling (:pr:`2207`) |
| * Improve fusion performance (:pr:`2129`), (:pr:`2131`), and (:pr:`2112`) |
| - Update to version 0.14.1 |
| + Array |
| * Micro-optimize optimizations (:pr:`2058`) |
| * Change slicing optimizations to avoid fusing raw numpy arrays (:pr:`2075`) |
| (:pr:`2080`) |
| * Dask.array operations now work on numpy arrays (:pr:`2079`) |
| * Reshape now works in a much broader set of cases (:pr:`2089`) |
| * Support deepcopy python protocol (:pr:`2090`) |
| * Allow user-provided FFT implementations in ``da.fft`` (:pr:`2093`) |
| + Bag |
| + DataFrame |
| * Fix to_parquet with empty partitions (:pr:`2020`) |
| * Optional ``npartitions='auto'`` mode in ``set_index`` (:pr:`2025`) |
| * Optimize shuffle performance (:pr:`2032`) |
| * Support efficient repartitioning along time windows like |
| ``repartition(freq='12h')`` (:pr:`2059`) |
| * Improve speed of categorize (:pr:`2010`) |
| * Support single-row dataframe arithmetic (:pr:`2085`) |
| * Automatically avoid shuffle when setting index with a sorted column |
| (:pr:`2091`) |
| * Improve handling of integer-na handling in read_csv (:pr:`2098`) |
| + Delayed |
| * Repeated attribute access on delayed objects uses the same key (:pr:`2084`) |
| + Core |
| * Improve naming of nodes in dot visuals to avoid generic ``apply`` |
| (:pr:`2070`) |
| * Ensure that worker processes have different random seeds (:pr:`2094`) |
| - Update to version 0.14.0 |
| + Array |
| * Fix corner cases with zero shape and misaligned values in ``arange`` |
| * Improve concatenation efficiency (:pr:`1923`) |
| * Avoid hashing in ``from_array`` if name is provided (:pr:`1972`) |
| + Bag |
| * Repartition can now increase number of partitions (:pr:`1934`) |
| * Fix bugs in some reductions with empty partitions (:pr:`1939`), (:pr:`1950`), |
| (:pr:`1953`) |
| + DataFrame |
| * Support non-uniform categoricals (:pr:`1877`), (:pr:`1930`) |
| * Groupby cumulative reductions (:pr:`1909`) |
| * DataFrame.loc indexing now supports lists (:pr:`1913`) |
| * Improve multi-level groupbys (:pr:`1914`) |
| * Improved HTML and string repr for DataFrames (:pr:`1637`) |
| * Parquet append (:pr:`1940`) |
| * Add ``dd.demo.daily_stock`` function for teaching (:pr:`1992`) |
| + Delayed |
| * Add ``traverse=`` keyword to delayed to optionally avoid traversing nested |
| data structures (:pr:`1899`) |
| * Support Futures in from_delayed functions (:pr:`1961`) |
| * Improve serialization of decorated delayed functions (:pr:`1969`) |
| + Core |
| * Improve windows path parsing in corner cases (:pr:`1910`) |
| * Rename tasks when fusing (:pr:`1919`) |
| * Add top level ``persist`` function (:pr:`1927`) |
| * Propagate ``errors=`` keyword in byte handling (:pr:`1954`) |
| * Dask.compute traverses Python collections (:pr:`1975`) |
| * Structural sharing between graphs in dask.array and dask.delayed (:pr:`1985`) |
| - Update to version 0.13.0 |
| + Array |
| * Mandatory dtypes on dask.array. All operations maintain dtype information |
| and UDF functions like map_blocks now require a dtype= keyword if it can not |
| be inferred. (:pr:`1755`) |
| * Support arrays without known shapes, such as arises when slicing arrays with |
| arrays or converting dataframes to arrays (:pr:`1838`) |
| * Support mutation by setting one array with another (:pr:`1840`) |
| * Tree reductions for covariance and correlations. (:pr:`1758`) |
| * Add SerializableLock for better use with distributed scheduling (:pr:`1766`) |
| * Improved atop support (:pr:`1800`) |
| * Rechunk optimization (:pr:`1737`), (:pr:`1827`) |
| + Bag |
| * Avoid wrong results when recomputing the same groupby twice (:pr:`1867`) |
| + DataFrame |
| * Add ``map_overlap`` for custom rolling operations (:pr:`1769`) |
| * Add ``shift`` (:pr:`1773`) |
| * Add Parquet support (:pr:`1782`) (:pr:`1792`) (:pr:`1810`), (:pr:`1843`), |
| (:pr:`1859`), (:pr:`1863`) |
| * Add missing methods combine, abs, autocorr, sem, nsmallest, first, last, |
| prod, (:pr:`1787`) |
| * Approximate nunique (:pr:`1807`), (:pr:`1824`) |
| * Reductions with multiple output partitions (for operations like |
| drop_duplicates) (:pr:`1808`), (:pr:`1823`) (:pr:`1828`) |
| * Add delitem and copy to DataFrames, increasing mutation support (:pr:`1858`) |
| + Delayed |
| * Changed behaviour for ``delayed(nout=0)`` and ``delayed(nout=1)``: |
| ``delayed(nout=1)`` does not default to ``out=None`` anymore, and |
| ``delayed(nout=0)`` is also enabled. I.e. functions with return |
| tuples of length 1 or 0 can be handled correctly. This is especially |
| handy, if functions with a variable amount of outputs are wrapped by |
| ``delayed``. E.g. a trivial example: |
| ``delayed(lambda *args: args, nout=len(vals))(*vals)`` |
| + Core |
| * Refactor core byte ingest (:pr:`1768`), (:pr:`1774`) |
| * Improve import time (:pr:`1833`) |
| - update to version 0.12.0: |
| * update changelog ( |
| * Avoids spurious warning message in concatenate ( |
| * CLN: cleanup dd.multi ( |
| * ENH: da.ufuncs now supports DataFrame/Series ( |
| * Faster array slicing ( |
| * Avoid calling list on partitions ( |
| * Fix slicing error with None and ints ( |
| * Add da.repeat ( |
| * ENH: add dd.DataFrame.resample ( |
| * Unify column names in dd.read_csv ( |
| * replace empty with random in test to avoid nans |
| * Update diagnostics plots ( |
| * Allow atop to change chunk shape ( |
| * ENH: DataFrame.loc now supports 2d indexing ( |
| * Correct shape when indexing with Ellipsis and None |
| * ENH: Add DataFrame.pivot_table ( |
| * CLN: cleanup DataFrame class handling (#1727) |
| * ENH: Add DataFrame.combine_first ( |
| * ENH: Add DataFrame all/any ( |
| * micro-optimize _deps ( |
| * A few small tweaks to da.Array.astype ( |
| * BUG: Fixed metadata lookup failure in Accessor ( |
| * Support auto-rechunking in stack and concatenate ( |
| * Forward `get` kwarg in df.to_csv ( |
| * Add rename support for multi-level columns ( |
| * Update paid support section |
| * Add `drop` to reset_index ( |
| * Cull dask.arrays on slicing ( |
| * Update dd.read_* functions in docs |
| * WIP: Feature/dataframe aggregate (implements |
| * Add da.round ( |
| * Executor -> Client |
| * Add support of getitem for multilevel columns ( |
| * Prepend optimization keywords with name of optimization ( |
| * Add dd.read_table ( |
| * Fix dd.pivot_table dtype to be deterministic ( |
| * da.random with state is consistent across sizes ( |
| * Remove `raises`, use pytest.raises instead ( |
| * Remove unnecessary calls to list ( |
| * Dataframe tree reductions ( |
| * Add global optimizations to compute ( |
| * TST: rename dataframe eq to assert_eq ( |
| * ENH: Add DataFrame/Series.align ( |
| * CLN: dataframe.io ( |
| * ENH: Add DataFrame/Series clip_xxx ( |
| * Clear divisions on single_partitions_merge ( |
| * ENH: add dd.pivot_table ( |
| * Typo in `use-cases`? ( |
| * add distributed follow link doc page |
| * Dataframe elemwise ( |
| * Windows file and endline test handling ( |
| * remove old badges |
| * Fix |
| * Remove use of multiprocessing.Manager ( |
| * A few fixes for `map_blocks` ( |
| * Automatically expand chunking in atop ( |
| * Add AppVeyor configuration ( |
| * TST: move flake8 to travis script ( |
| * CLN: Remove unused funcs ( |
| * Implementing .size and groupby size method ( |
| * Use strides, shape, and offset in memmap tokenize ( |
| * Validate scalar metadata is scalar ( |
| * Convert readthedocs links for their .org -> .io migration for |
| hosted projects ( |
| * CLN: little cleanup of dd.categorical ( |
| * Signature of Array.transpose matches numpy ( |
| * Error nicely when indexing Array with Array ( |
| * ENH: add DataFrame.get_xtype_counts ( |
| * PEP8: some fixes ( |
| - changes from version 0.11.1: |
| * support uniform index partitions in set_index(sorted) ( |
| * Groupby works with multiprocessing ( |
| * Use a nonempty index in _maybe_partial_time_string |
| * Fix segfault in groupby-var |
| * Support Pandas 0.19.0 |
| * Deprecations ( |
| * work-around for ddf.info() failing because of |
| https://github.com/pydata/pandas/issues/14368 ( |
| * .str accessor needs to pass thru both args & kwargs ( |
| * Ensure dtype is provided in additional tests ( |
| * coerce rounded numbers to int in dask.array.ghost ( |
| * Use assert_eq everywhere in dask.array tests ( |
| * Update documentation ( |
| * Support new_axes= keyword in atop ( |
| * pass through node_attr and edge_attr in dot_graph ( |
| * Add swapaxes to dask array ( |
| * add clip to Array ( |
| * Add atop(concatenate=False) keyword argument ( |
| * Better error message on metadata inference failure ( |
| * ENH/API: Enhanced Categorical Accessor ( |
| * PEP8: dataframe fix except E127,E402,E501,E731 ( |
| * ENH: dd.get_dummies for categorical Series ( |
| * PEP8: some fixes ( |
| * Fix da.learn tests for scikit-learn release ( |
| * Suppress warnings in psutil ( |
| * avoid more timeseries warnings ( |
| * Support inplace operators in dataframe ( |
| * Squash warnings in resample ( |
| * expand imports for dask.distributed ( |
| * Add indicator keyword to dd.merge ( |
| * Error loudly if `nrows` used in read_csv ( |
| * Add versioneer ( |
| * Strengthen statement about gitter for developers in docs |
| * Raise IndexError on out of bounds slice. ( |
| * ENH: Support Series in read_hdf ( |
| * COMPAT/API: DataFrame.categorize missing values ( |
| * Add `pipe` method to dask.dataframe ( |
| * Sample from `read_bytes` ends on a delimiter ( |
| * Remove mention of bag join in docs ( |
| * Tokenize mmap works without filename ( |
| * String accessor works with indexes ( |
| * corrected links to documentation from Examples ( |
| * Use conda-forge channel in travis ( |
| * add s3fs to travis.yml ( |
| * ENH: DataFrame.select_dtypes ( |
| * Improve slicing performance ( |
| * Check meta in `__init__` of _Frame |
| * Fix metadata in Series.getitem |
| * A few changes to `dask.delayed` ( |
| * Fixed read_hdf example ( |
| * add section on distributed computing with link to toc |
| * Fix spelling ( |
| * Only fuse simple indexing with getarray backends ( |
| * Deemphasize graphs in docs ( |
| * Avoid pickle when tokenizing __main__ functions ( |
| * Add changelog doc going up to dask 0.6.1 (2015-07-23). ( |
| * update dataframe docs |
| * update index |
| * Update to highlight the use of glob based file naming option for |
| df exports ( |
| * Add custom docstring to dd.to_csv, mentioning that one file per |
| partition is written ( |
| * Run slow tests in Travis for all Python versions, even if coverage |
| check is disabled. ( |
| * Unify example doc pages into one ( |
| * Remove lambda/inner functions in dask.dataframe ( |
| * Add documentation for dataframe metadata ( |
| * "dd.map_partitions" works with scalar outputs ( |
| * meta_nonempty returns types of correct size ( |
| * add memory use note to tsqr docstring |
| * Fix slow consistent keyname test ( |
| * Chunks check ( |
| * Fix last 'line' in sample; prevents open quotes. ( |
| * Create new threadpool when operating from thread ( |
| * Add finalize- prefix to dask.delayed collections |
| * Move key-split from distributed to dask |
| * State that delayed values should be lists in bag.from_delayed |
| ( |
| * Use lists in db.from_sequence ( |
| * Implement user defined aggregations ( |
| * Field access works with non-scalar fields ( |
| - Update to 0.11.0 |
| * DataFrames now enforce knowing full metadata (columns, dtypes) |
| everywhere. Previously we would operate in an ambiguous state |
| when functions lost dtype information (such as apply). Now all |
| dataframes always know their dtypes and raise errors asking for |
| information if they are unable to infer (which they usually |
| can). Some internal attributes like _pd and _pd_nonempty have |
| been moved. |
| * The internals of the distributed scheduler have been refactored |
| to transition tasks between explicit states. This improves |
| resilience, reasoning about scheduling, plugin operation, and |
| logging. It also makes the scheduler code easier to understand |
| for newcomers. |
| * Breaking Changes |
| + The distributed.s3 and distributed.hdfs namespaces are gone. |
| Use protocols in normal methods like read_text('s3://...' |
| instead. |
| + Dask.array.reshape now errs in some cases where previously |
| it would have create a very large number of tasks |
| - update to version 0.10.2: |
| * raise informative error on merge(on=frame) |
| * Fix crash with -OO Python command line ( |
| * [WIP] Read hdf partitioned ( |
| * Add dask.array.digitize. ( |
| * Adding documentation to create dask DataFrame from HDF5 ( |
| * Unify shuffle algorithms ( |
| * dd.read_hdf: clear errors on exceeding row numbers ( |
| * Rename `get_division` to `get_partition` |
| * Add nice error messages on import failures |
| * Use task-based shuffle in hash_joins ( |
| * Fixed |
| it doesn't require indexing and just coalesce existing partitions |
| without shuffling/balancing (#1396) |
| * Import visualize from dask.diagnostics in docs |
| * Backport `equal_nans` to older version of numpy |
| * Improve checks for dtype and shape in dask.array |
| * Progess bar process should be deamon |
| * LZMA may not be available in python 3 (#1391) |
| * dd.to_hdf: multiple files multiprocessing avoid locks (#1384) |
| * dir works with numeric column names |
| * Dataframe groupby works with numeric column names |
| * Use fsync when appending to partd |
| * Fix pickling issue in dataframe to_bag |
| * Add documentation for dask.dataframe.to_hdf |
| * Fixed a copy-paste typo in DataFrame.map_partitions docstring |
| * Fix 'visualize' import location in diagnostics documentation |
| (#1376) |
| * update cheat sheet (#1371) |
| - update to version 0.10.1: |
| * `inline` no longer removes keys (#1356) |
| * avoid c: in infer_storage_options (#1369) |
| * Protect reductions against empty partitions (#1361) |
| * Add doc examples for dask.array.histogram. (#1363) |
| * Fix typo in pip install requirements path (#1364) |
| * avoid unnecessary dependencies between save tasks in |
| dataframe.to_hdf (#1293) |
| * remove xfail mark for blosc missing const |
| * Add `anon=True` for read from s3 test |
| * `subs` doesn't needlessly compare keys and values |
| * Use pytest.importorskip instead of try/except/return pattern |
| * Fixes for bokeh 0.12.0 |
| * Multiprocess scheduler handles unpickling errors |
| * arra.random with array-like parameters ( |
| * Fixes issue |
| * Remove dask runtime dependence on mock 2.7 backport. |
| * Load known but external protocols automatically ( |
| * Add center argument to Series/DataFrame.rolling ( |
| * Add Bag.random_sample method. ( |
| * Correct docs install command and add missing required packages |
| ( |
| * Mark the 4 slowest tests as slow to get a faster suite by |
| default. ( |
| * Travis: Install mock package in Python 2.7. |
| * Automatic blocksize for read_csv based on available memory and |
| number of cores. |
| * Replace "Matthew Rocklin" with "Dask Development Team" ( |
| * Support column assignment in DataFrame ( |
| * Few travis fixes, pandas version >= 0.18.0 ( |
| * Don't run hdf test if pytables package is not present. (#1323) |
| * Add delayed.compute to api docs. |
| * Support datetimes in DataFrame._build_pd (#1319) |
| * Test setting the index with datetime with timezones, which is a |
| pandas-defined dtype |
| * (#1315) |
| * Add s3fs to requirements (#1316) |
| * Pass dtype information through in Series.astype (#1320) |
| * Add draft of development guidelines (#1305) |
| * Skip tests needing optional package when it's not present. ( |
| * DOC: Document DataFrame.categorize |
| * make dd.to_csv support writing to multiple csv files ( |
| * quantiles for repartitioning ( |
| * DOC: Minimal doc for get_sync ( |
| * Pass through storage_options in db.read_text ( |
| * Fixes |
| APIs and use urlsplit to automatically get remote connection |
| settings ( |
| * TST: Travis build matrix to specify numpy/pandas ver ( |
| * amend doc string to Bag.to_textfiles |
| * Return dask.Delayed when saving files with compute = false ( |
| * Support empty or small dataframes in from_pandas ( |
| * Add validation and tests for order breaking name_function ( |
| * ENH: dataframe now supports partial string selection ( |
| * Fix typo in spark-dask docs |
| * added note and verbose exception about CSV parsing errors ( |
| - update to version 0.10.0: |
| * Add parametrization to merge tests |
| * Add more challenging types to nonempty_sample_df test |
| * Windows fixes |
| * TST: Fix coveralls badge ( |
| * Sort index on shuffle ( |
| * Update specification docs to reflect new spec. |
| * Add groupby docs ( |
| * Update spark docs |
| * Rolling class receives normal arguments (unchecked other than |
| pandas call), stores at |
| * Reduce communication in rolling operations |
| * Fix Shuffle ( |
| * Work on earlier versions of Pandas |
| * Handle additional Pandas types |
| * Use non-empty fake dataframe in merge operations |
| * Add failing test for merge case |
| * Add utility function to create sample dataframe |
| * update release procedure |
| * amend doc string to Bag.to_textfiles ( |
| * Drop Python 2.6 support ( |
| * Clean DataFrame naming conventions ( |
| * Fix some bugs in the rolling implementation. |
| * Fix core.get to use new spec |
| * Make graph definition recursive |
| * Handle empty partitions in dask.bag.to_textfiles |
| * test index.min/max |
| * Add regression test for non-ndarray slicing |
| * Standardize dataframe keynames |
| * bump csv sample size to 256k ( |
| * Switch tests to utils.tmpdir ( |
| * Fix dot_graph filename split bug |
| * Correct documentation to reflect argument existing now. |
| * Allow non-zero axis for .rolling (for application over columns) |
| * Fix scheduler behavior for top-level lists |
| * Various spelling mistakes in docstrings, comments, exception |
| messages, and a filename |
| * Fix typo. ( |
| * Fix tokenize in dask.delayed |
| * Remove unused imports, pep8 fixes |
| * Fix bug in slicing optimization |
| * Add Task Shuffle ( |
| * Add bytes API ( |
| * Add dask_key_name to docs, fix bug in methods |
| * Allow formatting in dask.dataframe.to_hdf path and key parameters |
| * Match pandas' exceptions a bit closer in the rolling API. Also, |
| correct computation f |
| * Add tests to package (#1231) |
| * Document visualize method (#1234) |
| * Skip new rolling API's tests if the pandas we have is too old. |
| * Improve df_or_series.rolling(...) implementation. |
| * Remove `iloc` property on `dask.dataframe` |
| * Support for the new pandas rolling API. |
| * test delayed names are different under kwargs |
| * Add Hussain Sultan to AUTHORS |
| * Add `optimize_graph` keyword to multiprocessing get |
| * Add `optimize_graph` keyword to `compute` |
| * Add dd.info() ( |
| * Cleanup base tests |
| * Add groupby documentation stub |
| * pngmath is deprecated in sphinx 1.4 |
| * A few docfixes |
| * Extract dtype in dd.from_bcolz |
| * Throw NotImplementedError if old toolz.accumulate |
| * Add isnull and notnull for dataframe |
| * Add dask.bag.accumulate |
| * Fix categorical partitioning |
| * create single lock for glob read_hdf |
| * Fix failing from_url doctest |
| * Add missing api to bag docs |
| * Add Skipper Seabold to AUTHORS. |
| * Don't use mutable default argument |
| * Fix typo |
| * Ensure to_task_dasks always returns a task |
| * Fix dir for dataframe objects |
| * Infer metadata in dd.from_delayed |
| * Fix some closure issues in dask.dataframe |
| * Add storage_options keyword to read_csv |
| * Define finalize function for dask.dataframe.Scalar |
| * py26 compatibility |
| * add stacked logos to docs |
| * test from-array names |
| * rename from_array tasks |
| * add atop to array docs |
| * Add motivation and example to delayed docs |
| * splat out delayed values in compute docs |
| * Fix optimize docs |
| * add html page with logos |
| * add dask logo to documentation images |
| * Few pep8 cleanups to dask.dataframe.groupby |
| * Groupby aggregate works with list of columns |
| * Use different names for input and output in from_array |
| * Don't enforce same column names |
| * don't write header for first block in csv |
| * Add var and std to DataFrame groupby (#1159) |
| * Move conda recipe to conda-forge (#1162) |
| * Use function names in map_blocks and elemwise (#1163) |
| * add hyphen to delayed name (#1161) |
| * Avoid shuffles when merging with Pandas objects (#1154) |
| * Add DataFrame.eval |
| * Ensure future imports |
| * Add db.Bag.unzip |
| * Guard against shape attributes that are not sequences |
| * Add dask.array.multinomial |
| - update to version 0.9.0: |
| * No upstream changelog |
| - update to version 0.8.2: |
| * No upstream changelog |
| - update to version 0.8.1: |
| * No upstream changelog |
| - update to version 0.8.0: |
| * No upstream changelog |
| - update to version 0.7.5: |
| * No upstream changelog |
| - update to version 0.7.5: |
| * No upstream changelog |
| - update to version 0.7.0: |
| * No upstream changelog |
| - update to version 0.6.1: |
| * No upstream changelog |
| |
| ------------------------------------------------------------------- |
| Tue Jul 14 13:33:53 UTC 2015 - toddrme2178@gmail.com |
| |
| - Update to 0.6.0 |
| * No upstream changelog |
| |
| ------------------------------------------------------------------- |
| Tue May 19 11:03:41 UTC 2015 - toddrme2178@gmail.com |
| |
| - Update to 0.5.0 |
| * No upstream changelog |
| |
| ------------------------------------------------------------------- |
| Thu Apr 9 16:57:59 UTC 2015 - toddrme2178@gmail.com |
| |
| - Initial version |
| |
| |