Import python venv for stability

2026-02-15 21:24:16 -08:00
parent 1343e93a59
commit 7d784705c9
4997 changed files with 1628270 additions and 0 deletions
@@ -0,0 +1,153 @@
+Metadata-Version: 2.4
+Name: fastparquet
+Version: 2025.12.0
+Summary: Python support for Parquet file format
+Home-page: https://github.com/dask/fastparquet/
+Author: Martin Durant
+Author-email: mdurant@anaconda.com
+License: Apache License 2.0
+Classifier: Development Status :: 4 - Beta
+Classifier: Intended Audience :: Developers
+Classifier: Intended Audience :: System Administrators
+Classifier: License :: OSI Approved :: Apache Software License
+Classifier: Programming Language :: Python
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Programming Language :: Python :: 3.14
+Classifier: Programming Language :: Python :: Implementation :: CPython
+Requires-Python: >=3.10
+License-File: LICENSE
+Requires-Dist: pandas>=1.5.0
+Requires-Dist: numpy
+Requires-Dist: cramjam>=2.3
+Requires-Dist: fsspec
+Requires-Dist: packaging
+Provides-Extra: lzo
+Requires-Dist: python-lzo; extra == "lzo"
+Dynamic: author
+Dynamic: author-email
+Dynamic: classifier
+Dynamic: description
+Dynamic: home-page
+Dynamic: license
+Dynamic: license-file
+Dynamic: provides-extra
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary
+
+fastparquet
+===========
+
+.. image:: https://github.com/dask/fastparquet/actions/workflows/main.yaml/badge.svg
+    :target: https://github.com/dask/fastparquet/actions/workflows/main.yaml
+
+.. image:: https://readthedocs.org/projects/fastparquet/badge/?version=latest
+    :target: https://fastparquet.readthedocs.io/en/latest/
+
+fastparquet is a python implementation of the `parquet
+format <https://github.com/apache/parquet-format>`_, aiming integrate
+into python-based big data work-flows. It is used implicitly by
+the projects Dask, Pandas and intake-parquet.
+
+We offer a high degree of support for the features of the parquet format, and
+very competitive performance, in a small install size and codebase.
+
+Details of this project, how to use it and comparisons to other work can be found in the documentation_.
+
+.. _documentation: https://fastparquet.readthedocs.io
+
+Requirements
+------------
+
+(all development is against recent versions in the default anaconda channels
+and/or conda-forge)
+
+Required:
+
+- numpy
+- pandas
+- cython >= 0.29.23 (if building from pyx files)
+- cramjam
+- fsspec
+
+Supported compression algorithms:
+
+- Available by default:
+
+  - gzip
+  - snappy
+  - brotli
+  - lz4
+  - zstandard
+
+- Optionally supported
+  
+  - `lzo <https://github.com/jd-boyd/python-lzo>`_
+
+
+Installation
+------------
+
+Install using conda, to get the latest compiled version::
+
+   conda install -c conda-forge fastparquet
+
+or install from PyPI::
+
+   pip install fastparquet
+
+You may wish to install numpy first, to help pip's resolver.
+This may install an appropriate wheel, or compile from source. For the latter,
+you will need a suitable C compiler toolchain on your system.
+
+You can also install latest version from github::
+
+   pip install git+https://github.com/dask/fastparquet
+
+in which case you should also have ``cython`` to be able to rebuild the C files.
+
+Usage
+-----
+
+Please refer to the documentation_.
+
+*Reading*
+
+.. code-block:: python
+
+    from fastparquet import ParquetFile
+    pf = ParquetFile('myfile.parq')
+    df = pf.to_pandas()
+    df2 = pf.to_pandas(['col1', 'col2'], categories=['col1'])
+
+You may specify which columns to load, which of those to keep as categoricals
+(if the data uses dictionary encoding). The file-path can be a single file,
+a metadata file pointing to other data files, or a directory (tree) containing
+data files. The latter is what is typically output by hive/spark.
+
+*Writing*
+
+.. code-block:: python
+
+    from fastparquet import write
+    write('outfile.parq', df)
+    write('outfile2.parq', df, row_group_offsets=[0, 10000, 20000],
+          compression='GZIP', file_scheme='hive')
+
+The default is to produce a single output file with a single row-group
+(i.e., logical segment) and no compression. At the moment, only simple
+data-types and plain encoding are supported, so expect performance to be
+similar to *numpy.savez*.
+
+History
+-------
+
+This project forked in October 2016 from `parquet-python`_, which was not designed
+for vectorised loading of big data or parallel access.
+
+.. _parquet-python: https://github.com/jcrobak/parquet-python
+