pyreadstat 1.2.1__tar.gz → 1.2.3__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of pyreadstat might be problematic. Click here for more details.
- {pyreadstat-1.2.1/pyreadstat.egg-info → pyreadstat-1.2.3}/PKG-INFO +1 -1
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/README.md +14 -15
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/__init__.py +1 -1
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/_readstat_parser.c +17444 -11859
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/_readstat_parser.pxd +30 -24
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/_readstat_parser.pyx +39 -40
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/_readstat_writer.c +8057 -5415
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/_readstat_writer.pxd +7 -9
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/_readstat_writer.pyx +21 -44
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/conditional_includes.h +3 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/pyreadstat.c +8739 -4086
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/pyreadstat.pyx +47 -20
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/readstat_api.pxd +9 -5
- {pyreadstat-1.2.1 → pyreadstat-1.2.3/pyreadstat.egg-info}/PKG-INFO +1 -1
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/setup.py +4 -4
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/LICENSE +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/MANIFEST.in +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyproject.toml +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/pyreadstat.pxd +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat/worker.py +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat.egg-info/SOURCES.txt +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat.egg-info/dependency_links.txt +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat.egg-info/requires.txt +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/pyreadstat.egg-info/top_level.txt +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/setup.cfg +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/CKHashTable.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/CKHashTable.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_bits.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_bits.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_convert.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_convert.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_error.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_iconv.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_io_unistd.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_io_unistd.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_malloc.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_malloc.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_metadata.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_parser.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_strings.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_value.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_variable.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_writer.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/readstat_writer.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/ieee.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/ieee.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_sas.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_sas.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_sas7bcat_read.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_sas7bcat_write.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_sas7bdat_read.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_sas7bdat_write.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_sas_rle.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_sas_rle.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_xport.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_xport.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_xport_parse_format.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_xport_parse_format.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_xport_read.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/sas/readstat_xport_write.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_por.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_por.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_por_parse.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_por_parse.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_por_read.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_por_write.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav_compress.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav_compress.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav_parse.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav_parse.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav_parse_timestamp.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav_parse_timestamp.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav_read.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_sav_write.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_spss.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_spss.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_spss_parse.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_spss_parse.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_zsav_compress.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_zsav_compress.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_zsav_read.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_zsav_read.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_zsav_write.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/spss/readstat_zsav_write.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/stata/readstat_dta.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/stata/readstat_dta.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/stata/readstat_dta_parse_timestamp.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/stata/readstat_dta_parse_timestamp.h +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/stata/readstat_dta_read.c +0 -0
- {pyreadstat-1.2.1 → pyreadstat-1.2.3}/src/stata/readstat_dta_write.c +0 -0
|
@@ -194,9 +194,7 @@ You can also install from the github repo directly (without cloning). Use the fl
|
|
|
194
194
|
pip install git+https://github.com/Roche/pyreadstat.git
|
|
195
195
|
```
|
|
196
196
|
|
|
197
|
-
You need a working C compiler
|
|
198
|
-
cython version >= 0.28 installed (see later Python 2.7 support). For python 3, cython
|
|
199
|
-
is not necessary if compiling on unix, but if installed it will be used.
|
|
197
|
+
You need a working C compiler and cython >=3.0.0.
|
|
200
198
|
|
|
201
199
|
### Compiling on Windows and Mac
|
|
202
200
|
|
|
@@ -330,7 +328,8 @@ df, meta = pyreadstat.read_sas7bdat('/path/to/a/file.sas7bdat', usecols=["variab
|
|
|
330
328
|
#### Reading files in parallel processes
|
|
331
329
|
|
|
332
330
|
A challenge when reading large files is the time consumed in the operation. In order to alleviate this
|
|
333
|
-
pyreadstat provides a function "
|
|
331
|
+
pyreadstat provides a function "read\_file\_multiprocessing" to read a file in parallel processes using
|
|
332
|
+
the python multiprocessing library. As it reads the whole file in one go you need to have enough RAM for the operation. If
|
|
334
333
|
that is not the case look at Reading rows in chunks (next section)
|
|
335
334
|
|
|
336
335
|
Speed ups in the process will depend on a number of factors such as number of processes available, RAM,
|
|
@@ -351,6 +350,11 @@ import multiprocessing
|
|
|
351
350
|
num_processes = multiprocessing.cpu_count()
|
|
352
351
|
```
|
|
353
352
|
|
|
353
|
+
**Notes for Xport, Por and some defective SAV files not having the number of rows in the metadata**
|
|
354
|
+
1. In all Xport, Por and some defective SAV files, the number of rows cannot be determined from the metadata. In such cases,
|
|
355
|
+
you can use the parameter num\_rows to be equal or larger to the number of rows in the dataset. This number can be obtained
|
|
356
|
+
reading the file without multiprocessing, reading in another application, etc.
|
|
357
|
+
|
|
354
358
|
**Notes for windows**
|
|
355
359
|
|
|
356
360
|
1. For this to work you must include a __name__ == "__main__" section in your script. See [this issue](#85)
|
|
@@ -410,6 +414,9 @@ for df, meta in reader:
|
|
|
410
414
|
# do some cool calculations here for the chunk
|
|
411
415
|
```
|
|
412
416
|
|
|
417
|
+
**If using multiprocessing, please read the notes in the previous section regarding Xport, Por and some defective SAV files not
|
|
418
|
+
having the number of rows in the metadata**
|
|
419
|
+
|
|
413
420
|
**For Windows, please check the notes on the previous section reading files in parallel processes**
|
|
414
421
|
|
|
415
422
|
#### Reading value labels
|
|
@@ -861,17 +868,9 @@ Converting data types from foreign applications into python some times also brin
|
|
|
861
868
|
|
|
862
869
|
## Python 2.7 support.
|
|
863
870
|
|
|
864
|
-
Python 2.7 is not
|
|
865
|
-
|
|
866
|
-
|
|
867
|
-
At the moment of writing this document Python 2.7 does not work for windows.
|
|
868
|
-
It does work for Mac and Linux. In Mac and Linux, files cannot be opened
|
|
869
|
-
if the path contains international (non-ascii) characters. As mentioned
|
|
870
|
-
before this bug is not going to be repaired (There is not such issue on
|
|
871
|
-
Python 3).
|
|
872
|
-
|
|
873
|
-
Starting on version 1.0.6 wheels are not produced for Python 2.7 anymore,
|
|
874
|
-
but you can still compile on linux and mac.
|
|
871
|
+
As version 1.2.3 Python 2.7 is not supported. In previous versions it was possible to compile it for
|
|
872
|
+
mac and linux but not for windows, but no wheels were provided. In linux and mac it will fail if
|
|
873
|
+
the path file contains non-ascii characters.
|
|
875
874
|
|
|
876
875
|
## Change log
|
|
877
876
|
|