Overview

A simple package that streamlines the download-read-wrangling process needed to analyze the Encuesta Continua de Hogares survey carried out by the Instituto Nacional de Estadística (Uruguay).

Here's what PyECH can do:

Download survey compressed files.
Unrar, rename and move the SAV (SPSS) file to a specified path.
Read surveys from SAV files, keeping variable and value labels.
Download and process variable dictionaries.
Search through variable dictionaries.
Summarize variables.
Calculate variable n-tiles.
Convert variables to real terms or USD.

PyECH does not attempt to estimate any indicators in particular, or facilitate any kind of modelling, or concatenate surveys from multiple years. Instead, it aims at providing a hassle-free experience with as simple a syntax as possible.

Surprisingly, PyECH covers a lot of what people tend to do with the ECH survey without having to deal with software licensing.

For R users, check out ech.

Installation

pip install pyech

Dependencies

In order to unpack downloaded survey files you will need to have unrar in your system. This should be covered if you have WinRAR or 7zip installed. Otherwise sudo apt-get install unrar or what's appropiate for your system.

Usage

Full documentation, including this readme.
Run the examples notebook in your browser |

Loading a survey is as simple as using ECH.load, which will download it if it cannot be found at dirpath (by default the current working directory).

from pyech import ECH

survey = ECH()
survey.load(year=2019, weights="pesoano")

Optionally, load accepts from_repo=True, which downloads survey data from the PyECH Github repository (HDFS+JSON). Loading data this way is significantly faster.

ECH.load also downloads the corresponding variable dictionary, which can be easily searched.

survey.search_dictionary("ingreso", ignore_case=True, regex=True)

This will return a pandas DataFrame where every row matches the search term in any of its columns.

Calculating aggregations is as simple as using ECH.summarize.

survey.summarize("ht11", by="dpto", aggfunc="mean", household_level=True)

Which returns a pandas DataFrame with the mean of "ht11" grouped by ECH.splitter and by (both are optional). Cases are weighted by the column defined in ECH.load.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
pyech		pyech
tests		tests
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
LICENSE		LICENSE
README.md		README.md
logo.png		logo.png
requirements-dev.in		requirements-dev.in
requirements-dev.txt		requirements-dev.txt
requirements.in		requirements.in
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Installation

Dependencies

Usage

About

Releases 1

Languages

License

cpa-analytics/pyech

Folders and files

Latest commit

History

Repository files navigation

Overview

Installation

Dependencies

Usage

About

Resources

License

Stars

Watchers

Forks

Releases 1

Languages