I've been tinkering with a small #Pandas-based package for non-standard #dataframe operations:
pandance.readthedocs.io
Specifically, it offers the following joins (implemented as efficiently as I could in pure Python):
- fuzzy join (for numerical and time columns)
- inequality join (for any comparable type)
- a generic join on arbitrary conditions ("theta-join")
I found these operations very useful in my research,
and since no efficient versions existed at the time, I decided to package them.
Maybe it's interesting for others as well.
The operations have been tested extensively, but I've been the sole user so far.
So feedback and any bug reports are welcome.
Wrote up benchmark difference between #python #pandas #dataframe `iloc` and `to_dict`
Hoy traigo una librería de #Python que desconocía, sidetable, tal vez sea conocido pero yo no me lo había encontrado.
:python: https://pypi.org/project/sidetable/
Más información:
Fuente de la imagen: Avi Chawla.
#DataScience #Pandas #Dataframe #Tables #DataScientist #analysis #análisis #Librería #research #Investigación #Library
#python #datascience #pandas #Dataframe #Tables #DataScientist #analysis #análisis #Librería #research #investigación #library
Hoy traigo una librería de
#Python que desconocía, sidetable, tal vez sea conocido pero yo no me lo había encontrado.
:python: https://pypi.org/project/sidetable/
Más información:
Fuente de la imagen: Avi Chawla.
#DataScience #Pandas #Dataframe #Tables #DataScientist #analysis #análisis #Librería #research #Investigación #Library
#python #datascience #pandas #Dataframe #Tables #DataScientist #analysis #análisis #Librería #research #investigación #library