FedSearch - Federated network search engine

annaraven · @annaraven

897 followers · 4686 posts · Server sfba.social

Weird. I get:
#python
5 results
#PythonForDataAnalysis
0 people in the past 2 days
#python
124 people in the past 2 days
#pythonist
0 people in the past 2 days
...

#python #pythonfordataanalysis #pythonist

Last updated 2 years ago

Original post

Xiuwen · @icoder

110 followers · 465 posts · Server sfba.social

Learning from #EffectivePandas and #PythonForDataAnalysis.

Recipe for permutating or randomly reordering the rows of a DataFrame or Series:
new_order = np.random.permutation(n)
df.iloc[new_order]
df.take(new_order)
To permutate the cols of a DataFrame, add "axis='columns'" to .take().

Method for selecting a random subset of the rows DataFrame or Series:
df.sample(n=, frac=)
To allow for replacement, add "replace=True" to .sample().

#LearnPython #ProgressToday

#effectivepandas #pythonfordataanalysis #learnpython #ProgressToday

Last updated 2 years ago

Original post

Xiuwen · @icoder

114 followers · 519 posts · Server sfba.social

Learning from #EffectivePandas and #PythonForDataAnalysis.

Recipe for permutating or randomly reordering the rows of a DataFrame or Series:
new_order = np.random.permutation(n)
df.iloc[new_order]
df.take(new_order)
To permutate the cols of a DataFrame, add "axis='columns'" to .take().

Method for selecting a random subset of the rows DataFrame or Series:
df.sample(n=, frac=)
To allow for replacement, add "replace=True" to .sample().

#LearnPython #ProgressToday

#effectivepandas #pythonfordataanalysis #learnpython #ProgressToday

Last updated 2 years ago

Original post

Xiuwen · @icoder

114 followers · 519 posts · Server sfba.social

@treyhunner I continued working through 2 books on pandas: #EffectivePandas and #PythonForDataAnalysis, and wrote a few toots as my notes

#effectivepandas #pythonfordataanalysis

Last updated 2 years ago

Original post

Xiuwen · @icoder

114 followers · 519 posts · Server sfba.social

Learning from #EffectivePandas and #PythonForDataAnalysis.

The preferred way to index and filter a Series or a DataFrame is i) with .loc[] indexing on index labels or ii) with .iloc[] indexing on index position integers. Their call signatures are nearly identical:

.loc[rows]
.loc[:, cols]
.loc[rows, cols]

Their strengths come from the increased clarity what we intend to index on and what we intend to select, therefore helping us not be the problem 😂

#LearnPython #ProgressToday

#effectivepandas #pythonfordataanalysis #learnpython #ProgressToday

Last updated 2 years ago

Original post

Xiuwen · @icoder

96 followers · 398 posts · Server sfba.social

#ProgressToday Continued my way through #EffectivePandas and #PythonForDataAnalysis.

Element-wise transformation of a Series values or an Index labels can be done by feeding a dictionary (for selected elements) or a function (for all elements) into method

.map(dict or func)

Binning can be done with methods

.cut(data, bins or nbins, right=, labels=, precision=)
.qcut(data, quantiles or nquartiles)

.cut() bins the data values, while .qcut() bins the data quantiles.

#LearnPython

#ProgressToday #effectivepandas #pythonfordataanalysis #learnpython

Last updated 2 years ago

Original post

Xiuwen · @icoder

114 followers · 519 posts · Server sfba.social

Continued my way through #EffectivePandas and #PythonForDataAnalysis.

Element-wise transformation of a Series values or an Index labels can be done by feeding a dictionary (for selected elements) or a function (for all elements) into method

.map(dict or func)

Binning of a Series or column can be done with i) the data values, or ii) the data quantiles:

.cut(data, bins or nbins, right=, labels=, precision=)
.qcut(data, quantiles or nquartiles)

#LearnPython #ProgressToday

#ProgressToday #effectivepandas #pythonfordataanalysis #learnpython

Last updated 2 years ago

Original post

Xiuwen · @icoder

90 followers · 358 posts · Server sfba.social

#ProgressToday Finished the sections in #EffectivePandas and #PythonForDataAnalysis on converting the data types of a Series or column. Top methods:

.astype(dtype, copy=, errors=)
.convert_dtypes()
pd.to_datetime()
pd.CategoricalDtype(categories=, ordered=)

The first is a general-purpose one for Python and NumPy types, while the second converts to pandas extension types that support NA.

Before converting data types, be sure to take care of codes for missing data or errors

#ProgressToday #effectivepandas #pythonfordataanalysis

Last updated 2 years ago

Original post

Xiuwen · @icoder

89 followers · 347 posts · Server sfba.social

#ProgressToday Finished going through sections in #EffectivePandas and #PythonForDataAnalysis related to duplicated data and cleaning. It's good that the two important methods apply to all three objects - Series, DataFrame, and Index:
.duplicated(subset=, keep=)
.drop_duplicates(subset=, keep=)
One difference is that the kwarg 'subset=' applies to DataFrame objects only, which can have multiple columns to choose from.

#ProgressToday #effectivepandas #pythonfordataanalysis

Last updated 2 years ago

Original post

Xiuwen · @icoder

88 followers · 341 posts · Server sfba.social

#ProgressToday Finished going over sections in #EffectivePandas and #PythonForDataAnalysis related to handling missing data. Here are useful functions on this topic:
.isna()
.notna()
.dropna(how=, thresh=, axis=)
.fillna(value=, method=, limit=, axis=)
.interpolate(method=, limit=, axis=)

#ProgressToday #effectivepandas #pythonfordataanalysis

Last updated 2 years ago

Original post