FedSearch - Federated network search engine

Gilles San Martin · @sanmartin

8 followers · 27 posts · Server fosstodon.org

📊 I gave a seminar on best practices for data collection

Learn which data/metadata to collect systematically & tips to avoid hours of data cleaning nightmare 😅

#DataCollection #DataScience #TidyData

📽️ Watch now on YT: https://youtu.be/zsyTlgAG_58?feature=shared

#datacollection #datascience #TidyData

Last updated 1 year ago

Original post

datamaps :rickwhoah: · @datamaps

396 followers · 753 posts · Server social.linux.pizza

@sharoz @JASPStats

SPSS suppose that you have a "dataset", a collection of "observations" that relate to a particular subject/survey/domain: one row is one observation about a (sample) unit, one column is a variable collected over all units, one cell is a singular value (this is essentially what is now called #tidydata because someone wrote an article naming it, but it's been part of the profession long before that)

remember that SPSS meant statistical package for social sciences

#TidyData

Last updated 2 years ago

Original post

Steven P. Sanderson II, MPH · @stevensanderson

74 followers · 423 posts · Server mstdn.social

Open media

New Blog post

✅ .data - The data being passed that will be augmented by the function.

✅ .dx_col - The column containing the Principal Diagnosis for the discharge.

✅ .px_col - The column containing the Principal Coded Procedure for the discharge. It is possible that this could be blank.

✅ .drg_col - The DRG Number coded to the inpatient discharge.

Post: https://www.spsanderson.com/steveondata/posts/weekly-rtip-healthyr-2023-01-27/

#healthcare #datascience #rstats #data #dataanalysis #analytics #dx #px #serviceline #drg #tidydata #opensource

#OpenSource #TidyData #DRG #serviceline #Px #dx #analytics #dataanalysis #Data #RStats #DataScience #Healthcare

Last updated 2 years ago

Original post

CodeRefinery · @coderefinery

54 followers · 32 posts · Server fosstodon.org

#PythonForSciComp

Day finished successfully, feedback was good. We resume tomorrow 9:50 EET / 8:50 CET with more #Pandas (going through practical usage), #visualization (#matplotlib as a base of the ecosystem), and then disk-based data formats.

If you want something to review for tomorrow, check out #TidyData as defined in this paper - useful for anyone organizing, #Python or not:
https://vita.had.co.nz/papers/tidy-data.pdf

#PythonForSciComp #pandas #visualization #matplotlib #TidyData #python

Last updated 2 years ago

Original post

Steven P. Sanderson II, MPH · @stevensanderson

17 followers · 55 posts · Server mstdn.social

Open media

My #r #package {TidyDensity} is on its way to #cran @ramikrispin #distributions #randomdata #tidydata

#TidyData #randomdata #distributions #cran #package #r

Last updated 2 years ago

Original post

Zane Selvans · @ZaneSelvans

348 followers · 365 posts · Server social.coop

The last hard thing is trying to generalize the way we reshape the FERC data, which typically comes in a wide format (like... 500 columns sometimes) into #TidyData that's more relational.

We do have a nice way to concatenate the old DBF and new XBRL data, which also aligns all of the old data, whose row numbers changed meaning from year to year as new fields were added, split, or removed.

#TidyData

Last updated 2 years ago

Original post