FedSearch - Federated network search engine

Christos Argyropoulos MD, PhD · @ChristosArgyrop

372 followers · 587 posts · Server mstdn.science

Open media

I am developing a research application that requires very fast analysis of very large tabular data from sequencing experiments. While I eventually settled in #rstats #datatable, someone kindly suggested I check out what #porn IT does.
The porno servers handle a massive amount of data in real time, executing complex queries in response to what users are watching. At least 3 orders of magnitude larger a problem than mine. Here is what the pros do:
https://news.ycombinator.com/item?id=3597891
https://davidwalsh.name/pornhub-interview

#rstats #datatable #porn

Last updated 2 years ago

Original post

Taras Novak 🇺🇦 · @dataSamurai

176 followers · 318 posts · Server vis.social

Going back to our new #DataNotebook 📓 #ProDataTools 🛠️ development, and updating our generic #DataTableRenderers 🈸 for #VSCode Notebooks 📚 this week.

File your new feature requests and enhancements in our VS Code #DataTable ⊞ repo for now:

🗃️ https://github.com/RandomFractals/vscode-data-table

#datatable #vscode #datatablerenderers #prodatatools #datanotebook

Last updated 2 years ago

Original post

Steven P. Sanderson II, MPH · @stevensanderson

162 followers · 759 posts · Server mstdn.social

Open media

Imagine you have a bunch of data points and you want to know how many belong to different categories. This is where grouped counting comes in. We've got three fantastic methods for you to explore, each with its own flair: **`aggregate()`**, **`dplyr`**, and **`data.table`**.

Happy counting, fellow data explorer! 🎉🔍 #DataAnalysis #RProgramming #ExploreData #dplyr #aggregate #baser #r #rstats #datatable

Post: https://www.spsanderson.com/steveondata/posts/2023-08-10/

#datatable #RStats #r #baser #aggregate #dplyr #exploredata #rprogramming #dataanalysis

Last updated 2 years ago

Original post

Steven P. Sanderson II, MPH · @stevensanderson

146 followers · 703 posts · Server mstdn.social

Open media

Group percentages in R with #baser #dplyr and #datatable
#R #RStats #opensource

https://www.spsanderson.com/steveondata/posts/2023-07-24/

#OpenSource #RStats #r #datatable #dplyr #baser

Last updated 2 years ago

Original post

magljo · @magljo

15 followers · 18 posts · Server fosstodon.org

Spent 3 hours this evening trying to parse a deeply nested json file and convert to an R data.table. Thought I'd encountered enough json data to be able to handle anything thrown at me but had to admit temporary defeat - I'll try again tomorrow. Maybe I'm just rusty with parsing json, or the two beers I had after work addled my brain. Anyone know of any good resources for handling deeply nested Json? #rstats #json #datatable

#rstats #json #datatable

Last updated 2 years ago

Original post

devSJR :python: :rstats: · @devSJR

160 followers · 302 posts · Server fosstodon.org

Occasionally, I think about how to work effectively with #rstats. Currently, I am teaching my #bioinformatics courses with #RKWard again. I try to do most of it with packages from the base installation. #datatable is an exception. But otherwise, I like to use #within (very fast) instead of #mutate.
But there are more approaches, which are often simpler/faster/stable:

- https://github.com/matloff/TidyverseSkeptic/blob/master/RDesign.pdf
- https://davidhughjones.medium.com/dont-forget-non-tidyverse-solutions-979c870c7f3e

#rstats #bioinformatics #rkward #datatable #within #mutate

Last updated 2 years ago

Original post

Taras Novak 🇺🇦 · @dataSamurai

155 followers · 201 posts · Server vis.social

Open media

Running #SQL queries in our new #DataNotebook 📓 @code extension, rendering results with simple #DataTable, #DataSummary 🈷️ & #FlatDataGrid 中 from our #DataTableRenderers 🈸 + CSV #DataExport from VSCode Notebook cell output all in one go. We doubt #MalloyData is as flexible. 😎 #DataTools 🔬 ...

#datatools #MalloyData #dataexport #datatablerenderers #flatdatagrid #datasummary #datatable #datanotebook #sql

Last updated 2 years ago

Original post

Steven P. Sanderson II, MPH · @stevensanderson

99 followers · 537 posts · Server mstdn.social

Open media

I had recently posted on benchmarking the reading in of a .csv file but received an email over the weekend pointing out the omission of something like csv.gz file(s).

Functions tested in the benchmark:

✅ read.table
✅ read.csv
✅ fread
✅ vroom with altrep=false
✅ vroom with altrep=true
✅ read_csv

Post: https://www.spsanderson.com/steveondata/posts/rtip-2023-03-27/

#data #help #softwaredevelopment #compression #gz #r #rstats #vroom #datatable #readr #tidyverse #baser #opensource #innovation #technology #software #benchmarking

#benchmarking #Software #Technology #innovation #OpenSource #baser #tidyverse #readr #datatable #vroom #RStats #r #gz #compression #softwaredevelopment #Help #Data

Last updated 2 years ago

Original post

Steven P. Sanderson II, MPH · @stevensanderson

99 followers · 537 posts · Server mstdn.social

Open media

Today I wanted to share some out of the box benchmarking for reading in a square #matrix in #r the idea behind this was to see how fast the default settings where for reading in these various files.

Post: https://www.spsanderson.com/steveondata/posts/rtip-2023-03-24/

#r #rstats #vroom #fst #arrow #datatable #opensource #opensourcesoftware #software #softwareengineering #technology #innovation

#innovation #Technology #softwareengineering #Software #opensourcesoftware #OpenSource #datatable #arrow #fst #vroom #RStats #r #Matrix

Last updated 2 years ago

Original post

Steven P. Sanderson II, MPH · @stevensanderson

86 followers · 473 posts · Server mstdn.social

Open media

The original data.table function I wrote was slower than the original solution of tidy_bernoulli(), but with the help of Reddit, LinkedIn, and Mastadon users, I got a few great improvements thanks to users from Reddit, LinkedIn, and Mastadon.

🙌 Reddit Help from: https://www.reddit.com/user/NewHere_Hi_everyone/

🙌 LinkedIn Help from: Chris Kypridemos

🙌 Mastadon Help from: @datamaps

Post: https://www.spsanderson.com/steveondata/posts/rtip-2023-03-09/

#innovation #opensource #opensourcesoftware #software #technology #datatable #benchmarking #RStats

#RStats #benchmarking #datatable #Technology #Software #opensourcesoftware #OpenSource #innovation

Last updated 2 years ago

Original post

Steven P. Sanderson II, MPH · @stevensanderson

85 followers · 458 posts · Server mstdn.social

Open media

I was recently challenged by a LinkedIn connection to get on with data.table and it was something that was on my radar but now it's got my interest and attention, so onward with it! challenge accepted!

Post: https://www.spsanderson.com/steveondata/posts/rtip-2023-03-07/

#datatable #tidydensty #bernoulli #tibble #tidy #r #rstats #opensourcesoftware #opensource #software #softwareengineering #innovation #technology #distributions #improvement #engineering #data #bigdata #dataanalysis

#dataanalysis #bigdata #Data #engineering #improvement #distributions #Technology #innovation #softwareengineering #Software #OpenSource #opensourcesoftware #RStats #r #tidy #tibble #Bernoulli #tidydensty #datatable

Last updated 2 years ago

Original post

devSJR :python: :rstats: · @devSJR

154 followers · 280 posts · Server fosstodon.org

Everybody knows (hopefully) that data.table is great. Today, I noticed that it comes with its own update mechanism.

data.table::update_dev_pkg()

That is really useful if there is a feature you see in the development version but prefer a tested package.

#rstats #datatable

Last updated 3 years ago

Original post

Joris Meys · @JorisMeys

449 followers · 840 posts · Server mstdn.social

Open media

Well, that settles it then.

#RStats #dplyr #datatable #chatGPT

(joking aside, it's spooky how well it responds to all kinds of questions students would throw at it. )

#chatgpt #datatable #dplyr #RStats

Last updated 3 years ago

Original post

Taras Novak 🇺🇦 · @dataSamurai

122 followers · 114 posts · Server vis.social

Open media

Our #DataTableRenderers 🈸 for #VSCode Notebooks 📚 has over 30,000 installs. It's one of the most widely used #dataNotebook 📓 extensions in VS marketplace. Extension includes scrollable #dataTable, #flatDataGrid & #dataSummary output renderers. Try it!
📥 https://marketplace.visualstudio.com/items?itemName=RandomFractalsInc.vscode-data-table

#dataTools 🛠️ 💎💎💎...

#datatools #datasummary #flatdatagrid #datatable #datanotebook #vscode #datatablerenderers

Last updated 3 years ago

Original post

Surya Teja K · @shanmukhateja

15 followers · 58 posts · Server social.linux.pizza

As part of improving documentation and encouraging best practices, I will be updating angular-datatables in near future. There won't be major changes to the library source code.

Our goal is to shuffle the menu items and update GitHub Support templates to reflect these changes.

I'll share more details on my blog in a few weeks once I've made some progress. (hopefully!!!)

Happy new year (in advance) folks! 🎉

#Angular #datatable #opensource #newplans

#newplans #opensource #datatable #angular

Last updated 3 years ago

Original post

Ewan Donnachie · @ERDonnachie

508 followers · 704 posts · Server mstdn.social

Unpopular opinion: {data.table} syntax is:

1. Confusing in comparison with the indexing notation for base R data.frames

2. Cryptic with all those [, x := fn(a), by=var]

3. Difficult for the uninitiated to understand (unlike SQL and dplyr)

I know {data.table} is a good package with a lot of very happy users, but for some reason these disadvantages are rarely mentioned. Alongside the lack of database backend, they're the main reason I don't use the package much.

#RStats @rstats #datatable

#datatable #RStats

Last updated 3 years ago

Original post

Ewan Donnachie · @ERDonnachie

560 followers · 858 posts · Server mstdn.social

Unpopular opinion: {data.table} syntax is:

1. Confusing in comparison with the indexing notation for base R data.frames

2. Cryptic with all those [, x := fn(a), by=var]

3. Difficult for the uninitiated to understand (unlike SQL and dplyr)

I know {data.table} is a good package with a lot of very happy users, but for some reason these disadvantages are rarely mentioned. Alongside the lack of database backend, they're the main reason I don't use the package much.

#RStats @rstats #datatable

#datatable #RStats

Last updated 3 years ago

Original post

devSJR :python: :rstats: · @devSJR

136 followers · 162 posts · Server fosstodon.org

Every time I work with data.table I think what a great package. The speed alone is great. What I also like is the tibble-like behavior when displaying data.

#tibble #datatable #rstats

Last updated 3 years ago

Original post

Steven P. Sanderson II, MPH · @stevensanderson

17 followers · 54 posts · Server mstdn.social

Open media

There might be times when you may want to get some sort of #summary #statistic like a #quantile or #IRQ on your #distribution data.

With my #r #package {TidyDensity} this is possible given the data comes from a tidy_ distribution function. If you have a vector of data you can use tidy_empirical() as a cheat.

With this function you can get output as #sapply #lapply #tibble or a #tibble where #datatable is doing the work.

Post: https://www.spsanderson.com/steveondata/posts/weekly-rtip-tidydensity-2022-11-23/

See attached!

#datatable #tibble #lapply #sapply #package #r #distribution #irq #quantile #statistic #summary

Last updated 3 years ago

Original post

docfleetwood · @docfleetwood

39 followers · 23 posts · Server fosstodon.org

@chrisadamsecon Definitely faster than #dplyr on my laptop. More vectorized functions. #tidyverse like syntax and you can mostly mix and match with dplyr as you require. Another package, #tidytable is quite interesting also. You can just type regular dplyr and it will convert to #datatable without worrying about any extra steps like dtplyr. #rstats

#dplyr #tidyverse #tidytable #datatable #rstats

Last updated 3 years ago

Original post