Holden · @holden
543 followers · 66 posts · Server tech.lgbt

RT @jaceklaskowski
Trying to get the better grip over aggregation execution in and wonder what to google for to learn how to describe the topic in a more academic style.

Used "introduction aggregation" with and without "spark" and found some resources.

Any other recs? 🙏

#ApacheSpark #sparksql

Last updated 2 years ago

BirderScott · @sfraser
13 followers · 6 posts · Server mastodon.social

my cluster
is bigger than
your spark cluster

(hopefully no correlation to my inefficiency)

#spark #nerdboast #pyspark #sparksql

Last updated 2 years ago

Zvi · @Zvi
5 followers · 6 posts · Server data-folks.masto.host

Spark question:
Say you have a String field, the string represents a json (assuming a valid json) - how would you select rows that have more than 1 json?
Example: most rows have [{json}] , but some rows have [{json1},{json2}] -how can we get these rows using Spark sql?

#sparksql #spark

Last updated 2 years ago