Yaro Ivanovsky · @YaroIvanovsky
0 followers · 3 posts · Server mas.to
Dylan Van Assche · @dylanvanassche
684 followers · 1016 posts · Server fosstodon.org

Claus Stadler is presenting their work behind SANSA: 'Scaling RML and SPARQL-based Knowledge Graph Construction with Apache Spark' now at the Knowledge Graph Construction Workshop!

@eswc_conf @aksw

#eswc2023 #kgcw2023 #rml #sparql #apachespark

Last updated 1 year ago

Delta Lake · @deltalakeoss
46 followers · 57 posts · Server social.lfx.dev

📣 We are excited to announce the release of Delta Lake 2.4.0 on Apache Spark 3.4. Similar to Apache Spark™, we have released Maven artifacts for both Scala 2.12 and Scala 2.13! 🎉

Documentation: lnkd.in/eTD9ua_6
Python artifacts: lnkd.in/e65AeChW

⭐ View the release notes: lnkd.in/er2PDhjJ

#deltalake #opensource #oss #data #apachespark

Last updated 1 year ago

Tech news from Canada · @TechNews
454 followers · 12806 posts · Server mastodon.roitsystems.ca
IT News · @itnewsbot
3096 followers · 256330 posts · Server schleuss.online

“A really big deal”—Dolly is a free, open source, ChatGPT-style AI model - Enlarge (credit: Databricks)

On Wednesday, Databricks released... - arstechnica.com/?p=1931693

#ai #meta #llama #dolly #pythia #biz #finetuning #eleutherai #databricks #apachespark #textsynthesis #machinelearning #largelanguagemodels

Last updated 2 years ago

Delta Lake · @deltalakeoss
38 followers · 37 posts · Server social.lfx.dev

Check out the latest installment of on which includes:

✔️ The fully packed delta-spark 2.3 release (it's not always about and ... just a lot of the time)

✔️ A great post by Will Girten on and

✔️ A shout to The Linux Foundation project

✔️ Our new archive via Linen, &..

✔️ A great blog by Nick Karpov on and

youtu.be/nILXJ-0aTPY

#lastweekinabyte #deltalake #rustlang #python #apachespark #deltasharing #streaming #finos #legend #slack #aws #lambda #opensource #linuxfoundation

Last updated 2 years ago

Carlos Peña · @capemo
30 followers · 200 posts · Server infosec.exchange

Yesterday we tried to upload 10M rows to an Table using and . We hit 333K rows per minute at our best.

Wondering if anyone here has done something similar?

From what I read, the max transactions per second on an Azure Table is 20K so… I guess we can try to speed it up a bit further.

#azure #apachespark #databricks

Last updated 2 years ago

Poda Black · @PodaBlack
4 followers · 27 posts · Server zbrx.org

Data Science Frameworks: This field involves the use of statistics, scientific methods, and algorithms to extract knowledge. Popular data science frameworks include , , , and , with being the dominant programming language.

#tensorflow #pytorch #apachespark #numpy #python

Last updated 2 years ago

Carlos Peña · @capemo
30 followers · 195 posts · Server infosec.exchange

Wish someday we could get desktop computers instead of laptops for work.

I use on a daily basis and whenever I start running some functional tests the fans don’t like it. Yesterday the CPU was at 90°C. :blob_cry:

#apachespark

Last updated 2 years ago

Carlos Peña · @capemo
26 followers · 163 posts · Server infosec.exchange

Refactored some code today. Reduced execution time from ~13h to 5m.

is amazing.

#apachespark

Last updated 2 years ago

Jacek Laskowski · @jaceklaskowski
98 followers · 18 posts · Server fosstodon.org

@oleg Mostly source code (so I can learn even more at the same time). Worked great with (as they all are written in ). Thinking of as it's close to Spark but written in I'd like to know better. HTH

#apachespark #DeltaLake #apachekafka #scala #dask #python

Last updated 2 years ago

@oleg Mostly source code (so I can learn even more at the same time). Worked great with (as they all are written in ). Thinking of as it's close to Spark but written in I'd like to know better. HTH

#apachespark #DeltaLake #apachekafka #scala #dask #python

Last updated 2 years ago

Wojtek Walczak · @wojtekwalczak
26 followers · 40 posts · Server mastodon.social

My Medium adventure enters a new phase: the first post for a Medium-held publication, Plumbers of Data Science, just got published :)

It's also more technical than my previous writings. The point is to introduce Apache Hudi in a softer way than the official documentation does at the moment. So, if you're interested in starting with Hudi, look no further :)

medium.com/plumbersofdatascien

#ApacheHudi #apachespark #dataengineering

Last updated 2 years ago

Paul King · @paulk
13 followers · 3 posts · Server foojay.social
Paul King · @paulk
13 followers · 4 posts · Server foojay.social
Alex Ott · @alexott
1 followers · 7 posts · Server infosec.exchange

Another post at company blog: Build Reliable and Cost Effective Streaming Data Pipelines With Delta Live Tables’ Enhanced Autoscaling

Autoscaling is important for handling of cybersecurity data that are often spiky by its nature. For this blog post I specially selected Zeek logs as an example to demonstrate that it’s possible to build cost efficient data ingestion pipelines.

databricks.com/blog/2022/12/08

#databricks #zeek #deltalivetables #cybersecurity #apachespark

Last updated 2 years ago

IT News · @itnewsbot
2274 followers · 240364 posts · Server schleuss.online

AWS Glue upgrades Spark engines, backs Ray framework - AWS Glue, a serverless data integration service provided by Amazon Web Services, showc... - infoworld.com/article/3681339/

#python #apachespark #cloudcomputing #dataintegration #amazonwebservices

Last updated 2 years ago

Ambarish Ganguly · @ambarish
1 followers · 7 posts · Server hachyderm.io

This is a video on - Joins , Null Values and Built In Functions - 8th video of the playlist Apache Spark Developer Associate [ Databricks ]

The material is from the Databricks Academy. Please do subscribe , comment and lets learn together as a community.

Objectives
✅ Apply built-in functions to generate data for new columns
✅ Apply DataFrame NA functions to handle null values
✅ Join DataFrames

youtu.be/oF3_AbndFcY

#azuredatabricks #apachespark #apachesparkdeveloper #certifications

Last updated 2 years ago

Alex Ott · @alexott
0 followers · 3 posts · Server infosec.exchange
Dirk Van den Poel · @dirkvandenpoel
61 followers · 14 posts · Server mastodon.online