I will be joining the #deltalake community at Data and AI Summit this yea, I hope you will too!
Here are some of my recommended sessions:
https://www.buoyantdata.com/blog/2023-05-17-data-and-ai-summit.htmlr
We are preparing a new release of #DeltaLake for #rustlang which includes a number of fixes, but most importantly and upgrade of Arrow and DataFusion.
There are so many exciting things going on in the rust data processing ecosystem!
I've been taking more meetings with some startups and open source orgs to offer guidance on the open source data or infra l;landscape lately.
In exchange I have been asking for support in my #AIDSLifeCycle fundraising (https://giving.aidslifecycle.org/participant/rtyler)
If I can help you navigate the #AWS, #rustlang , or #deltalake ecosystems, let's chat! 👇
#AIDSLifeCycle #aws #rustlang #DeltaLake
Doing that Tiger Woods style fist pump alone in my office.
After weeks of off-and-on hacking, and some help from another #deltalake community member, I have an small and fast Lambda to stream data through SQS into a Delta Table
Referenced link: https://hubs.la/Q01HxY400
Originally posted by The Linux Foundation / @linuxfoundation@twitter.com: https://twitter.com/linuxfoundation/status/1637879367807188993#m
🗓️ Thursday, March 23rd
🕝 10:00AM PST
👥 Robert Thompson and Geoff Freeman, hosted by
@dennylee
🦀 D3L2: Implementing a Data Lakehouse for Improved #DataScience and #Analytics
RSVP here ➡️ https://hubs.la/Q01HxY400
#datascience #analytics #opensource #DeltaLake #lakehouse #tmobile
Referenced link: https://hubs.la/Q01HvKZS0
Originally posted by The Linux Foundation / @linuxfoundation@twitter.com: https://twitter.com/linuxfoundation/status/1637831864248414211#m
The Python #deltalake 0.8.0 release is here! 😁🦀 In the 𝚠𝚛𝚒𝚝𝚎_𝚍𝚎𝚕𝚝𝚊𝚕𝚊𝚔𝚎 function you can use 𝚖𝚘𝚍𝚎='𝚘𝚟𝚎𝚛𝚠𝚛𝚒𝚝𝚎' in combination with 𝚙𝚊𝚛𝚝𝚒𝚝𝚒𝚘𝚗_𝚏𝚒𝚕𝚝𝚎𝚛𝚜 to overwrite part of a Delta Lake table.
View release notes ➡️ https://hubs.la/Q01HvKZS0
I wrote a blog post sharing some example code on how to write #DeltaLake with #rustlang
Let's all start building data pipelines in Rust!
https://www.buoyantdata.com/blog/2023-02-09-rust-recordbatchwriter-example.html
Referenced link: https://hubs.la/Q01BT4JW0
Originally posted by The Linux Foundation / @linuxfoundation@twitter.com: https://twitter.com/linuxfoundation/status/1623758630519468032#m
We are excited to share that Delta Lake users can now use Conda natively to manage their delta-spark dependency!
Try it out today: https://hubs.la/Q01BT4JW0
#Conda #DeltaLake #OpenSource @DeltaLakeOSS
RT @DeltaLakeOSS
We are excited to share #DeltaLake users can now use Conda natively to manage their delta-spark dependency! 🦀 Try it out today ➡️ https://anaconda.org/conda-forge/delta-spark
#DeltaLake #opensource #dataengineering #conda #linuxfoundation #spark
I stepped into a podcast a few weeks ago to discuss the creation of delta-rs, which brings native #deltalake support to #rustlang and #python
Referenced link: https://hubs.la/Q01yFtPP0
Originally posted by The Linux Foundation / @linuxfoundation@twitter.com: https://twitter.com/linuxfoundation/status/1616181354504273927#m
We are excited to announce the release of #DeltaLake 2.0.2 on Apache Spark 3.2!! 🎊 This release contains important bug fixes and a few high-demand usability improvements over 2.0.1.
View the release notes: https://hubs.la/Q01yFtPP0
Referenced link: https://hubs.la/Q01xVX6S0
Originally posted by The Linux Foundation / @linuxfoundation@twitter.com: https://twitter.com/linuxfoundation/status/1613618177371475975#m
AWS Glue crawlers now have enhanced support for #DeltaLake tables, increasing operational efficiency to extract meaningful insights from analytics services such as Amazon #Athena, Amazon EMR, and #AWS Glue.
Learn more ➡️ https://hubs.la/Q01xVX6S0
#opensource @DeltaLakeOSS
#DeltaLake #Athena #aws #opensource
I did some #rustlang streaming tonight, and while I didn't get very far in building some #DeltaLake lambda ingestion code, I do believe I found and documented a bug in a piece of software that is handling millions of messages each day🎉
@oleg Mostly source code (so I can learn even more at the same time). Worked great with #ApacheSpark #DeltaLake #ApacheKafka (as they all are written in #Scala). Thinking of #Dask as it's close to Spark but written in #Python I'd like to know better. HTH
#apachespark #DeltaLake #apachekafka #scala #dask #python
@oleg Mostly source code (so I can learn even more at the same time). Worked great with #ApacheSpark #DeltaLake #ApacheKafka (as they all are written in #Scala). Thinking of #Dask as it's close to Spark but written in #Python I'd like to know better. HTH
#apachespark #DeltaLake #apachekafka #scala #dask #python
Pondering about all the things I'd like to add to https://github.com/timvw/datafusion-gui ...
Here is a demo: https://youtu.be/fbwU8Lsp5FY
#datafusion #parquet #avro #DeltaLake #data #tools #tooling
Ended up cleaning up some RecordBatchWriter code and posting this example which demonstrates a simple writing of data to #DeltaLake in #rustlang
I have some time this evening, debating whether I should live stream some #DeltaLake and #rustlang coding so long as the power holds out
@bflipp there are not protocol or transaction file changes with #DeltaLake 2.x, so you can safely run a job using the newer version just to run OPTIMIZE if your existing architecture supports that.
What kind of write workloads do ya have floating about?