Stephen Townshend · @thekiwisre
67 followers · 153 posts · Server hachyderm.io

This week on Slight Reliability I had the honour of interviewing Courtney Nash about why mean time to recover () is an unhelpful metric, what she learned by analysing 10+ incident reports, and much more.

🕵🏽‍♀️ Instead of MTTR, let's focus on learning from incidents, observing patterns and themes, involving leadership, and adding an "accident investigator" lens after the fact to enhance the learning.

youtube.com/watch?v=k-tuE9aMg3

#MTTR #sre #devops #incidents #SlightReliability

Last updated 1 year ago

Stephen Townshend · @thekiwisre
65 followers · 146 posts · Server hachyderm.io

This week on I chat with Martin Thwaites from Honeycomb.io about during (). Some of my takeaways:

đź’» How observability in development frees up developers to spend less time debugging and more time writing code.

🤖 That manual instrumentation is where the power is.

đź’° Keeping the cost of observability data down through a combination of head and tail based sampling. "Keeping every span of trace data is irresponsible".

youtube.com/watch?v=dsLVtqILbH

#SlightReliability #observability #development #odd

Last updated 1 year ago

Stephen Townshend · @thekiwisre
65 followers · 145 posts · Server hachyderm.io

This week on ... how do we prevent from only generating value for a small set of engineers? How do executives, product managers, and other stakeholders leverage its power?

youtube.com/watch?v=rH0U1sKr-T

(You can also listen to Slight Reliability via most podcast platforms, or check out slightreliability.com/)

#SlightReliability #observability

Last updated 1 year ago

Stephen Townshend · @thekiwisre
57 followers · 118 posts · Server hachyderm.io

Unfortunately there is no episode this week... So as is tradition, I have a haiku for you.

#SlightReliability #sre

Last updated 1 year ago

Stephen Townshend · @thekiwisre
57 followers · 114 posts · Server hachyderm.io

Who else is going to be at AWS Summit in London on June 7th? Would be great to meet some of the community in person. aws.amazon.com/events/summits/

#awssummit #aws #SlightReliability

Last updated 1 year ago

Stephen Townshend · @thekiwisre
50 followers · 84 posts · Server hachyderm.io

This week on Slight Reliability I chat to Ivan Merrill about his experiences implementing in the real world. We discuss making observability part of onboarding, discussing risk to get leadership buy-in, inviting over inflicting practices, and much more.

youtube.com/watch?v=6osDq8DSxc

#observability #sre #SlightReliability #reliability

Last updated 2 years ago

Stephen Townshend · @thekiwisre
47 followers · 79 posts · Server hachyderm.io

Yesterday reached 1k subscribers on YouTube! Just wanted to say thank you to everyone who has listened and joined in the discussion about !

#SlightReliability #sre

Last updated 2 years ago

Stephen Townshend · @thekiwisre
47 followers · 75 posts · Server hachyderm.io

This week on ... what is "insight" in ? Are tool vendors lying to us about being able to provide it? Is it science? Art? Or magic? youtube.com/watch?v=i2GFEobj2g

#SlightReliability #observability #sre

Last updated 2 years ago

Stephen Townshend · @thekiwisre
45 followers · 64 posts · Server hachyderm.io

This week on I reminisce from my days when I used to analyse complete sets of raw data using scatterplots, and ponder how we could apply this in youtube.com/watch?v=f1GSGWGUEG

#SlightReliability #performancetesting #observability #sre

Last updated 2 years ago

Stephen Townshend · @thekiwisre
39 followers · 61 posts · Server hachyderm.io

Last week on I chated to Paige Cruz from Chronosphere about cognitive overload in . We chated about how SREs are often used as the Swiss army knives of the IT department, how as humans our RAM is maxed out, why you shouldn’t give your team a name like “The Lobsters”, and a whole lot more.

This was one of my very favourite interviews I've ever done. youtube.com/watch?v=CDhGgnIGGQ

#SlightReliability #sre

Last updated 2 years ago

Stephen Townshend · @thekiwisre
39 followers · 59 posts · Server hachyderm.io

This week on I talk about how I think promises more than what we're getting. I argue that it needs to look at more than technology in order to help us negotiate the ocean of chaos in the Digital Era. youtube.com/watch?v=da3o2QSxVe

#SlightReliability #observability #sre

Last updated 2 years ago

Stephen Townshend · @thekiwisre
39 followers · 59 posts · Server hachyderm.io

This week on ... what do we do with all our data? Should we put it all in a data lake? Or is there another way we can pull insight together? youtube.com/watch?v=Mv55p1kXz6

#SlightReliability #telemetry #sre #observability

Last updated 2 years ago

Stephen Townshend · @thekiwisre
37 followers · 54 posts · Server hachyderm.io

My second official blog, focusing on my takeaways from . I explore serverless, observability data lakes, topologies (technology maps), FinOps, and more. Oh, and lots of art! squaredup.com/blog/slight-reli

#SlightReliability #sre #reinvent #mspaint

Last updated 2 years ago

Stephen Townshend · @thekiwisre
37 followers · 53 posts · Server hachyderm.io

What is the future of ? This week on I'm joined by the hosts of the @oncallmemaybe podcast @adrianamvillela and @anamedina to discuss just this.

We discuss the role of in SRE, recruitment tactics, company culture and leadership buy-in, cognitive load, leveraging the scale of community, and more.

youtube.com/watch?v=WNOnq5Mc8C

#sre #SlightReliability #observability

Last updated 2 years ago

Stephen Townshend · @thekiwisre
37 followers · 53 posts · Server hachyderm.io

How do you improve yourself as an or any other role in technology? This week on I share the books I read in 2022 and what I gained from each. Perhaps one of them could be useful to you? youtube.com/watch?v=g54j6lTBbf

#sre #SlightReliability

Last updated 2 years ago

Stephen Townshend · @thekiwisre
32 followers · 46 posts · Server hachyderm.io

About to log off for the year. Thank you to SquaredUp for being an awesome employer, and to everyone who tuned into (or read my articles) in 2022. Looking forward to hitting the ground running in 2023.

I hope you all have a well earned break, and if you're on call over the holiday period... may your incidents be few, and your MTTR extremely small. Oh wait, MTTR has been disproved or something hasn't it? How about, hope it goes smoothly?

#SlightReliability #sre #observability

Last updated 2 years ago

Stephen Townshend · @thekiwisre
37 followers · 53 posts · Server hachyderm.io

About to log off for the year. Thank you to SquaredUp for being an awesome employer, and to everyone who tuned into (or read my articles) in 2022. Looking forward to hitting the ground running in 2023.

I hope you all have a well earned break, and if you're on call over the holiday period... may your incidents be few, and your MTTR extremely small. Oh wait, MTTR has been disproved or something hasn't it? How about, hope it goes smoothly?

#SlightReliability #sre #observability

Last updated 2 years ago

Stephen Townshend · @thekiwisre
29 followers · 45 posts · Server hachyderm.io

This week on
Henrik Rexed (from Dynatrace) and I share our new year's resolutions. We chat about , continuous , using distributed in , and much more. youtube.com/watch?v=e5PzmBYsYN

#SlightReliability #observability #otel #profiling #tracing #testing #sre

Last updated 2 years ago

Stephen Townshend · @thekiwisre
37 followers · 53 posts · Server hachyderm.io

This week on
Henrik Rexed (from Dynatrace) and I share our new year's resolutions. We chat about , continuous , using distributed in , and much more. youtube.com/watch?v=e5PzmBYsYN

#SlightReliability #observability #otel #profiling #tracing #testing #sre

Last updated 2 years ago

Stephen Townshend · @thekiwisre
27 followers · 35 posts · Server hachyderm.io

This week on I chat to Gwen Berry and Steve Gill about starting an team from scratch. We discuss failing at adoption, being on-call as a junior engineer, single pane of glass , and much more. youtube.com/watch?v=o5zm_GgdbE

#SlightReliability #sre #slo #observability

Last updated 2 years ago