Last was an enjoyable talk by Jean-Baptiste Poline on the statistical and sociological components of #reproducibility at the University of Washington eScience Institute. I particularly liked the story on reproducibility at the start https://www.youtube.com/watch?v=HMctfZxU0mo (12/12) #statistics
Next was a fantastic conversation with @c on the history of #data and #statistics at Columbia Business School. The historical context is an instructive lens to view practices which we take for granted, and Wiggins's take on recent #AI advances is refreshing. Highly recommend https://www.youtube.com/watch?v=sZLuhQkook8 (3/12)
“Better numbers exist to summarize location, association and spread: numbers that are easier to interpret and that don't act up with wonky data and outliers.”
(2017)
http://debrouwere.org/2017/02/01/unlearning-descriptive-statistics/
@CPDPconferences @sfiscience @barbararlaz @aurelie Next was an excellent talk by Dan Feiler on the worst-first heuristic at Data Colada. Feiler shows that when people are trying to solve complex problems with dependent parts, they often place excessive value on improving the weakest link even when that's not statistically correct. Highly recommend https://www.youtube.com/watch?v=Aac_1gwCLxI (5/7) #psychology #statistics #BehavioralEconomics
#psychology #Statistics #behavioraleconomics
First was a great talk by Susan Murphy on inference for longitudinal #data after adaptive sampling. Using tooth brushing as the study setting, Murphy builds up a robust method for an important intervention that takes into account the many statistical complexities of the setting. This is an example of how, rather than ignoring that some algorithm's assumptions are violated, one can take that into account to make something that is much more likely to work https://www.youtube.com/watch?v=WpzVMgdYfxk (2/7) #statistics
Hey #rstats and #quartopub folks: I'd like to write up a #casestudy on a 300-level #stats class that I built in Quarto (originally in #Rmarkdown with #blogdown) and that heavily leverages R and Quarto to teach both the stats *and* R and Quarto.
Any suggestions for possible outlets for a piece like that? #academicchatter #academicpublishing #academia #publishing #Statistics
#rstats #QuartoPub #casestudy #stats #rmarkdown #blogdown #academicchatter #academicpublishing #academia #publishing #Statistics
Next was an interesting talk by Monica Alexander on Bayesian #demography at the University of Washington. Alexander goes through the history of demography and how Bayesian methods first started to be used, in what contexts they can be effective, and how they should be used moving forward https://www.youtube.com/watch?v=2RtMYaBe1D0 (3/5) #statistics
Next was a fantastic talk by Juan Carlos Perdomo on performative prediction at the Vermont Complex Systems Center. This talk blew my mind - since in many cases we know that predicting something changes the outcome, what if we explicitly built that into the prediction model? I am immediately going to work to put these ideas into practice. Highly recommend https://www.youtube.com/watch?v=-BUlUpzPyCU (10/11) #algorithms #statistics
Last was an absolute banger on the misuse of #statistics at the #RSS w/Timandra Harkness, Anna Powell-Smith, Ed Humpherson, Paul Kiff, & Michael Blastland. What does it mean to mislead? How much authority should statisticians have over what is said publicly by authorities and politicians? Should medicine, public health, or other officials be allowed to take a persuasive approach to stats use? All this & more is discussed in this incredible panel. Highly recommend https://www.youtube.com/watch?v=OhP6QzoTZO8 (10/10)
Next was a short talk by Claire McKay Bowen on providing different differentially private stats and analyses for tax #data. This was a wonderful breakdown of how to thoughtfully release meaningful, sensitive data to researchers. Highly recommend https://www.youtube.com/watch?v=ENvvF6jY8f4 (8/12) #statistics #DifferentialPrivacy #privacy
#data #Statistics #differentialprivacy #privacy
First was an excellent slate of talks on panel conditioning in longitudinal #survey #data by Ruben Bach, Fabienne Kraemer, and @floriankeusch at the #RoyalStatisticalSociety. As participants experience data collection through multiple rounds, it can change their attitudes and behaviors leading to biased estimates, and the important work presented here explores this phenomenon. Highly recommend https://www.youtube.com/watch?v=tQ6G9MPJkW4 (2/8) #statistics
#survey #data #royalstatisticalsociety #Statistics
Next was a fabulous talk by Jerry Reiter on the future of public use data at #PennState #Statistics. Public agencies have been releasing #data for decades, but advances in #MachineLearning have made it increasingly easy to identify individuals in the data. This presents a number of issues with complex solutions that are examined in detail here. Highly recommend https://www.youtube.com/watch?v=sgUx1HkhzKc (7/9) #privacy
#PennState #Statistics #data #MachineLearning #privacy
Last was a wonderful talk by Dave Hunter on clustering, model explainability, and societal outcomes at #PSU. Hunter developed the admissions algorithm at the center of a landmark Supreme Court case, and he goes into fascinating detail on the approach, logistics, and implications of his work, with some strong words for how modern ML #algorithms are designed and implemented. Highly recommend https://www.youtube.com/watch?v=gvWJsP3jbRM (7/7) #statistics #MachineLearning
#psu #algorithms #Statistics #MachineLearning
Next was an interesting talk by Dylan Small on testing elaborate theories of causal hypotheses at the University of Washington. The real world is complicated, and this talk presents some important methods and thought about how to go about testing more realistic models without breaking #statistics https://www.youtube.com/watch?v=tRUicNpsybk (6/7)
Next was a wonderful talk by Daniela Witten on double dipping problems in #statistics at the University of Washington. I love statistics talks that go into detail on important problems that affect many analyses, and this talk provides an elegant solution for important problems in statistical analyses that folks in academia and #tech that work with large datasets should pay attention to. Highly recommend https://www.youtube.com/watch?v=0x_0uHu1JlM (6/9) #BigData #AI #MachineLearning
#Statistics #tech #bigdata #ai #MachineLearning
Next was an ingenious talk by Ting Ye on bounding difference in differences analyses at the University of Washington. These analyses are used frequently for studying policy interventions, but Ye explores the problems with this approach in many cases and proposes an elegant solution. Highly recommend https://www.youtube.com/watch?v=FtOY14fmkBI (8/10) #statistics
An ultra-Heywood case was detected
Sounds awesome, is probably boring to fix. #Statistics #Rlang
Okay #ML, #AI, #Statistics folks, here's a challenge:
What's your favorite/most intuitive way to explain model precision, recall, and confusion matrices to beginners?
Are there hacks you use to keep things straight?
#AcademicChatter #research #PhD #artificialintelligence #MachineLearning #chatgpt
#ml #ai #Statistics #academicchatter #research #phd #artificialintelligence #MachineLearning #chatgpt
First was a phenomenal talk by Seth Spielman on the rise of model-based #data at the #AlanTuringInstitute. Data has never been objective, but this talk explores how the injection of modern #statistics into dataset augmentation creates new issues. Highly recommend https://www.youtube.com/watch?v=waiyxVeQ_kI (2/12)
#data #alanturinginstitute #Statistics
http://rtutor.ai/ <-- very fun to play with "generative stats", writing #R code through #GPT3 ... requires some knowledge of R and #statistics, but potentially great for people who use R every few months and then google for the commands and libaries again; thx @StevenXGe@twitter.com