Enjoying the discussion of cross-validation methods for use of sensor data for air quality applications at the EPA air sensor QA workshop. It’s easy to overestimate how well you are doing with sensor data corrections or fusion applications unless a rigorous independent test approach is used #airquality #airpollution #crossvalidation #lowcostsensors @dwestervelt https://www.epa.gov/amtic/2023-air-sensors-quality-assurance-workshop
#lowcostsensors #crossvalidation #airpollution #airquality
🔍 Cross-Validation (CV): A powerful technique in #MachineLearning! Validation Set Approach, Leave-One-Out CV, k-fold CV - Methods to validate and fine-tune models.In our article, we present these techniques in detail. #DataScience #CrossValidation
Link:
https://medium.com/mlearning-ai/a-beginner-friendly-introduction-to-cross-validation-2e37e70a592c
🪄 Join our weekly Magic AI newsletter for the latest AI updates! It's free.
https://tinztwins.gumroad.com/l/magic-ai-newsletter
#machinelearning #datascience #crossvalidation
3/
#Feynman: "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"
https://archive.org/details/meaningofitallth0000feyn_d8d3/page/80/mode/2up?q=%22it+doesn%E2%80%99t+make+any+sense+to+calculate+after+the+event%22&view=theater
Special trending case: #CrossValidation (where data for selecting/tuning a model are also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other #MachineLearning math. tricks where many dimensions/parameters are tuned by using much less data
Without a deep understanding, black-box tools lead astray
#feynman #crossvalidation #machinelearning
3/
Again #Feynman: "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"
https://archive.org/details/meaningofitallth0000feyn_d8d3/page/80/mode/2up?q=%22it+doesn%E2%80%99t+make+any+sense+to+calculate+after+the+event%22&view=theater
Special cases: #CrossValidation (where data for selecting/tuning a model are also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other #MachineLearning math. tricks where many dimensions/parameters are tuned by using much less data.
Without a deep understanding, black-box tools lead astray.
#feynman #crossvalidation #machinelearning
3/
Again #Feynman: "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"
https://archive.org/details/meaningofitallth0000feyn_d8d3/page/80/mode/2up?q=%22it+doesn%E2%80%99t+make+any+sense+to+calculate+after+the+event%22&view=theater
Special cases: #CrossValidation (where data for selecting/tuning a certain model are also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other #MachineLearning math tricks where many dimensions/parameters are tuned by using much less data
Without deep knowledge, black-box tools are so risky
#feynman #crossvalidation #machinelearning
3/
Again #Feynman: "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"
https://archive.org/details/meaningofitallth0000feyn_d8d3/page/80/mode/2up?q=%22it+doesn%E2%80%99t+make+any+sense+to+calculate+after+the+event%22&view=theater
Special cases: #CrossValidation (data for selecting/tuning a model also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other #MachineLearning tricks where many dimensions/parameters are tuned by using much less data.
#feynman #crossvalidation #machinelearning
3/
A more specific problem (special case of the point #Feynman made so clear) deals with #CrossValidation (where data for selecting/tuning a model are also iteratively used to test it, with allegedly "clever" methods to avoid fooling oneself) and other #MachineLearning mathematical tricks where many dimensions/parameters are tuned by using much less data.
So easy to fall into #overfitting, or modelling not only the studied "signal" in these few data but also (or mostly) their useless noise.
#feynman #crossvalidation #machinelearning #overfitting
3/
A more specific problem (special case of the point #Feynman made so clear) deals with #CrossValidation (where data for selecting/tuning a model are also iteratively used to test it, with allegedly "clever" methods to avoid fooling oneself) and other #MachineLearning mathematical tricks where many dimensions/parameters are tuned by using much less data. So easy to fall into #overfitting, or modelling not only the studied "signal" in these few data but also (or mostly) their useless noise.
#feynman #crossvalidation #machinelearning #overfitting
Model Evaluation, Model Selection, and Algorithm
Selection in Machine Learning
#MachineLearning #ModelEvaluation #CrossValidation
#HyperparameterOptimization
#machinelearning #modelevaluation #crossvalidation #hyperparameteroptimization
I have two binary classifiers A and B, trained and tested through #crossvalidation on the same training-set, strongly unbalanced, since the positive class samples are the 7% of the total samples.
The ROC-AUC of A and B is respectively 0.950 and 0.949, while the area under the precision-recall curve is respectively 0.716 and 0.717. Both this differences are not statistically significant.
#datascience #machinelearning #artificialintelligence #statistics #classification
#classification #statistics #artificialintelligence #machinelearning #datascience #crossvalidation