Nathan Pavlovic · @N_pavlovic
117 followers · 113 posts · Server fediscience.org

Enjoying the discussion of cross-validation methods for use of sensor data for air quality applications at the EPA air sensor QA workshop. It’s easy to overestimate how well you are doing with sensor data corrections or fusion applications unless a rigorous independent test approach is used @dwestervelt epa.gov/amtic/2023-air-sensors

#lowcostsensors #crossvalidation #airpollution #airquality

Last updated 2 years ago

Tinz Twins · @tinztwins
0 followers · 7 posts · Server me.dm

🔍 Cross-Validation (CV): A powerful technique in ! Validation Set Approach, Leave-One-Out CV, k-fold CV - Methods to validate and fine-tune models.In our article, we present these techniques in detail.  

Link:
medium.com/mlearning-ai/a-begi

🪄 Join our weekly Magic AI newsletter for the latest AI updates! It's free.
tinztwins.gumroad.com/l/magic-

#machinelearning #datascience #crossvalidation

Last updated 2 years ago

Daniele de Rigo · @dderigo
120 followers · 147 posts · Server hostux.social

3/
: "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"
archive.org/details/meaningofi

Special trending case: (where data for selecting/tuning a model are also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other math. tricks where many dimensions/parameters are tuned by using much less data

Without a deep understanding, black-box tools lead astray

#feynman #crossvalidation #machinelearning

Last updated 2 years ago

Daniele de Rigo · @dderigo
120 followers · 147 posts · Server hostux.social

3/
Again : "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"
archive.org/details/meaningofi

Special cases: (where data for selecting/tuning a model are also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other math. tricks where many dimensions/parameters are tuned by using much less data.

Without a deep understanding, black-box tools lead astray.

#feynman #crossvalidation #machinelearning

Last updated 2 years ago

Daniele de Rigo · @dderigo
120 followers · 147 posts · Server hostux.social

3/
Again : "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"
archive.org/details/meaningofi

Special cases: (where data for selecting/tuning a certain model are also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other math tricks where many dimensions/parameters are tuned by using much less data

Without deep knowledge, black-box tools are so risky

#feynman #crossvalidation #machinelearning

Last updated 2 years ago

Daniele de Rigo · @dderigo
120 followers · 147 posts · Server hostux.social

3/
Again : "it doesn’t make any sense to calculate after the event. You see, you found the peculiarity, and so you selected the peculiar case"
archive.org/details/meaningofi

Special cases: (data for selecting/tuning a model also used to test it, with allegedly "clever" methods to avoid fooling oneself) and other tricks where many dimensions/parameters are tuned by using much less data.

#feynman #crossvalidation #machinelearning

Last updated 2 years ago

Daniele de Rigo · @dderigo
120 followers · 147 posts · Server hostux.social

3/
A more specific problem (special case of the point made so clear) deals with (where data for selecting/tuning a model are also iteratively used to test it, with allegedly "clever" methods to avoid fooling oneself) and other mathematical tricks where many dimensions/parameters are tuned by using much less data.

So easy to fall into , or modelling not only the studied "signal" in these few data but also (or mostly) their useless noise.

#feynman #crossvalidation #machinelearning #overfitting

Last updated 2 years ago

Daniele de Rigo · @dderigo
120 followers · 147 posts · Server hostux.social

3/
A more specific problem (special case of the point made so clear) deals with (where data for selecting/tuning a model are also iteratively used to test it, with allegedly "clever" methods to avoid fooling oneself) and other mathematical tricks where many dimensions/parameters are tuned by using much less data. So easy to fall into , or modelling not only the studied "signal" in these few data but also (or mostly) their useless noise.

#feynman #crossvalidation #machinelearning #overfitting

Last updated 2 years ago

Tiago Ribeiro · @tiago_ribeiro
42 followers · 208 posts · Server mastodon.social
saldan · @saldan
28 followers · 14 posts · Server mathstodon.xyz

I have two binary classifiers A and B, trained and tested through on the same training-set, strongly unbalanced, since the positive class samples are the 7% of the total samples.

The ROC-AUC of A and B is respectively 0.950 and 0.949, while the area under the precision-recall curve is respectively 0.716 and 0.717. Both this differences are not statistically significant.

#classification #statistics #artificialintelligence #machinelearning #datascience #crossvalidation

Last updated 3 years ago