Certain data mining algorithms (including k-means clustering and k-nearest neighbors) require a user defined parameter k. A user of these algorithms is required to select this value, which raises the questions: what is the "best" value of k that one should select to solve their problem?
This mini-episode explores the appropriate value of k to use when trying to estimate the cost of a house in Los Angeles based on the closests sales in it's area.
Fler avsnitt av Data Skeptic
Visa alla avsnitt av Data SkepticData Skeptic med Kyle Polich finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
