Final thoughts on K Nearest Neighbors – Practical Machine Learning Tutorial with Python p.19



We’re going to cover a few final thoughts on the K Nearest Neighbors algorithm here, including the value for K, confidence, speed, and the pros and cons of the algorithm now that we understand more about how it works.

28 Comments

  1. Thanks so much for the videos! It's really helping me with some research in kinesiology at the moment. I had a question, however: Would it be possible to produce a third class that is neither of the classes of interest?

    My problem is that I don't want to train the algorithm with features and labels that are not in either of the two classes of interest, since this third class can vary wildly. I was hoping for some higher level way of saying: If (most of?) the k nearest neighbors of the test point are sufficiently far away, then the test point cannot be placed in either class.

    My best guess involves looking at the standard deviations away from the mean distances and deciding whether a point is an outlier by using that.

  2. Is it possible to give certain nodes a greater weight?

    My problem is planning new people in small groups based on the location on the map of the people in the group as well as ages and gender, that data tends to have a lot of noise as not everyone who as chosen a group goes to a group they live closest to, and so when placing a new person in a small group I'd like to give a weight to the actual group location, and if they are a group that caters to a specific age group, or gender group I'd like to place that group node in an a more ideal place in those dimensions too

  3. if anyone is getting confidence of 1s and 0s try to make a variable in the confidence a float like

    confidence = Counter(votes).most_common(1)[0][1] / float(k)

    🙂 thanks for the vids sentdex!

  4. hi sentdex~
    I found that the results of "accuracy method" and "accuracies method" are totally the same.
    so why should we calculate accuracies again after calculated accuracy?

  5. Hi Sendex,
    I like you approach a lot but there is a concern that need to solve. In scikit Learn, there are two steps for KNN. One is fit and another is Predict. The fuinction that you wrote seems to be doing all the things togather. Can you please tell me upto which part in your code is run fitting and which part is for prediction. I ask this just to correlate with actual scikit learn.

  6. You don't use 80% of the data to test, only the last 20%. I guess test_data= full_data[(-int((1-test_size)*len(full_data))):] is what you wanted to do.

Leave a Reply

Your email address will not be published.


*