k-Nearest Neighbors (Cross-Validation Version)
k-Nearest Neighbors (kNN) is often used for machine learning. You can choose the value for topK
based on your experience, or using cross-validation to optimize the hyperparameters. In our library, Leave-one-out cross-validation for selecting optimal k is provided. Given a k value, we run the algorithm repeatedly using every vertex with a known label as the source vertex and predict its label. We assess the accuracy of the predictions for each value of k, and then repeat for different values of k in the given range. The goal is to find the value of k with highest predicting accuracy in the given range, for that dataset.
Specifications
tg_knn_cosine_cv(SET<STRING> v_type, SET<STRING> e_type, SET<STRING> re_type,
STRING weight, STRING label, INT min_k, INT max_k) RETURNS (INT)
Characteristic | Value |
---|---|
Result |
A list of prediction accuracy for every k value in the given range, and the value of k with the highest predicting accuracy in the given range. The result is available in JSON format |
Input Parameters |
|
Result Size |
max_k-min_k+1 |
Time Complexity |
O(max_k*E^2 / V), V = number of vertices, E = number of edges |
Graph Types |
Undirected or directed edges, weighted edges |