Given a singly linked list, select a random node from linked list (the probability of picking a node should be 1/N if there are N nodes in list). You are given a random number generator.
Below is a Simple Solution
1) Count number of nodes by traversing the list.
2) Traverse the list again and select every node with probability 1/N. The selection can be done by generating a random number from 0 to N-i for i’th node, and selecting the i’th node node only if generated number is equal to 0 (or any other fixed number from 0 to N-i).
We get uniform probabilities with above schemes.
i = 1, probability of selecting first node = 1/N i = 2, probability of selecting second node = [probability that first node is not selected] * [probability that second node is selected] = ((N-1)/N)* 1/(N-1) = 1/N
Similarly, probabilities of other selecting other nodes is 1/N
The above solution requires two traversals of linked list.
How to select a random node with only one traversal allowed?
The idea is to use Reservoir Sampling. Following are the steps. This is a simpler version of Reservoir Sampling as we need to select only one key instead of k keys.
(1) Initialize result as first node result = head->key (2) Initialize n = 2 (3) Now one by one consider all nodes from 2nd node onward. (3.a) Generate a random number from 0 to n-1. Let the generated random number is j. (3.b) If j is equal to 0 (we could choose other fixed number between 0 to n-1), then replace result with current node. (3.c) n = n+1 (3.d) current = current->next
Below is the implementation of above algorithm.
Note that the above program is based on outcome of a random function and may produce different output.
How does this work?
Let there be total N nodes in list. It is easier to understand from last node.
The probability that last node is result simply 1/N [For last or N’th node, we generate a random number between 0 to N-1 and make last node as result if the generated number is 0 (or any other fixed number]
The probability that second last node is result should also be 1/N.
The probability that the second last node is result = [Probability that the second last node replaces result] X [Probability that the last node doesn't replace the result] = [1 / (N-1)] * [(N-1)/N] = 1/N
Similarly we can show probability for 3rd last node and other nodes.