Coding PYTHON Randomized Algorithms

Python Programming – Select a Random Node from a Singly Linked List

The idea is to use Reservoir Sampling. Following are the steps. This is a simpler version of Reservoir Sampling as we need to select only one key.

Given a singly linked list, select a random node from linked list (the probability of picking a node should be 1/N if there are N nodes in list). You are given a random number generator.

Below is a Simple Solution
1) Count number of nodes by traversing the list.
2) Traverse the list again and select every node with probability 1/N. The selection can be done by generating a random number from 0 to N-i for i’th node, and selecting the i’th node node only if generated number is equal to 0 (or any other fixed number from 0 to N-i).

We get uniform probabilities with above schemes.

i = 1, probability of selecting first node = 1/N
i = 2, probability of selecting second node =
                   [probability that first node is not selected] * 
                   [probability that second node is selected]
                  = ((N-1)/N)* 1/(N-1)
                  = 1/N

Similarly, probabilities of other selecting other nodes is 1/N

The above solution requires two traversals of linked list.

How to select a random node with only one traversal allowed?
The idea is to use Reservoir Sampling. Following are the steps. This is a simpler version of Reservoir Sampling as we need to select only one key instead of k keys.

(1) Initialize result as first node
   result = head->key 
(2) Initialize n = 2
(3) Now one by one consider all nodes from 2nd node onward.
    (3.a) Generate a random number from 0 to n-1. 
         Let the generated random number is j.
    (3.b) If j is equal to 0 (we could choose other fixed number 
          between 0 to n-1), then replace result with current node.
    (3.c) n = n+1
    (3.d) current = current->next

Below is the implementation of above algorithm.

Python Program
# Python program to randomly select a node from singly
# linked list 
 
import random
 
# Node class 
class Node:
 
    # Constructor to initialize the node object
    def __init__(self, data):
        self.data= data
        self.next = None
 
class LinkedList:
 
    # Function to initialize head
    def __init__(self):
        self.head = None
 
    # A reservoir sampling based function to print a
    # random node from a linkd list
    def printRandom(self):
 
        # If list is empty 
        if self.head is None:
            return
 
        # Use a different seed value so that we don't get 
        # same result each time we run this program
        random.seed()
 
        # Initialize result as first node
        result = self.head.data
 
        # Iterate from the (k+1)th element nth element
        current = self.head 
        n = 2
        while(current is not None):
             
            # change result with probability 1/n
            if (random.randrange(n) == 0 ):
                result = current.data 
 
            # Move to next node
            current = current.next
            n += 1
 
        print "Randomly selected key is %d" %(result)
         
    # Function to insert a new node at the beginning
    def push(self, new_data):
        new_node = Node(new_data)
        new_node.next = self.head
        self.head = new_node
 
    # Utility function to print the linked LinkedList
    def printList(self):
        temp = self.head
        while(temp):
            print temp.data,
            temp = temp.next
 
 
# Driver program to test above function
llist = LinkedList()
llist.push(5)
llist.push(20)
llist.push(4)
llist.push(3)
llist.push(30)
llist.printRandom()

Note that the above program is based on outcome of a random function and may produce different output.

READ  Python Algorithm-Shortest Path-Shortest Path in Directed Acyclic Graph

How does this work?
Let there be total N nodes in list. It is easier to understand from last node.

The probability that last node is result simply 1/N [For last or N’th node, we generate a random number between 0 to N-1 and make last node as result if the generated number is 0 (or any other fixed number]

The probability that second last node is result should also be 1/N.

The probability that the second last node is result 
          = [Probability that the second last node replaces result] X 
            [Probability that the last node doesn't replace the result] 
          = [1 / (N-1)] * [(N-1)/N]
          = 1/N

Similarly we can show probability for 3rd last node and other nodes.

About the author

Venkatesan Prabu

Venkatesan Prabu

Wikitechy Founder, Author, International Speaker, and Job Consultant. My role as the CEO of Wikitechy, I help businesses build their next generation digital platforms and help with their product innovation and growth strategy. I'm a frequent speaker at tech conferences and events.

Add Comment

Click here to post a comment