Huffman Coding with Unequal Letter Costs

Ninja
0/200
0 upvote
Asked in company
PhonePe

Problem statement

You are given a set of 'n' symbols, each with a given probability of occurrence. Your task is to devise a prefix-free binary code for these symbols using an alphabet {‘A’, ‘B’}.

However, the transmission time for the letters in the alphabet is unequal:


  • Transmitting 'A' takes 1 unit of time.


  • Transmitting 'B' takes 2 units of time.


    Your goal is to find an optimal set of codes that minimizes the total expected transmission time. The expected time is calculated as the sum over all symbols of (symbolprobability * timeofitscode).

    A prefix-free code is one where no symbol's code is a prefix of another symbol's code.


  • Detailed explanation ( Input/output format, Notes, Images )
    Input Format:
    The first line of input contains an integer 'n', the number of symbols.
    
    The second line contains 'n' space-separated floating-point numbers, representing the probabilities of the symbols. The sum of these probabilities will be 1.
    


    Output Format:
    Print a single floating-point number representing the minimum possible expected transmission time, rounded to a precision of 5 decimal places.
    


    Note:
    The standard Huffman coding algorithm greedily combines the two nodes with the lowest probabilities to form a new node. This works for uniform letter costs (e.g., '0' and '1' both costing 1 unit).
    
    For unequal letter costs, the greedy Huffman approach fails. The optimal substructure is more complex. A simple pairing might not be optimal because combining low-probability nodes might force the use of the expensive letter 'B' too often.
    
    The Larmore-Hirschberg algorithm provides a more sophisticated solution, but for the given constraints, a dynamic programming approach is more suitable.
    
    Sample Input 1:
    5
    0.1 0.1 0.2 0.2 0.4
    


    Sample Output 1:
    2.10000
    


    Explanation for Sample 1:
    For the probabilities {0.1, 0.1, 0.2, 0.2, 0.4}, a standard Huffman tree would be built. The adaptation for unequal costs involves a specific grouping strategy that a dynamic program would find. The code's calculated minimum is 2.1.
    


    Expected Time Complexity:
    The expected time complexity is O(N^2) using a standard DP approach for this problem variant.
    


    Constraints:
    2 <= n <= 200
    p[i] > 0
    Sum of p[i] = 1.0
    
    Time limit: 1 sec
    
    Approaches (1)
    Huffman Coding with Unequal Letter Costs
    Time Complexity
    Space Complexity
    Code Solution
    (100% EXP penalty)
    Huffman Coding with Unequal Letter Costs
    Full screen
    Console