You are a data engineer at a large tech company, tasked with analyzing server logs to measure system load. The log is a sequence of N events, where each event is identified by a task_id.
To analyze the logs, you must divide the entire sequence of N events into exactly K consecutive, non-empty "analysis windows" (segments).
The "processing cost" for a single window is a measure of its complexity. It is calculated as follows: for each unique task_id that appears in the window, find the distance between its first and last occurrence within that window and add it to the window's total cost.
Your goal is to find a division of the logs into K windows that minimizes the total processing cost across all windows.
The first line contains two space-separated integers, N (the number of log events) and K (the number of windows).
The second line contains N space-separated integers, representing the sequence of task_ids.
The output should be a single integer representing the minimum possible total processing cost.
The cost of a window [l, r] is sum(last_occurrence(x) - first_occurrence(x)) for all unique x in A[l...r].
Splitting the log is often beneficial because it can separate distant occurrences of the same task_id into different windows, reducing or eliminating their cost contribution.
5 2
1 1 3 3 1
1
We need to split [1, 1, 3, 3, 1] into 2 windows.
Split Option: [1, 1, 3] | [3, 1]
Cost of [1, 1, 3]:
- Unique elements: 1, 3.
- For 1: first is at index 0, last at 1. Cost = 1-0=1.
- For 3: first and last at index 2. Cost = 2-2=0.
- Segment Cost = 1 + 0 = 1.
Cost of [3, 1]:
- Unique elements: 3, 1. Both appear once.
- Segment Cost = 0.
=> Total Cost = 1 + 0 = 1. This is the minimum possible.
The expected time complexity for the clear solution provided is O(K * N^2).
1 <= K <= N <= 35000 (Typical for optimized solution)
1 <= A[i] <= N
Time limit: 2 sec