1722. Minimize Hamming Distance After Swap Operations
Group swappable indices into connected components, then compare the values inside each component to count the minimum possible mismatches.
Problem statement
You are given two integer arrays, source and target, both of length n. You are also given an array allowedSwaps where each allowedSwaps[i] = [ai, bi] means you can swap the elements at indices ai and bi in source.
You can use a specific allowed swap any number of times and in any order.
The Hamming distance between two arrays of the same length is the number of positions where the arrays differ. We need to return the minimum possible Hamming distance after performing any number of allowed swaps on source.
Examples
Example 1
Input: source = [1,2,3,4], target = [2,1,4,5], allowedSwaps = [[0,1],[2,3]]
Output: 1
Explanation:
- Swap indices
0and1:source = [2,1,3,4] - Swap indices
2and3:source = [2,1,4,3]
Now only index 3 is different, so the answer is 1.
Example 2
Input: source = [1,2,3,4], target = [1,3,2,4], allowedSwaps = []
Output: 2
There are no allowed swaps, so the Hamming distance stays 2.
Example 3
Input: source = [5,1,2,4,3], target = [1,5,4,2,3], allowedSwaps = [[0,4],[4,2],[1,3],[1,4]]
Output: 0
Constraints
n == source.length == target.length1 <= n <= 10^51 <= source[i], target[i] <= 10^50 <= allowedSwaps.length <= 10^5allowedSwaps[i].length == 20 <= ai, bi <= n - 1ai != bi
Explanation
Given allowedSwaps, source, and target, we want to minimize the Hamming distance, which is the number of indices where source[i] != target[i].
The first thing to notice is this:
- we are allowed to swap only the indices mentioned directly or indirectly through connections
- we are not asked to minimize the number of swaps
- we can do any number of swaps
So the actual order of swaps is not the important part here.
The real question is:
Which indices belong to the same group, so that values can move freely inside that group?
For example:
- If we have swaps
(a, b)and(c, d), then these are separate groups, so we cannot mix values between them. - If we have swaps
(a, b)and(b, c), thena,b, andcall become connected, so effectively we can rearrange values among them.
That means we should think of the indices as a graph:
- each index is a node
- each allowed swap is an edge
Now every connected component gives us one group where we can rearrange source values however we want.
So for each connected component:
- collect all values from
source - collect all values from
target - remove common values
- the values left unmatched contribute to the answer
You can think of it like this:
- inside one component, we can permute values freely
- so only the frequency of values matters
- anything that cannot be matched inside that component must stay mismatched
That is why the answer becomes the sum of unmatched values across all connected components.
Code
class Solution {
public:
int minimumHammingDistance(vector<int>& source, vector<int>& target, vector<vector<int>>& allowedSwaps) {
int n = source.size();
vector<vector<int>> graph(n + 1);
for (auto& swap : allowedSwaps) {
graph[swap[0]].push_back(swap[1]);
graph[swap[1]].push_back(swap[0]);
}
vector<int> vis(n);
multiset<int> A, B;
auto dfs = [&](auto&& dfs, int node) -> void {
// Collect one component where swaps can be performed freely.
A.insert(source[node]);
B.insert(target[node]);
vis[node] = 1;
for (auto& child : graph[node]) {
if (vis[child]) continue;
dfs(dfs, child);
}
};
int ans = 0;
for (int i = 0; i < n; i++) {
if (vis[i]) continue;
A.clear();
B.clear();
dfs(dfs, i);
// Count values from source that cannot be matched in target
// within the same connected component.
while (!A.empty()) {
if (B.find(*A.begin()) == B.end()) ans++;
B.extract(*A.begin());
A.erase(A.begin());
}
}
return ans;
}
};