1722. Minimize Hamming Distance After Swap Operations

Problem statement

You are given two integer arrays, source and target, both of length n. You are also given an array allowedSwaps where each allowedSwaps[i] = [ai, bi] means you can swap the elements at indices ai and bi in source.

You can use a specific allowed swap any number of times and in any order.

The Hamming distance between two arrays of the same length is the number of positions where the arrays differ. We need to return the minimum possible Hamming distance after performing any number of allowed swaps on source.

Examples

Example 1

Input: source = [1,2,3,4], target = [2,1,4,5], allowedSwaps = [[0,1],[2,3]]
Output: 1

Explanation:

Swap indices 0 and 1: source = [2,1,3,4]
Swap indices 2 and 3: source = [2,1,4,3]

Now only index 3 is different, so the answer is 1.

Example 2

Input: source = [1,2,3,4], target = [1,3,2,4], allowedSwaps = []
Output: 2

There are no allowed swaps, so the Hamming distance stays 2.

Example 3

Input: source = [5,1,2,4,3], target = [1,5,4,2,3], allowedSwaps = [[0,4],[4,2],[1,3],[1,4]]
Output: 0

Constraints

n == source.length == target.length
1 <= n <= 10^5
1 <= source[i], target[i] <= 10^5
0 <= allowedSwaps.length <= 10^5
allowedSwaps[i].length == 2
0 <= ai, bi <= n - 1
ai != bi

Explanation

Given allowedSwaps, source, and target, we want to minimize the Hamming distance, which is the number of indices where source[i] != target[i].

The first thing to notice is this:

we are allowed to swap only the indices mentioned directly or indirectly through connections
we are not asked to minimize the number of swaps
we can do any number of swaps

So the actual order of swaps is not the important part here.

The real question is:

Which indices belong to the same group, so that values can move freely inside that group?

For example:

If we have swaps (a, b) and (c, d), then these are separate groups, so we cannot mix values between them.
If we have swaps (a, b) and (b, c), then a, b, and c all become connected, so effectively we can rearrange values among them.

That means we should think of the indices as a graph:

each index is a node
each allowed swap is an edge

Now every connected component gives us one group where we can rearrange source values however we want.

So for each connected component:

collect all values from source
collect all values from target
remove common values
the values left unmatched contribute to the answer

You can think of it like this:

inside one component, we can permute values freely
so only the frequency of values matters
anything that cannot be matched inside that component must stay mismatched

That is why the answer becomes the sum of unmatched values across all connected components.

Code

class Solution {
public:
    int minimumHammingDistance(vector<int>& source, vector<int>& target, vector<vector<int>>& allowedSwaps) {
        int n = source.size();
        vector<vector<int>> graph(n + 1);

        for (auto& swap : allowedSwaps) {
            graph[swap[0]].push_back(swap[1]);
            graph[swap[1]].push_back(swap[0]);
        }

        vector<int> vis(n);
        multiset<int> A, B;

        auto dfs = [&](auto&& dfs, int node) -> void {
            // Collect one component where swaps can be performed freely.
            A.insert(source[node]);
            B.insert(target[node]);
            vis[node] = 1;

            for (auto& child : graph[node]) {
                if (vis[child]) continue;
                dfs(dfs, child);
            }
        };

        int ans = 0;
        for (int i = 0; i < n; i++) {
            if (vis[i]) continue;

            A.clear();
            B.clear();
            dfs(dfs, i);

            // Count values from source that cannot be matched in target
            // within the same connected component.
            while (!A.empty()) {
                if (B.find(*A.begin()) == B.end()) ans++;
                B.extract(*A.begin());
                A.erase(A.begin());
            }
        }

        return ans;
    }
};