How to Solve the Sock Merchant Problem in Java

Explore two approaches to solving the Sock Merchant Problem in Java. The post How to Solve the Sock Merchant Problem in Java first appeared on Baeldung.       

How to Solve the Sock Merchant Problem in Java

1. Introduction

In this tutorial, we’ll study two Java solutions for the Sock Merchant problem.

2. The Sock Merchant Problem

In the Sock Merchant problem, we have a merchant with a large pile of socks for sale that must be paired by color. We represent the socks’ colors by an array of integers, socks. The merchant’s goal is to determine how many pairs of matching socks there are.

Therefore, we have three inputs:

  • n: the number of socks in the collection
  • socks: the array holding the colors of n socks
  • k: the maximal color ID in this array, meaning 1 <= socks[i] <= k (for 0 <= i < n).

For example, let’s say we have n = 7 socks with socks = [11, 22, 22, 11, 111, 33, 222] and k=222:

Visualatizaton of Socks color

Here, we have only 5 colors. with one pair of color 11 and one of color 222. There are three odd socks left, one of each color. Hence, we return 2 as the number of pairs.

3. Array-Based Solution

An array-based solution works best when we know k is small. Here, we use an array counts to count the occurrences of each sock. Then, we iterate through it, dividing each count by two to calculate the total number of pairs:

public int countPairsWithArray(int n, int[] socks, int k) {
    int[] counts = new int[k];
    int pairs = 0;
    for (int i = 0; i < n; i++) {
        counts[socks[i]]++;
    }
    for (int count : counts) {
        pairs += count / 2;
    }
    return pairs;
}

Technically, we could omit n from the argument list and use socks.length in the code. We kept n in the method signature to follow the problem formulation more closely and make the solution easier to read.

This solution has time and space complexities of O(n + k). If k is O(n), the complexity simplifies to O(n).

4. HashSet-Based Solution

An array-based solution degrades performance when the range of color IDs is unbounded, unknown, or very large. For instance, the counts array for a very large k could not only be sparse (mostly 0s) if the number of unique colors is much smaller, but also too large to fit in the memory.

For such cases, we use HashSet. Here, our HashSet unmatchedSocks stores unmatched socks. As we iterate through the input array,we check if the current sock’s color is already in unmatchedSocks:

  • if it isn’t, we add it
  • otherwise, we have successfully found a pair!

Whenever we find a pair, we increment our pair counter and remove that color from  unmatchedSocks, so that the next pair of that color can be identified correctly:

public int countPairsWithSet(int[] socks) {
    Set unmatchedSocks = new HashSet<>();
    int pairs = 0;
    for (int sock : socks) {
        if (unmatchedSocks.contains(sock)) {
            pairs++;
            unmatchedSocks.remove(sock);
        } else {
            unmatchedSocks.add(sock);
        }
    }
    return pairs;
}

The HashSet-based solution’s average time complexity is O(n) since we iterate through the set exactly once, and HashSet operations add() and size() are O(1) on average.

However, HashSet suffers from collisions, so it has a log-linear O(n log n) worst-case time complexity for case where k is O(n). If k is O(1), the time complexity evaluates to O(n), which is same as that of  the array-based algorithm.

The space complexity is O(n + k).

5. Testing

5.1. Array-based Solution Testing

First, we test the array-based solution.

We create our socks array and pass it along with its length and the maximum color ID:

public void givenSockArray_whenUsingArray_thenReturnsCorrectPairCount() {
    SockMerchant merchant = new SockMerchant();
    int[] socks = {11, 22, 22, 11, 33, 3, 33, 111111, 33, 222222};
    int colorMax = 222223;
    int foundPairs = merchant.countPairsWithArray(socks.length, socks, colorMax);
    assertEquals(3, foundPairs);
}

We expect the test to return 3 for success.

5.2. HashSet-based Solution Testing

Let’s test the HashSet-based solution:

public void givenSockArray_whenUsingSet_thenReturnsCorrectPairCount() {
    SockMerchant merchant = new SockMerchant();
    int[] socks = {11, 22, 22, 11, 33, 3, 33, 111111, 33, 222222};
    int foundPairs = merchant.countPairsWithSet(socks);
    assertEquals(3, actualPairs);
}

As before, we expect the test to return 3 for success.

6. Conclusion

In this article, we studied two solutions, based on two different Java data structures, for the Sock Merchant problem.

The array-based method is fast and memory-efficient when the color domain is small and the color IDs are contiguous. However, it wastes memory and computation time if the color IDs are sparse or k is exceedingly large.

On the other hand, the HashSet approach is more versatile since it can handle arbitrarily large integers and sparse datasets without allocating massive contiguous blocks of memory. Furthermore, we can adapt the HashSet logic to group custom objects or strings.

As always, the complete code examples are available over on GitHub.

The post How to Solve the Sock Merchant Problem in Java first appeared on Baeldung.
       

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow