Tag Archives: arrays

Find the Duplicate Number (Leetcode)

This is a reasonable warm-up question I think, but the performance-related requirement (constant space) is silly. Reading the editorial, well, I have thoughts to share. This is from the archives.

Using this as a warm-up question

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        seen = set([])
        for n in nums:
            if n in seen: return n
            seen.add(n)

This presents an approachable-enough setting to discuss performance, and I think things like “find the duplicates” are related to real-world questions. Moreover, there are ready sequels: OK, if they use this “hash table approach”, what about one that can take more time, but less space?

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        prev = None
        for n in sorted(nums):
            if prev is None:
                prev = n
            elif prev == n:
                return n
            prev = n

Considering The Leetcode Discussion

The leetcode “editorial” (at least as of June 2023) presents 7 different approaches for the problem. Here are some of my reactions to the editorial. This is my experimental foray into providing a criticism to, what I think is the typical leetcode perspective on questions. I’m not meaning to pick on this particular question.

  • “[The problem] is a classic problem” — I’m not sure in what domain this question is a classic. Deduplication in general is very useful, but the particular of this question don’t strike me as something classic. Maybe it’s a common interview question for a certain company? I wish leetcode questions placed their evaluations in more context. It gives the air of secret knowledge (“oh, this is a classic question, and I didn’t know that!”) when I think they’re just… making things up.
  • Approach 3, in which they borrow a bit from each element in the array (and use the range-trick) they say is O(1) space. In practical terms, sure, but what if the array is, say, unsigned shorts and the size of 2^16? You have no bits to borrow; that’s just an example where this is sort of “punning” between theoretical concerns (asymptotic analysis of resource usage) versus practical concerns. Both are fine, but mixing them leads to confusion.
  • The other approaches are fun things to learn, but all (including approach 3) really rely on the particular structure of the input. It’s just unsatisfying.

Two Sum (Leetcode)

This is the classic, literally Leetcode question 1. It’s an OK question. It admits a “naive” solution of try-every-pair, and also enables some deeper thought. I don’t have a good read on whether that deeper thought should be considered expected in an interview setting.

Naive Solution

The naive solution is try-every-pair. This is an accessible example of a concept I like, where we have the explicit object (in our case, an array), but we’re really exploring an implicit object (in our case, a sequence of pairs of elements, those elements drawn from an array). We can be careful about precisely which pairs we’re exploring.

class Solution:
    def twoSum(self, nums: List[int], target: int) -> List[int]:
        for i in range(len(nums)):
            for j in range(len(nums)):
                if i == j:
                    continue
                if nums[i]+nums[j] == target:
                    return [i, j]

Not-Better Solution

The interesting thing is how to derive the better solution. Let’s formulate it in terms of the condition we “care” about, nums[i]+nums[j] == target. With arithmetic, we can say that’s the same condition as: nums[i] == target – nums[j], (we subtracted nums[j] from each side).

So we can derive a new sequence, diffNums, where we assign diffNums[j] = target – nums[j]. Now what would our loop look like?

class Solution(object):
    def twoSum(self, nums, target):
        diffNums = [target-nums[i] for i in range(len(nums))]
        for i in range(len(nums)):
            for j in range(len(nums)):
                if i == j:
                    continue
                if nums[i] == diffNums[j]:
                    return [i, j]

This is not faster by any measure: asymptotically it’s still O(n^2), and now we have an extra intermediate array diffNums to instantiate. Sure enough, it’s too slow for Leetcode, haha.

However, a careful look at our condition now is enticing: we’re “just” seeing if the collection nums has any element in common with the collection diffNums. That’s set-membership! So let’s recast things:

Faster Solution

We ultimately need to return the indices. So let’s have diffNums be a mapping from target-nums[i] : i.

class Solution(object):
    def twoSum(self, nums, target):
        diffNums = {target-nums[i] : i for i in range(len(nums))}
        for i in range(len(nums)):
            v = nums[i]
            if v in diffNums and i != diffNums[v]:
                return [i, diffNums[v]]

A lot has changed! Crucially, we have the if v in diffNums condition: that captures the whole inner for-loop! Even better, because diffNums is a map, we can expect that lookup to be roughly constant time, instead of linear. This brings our algorithm down to roughly-linear time. (Note that diffNums[v] is basically j in this loop.)

Conclusion

This problem grew on me, in terms of introducing some intermediate programming thoughts. I think if we have to use this an interview question, assuming the candidate is stuck on getting the faster solution, I’d prompt them to that not-better solution and see where they take it. More introductory questions here.

Counting Bits (Leetcode)

Another from the archives! I had a lot of fun with this one. Would others? I don’t know! We can use a sort-of dynamic programming approach (maybe this is a good replacement for that frustrating “Fib” standard warmup?) to avoid calling popcount over and over again. Not sure what’s actually faster on hardware, but I like the principle here.

This is a good algorithmic-thinking question, I think. Requires some fluency with bit manipulation, though.

class Solution {
public:
    vector<int> countBits(int num) {
        if (num == 0) { return {0}; }
        if (num == 1) { return {0, 1}; }
        if (num == 2) { return {0, 1, 1}; }
        vector<int> res(num+1);
        res[0] = 0;
        res[1] = 1;
        for (int i = 2; i <= num; ++i) {
            const int lastBit = i % 2;
            res[i] = res[i/2] + lastBit;
        }
        return res;
    }
};

Notes

Additional implementation details? I don’t know. I found it reassuring that the length of the output is required to be num, because there’s always that potential psuedopolynomial trap. Fixing the output size (versus, say, we’d have to return the total numbers of 1s or something) helps make it unambiguous that we’d at least be able to iterate through all inputs.

Other bit-manipulating questions are collected here. Other dynamic-programming questions are collected here.

Sort Array By Parity 2 (Leetcode)

We’ve been here before. Quite explicitly in fact, this question is obviously marked as a sequel.

This really drills into the idea of thinking about values versus indices. I think that’s something my students would get a lot out of. As mentioned in my comments, I feel like the term “sort” here is a bit deceptive. I would call this more like merging or partitioning, a helper function within a sort function. “Classify” by party? “Cubbyhole” by parity?

class Solution {
public:
    // Helpers inspired by commented out block in the main function.
    // This sort of "given the index, find the next valid index" pattern
    // is quite common. This is inclusive -- we can return start.
    // Yes I can have them share a function with lamdbads or whatever,
    // but let's not overcomplicate implementation.
    int nextEvenIndex(vector<int>& A, int start) {
        assert(start % 2 == 0);
        for (; start < A.size(); start += 2) {
            if (A[start] % 2) return start;
        }
        return -1;
    }
    int nextOddIndex(vector<int>& A, int start) {
        assert(start % 2);
        for (; start < A.size(); start += 2) {
            if (A[start] % 2 == 0) return start;
        }
        return -1;
    }
    vector<int> sortArrayByParityII(vector<int>& A) {
        // One of the quadratic sorts? This is almost more like merging or pivoting, than
        // sorting. The name feels deceptive :).
        
        // initialize i as the first even index to swap;
        int i = nextEvenIndex(A, 0);
        // j as the first odd index to swap;
        int j = nextOddIndex(A, 1);
        while (i != -1) {
            assert(j != -1);
            swap(A[i], A[j]);
            i = nextEvenIndex(A, i); // we should be able to get past i here, as it's now legal.
            j = nextOddIndex(A, j); // ditto
        }
        assert(i == -1 && j == -1); // we should both be done here.
        return A;
    }
};

Notes

  • You can definitely see my merge-friendly perspective influencing my approach.
  • I really like drilling into these loops where the index variables aren’t necessarily always incremented by a constant, as is here.
  • I like how this has the implicit sub-arrays inside the main array.
  • Avoiding a new allocation is a nice extended requirement, I think.

I think this is an obviously-contrived interview question, but it exercises nice things and the question (while a bit detailed) isn’t tricky. It’s very clear what a successful output should look like, or at least clear enough.

Reshape the Matrix (Leetcode 30)

I decided to redo this problem, from the archives in Python. Previously I had it in C++. You’d think this post, then, of resizing a 2D array, would have been straightforward. Here are some notes:

  1. The key insight (in my approach, at least) is that assuming the resize is valid, there are r*c elements, and we can define helper functions like getRow and getColumn that take an i, and an r, c dimensions, and return the row and column of the ith element. In this way we can walk through the arrays “in parallel”
  2. I hit an issue with some terse way of allocating an array in python. It seems something like [[0]*c]*r leads to shallow copies in some cases, and that tripped me up.
  3. Then, of course, I managed to get R and C mixed up.

I think this is a good exercise for junior people who say they feel comfortable with 2D arrays.

Complete with debugging statements, here is my approach:

def getRow(i, r, c):
    #print('row of', i, '-th element in matrix ', r, '*', c, 'is', i // c)
    return i // c
def getCol(i, r, c):
    return i % c
class Solution:
    def matrixReshape(self, mat: List[List[int]], r: int, c: int) -> List[List[int]]:
        R = len(mat)
        if R is 0: return mat
        
        C = len(mat[0])
        
        if R * C != r * c:
            return mat
        
        res = []
        for i in range(r):
            res.append([])
            for j in range(c):
                res[i].append(0)
        
        for i in range(r*c):
            #print(getRow(i, r, c), getCol(i, r, c), getRow(i, R, C), getCol(i, R, C))
            old = mat[getRow(i, R, C)][getCol(i, R, C)]
            res[getRow(i, r, c)][getCol(i, r, c)] = old
                
        return res

Take-Aways

This covers interesting questions like allocation (pre-allocating the results array), bounds-checking in a nontrivial setting (numrows, numcols), and then an interesting question of mapping. That mapping also really requires a comfort with indices versus the values of an array, plus array-of-arrays instead of just arrays. I think this is a good “hard” phone-screen question.

For real-world applicability, I’ve never worked too much with 2D arrays, certainly not this explicitly. However, the challenge of iterating over 2 different objects/data-structures that don’t naively both map to the same for loop (so to speak), that’s something that comes up a lot, albeit not so directly as in this code. Just recently I was iterating over 2 similar (but not the same!) trees at work, and had to make sure the iteration remains in lockstep. Of course, this complex-iteration also comes up in my favorite merge-linked-list sort of questions. Again, not literally identical, but the same “genus”.

Normalize Array Indexes (Code Wars 19)

This was a problem a student brought to me. It was very interesting! It suggests 2 approaches of different “trickiness”.

The natural approach suggested by the problem’s description is to use a for-loop. This is complicated by that, in an unusual twist, the for loop (over the provided index parameter) isn’t really related to the length of the array. So while you have the i or whatever in your for loop, you want to resist the temptation to do a[i].

Ultimately you’ll want a secondary variable. While i tracks the provided index, j handles the “wrap-around” logic:

def norm_index_test(seq, ind):
    if seq == []:
        return None
    isneg = ind < 0
    ind = abs(ind)
    j = 0
    for i in range(ind):
        j += 1
        if j == len(seq):
            j = 0
    assert(j < len(seq))
    if isneg and j != 0:
        j = len(seq) - j
    return seq[j]

This is actually probably the hardest form of the question, complicated by the negative indexes. If I were asking this problem, I would not ask about negative indexes. Ugh!

Once you can trust modular arithmetic (and maybe Python makes it too easy? Not sure if all languages define module of a negative number the same), it’s a one-liner. Cute, but OK.

def norm_index_test(seq, ind): 
    if seq == []:
        return None
    return seq[ind%len(seq)]

Longest Common Prefix (Leetcode 12)

This is another good one, I think. Strings and arrays are commonly-requested topics.

  1. We can observe that “the first string must contain the longest common prefix”. That’s an interesting tool that we can leverage.
  2. So we know that we can iterate over that length. We want to “scan” all the strings in parallel, and that’s how.
  3. We have an interesting “return from a nested loop” potential. Helper functions may make this clearer. Potential sequel homework.
class Solution(object):
    def longestCommonPrefix(self, strs):
        if len(strs) == 0:
            return ""
        root = strs[0]
        if len(root) == 0:
            return ""
        
        prefixLen = 0
        for i in range(len(root)):
            for s in strs[1:]:
                if i >= len(s):
                    return root[:prefixLen]
                if s[i] != root[i]:
                    return root[:prefixLen]
            prefixLen += 1
        return root

Move Zeros (Leetcode 3)

This problem is a fun exercise on arrays. Given an array, move all the zeros to the end of the array. Maintain the relative order of all other elements.

I have some familiarity with the C++ standard library, which includes algorithms similar-to-this (but without the relative order). In languages with very explicit memory-management, this is removing elements from the array “in spirit”. In C++, in fact, this is the solution:

class Solution {
public:
    void moveZeroes(vector<int>& nums) {
        auto new_end = std::remove(std::begin(nums), std::end(nums), 0);
        std::fill(new_end, std::end(nums), 0);
    }
};

Really line 4 is the answer this problem is looking for. Really what happen is that remove “slides the contents down” to fill up the zeros. The new_end variable stores the end of the newly-shrunken array. However, the length of the original nums is unchanged — it’s up to the programmer (me) to really explicitly request that memory be freed, or otherwise messed with.

This also echoes, in a way even more directly, partitioning algorithms. We want to reorder the contents of nums such that one category of elements (those not equal to 0) are on one side, and the other category is at the other side. Again, C++ has this algorithm directly:

class Solution {
public:
    void moveZeroes(vector<int>& nums) {
        std::stable_partition(std::begin(nums), std::end(nums), [](int n) { return n != 0;});
    }
};

A common critique of code (which I agree with!) is “don’t get too clever with one-liners”. I’ll stress that the above solution, in my mind, is fine because it really is just using the fact that the algorithm is exactly what the problem is asking for. Akin to using the built-in sort method for a question that asks you to sort items. C++ is just a bit verbose.

Say, however, that we don’t have such a partition algorithm. Or that the problem is explicitly asking us to implement a partition algorithm. One trick we can do here is exploit that we know how to create new 0s. In contrast to when our array might container pointers or similar, we can just “remember how many zeros to add in later”. Here’s a Python version, basically of the first C++ solution:

class Solution(object):
    def moveZeroes(self, nums):
        nextWrite = 0
        for nextRead in range(len(nums)):
            x = nums[nextRead]
            if x == 0:
                continue
            nums[nextWrite] = x
            nextWrite += 1
        for zeroOut in range(nextWrite, len(nums)):
            nums[zeroOut] = 0

Basically nextWrite serves as new_end, the first for loop is remove, and the second for loop is fill.

Say, however, instead of an array of 0s, we have an array of objects, and we’ve been partitioning them based on whether their, I don’t know, books-checked-out-of-the-library-count was 0. We couldn’t really do the above approach because we couldn’t reconstruct a whole object after-the-fact. A nice sequel problem to consider, basically, how do we “really” implement partition.

PS: This problem can be found in codewars, too. I’d say related questions are collected under array questions, or sorting questions.

Tools and Takeaways

  1. Moving elements within an array is a common sort of “vocabulary” pattern. Remove, partition, etc., are often building blocks of larger algorithms.
  2. Sometimes we may break a for-loop up into multiple “ranges”. In our first approach, we go through the “tail” of the input array to zero-out those contents.
  3. Sometimes we can recreate the same values we care about, but later. We can do this when the values are integers. Sometimes we can’t get away with that, and have to do something more sophisticated.