Find the Duplicate Number (Leetcode)

This is a reasonable warm-up question I think, but the performance-related requirement (constant space) is silly. Reading the editorial, well, I have thoughts to share. This is from the archives.

Using this as a warm-up question

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        seen = set([])
        for n in nums:
            if n in seen: return n
            seen.add(n)

This presents an approachable-enough setting to discuss performance, and I think things like “find the duplicates” are related to real-world questions. Moreover, there are ready sequels: OK, if they use this “hash table approach”, what about one that can take more time, but less space?

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        prev = None
        for n in sorted(nums):
            if prev is None:
                prev = n
            elif prev == n:
                return n
            prev = n

Considering The Leetcode Discussion

The leetcode “editorial” (at least as of June 2023) presents 7 different approaches for the problem. Here are some of my reactions to the editorial. This is my experimental foray into providing a criticism to, what I think is the typical leetcode perspective on questions. I’m not meaning to pick on this particular question.

“[The problem] is a classic problem” — I’m not sure in what domain this question is a classic. Deduplication in general is very useful, but the particular of this question don’t strike me as something classic. Maybe it’s a common interview question for a certain company? I wish leetcode questions placed their evaluations in more context. It gives the air of secret knowledge (“oh, this is a classic question, and I didn’t know that!”) when I think they’re just… making things up.
Approach 3, in which they borrow a bit from each element in the array (and use the range-trick) they say is O(1) space. In practical terms, sure, but what if the array is, say, unsigned shorts and the size of 2^16? You have no bits to borrow; that’s just an example where this is sort of “punning” between theoretical concerns (asymptotic analysis of resource usage) versus practical concerns. Both are fine, but mixing them leads to confusion.
The other approaches are fun things to learn, but all (including approach 3) really rely on the particular structure of the input. It’s just unsatisfying.

Aaron and Algorithms

Find the Duplicate Number (Leetcode)

Using this as a warm-up question

Considering The Leetcode Discussion

Introductory notes on computer science and programming