All posts by Aaron

Middle of Linked List (Leetcode)

Another warm-up question from the archives.

The editorial solution, fast-and-slow-pointer, is very cute. I’m not sure how it’s preferable over separating the length (“fast”) calculation from the actual-find-calculation (“slow”). We’re doing the same traversals, but presumably with better locality?

class Solution:
    def middleNode(self, head: Optional[ListNode]) -> Optional[ListNode]:
        l = 0
        n = head
        while n is not None:
            n = n.next
            l += 1

        n = head
        for i in range(l//2):
            n = n.next
        
        return n

Find the Duplicate Number (Leetcode)

This is a reasonable warm-up question I think, but the performance-related requirement (constant space) is silly. Reading the editorial, well, I have thoughts to share. This is from the archives.

Using this as a warm-up question

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        seen = set([])
        for n in nums:
            if n in seen: return n
            seen.add(n)

This presents an approachable-enough setting to discuss performance, and I think things like “find the duplicates” are related to real-world questions. Moreover, there are ready sequels: OK, if they use this “hash table approach”, what about one that can take more time, but less space?

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        prev = None
        for n in sorted(nums):
            if prev is None:
                prev = n
            elif prev == n:
                return n
            prev = n

Considering The Leetcode Discussion

The leetcode “editorial” (at least as of June 2023) presents 7 different approaches for the problem. Here are some of my reactions to the editorial. This is my experimental foray into providing a criticism to, what I think is the typical leetcode perspective on questions. I’m not meaning to pick on this particular question.

  • “[The problem] is a classic problem” — I’m not sure in what domain this question is a classic. Deduplication in general is very useful, but the particular of this question don’t strike me as something classic. Maybe it’s a common interview question for a certain company? I wish leetcode questions placed their evaluations in more context. It gives the air of secret knowledge (“oh, this is a classic question, and I didn’t know that!”) when I think they’re just… making things up.
  • Approach 3, in which they borrow a bit from each element in the array (and use the range-trick) they say is O(1) space. In practical terms, sure, but what if the array is, say, unsigned shorts and the size of 2^16? You have no bits to borrow; that’s just an example where this is sort of “punning” between theoretical concerns (asymptotic analysis of resource usage) versus practical concerns. Both are fine, but mixing them leads to confusion.
  • The other approaches are fun things to learn, but all (including approach 3) really rely on the particular structure of the input. It’s just unsatisfying.

Island Perimeter (Leetcode)

This is a nice application of DFS. I think this is a good question that exercises a few things. Briefly, the question is “given a grid, find the perimeter of the unique island”. To me the natural approach feels like, find the island, and then do a DFS exploration to touch its perimeter.

  • From a pedagogical standpoint, I think it’s nice that it uses DFS, and “stretches” its application. In the [[1, 1], [1, 1]] example it made it clear that I couldn’t “double-count” any nodes!
  • From an interview standpoint, it’s a nice mapping of intuitive(-enough) requirements to code.
  • As always it’s a bit contrived, but I like questions where you have to synthesize higher-level data from smaller parts. I think that’s a good theme.

Solution, and detailed discussion, below

Solution

class Solution:
    def findLand(self, grid: List[List[int]]):
        for i, row in enumerate(grid):
            for j, cell in enumerate(row):
                if cell == 1:
                    return (i, j)
        return (-1, -1) # error state

    def islandPerimeter(self, grid: List[List[int]]) -> int:
        startLoc = self.findLand(grid)
        assert(startLoc[0] != -1)

        p = 0
        # DFS
        seen = set([])
        willProcess = set([startLoc])
        workList = [startLoc]
        while workList:
            i, j = workList.pop()
            for x, y in [(i+1, j), (i-1, j), (i, j+1), (i, j-1)]:
                # case 1: we know there cannot be a neighbor (edge of map) or water!
                if x < 0 or y < 0 or x >= len(grid) or y >= len(grid[x]) or grid[x][y] == 0:
                    p += 1
                    continue
                # case 2: it's a neighbor we've seen:
                if (x, y) in seen:
                    continue
                # case 3: a new node to explore!
                if (x, y) not in willProcess:
                    willProcess.add((x, y))
                    workList.append((x, y))
            seen.add((i, j))
        return p

Discussion

  • There’s some redundancy with “willProcess” and “seen”. I think I can get rid of “seen”?
  • I make ready use of a lot of the conveniences of Python. Generating the list as the expression (in the for x, y in … loop), nevermind the ability to decompose those tuples, is really nice for these interviewing questions.
  • Implicit in the code is sort of 2 layers: there’s the “straight” DFS, i.e., case 3, and then there’s also the “when I’m at a node, how does it contribute to the perimeter calculation”? Something in my mind, but not explicitly listed out, is that we process each node exactly once, and then we know that the node will contribute between 0 and 4 to the perimeter (depending on the 4 neighbors we check.
  • I feel like it is crucial we don’t have lakes, as noted in the problem, but I can’t quite articulate why. Something to consider if/when I revisit this problem.

I like this! It touches on both matrix and graph problems (mainly graph problems) and hits a lot of “right notes”. More graph problems here.

Clone Graph (Leetcode)

What can I say, I like graphs. This problem is from the archives, but I think especially from a pedagogical standpoint these “copy the data structure” (or usually I like, “print the data structure”) are terrific educational exercises.

The approach I took for this problem is the commented helper function. I think I need to review DFS/BFS and make sure I’m using the right number of node-marking (typically I’ve seen it formulated as a node being “marked” done or workset, rather than added to a set — that is more efficient generally.

class Solution:
    def cloneGraph(self, node: 'Node') -> 'Node':
        if node is None:
            return None

        # We want to provide a function to get the
        # "corresponding" node, allocating it if needed.
        cor = {}
        def getCor(u):
            if u not in cor:
                cor[u] = Node(u.val)
            return cor[u]

        workset = set([])
        done = set([])
        workset.add(node)
        while len(workset) > 0:
            n = workset.pop()
            if n in done:
                continue
            m = getCor(n)
            for u in n.neighbors:
                v = getCor(u)
                m.neighbors.append(v)
                if u not in done:
                    if u not in workset:
                        workset.add(u)
            done.add(n)

        return getCor(node)

Notes

  • All graph problems (rare) are collected here.

Unique Morse Code Words (Leetcode)

Another from the archives. I had too much fun with this python approach. A nice introductory string question.

class Solution:
    def uniqueMorseRepresentations(self, words: List[str]) -> int:
        mapping = [".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--.."]
        def mapIndex(c):
            return ord(c) - ord('a')
        def mapWord(w):
            return ''.join(map(lambda c: mapping[mapIndex(c)], w))
        return len(set(list(map(mapWord, words))))

This is a fine introductory interview question. The exact motivation isn’t great, but “process this data and count unique ones” is fine. More strings — a popular request from my students — here.

Longest Substring Without Repeating Characters (Leetcode)

Another from the archives. In my old notes I tried a few different approaches. Ultimately this one feels a little “brute-force-y”, here is what I got:

class Solution {
public:
    int lengthOfLongestSubstring(string s) {
        int res = 0;
        for (int i = 0; i < s.size(); i++) {
            for (int l = std::max(res, 1); l + i <= s.size(); l++) {
                std::bitset<256> seen;
                bool bad = false;
                for (int j = i; j < i + l; j++) {
                    if (seen.test(s[j])) {
                        bad = true;
                        break;
                    }
                    seen.set(s[j]);
                }
                if (!bad) {
                    res = std::max(res, l);
                } else {
                    // i i our starting point, a longer l won't help here.
                    break;
                }
            }
        }
        return res;
    }
};

Thoughts

  1. The “length” iteration (how i, l, j indexes interact) is pretty interesting. Comes up rarely, but in thinking about many collections of substrings.
  2. The final “break” (with the comment above it) is necessary.
  3. I wanted to port this to Python—my old archives are in C++. But it lacks a bitset class, and its range loops aren’t amenable to the length-based iteration (or at least, I don’t know how to adapt them).

I doubt my solution is the right one. What is the official Leetcode solution? Oh I like it! Does this make the cut as an interview question? I don’t know, I am skeptical. More string questions here.

Minimum Distance Between BST Nodes (Leetcode)

This is a fun-enough question, I guess. I feel like I’ve seen a few questions with BSTs where the answer involves the idea that in-order traversal presents their content as a sorted list. Or should… Oh yeah, this would much-improve my answer for this older question.

Interlude: Yield Statements

I guess sometimes they’re called coroutines? Fun excuse to use those in Python. For our purposes they’re a “better” (?) way of collapsing the BST we have into a sorted list. The magic of that isn’t the coroutines of course, it’s that in-order traversal of a BST presents the contents as a sorted list. The yield statement just makes it nice, we don’t have to allocate a whole intermediate list.

Solution

class Solution(object):
    def inOrderYield(self, root):
        if root is None:
            return
        for x in self.inOrderYield(root.left):
            yield x
        yield root.val
        for y in self.inOrderYield(root.right):
            yield y
    def minDiffInBST(self, root):
        minDiff = None
        prev = None
        for v in self.inOrderYield(root):
            if prev is None:
                prev = v
                continue
            if minDiff is None:
                minDiff = v - prev
                prev = v
                continue
            minDiff = min(minDiff, v - prev)
            prev = v
        return minDiff

I bet there’s a better way of doing the final loop in our main function.

More BST goodness here.

Network Delay Time (Leetcode)

This is a classic algorithm, and from the archive! It is just Dijkstra’s algorithm, and in fact this sort of formulation is exactly how my undergrad algorithms professor tried to have us develop the intuition of how it worked. I appreciate that this formulation at least suggests something of an approachable motivation, measuring network latency and similar.

class Solution(object):
    def networkDelayTime(self, times, n, k):
        # Cache neighbors and their weights
        neighbors = {u : [] for u, _, _ in times}
        for u, v, w in times:
            neighbors[u].append((v, w))
            
        # Djikstra's
        worklist = [k]
        cost = {k: 0} # cost to reach k
        while worklist:
            u = worklist.pop()
            c = cost[u]
            if u not in neighbors:
                continue # its a sink
            for v, w in neighbors[u]:
                if v not in cost or cost[v] > w + c:
                    cost[v] = w + c
                    worklist.append(v)
        
        # Return requested result
        if len(cost) != n:
            return -1
        return max(cost.values())

Discussion

This was originally “easy” on Leetcode, according to my notes. I think with some thought this can be an approachable, or independently derivable algorithm. (A different professor I had pointed to Dijkstra as an advantage of being early on in a new field: if you are, you can name all the “obvious” results after yourself!) I wouldn’t say this is an obvious algorithm, but the idea that you “update your neighbors” when you receive new information is something that is worth getting a measurement on, interview-ish speaking.

Glad to finally get another graph algorithm on the books. The others are here.

Longest Univalue Path (Leetcode)

This leetcode question is interesting. It is from my archives. Originally, I misunderstood and thought you could only look for paths going “down”, but obviously as illustrated the question also admits “sideways” paths, see the second example.

Curiously, I was able to adapt my solution for the simpler interpretation and extend it to work for the more complicated interpretation too! Here is the code:

class Solution(object):
    def bottomUpBuilder(self, root):
        if root is None: return
        
        self.bottomUpBuilder(root.left)
        self.bottomUpBuilder(root.right)
        
        # can we continue a path from either of our children?
        self.longestPath[root] = 0
        if root.left and root.left.val == root.val:
            lval = self.longestPath[root.left]+1
            self.longestPath[root] = lval
        if root.right and root.right.val == root.val:
            rval = self.longestPath[root.right]+1
            self.longestPath[root] = max(self.longestPath[root], rval)
            
        # Can we join a path between our two children?
        # Note, crucially, that we're using "longestPath" for our child values;
        # we can't use asJoin because that would suggest a degree-3 node in our "path".
        self.asJoin[root] = 0
        if root.left and root.right and root.left.val == root.val and root.right.val == root.val:
            self.asJoin[root] = self.longestPath[root.left] + self.longestPath[root.right] + 2
    def longestUnivaluePath(self, root):
        self.longestPath = {}
        self.asJoin = {}
        self.bottomUpBuilder(root)
        m = 0
        for r in self.asJoin:
            m = max(m, self.asJoin[r])
        for r in self.longestPath:
            m = max(m, self.longestPath[r])
        return m

Discussion

You can see that there are basically 2 maps built, not just one. The asJoin map handles the case I forgot about.

So I can see this question as a nice one-two sort of question: first solve my simpler case, and then “can you extend that to the sideways case”? Well, the key insight that allows us to extend our code is hinted at in the comments: the asJoin map is defined in terms of longestPath, we don’t want to define it recursively.

This has a nice bottom-up iteration, and some neat invariants and applications. This question, on Leetcode, also has a certified solution. What is that solution? Let’s see.

Leetcode Solution Discussion

OK, it’s a more elegant recursive formulation (maybe?), and doesn’t admit my “extension” idea. Understanding why it’s linear-time is an interesting exercise.

More trees here, more recursion here.