Tag Archives: strings

Decimal to Roman Numerals (Leetcode)

A sequel to our previous effort, this time going the other way. That the greedy algorithm works here is not too surprising, but maybe there’s a way of concocting a variant of roman numerals where the greedy algorithm wouldn’t work? It’s basically the change-making algorithm. That it works is intuitive, perhaps, but it feels a bit unstable. If somehow we had to do something like this for production, I’d want to really spend out the time characterizing exactly why this works—why we can recast the whole subtractive thing as just a pile of coins. It’s sort of interesting, I guess…

def trySingleString(num, result, symbol, amount):
    if num >= amount:
        return num - amount, result + symbol
    return num, result


def tryRepeatedString(num, result, symbol, amount):
    while True:
        num2, result2 = trySingleString(num, result, symbol, amount)
        if num2 == num:
            return num, result
        num, result = num2, result2


class Solution:
    def intToRoman(self, num: int) -> str:
        assert num >= 0
        result = ""
        num, result = tryRepeatedString(num, result, "M", 1000)
        num, result = tryRepeatedString(num, result, "CM", 900)
        num, result = tryRepeatedString(num, result, "D", 500)
        num, result = tryRepeatedString(num, result, "CD", 400)
        num, result = tryRepeatedString(num, result, "C", 100)
        num, result = tryRepeatedString(num, result, "XC", 90)
        num, result = tryRepeatedString(num, result, "L", 50)
        num, result = tryRepeatedString(num, result, "XL", 40)
        num, result = tryRepeatedString(num, result, "X", 10)
        num, result = tryRepeatedString(num, result, "IX", 9)
        num, result = tryRepeatedString(num, result, "V", 5)
        num, result = tryRepeatedString(num, result, "IV", 4)
        num, result = tryRepeatedString(num, result, "I", 1)
        return result

Roman Numerals to Decimal (Leetcode)

This is apparently a popular question. It seems OK, it’s very easy. I think I already knew the rule of subtraction. I suppose questions that prompt the learner to consider “doing the work” of writing out a map is nice.

Maybe there’s something to it with the loop being backwards. What if I did it forwards? Seems like it’d end up in the same place: the condition in the if statement would be a bit different, that’s all.

MAP = {"I": 1, "V": 5, "X": 10, "L": 50, "C": 100, "D": 500, "M": 1000}

class Solution:
    def romanToInt(self, s: str) -> int:
        val = 0
        prevC = None
        for c in reversed(s):
            if prevC is not None and MAP[c] < MAP[prevC]:
                val -= MAP[c]
            else:
                val += MAP[c]
            prevC = c
        return val

Demonstrative Implementation: LZW

There’s something of the “standard literature” in undergraduate algorithms: graph traversals, greedy and dynamic programming, sorts and data structures, and so on. One class of algorithms that seems oddly excluded, at least from my experience, is compression algorithms. So I wanted to explore more of that, in this case by implementing LZW.

I implemented (I think…) a more-or-less standard LZW algorithm here. Continuing my theme of bridging computer science and practice (started (?) with my Bloom filter write-up I wrote it in C, and

Some take-aways:

  • I didn’t really bother with literate coding. I think that was ultimately the right thing to do. Uh oh.
  • My biggest frustration with example implementations online (that I can find, mainly from Wikipedia: there’s an old Dr. Dobbs article, and then a million Rosetta Code implementations all doing the same thing) is that it doesn’t seem to reflect a practical implementation. A linear scan over the dictionary? I know (better than most, I think…) that linear scans are great, but it’s hard to believe that’s the right thing here.
  • The Sedgwick implementation for his undergrad book seems OK.
  • I think I can make an interesting short-course on how I iteratively improved performance.
  • I really dropped the ball implementing clear-codes, that was a mess and a half. But it’s exactly the complications of “really” implementing clear-codes, variable-length coding, and so on that I am frustrated being skimmed over by academic resources! Absent my efforts (such that they are) it seems the students’ options are to ignore them entirely, or dive into gzip’s source base.

I have a PDF with more thoughts here: https://github.com/agorenst/lzw/blob/master/lzw.pdf ; you can see how I’m trying to get the framing.

Unique Morse Code Words (Leetcode)

Another from the archives. I had too much fun with this python approach. A nice introductory string question.

class Solution:
    def uniqueMorseRepresentations(self, words: List[str]) -> int:
        mapping = [".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--.."]
        def mapIndex(c):
            return ord(c) - ord('a')
        def mapWord(w):
            return ''.join(map(lambda c: mapping[mapIndex(c)], w))
        return len(set(list(map(mapWord, words))))

This is a fine introductory interview question. The exact motivation isn’t great, but “process this data and count unique ones” is fine. More strings — a popular request from my students — here.

Jewels and Stones (Leetcode)

The hardest part of this question for me was the phrasing. However, I do sort of like this question: hopefully the student will realize they need to repeatedly check the string (set) of jewels, as they iterate over their stones. I don’t know how I feel about the fact that we’re using strings as sets, basically. It does happen in the real world.

Again, from the archives. This is me having too much fun with C++-isms. You can see my inline notes about other approaches.

class Solution {
public:
    int numJewelsInStones(string J, string S) {
        // let's do this very simply, no helper objects.
        int jewelCount = 0;
        for (auto&& stone : S) {
            if (std::find(std::begin(J), std::end(J), stone) != std::end(J))
            {
                jewelCount++;
            }
        }
        // We can do lots of things, really. We should cache J into a set, perhaps, and
        // then can just do a fancier std::algorithms (for_each) sort of thing, if that
        // strikes our fancy. filter etc.
        return jewelCount;
    }
};

We may even be able to collapse the outer for-loop into a count_if call or something… but that’s probably too clever. Who knows.

I guess this is a strings question? Even though it’s really treating strings as a set.

Isomorphic Strings (Leetcode 28)

I like this one a lot! Also from the archives. This one is listed as easy, I would say it’s on the harder side of easy. The word “isomorphism” is a bit gnarly, but you can ignore that. I’d rather this be introduced as a Caesar cipher, but maybe that’s also confusing, or makes the possible answer too clear. But if the question is “are you comfortable talking about isomorphism”, well, that’s not a very good question. So I suppose I’d phrase this as a Caesar cipher.

The interesting challenge is, well, distinguish an isomorphism from a homomorphism. Vaguely, the idea of distinguishing this “uniqueness” (invertibility) property is nice. Maybe to enable something like dictionary encoding? I’m trying to brainstorm how to motivate this question and give it some application-justification. In any case…

class Solution:
    def isIsomorphic(self, s: str, t: str) -> bool:
        # Easy cases
        if len(s) != len(t):
            return False
        
        mapping = {}
        mapped = set([])
        for i in range(len(s)):
            sc = s[i]
            tc = t[i]
            if sc in mapping:
                if mapping[sc] != tc:
                    return False
            elif tc in mapped:
                # uniqueness condition
                return False
            else:
                mapping[sc] = tc
                mapped.add(tc)
        return True

Group Anagrams (Leetcode 27)

Another from the archive. This is a nice warm-up or introductory interview question. It is a bit contrived, but I think it exercises a few common tools you’d want to use. You want to group things together that share a common property, and so you want to compare the property but group the things-actual.

Here I use defaultdict, but that’s just out of convenience. I don’t like how to create a sorted string in Python, I wonder if there’s a better way.

from collections import defaultdict
class Solution:
    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        
        d = defaultdict(list)
        for s in strs:
            key = "".join(sorted(s))
            d[key].append(s)
            
        res = []
        for k in d:
            res.append(d[k])
        return res

RLE compress string (Leetcode 19)

OK, a sequel!

I like:

  1. We have two views of the same array.
  2. We can start at 1 in the for-loop
  3. Opportunity for invariants to help our loop-based reasoning.
  4. Things get much simpler with the helper function (i.e., I made a lot of bugs until I decided to use a helper function).

I don’t think I’m biased to say this one requires a bit of careful thought!

class Solution(object):
    def commitLetter(self, chars, c, i, count):
        assert(count > 0)
        chars[i] = c
        i += 1
        if count > 1:
            w = str(count)
            for t in w:
                chars[i] = t
                i += 1
        return i
            
    def compress(self, chars):
        if len(chars) == 0: return []
        
        count = 1
        j = 0
        lastChar = chars[0]
        
        for i in range(1, len(chars)):
            if chars[i] == lastChar:
                count += 1
            else:
                j = self.commitLetter(chars, lastChar, j, count)
                lastChar = chars[i]
                count = 1
        j = self.commitLetter(chars, lastChar, j, count)
        return j

Longest Common Prefix (Leetcode 12)

This is another good one, I think. Strings and arrays are commonly-requested topics.

  1. We can observe that “the first string must contain the longest common prefix”. That’s an interesting tool that we can leverage.
  2. So we know that we can iterate over that length. We want to “scan” all the strings in parallel, and that’s how.
  3. We have an interesting “return from a nested loop” potential. Helper functions may make this clearer. Potential sequel homework.
class Solution(object):
    def longestCommonPrefix(self, strs):
        if len(strs) == 0:
            return ""
        root = strs[0]
        if len(root) == 0:
            return ""
        
        prefixLen = 0
        for i in range(len(root)):
            for s in strs[1:]:
                if i >= len(s):
                    return root[:prefixLen]
                if s[i] != root[i]:
                    return root[:prefixLen]
            prefixLen += 1
        return root