Roman Numerals to Decimal (Leetcode)

This is apparently a popular question. It seems OK, it’s very easy. I think I already knew the rule of subtraction. I suppose questions that prompt the learner to consider “doing the work” of writing out a map is nice.

Maybe there’s something to it with the loop being backwards. What if I did it forwards? Seems like it’d end up in the same place: the condition in the if statement would be a bit different, that’s all.

MAP = {"I": 1, "V": 5, "X": 10, "L": 50, "C": 100, "D": 500, "M": 1000}

class Solution:
    def romanToInt(self, s: str) -> int:
        val = 0
        prevC = None
        for c in reversed(s):
            if prevC is not None and MAP[c] < MAP[prevC]:
                val -= MAP[c]
            else:
                val += MAP[c]
            prevC = c
        return val

Demonstrative Implementation: LZW

There’s something of the “standard literature” in undergraduate algorithms: graph traversals, greedy and dynamic programming, sorts and data structures, and so on. One class of algorithms that seems oddly excluded, at least from my experience, is compression algorithms. So I wanted to explore more of that, in this case by implementing LZW.

I implemented (I think…) a more-or-less standard LZW algorithm here. Continuing my theme of bridging computer science and practice (started (?) with my Bloom filter write-up I wrote it in C, and

Some take-aways:

  • I didn’t really bother with literate coding. I think that was ultimately the right thing to do. Uh oh.
  • My biggest frustration with example implementations online (that I can find, mainly from Wikipedia: there’s an old Dr. Dobbs article, and then a million Rosetta Code implementations all doing the same thing) is that it doesn’t seem to reflect a practical implementation. A linear scan over the dictionary? I know (better than most, I think…) that linear scans are great, but it’s hard to believe that’s the right thing here.
  • The Sedgwick implementation for his undergrad book seems OK.
  • I think I can make an interesting short-course on how I iteratively improved performance.
  • I really dropped the ball implementing clear-codes, that was a mess and a half. But it’s exactly the complications of “really” implementing clear-codes, variable-length coding, and so on that I am frustrated being skimmed over by academic resources! Absent my efforts (such that they are) it seems the students’ options are to ignore them entirely, or dive into gzip’s source base.

I have a PDF with more thoughts here: https://github.com/agorenst/lzw/blob/master/lzw.pdf ; you can see how I’m trying to get the framing.

Demonstrative Implementation: Bloom Filter

The Bloom filter seems to be a “sweet spot” for people doing hobby projects: simple enough to implement, unusual enough to be interesting, and complicated enough to feel satisfying.

I am interested in “postbaccalaureate” projects for self-study, and using those projects to bridge the transition from a recent grad to a professional programmer (still with a computer science bent). So I turned to this project for that purpose.

It went… OK!

  • I got really caught up in the literate coding stuff, and I think the code was, in effect, too simple. It was more the surrounding machinery that was interesting.
  • The driver code ended up being the complicated thing, which does suggest that it’s more running the experiments (which happens a lot in my experience!) that’s interesting, than the implementation-proper.
  • Things like the arbitrary-length-bit-vector, and the constituent pages, also seemed to be a bit more interesting.
  • I did really like the experiments, though. It was fun figuring out how to automatically make the charts.

I have a bigger write-up here: https://github.com/agorenst/bloom-filter/blob/master/bloomfilter.pdf