Another Tuesday, another pull request! 😀 This week we fly from the Philippines to the faraway land of Singapore.
Singapore is one of those fascinating city-states that is a Rorschach Test of essentially whatever you believe in. For the libertarians: No minimum wage and a flat 15% tax system (no capital gains tax!); but for the progressives: 80% of the housing is built by the government. There's a little bit of everything in Singapore for everyone! Fun fact: In Singapore, there are no overweight people! 😲
With that as the backdrop, this week I transitioned from the webdev world of React/NextJS to the wonderful world of Python and command line interface (CLI) tools. As a refresher for newbies: The world of software development has historically been separated into "frontend" and "backend" camps. (Though in 2021, as I've been increasingly finding, this differentiation is no longer so distinct!) But the general rule of thumb is that if you can see/touch/interact with it in a GUI kinda way-- that's frontend (React, HTML, CSS, etc). And if there's no GUI but you're ingesting/processing/transforming/serving data-- that's largely considered backend.
Back in March, one of the Discord servers that I hang out in had shared a link to an intriguing project that caught my eye: Yobi's CTM Summary Tool.
Starting several years ago, I got really interested in the Classic Tetris NES scene. YouTube's recommendation algo saw fit one day to put the (now famous) "Boom! Tetris for Jeff!" video into my homepage feed and I fatefully clicked on the intriguing, albeit, funny-looking thumbnail. And the rest is history. For whatever reason, I (and many millions of fellow humans) can contentedly watch blocks fall endlessly for hours-- there is something enormously satisfying in seeing these titans of godly dexterity get pieces left and right at insane speeds in order to clear lines from the screen. I can (and have) watched it for hours on end.
How does any of this relate to coding/software development? In one word: Motivation.
I think, in my own experience at least, motivation comes in two general flavors: The first is desperation-- you've lost your job or gotten divorced, can't pay the bills, can't feed your children, etc. It's do-or-die time; the rubber's met the road; and your back's up against the wall. There's no way out but forward. And if you can't go around an obstacle, you're simply going to need to go through it.
And while I've had a few rough patches, I harbor no delusions: I am immensely, profoundly grateful that I've led a largely privileged and fortunate life. I've never known (and obviously hope to never know) true poverty or any meaningful sense of difficulty. At the moment, my biggest daily "frustrations"/challenges largely revolve around fighting TypeScript's static type checker, learning Go, and trying to figure out all the various ways one can configure webpack.
So I can't speak to desperation. But I do know it's an immensely powerful driver.
A second driver of action though, and one less fraught, is curiosity/passion/interest.
However, the problem (or at least, my problem, speaking personally) is that, for me at least, curiosity is less powerful than desperation. My curiosity, at least historically, has largely wavered in the face of other competing life priorities (health/family/eating/sleeping) as well as just intimidation.
Like: If you're young and a newcomer, I know it can simply feel intimidating. Overwhelming. Like there's just too much and you don't know where to start.
And, speaking as an old, if you are an old, I know it can feel exhausting. Like it just never ends and that you simply can't keep up. That there's always some new stack/tech/framework/library that you need to know. And that every time you look at that AWS hamburger menu, the list seems to have impossibly grown by another half-dozen menu items. And that tweens half or even a third of your age are IPO'ing Unicorns, or retiring on Bitcoin, or founding blockchain startups and making more money in single day than you'll likely ever see in your entire life.
I hear you. I feel you.
BUT: (Cue inspirational music!)
So if there's only ONE thing I hope that you will take away from this article today, it's this:
I didn't invent that. It's advice as old as all of time. Ben Popper over on the popular Stack Overflow podcast has been building an online reservation system for a dog park next to his house. A dog park! And I spent this one weekend back in March last month working on extracting and calculating Tetris stats from a videogame released back in 1989.
Being able to read/write code is the new literacy of our times. It empowers you. If you can write code, you can do anything. Anything. After World War II, most of devastated Europe was a smoking crater and Japan was essentially a bombed-out, asphalt parking lot. America's had a seventy+ year head start because the war never touched our continental shores but that was seventy years ago. Everything's different now. In Bosnia, they don't have roads, but they have Facebook. The entire world's finally come online-- "It's time to build."
As the great GameScout explains, for years now, Classic Tetris NES videos have long been missing the "pace statistic" in their GUI overlays. By inventing "TreyVision", Trey Harrison, the Technical Director for the entire Classic Tetris World Championships had already given the world "score lead":
and "Tetris Rate" stat overlays:
but like GameScout explains, knowing a player's pace as they play is actually a much more accurate measure of how two players are stacking up against each other. Since players are able to "push down" which makes their pieces drop faster, they're also essentially able to play the game faster. Thus, as you can imagine: Being ahead in score isn't actually the whole picture if you are also ahead in lines too. (Every player only gets 230 lines until the Level 29 "kill screen".)
So, long story short: Via Discord servers and by working with others in the Classic Tetris community from all around the world like Xeal, Fractal, and Huffulufugus (all hail the Kingslayer! 🙇♂️🙇♀️🙇♂️🙇♀️), GameScout was able to figure out a reasonable way to calculate a "pace statistic." Back in my day, as tweens and teens we met each other in bowling alleys and at movie theaters. But nowadays, "Discord servers are the new bowling alley" for the youngblood so I followed GameScout's invitation to Yobi's Discord server.
By the time I'd arrived in Yobi's server, he'd actually done virtually all of the main work. The main program was already fully functional and could take in a video file via a command line argument, process the video (do the necessary "Optical Character Recognition" to read individual frames of the Tetris video and convert player scores (pixels) into actual numbers that Python could understand). Unlike Mithi's repo (the previous React/NextJS project that I'd contributed to) Yobi hadn't yet organized any issues in his repo. So I just went ahead and asked in the Discord if there were any good "first issues" that he'd wanted help with.
A quick aside here: Contributing to open-source comes in many forms! For my own personal "52 Weeks of Open Source" journey that I've undertaken, I'm setting the bar at "making nontrivial code changes" meaning I want to actually checkout the code, make some change, build it, and see that change, but you could totally wander around open-source land and just help with other tasks like documentation and general organization.
(In fact, and this may shock some of you, but at many of the larger megacorps, there is actually a job role for this! It goes by many names, but the one I hear most often is a "Project Manager." For the newbies: In the world of software development, there are actually many roles at the bigger megacorps --the Googles, Microsofts, Apples etc-- where really highly paid people actually don't write any code at all! In fact, if you can believe this, once an organization gets big enough, it actually becomes a full-time job for the managerial-types. I know in the previous century, manager-types got a bad reputation (obligatory Dilbert reference here), but nowadays, a good manager is actually worth their weight in gold. (At least, this is my humble opinion.) A good project manager will help organize and dictate coding priorities (just because we can doesn’t mean we should), keep programmers happy, and help eliminate "blockers". Basically, a good manager will enable other skilled technical people to properly do their job (and keep them from quitting).
Even if you're not a manager-type though, there are still plenty of ways to make obscene truckloads of money at any sufficiently-large software company. For example, if you enjoy technical work and coding, you can go down other tracks like eventually becoming a Senior Developer or a Tech Lead. And at really big organizations, if you're super-technically inclined, you can also become a "Distinguished Tech Fellow" or other roles of a similar stripe. Anyway, org-charts is a discussion for another time. For me personally, I just like coding. So let's get back to it!)
A few hours later, Yobi promptly replied to my Discord message with a list of good possible first issues:
And I went ahead and put all of those issues into his GitHub repo:
Now it was finally time to get to the good stuff: Diving into the code and getting our hands dirty! Coders code! 😁😄🎉✊
Yobi had personally said that one of his personal wants was to see "Tetris Rate" added back into the main GUI. He had already previously done this work for the "Tetris Rate Adder" that he'd coded for Chris Higgins's Best of Five Classic Tetris documentary. So our task here is pretty straightforward. All I needed to do was read the code in his Tetris Rate Adder repo, locate the exact lines that added the TRT box, and then copy that code over to the correct location in the CTM Summary Tool repo.
Whenever you paradrop into any foreign repository, the one thing you're always looking for first is the project's entry point. This is where all of the reasoning begins. Usually, the project maintainer will be kind enough to at least provide one line in their README on where to begin. Eg.
Alright, so we start with
trt_movie.py in the
Tetris Rate Adder project. Remember, all we need to do here is copy some code from one repo to another. No rocket surgery here. But we do need to put on our detective hats for a bit.
trt_movie.py, I quickly scan the code and look for anything that look like it might be related to "Tetris Rate"-- basically, anything with the word
trt inside of it:
By this point, I'm starting to form a preliminary mental model in my head of how Yobi's program works. Basically, it's just one big function. In the programming world, we actually call this "scripting" which is categorically different than "proper software development." It makes sense because Yobi wrote this very quickly in something like two days. When we're scripting, "proper software development" (Encapsulation! Composition! Design Patterns!) essentially gets tossed overboard because we're just trying to get something done. Quick-and-dirty, cowboy-style. Yeah! 😀
(I love myself a good quick-and-dirty script. It makes me feel alive. Like I've strayed across the tracks into that side of town. (Eg. If you're in Cincinnati, a good example would be wandering two blocks in any direction off the main drag in OTR, away from the Disneyfied redeveloped area.))
Alright, so at this point, I've identified the organs I essentially want to transplant back into the
CTM Summary Tool.
Now I navigate back to the
CTM Summary Tool and find its entry point:
So we go look at
generate.py. Again, it's essentially one giant big function. OpenCV2 is the Python library for image processing and it's been around forever since the dawn of time and is basically the open-source goto package anytime you need process video (eg. compress video or extract frames/information, etc). As is always my first step, I dink around for a bit just to make sure I can "dent its universe." Basically: Just make some trivial change and see that change reflected in the program somewhere.
Yobi's program is pretty straightforward: You run the script and feed in your local filepath to the video. The program then crunches that video and generates another video-- this one with the stat overlays burned into it.
So, for example, you feed in this video:
And out pops this video:
But how does Yobi's program know what
(x, y) regions to OCR in the video? And how does it know what regions to paint the generated stats?
config.json, of course!
At this point, the documentation was a little unclear. The only thing I didn't have, and couldn't find anywhere, was the actual video that was used for this example config. But luckily, everyone on Yobi's Discord was super-helpful. I reached out and GameScout shortly replied with the Google Drive link of the video I needed to download to run against these example config values:
Alright, so that was everything I needed to get started! Well, for my first foray, let's just see if I can paint some custom overlay, eh?
So looking at how Yobi drew the "Tetrises" label, I followed that same pattern but just with my own custom message:
Ran it and:
Haha, booyeah! We're in business! I have successfully rendered this "Robert was here!!!" message on every single frame of the processed video. 😄
From here on out, the rest was straightforward. Like, once you get that initial "toehold" into the program and figure out how to "dent its universe", the rest is honestly elementary.
Oh wait, that's a lie. There was ONE hiccup. After I implemented adding the TRT overlays, I got the full video from GameScout. (The one he provided earlier was just a two-minute sample.) After I'd gotten the full video (eight minutes long), I decided to run my modified program against it. And to my surprise, the TRT rates that were being generated weren't what I expected! Several minutes into the game, the TRTs would be something unreasonably low. In fact, the more Tetrises players scored, the lower the rate dropped! (It's almost as if the numerator was staying the same while the denominator was growing ever larger… but why/how…?) What was going on??
So I scanned the code and on line 132 in
Oh, Yobi 🤦♂️😉.
So I made that fix:
And then the rest was elementary 😄.
After making the necessary code changes, I submitted my PR:
When you make a PR to a webdev project, nowadays, most projects have CI/CD hooks that'll immediately deploy a preview of your PR to Vercel, Netlify, Heroku, Firebase hosting, etc. But with non-webdev projects it's a bit tougher because none of that 21st-century-ease-of-collaboration-machinery/tooling exists. So for Yobi's project, I put a screenshot of what my PR does to give him a heads-up of what to expect. I also mention other salient notes:
- I've introduced new
"trt-related"config parameters into
- A link to download the underlying video (the one that GameScout provided earlier) that the example config is built against.
And that's it! PR Week #2 in the bag! 🥳🎉
Working on Yobi's CTM Summary Tool was so much fun!
Just for kicks, I actually submitted another PR to Yobi's CTM Summary Tool. This one added a CLI Menu. Yobi hadn't requested it; it was just a fun side project I thought would be neat to whip up. 😄
In addition to introducing a rudimentary CLI menu, my PR also added the ability to extract the Tetris metadata as its own
.srt file where Tetris stats were recorded in text-format and timestamped. Though I ended up not pursuing this further at the time, it was an idea I was playing around with: Instead of generating another video, how about just creating a text file with all of the metadata in it? This text file could then be imported into Excel, R, or a Jupytr notebook for further analysis and number-crunching with pandas or something similar.
So many ideas and possibilities! So little time!
In the end: Though I was pleased with my own efforts, after some discussion, Yobi explained that the didn't like CLI menus so he didn't merge the PR into trunk. Chiefly (and this is a good point!), CLI menus are bad in the sense that you can't put them into other scripts when you're batching automation. For example, say you want to crunch a thousand videos. The way Yobi's program is currently written (accepting files via command line args) is much better for that because he can write another script that just calls
generate.py a thousand times and feed in an array of files that he wishes to crunch.
One beautiful thing about open-source is that the code can be forked! It can be branched! Everyone is contributing it to the public domain, free for use by anyone else. While you can always submit a PR back to the original repo, you also always have the option to just take the code in a wildly different direction and modify/hack/tweak it to your heart's content. This is the way! Code wants to be free!!! 😁
Also!! To my complete surprise and pleasant delight, last week GameScout made a YouTube video and at the very end, I'm even mentioned! So awesome, this totally made my day! Thank you for the shoutout, GameScout!!!! 🙏🙏🙏
I got featured in GameScout's video!!!!!!!!!!!!!!!!!!!! 😀😁😄😊🤗🎉🌟✨
Alrighty, and that's all for this week! Tune in next week for open-source project #3! Next week, we'll return to webdev world. Teaser: For nearly ten years, I'd put off learning React. But it was time. Same time, same channel next week! Hopefully see you then! Until then, Open-Source Avenger out! 😄🚀