Finally Coming Around to TypeScript

I was a Javascript Junkie for a long time before trying out Golang. Then I was a Statically Typed Junkie for a while, almost despising Javascript for the lack of safety it provides out of the box- but still acknowledging and appreciating how ridiculously fast and easy it is to build with it.

Firstly- Golang’s great. I’d use it in a heartbeat if performance/multi-threading was a bottleneck. But being a typed language isn’t special. There are plenty of statically typed languages and there are static analysis tools for just about any dynamically typed language.

Additionally, one thing that has frustrated me about Golang is the intolerance for cyclical imports, which forces the user to think very carefully about directory structure and package organization. On top of that, the brittle nature of 3rd party dependencies really settles in. Many libraries are deprecated and work awkwardly in combination with go dep.

Ultimately, the Golang ecosystem discourages using 3rd party abstractions and prefers you to build from scratch. Even folks in the community tend to respond to such inquiries with “why not just build it yourself?” It’s not entirely bad, since Golang has nice data structures like maps and dynamically sized arrays (slices), but in the end- Golang is better for building “things” and not “features”. Until I need a high-performance data-crunching service, I’ll stick to TypeScript.

The thing is- I have to thank Golang. It taught me that there really isn’t a reason for “null” most of the time- since an empty state can be represented without it. It taught me that- by having arguments that are required and typed- you can nearly reduce runtime bugs down to zero. It also made me fall in love with static typing + TDD, as I had finally reached a level of robustness that I had striven for as an engineer.

So with a recent project that’s mostly written in Node.js, but has a few Golang services- I decided to try out TypeScript as a theoretical best-of-both-worlds. And it did not disappoint.

By making both the arguments and return values explicitly typed, I’ve felt a great sense of relief knowing that the compiler will kick me in the shins whenever I inevitably use the wrong data type. The fact that TypeScript can even specify the return type of a Promise is ridiculously impressive, as you’d expect it to only support primitive or user-defined types. I’ve seen too many “of undefined” exceptions in my day to continue using vanilla JS in a professional context. TypeScript makes me sleep better at night.

The Age of Abstractions

Before 2019, I enjoyed building “things”. I’d tinker with algorithms, try out new libraries, and play with low-level code. I rarely do those things anymore.

It seems like there has been a shift in the dev world. We don’t build “things” anymore. We build “features”.

  • A long time ago, we’d set up a physical PC rig at home as a dedicated server. Now we use AWS or Digital Ocean. Sometimes we go “serverless”.
  • We used to create backend APIs. Now we use BaaS (Backend-as-a-Service) platforms like Firebase.
  • We used to run our apps directly on the OS. Now we run an entire isolated environment in Docker.
  • We used to manually backup different versions of our app. Now we use Git commits, branches and tags.
  • We used to string together a bunch of frontend libraries (Backbone, JQuery, Lowdash, Handlebars) as the foundation of a Single Page Application. Now we just run create-react-app.

Many of the nitty griddle technical hurdles are solved. By adopting modern abstractions, productivity can increase by magnitudes.

Blockchains and Neural Networks: Catching Up With The Hype Train

I’ve always avoided hyped technologies. But now that I’ve run out of shiny cool things to play with- and I can mess around with bytes without crapping myself- I’ve taken a committed step towards learning about and applying these new hot topics.

Blockchain. It’s the thing that most investors don’t understand, but will throw money at like it’s Stripper of the Year. After the Bitcoin bubble popped in 2018, I don’t care much about cryptocurrency. The nice thing about crypto is that there are no banks involved. The bad thing about crypto is that… there are no banks involved. Once you decide to cash out into a bank account, don’t expect Wells Fargo and Uncle Sam to be very nice about it.

But cryptocurrency is only a subset of blockchain, not its entirety. A blockchain- as I currently understand it- is really just a data structure where node N contains a hash that’s generated from its own data in addition to the data of node N-1. Each node can be validated by recalculating and comparing its hash, which makes all nodes intrinsically tied to the previous, hence the “chain” part.

The act of creating the hash is referred to as Proof of Work. Ideally, it should be computationally expensive, like any good password hashing algorithm- so that rebuilding the entire blockchain would be expensive.

I’m interested in the integrity aspect of blockchain. If you need to log non-trivial data like financial transactions, you want to be assured that each record is valid and no records are missing nor modified. Any linear dataset can be made into a blockchain by adding a Proof of Work hash to each entry. It seems like a nice auditing feature.

Beyond that, I’m still learning about how decentralization works with a blockchain. I see a few challenges with achieving decentralization, such as peer discovery, peer authentication, delegation and possible race conditions. Here’s the Github project that I’m using to do blockchain experiments:

Neural Networks. Last year, I took my first plunge into machine learning. I learned that the big picture of ML is to train a model (algorithm) with inputs and correct outputs, so that when given a new input, it can provide an accurate output. In school, we learned about the formula y = mx + b and we’d produce outputs from the forumula. In the real world, we often have outputs, but no forumula. Machine learning is about finding formulas- or in other words- identifying patterns.

I did my first “machine learning” project using linear regression, which is the grandaddy of all machine learning techniques. It would draw a line through historical movie ratings to predict future data.

Neural Networks- at the core- allow us to train a model on multi-featured inputs and classify them. In the following example, the input is a photo of a banana. A “feature” of the input is a pixel value. The output is an array of weights (or confidence) for every possible class.

Illustration of a Neural Network

The above array is example of what a neural network will take at the input layer and return at the output layer. The input is a 1-dimensional array of pixel values that represents a photo a fruit. For us humans, we see pictures as 2-dimensional objects, but the neural network is fine with a 1-D representation, under the condition that all the images are normalized to the same size.

The output layer is an array of all possible classes that the input photo might be, where each value is the weight (or confidence) of that class. The reason that this is awesome is because it aligns with how humans think. If we see an object that we’ve never seen before, we don’t always have the exact answer to what it is, but can deduce the answer by comparing multiple possibilities. At the end of the day, we usually choose the best answer (the weight with the biggest value) and move on. In this case it’s the banana.

But sometimes it’s also useful to know the 2nd-best answer because it might also be correct. A neural network is flexible enough to give multiple answers to a single question, which is in my opinion the reason why it’s a big deal and something I’ll be focusing on for the foreseeable future. I made a program to recognize handwritten characters using a neural network:

Moving forward, I’ve found that there are numerous abstractions to help with machine learning, such as TensorFlow, scikit-learn, and more. Personally, I enjoy doing ML in C/C++, but I’ll inevitably need to get comfortable with Python if I’m going to take data science seriously.

My biggest challenges thus far:
– Figuring out dope problems to solve with machine learning
– Finding good data sources (I remember a friend told me to check out Kaggle)
– Formatting data to play nice with neural networks (OpenCV seems decent at this)

Machine learning is super fun and I look forward to posting more of my experiences with it.

Specification-Driven Development (and why it’s the future of software development)

TLDR: I generated code for a REST API server and a microservice using a single Swagger spec file.

Every once in a while, I take time to completely re-evaluate how I approach software engineering. Design patterns are cool, but aren’t a game-changer in terms of increasing productivity. New dev tools go out of style as soon as you get comfortable with them (I like to wait until they’re mature). And no one programming language can be proven objectively better than another.

But after my recent exploration of what I call “specification-driven development”, my approach to backend development is significantly different today than it was last week.

At my last company, we had started a new project and decided to try go-swagger, which is a tool that generates the code for an entire Golang API server based on a swagger spec file. I was reluctant because it reminded me of the Adobe Dreamweaver days. But after the workflow finally clicked, it was exhilleratingly efficient.

I learned how to write a REST API schema according to the Swagger spec. In the Swagger file, I was not only able to specify API endpoints, but also add models (YAML can use references to keep it DRY), write descriptions for everything (which would be used for the generated documentation), and even add example values so that the documentation looked realistic. Then I ran a simple command and BOOM, all the server code was there. Another command resulted in my browser popping up with pretty docs and example requests/responses.

So recently I started a new project with the same formula- a code-generated API server. But I took it a step further by also having microservices that were code-generated via the same Swagger spec. It took a bit of shell script kung-fu, but I was enlightened to learn about Google’s Protobuf protocol and now I see what the future of software development will look like…

Backend developers have always dictated data models. A single declarative language will be used to define the endpoints and data models of a backend API, and code-generation tools will generate usable server and client code for all major languages. It’s already happening. There are a plethora of Swagger code generators. That saves a tremendous amount of time when iterating quickly and trying to keep the backend and frontend clients in sync. I’ve always believe that when working with data, it’s important to have a “single source of truth”, and now it’s possible to do that.

So now to the details. I started off with a swagger spec file:

At the end of the Swagger file would be a list the list of definitions that were references in the endpoints:

With the endpoints and models defined, there would now be enough information to generate the server with go-swagger:

swagger generate server -t ./generated

That’s it. Then all of a sudden, you see code show in the generated directory.

The cmd directory contains the main function/entry point. You can go run that.

The models directory contains… models.

And the restapi directory contains the endpoint handlers. The configure_myappname.go file contains hooks so that you can plug in your logic.

So with the API and models generated, I now wanted to generated a microservice.

After looking at various microservice frameworks, I chose go-micro as the candidate of choice because it had lots of stars, the example code looked simple enough and the readme had cool buzzwords like “protobuf”, “service discovery” and “load balancing”. So why not?

The thing is, I wanted to generate code from my Swagger file. The go-micro framework instead used the protobuf language to describe models and interfaces. So before moving forward, I’d need to convert my swagger file to a proto file. To do so, I found a tool called openapi2proto being maintained by New York Times. There wasn’t really another viable alternative, so I crossed my fingers… and it worked. I now had a myapp.proto file that contain all my swagger declarations, but in proto format, like so:

From there, the protoc tool was used to generate the microservice code in Go. I’ll be honest. There was a lot of tinkering required to create a shell script that generated working code, because Golang is very picky about directory/package structure, and code generation tools rely on that stuff to do their job. But after chaining together a few of the tools and adding additional hacks to tweak the generated files, I ended up with an ugly script that generated lovely results.

I ran into deep frustrations. I was on the cusp of saying “screw this” and building the microservices manually. But thankfully, with a bit more fiddling, I was able to automate the tedious parts. I can now create a new microservice in minutes, plug it in, and start implementing features. And if model changes are required, I just update the swagger spec, re-run the scripts and tweak a bit of code.

After about 2 days of foundation work, I now have an MVP backend. I try to keep things as simple as possible in the beginning, because once I add Docker and nginx to the mix, it’s only going to add additional complexity.

But for now, I’m going to jump into some React Native code. I’ll try to generate JS classes and maybe a JS client from the swagger spec to have generated code across the full stack.

Revisiting Swift

A long time ago, I wrote native iOS apps with Swift.

Then I didn’t for a while.

Now I do again.

I used to hate Swift. I thought that it was inconsistent because you could explicitly or implicitly declare a variable type. There were other reasons too, but my dislike for Swift was mostly because it was a bit different than other C-based languages.

But after getting comfortable with Golang, I can now appreciate Swift as a modern language that solves common issues in nice ways:

1.) Guard Statements
In any programming language, we have to check various conditions before doing work. Usually you’d check for a null value or attempt to cast a value to another type. Depending on your coding style, it might take several statements to do that. In Swift, it can be done in a single statement. And it’s specifically designed so that you can use the value outside the block scope, encouraging flatter code rather than nested chaos.

2.) Protocols
Nothing new here. In Java and C++, they’re called Abstract Classes. A base class contains method signatures than must be implemented in the derived class. Any good IDE will offer to fill in the stubs for you (Android Studio does for Java), and Xcode does for Swift.

I just find that Swift’s protocols are simpler to work with as they don’t require any special keywords (in Java, you’d need to explicitly mark each method as abtract and in C++ you’d need to mark them as virtual). In Swift, you can make a protocol just like a class, but omitting the implementation.

3.) Tuples
It’s when you can treat multiple values as a single value. Golang has this too. Sometimes it’s nice to not need a struct for tossing around multiple values.

It’s a shame that the language is so intrinsically associated with the Apple ecosystem (iOS and OSX). There are- of course- multi-platform solutions currently out there for working with Swift on Android, Linux and Windows… but those projects tend to resemble experiments rather than production-grade solutions. Swift is nice. But not nice enough to choose over other languages if given the chance.

The Making of Another App: Dank Nights

I like the nightlife. Living in LA, there’s no shortage of good night time spots. But finding the best spots isn’t easy. Even after 3 years of actively scouting and accumulating great recommendations, I felt like I barely scratched the surface because LA is a huge city.

So I made an app for it. The app is called Dank Nights. The app itself is rather simple. It just shows a bunch of markers on a map with colors to indicate how active a spot is on a given night. The red spots are the most active, while the orange spots are less active and gray spots even less.

There are currently 1000+ spots to choose from, filtered by day of the week to only show spots that are considered to be “dank” spots. What makes a spot “dank”? Ultimately, it’s a combination of how active the spot is on a given night (specified by the filter) and the rating of the spot. This app is strictly about night spots, so the venue must be active between 9pm and 2am.

The native app is built with React Native, but for the backend I decided not to use Node.js, which would normally be my default choice.

I’m using Golang for the backend REST API. The go-chi framework  is about as simple as Node’s Express framework for setting up a server. Since Golang is statically typed (or at least close enough to it), most of the scary runtime bugs that I’d normally run into with Node are no longer an issue. The Plush package brings back the familiar syntax of EJS/Handlebars templates.

Since the activity of a spot is drastically different depending on the day of the week (and some of us are crazy enough to hit the bar on Taco Tuesday), I’ve added filters for each day of the week. The default filter is for “weekends”, which is an aggregate of Friday and Saturday nights.

There’s a barely-working search engine to further find desired spots, but creating a good search engine is very very hard because you have to infer the “intent” of the user rather than just match strings.

At this point, the core functionality of the app is “done”. Nonetheless, Apple rejected the app, stating- in more fancy terms- that it wasn’t useful enough. It’s not unexpected. I’ve never had an app pass through the App Store review process without initial rejection. That’s just part of their quality control process. And if I do ever pass an initial review, I’ll probably have a difficult time sleeping on that night because of anxiety.

I guess I’ll add some kind of listview component just to get this thing into the App Store. Then I’ll move onto more impactful features, like cross-referencing instagram activity with venues to further separate the dank spots from the not-dank spots.

Edit 2/20/19: I had added some more visual spice as well as Yelp integration. Apple keeps rejecting the app for “lack of features/content”. I’ve pulled the plug on this project. It was fun building a new backend with Golang and scrubbing the rust off of my React Native skills, but Apple sucks nowadays. 

R.I.P. “Dank Nights”

Golang Impressions

I had previously started dabbling in C for work. C/C++ from a language standpoint isn’t the hardest thing in the world. But getting all the dependencies linked properly in a C/C++ environment is indeed the hardest thing in the world.

So we started using Golang. The performance is great, considering how high level the language is. It feels like a combination of C and ES6, considering it takes a more functional approach and doesn’t support classical OOP, yet it has a simple package/import system and built-in data structures like map and dynamically sized arrays (called “slices”).

Testing is built-in via the go test command, provided you’ve created some files that end with _test.go. A single independent binary is compiled via go build. Golang is very picky about where a project is located, often requiring you to place it in $GOPATH/src/ When you start using the go get and go install commands, it makes more sense.

Concurrency was one of the huge selling points for me. Golang doesn’t exactly let you create “threads” per se, but “goroutines” instead- which is a way of telling the compiler that the logic should be treated as a “threadable” subroutine. From there, the compiler decides exactly how to handle those cases. An object called a “channel” can be safely read and written to by goroutines. There are also atomic data types that help with thread-safe operations.

Ultimately, I’ve been spending less time on toolchains and more time building features. Golang truly is a modern systems language.

I would post some code, but there’s nothing syntactically spectacular about the language. Some of my early Golang study code can be found here:

WebAssembly: C/C++ In The Browser

Since I’ve been immersed in C/C++, I thought it would be a good time to try WebAssembly.

What is WebAssembly? It’s a new browser API that allows you to send over bits of compiled code to the browser for execution. I was originally hesitant about pursuing WASM because the toolchain looked painful. But thanks to an online tool, that part was made incredibly simple:

We start with some hand-written C/C++ code like this:

Then that C/C++ code gets compiled to WAT format, which is a bit more descriptive about which data types will be used:

There are 2 notable things about the WAT file: it shows you that 32-bit ints will be used. That can be changed to i64 if you want to use 64-bit ints. You’ll also see that it exports a function called “fib”. Depending on your compilation settings, the name might look different- such as “_Z3fibi”- if you’re compiling a debug version. We need to use that name in the JS script.

Next up, the WAT file gets compiled into assembly- which you really don’t need to look at unless you enjoy self-inflicted torture. From there, we download the final WASM file which we’ll load via a JS script:

We fetch the wasm file as an array buffer, compile it into a module, then create an instance of the module. The module instance is an object with the exported functions. Our “fib” function is there. From there, you can call the function and get a return value just like any regular JS function.

I made a demo project in which the C and JS versions of the Fibonacci function were benchmarked. I saw a 1.7x to 1.8x improvement in speed in favor of the C version:

As mentioned earlier, my primary concern was build process. In the same way game engines make the work of a game developer easier, build tools like Webpack make the job of a web developer easier. And by the looks of it, there’s already a WebAssembly plugin for Webpack: My fears have been put to rest.

My early impression of WebAssembly is that it’s a pretty awesome extension of the web. It’s not meant to replace JS, but it’s a way of improving performance for demanding parts of the client side app. Since the dawn of web applications, we’ve tended to put the heavy number crunching logic on the backend so that slower devices don’t suffer. Well with WebAssembly, perhaps we’ll see a paradigm shift where we load more work onto the client, which can result in both a snappier experience and less server load. Since C/C++ has excellent static analysis tools, the code would be potentially more stable too. While I initially felt “blah” about WASM, I think the ROI is certainly a net positive for the future of web.

This awesome article inspired me to give WASM a shot:

Low-Level Revelations

I’m learning C for work. I’ve always wanted to immerse with C/C++ but never had a reason to… until now.

I’ve previously played with C++. It has a huge amount of features. But I didn’t realize how big of a language it was until I started learning C. Here’s a list of things that C++ has that C doesn’t:

  • bool type (use a decimal value 0 or 1)
  • Classes (use structs)
  • Overloading operators
  • new and delete operators (use malloc and free)
  • Template functions

Yet even though many of those taken-for-granted features were missing, I found that it was easier to solve problems in C because I wasn’t burdened with the question of how to architect a program. There’s no classes or name spacing, so it’s pretty simple: just use functions and structs.

It turns out that you really don’t need those advanced features or higher level datatypes to get stuff done. With some decent intuition about common data structures, you can solve most problems.

Here’s a function from a word-counting program I made:

Each “WordInfo” object contains a word string, count and pointer to the next word. It iterates through each word and if it finds a match, it iterates the counter for the word. Otherwise it creates an entry for the word at the tail of the list.

My key takeaway from this experience is that under the hood, lists of data are either 1.) aligned next to each other in memory  (as an array) so that you can increment the pointer to get the next item or 2.) can be at any location in memory, but “linked” together by references to each other (linked list or tree structure).

I visualize a train with multiple cabins. To get to a particular cabin, you have to know the location of one of the connected cabins. But if you lose the location to all the cabins, you’re screwed. And if somehow the link between the cabins is lost, you cannot access the disconnected cabins. In a garbage-collected language like Java or JavaScript, the cabins would magically disappear once you lost them. But in C, the cabins will remain existing until explicitly “freed from memory”.

Another thing that I dig about C is how it gives you complete ownership over memory. Most of the functions in the standard library like string.h don’t allocate memory for objects, but instead ask you to pass your own object for mutation. I’m learning how to program in such a way that the number of allocations are reduced and memory is reused, and yet the code is still surprisingly compact.

I’m going deeper down the rabbit hole…

Test Driven Development + CRUD

I think about Test-Driven Development frequently nowadays. I use it for work everyday. And I’m trying my darnedest to use it in all of my personal projects too.

It’s easy to end up writing tests that are either over-engineered or tests that don’t cover edge cases enough. For the most part, we just want to be able to sleep at night knowing that our software works how it’s intended to.

At one point, I thought that every callable function should be tested. Nowadays, I actually think that in most cases, only the outer-most interfaces need to be tested, since those are ultimately where the request will be received and response sent. Of course, for complex systems where data is transformed in numerous stages, creating tests for smaller components is very beneficial.

There’s also a good reason to focus more on testing interfaces rather than implementation: you have the freedom to do whatever you want to the guts of your app as long as it provides the right output for a given input. If you choose to refactor, there would be minimal to no modifications required for the tests.

With the nitty gritty out the way, this brings me to the topic of CRUD interfaces. Most software uses CRUD. Whether it’s a social platform, a news website or an ecommerce store, we need to be able to Create, Read, Update and Delete objects.

Over the years, I’ve tried to figure out the simplest way to write tests that handle a CRUD API. At this point, I think I have something that makes sense. Let’s say we have a video sharing platform with user-generated content like YouTube. Here’s how I would write a test for CRUDing videos. For simplicity, I would break it down into a singular test and a plural test:

Singular Test:

  • Get a video by a random ID: should return nothing
  • Create a video
  • Get the video by ID: should matches initial values
  • Update the video by ID
  • Get the video by ID: should match updated values
  • Delete the video by ID
  • Get the video by ID: should return nothing

Plural Test:

  • Query list of videos: should return empty array
  • Create a video
  • Query list of videos: should return array containing the 1 video
  • Create a video
  • Query list of videos: should return array containing the 2 videos

And those two tests should handle basic CRUD for a single collection/table. The single test handles operations performed on a single item. Sometimes an “update” might return the updated version of the object. Either way, we should explicitly get the object afterwards to make sure it’s updated. The plural test handles querying a list. It might seem redundant to query the list for zero, one and two items, but I have a good reason for it:

  1. I want to ensure that the system can handle an empty list
  2. I want to ensure that the system can return one item
  3. I want to ensure that the system can return multiple items

It’s possible for #1 to fail if some code always assumes items will be present. It’s possible for #2 to fail if an object cannot be created or the query simply doesn’t return anything. And it’s possible for #3 to fail if the system was only designed to store a single object rather than an array of objects. I can safely assume that if two objects are queried successfully, then a thousand objects will be queried successfully.

So I’m fairly happy with this approach to TDD + CRUD. I’m also mulling over more complex cases, like a nested object schema, but I don’t think that case requires any special philosophy… yet.