When we started Wealthfront, we wanted a framework that would allow us to build our dream systems in minutes. Available solutions did too much, and it was hard to see a guiding principle behind them. Furthermore, distributed computing seemed like an afterthought in their design rather than a core principle.
As an engineering organization, we have a very strong sentiments about how software should be written. We are convinced that:
- Understandable code must be the norm, not the exception. Code is much more often read and reasoned about than written.
- Natural and type-safe APIs help avoid bugs.
- When a problem occurs, there should be enough information reported by the system to effortlessly track it down.
- Testable code does not happen by accident. A system must be designed with testability in mind from day one.
Our long term vision for the QueryEngine is to make it a one size fits many platform. It should be used to build services, command line tools, web servers. And it needs to lend itself to scaling teams.
Why Java?
Java is widely known in the industry. Most college students learn it. Most engineers know it. It runs on most architectures. If you need a tool or API, there is probably a high quality open-source project for it. One caveat though, is that it is sometimes had to unlean bad habits and break misconceptions.
The JVM is now the target of many beautiful languages such Jython, JRuby and Rhino.
Scala is purposely eluded from the previous list as it commands it's own paragraph. In summary, Scala is the most beautiful Turing complete language known to man. It has built on the shoulders of many giants: Davig Ungar's Self, local type inference pioneered by Benjamin C. Pierce and functional programming à la ML. Scala inter operates perfectly with Java and we've been using both very successfully side by side in production.
Our team has a deep understanding of the Java world and is acutely familiar with the whole stack. Our team members have worked on nearly everything: compilers targeting the JVM, dis-assemblers, class loaders, bytecode instrumentation tools, static analysis, extreme performance, distributed cluster, machine learning. And after all this time, we're still as much in love with it as in the early days.
Tests for Breakfast, Lunch and Dinner
We pride ourselves of being a test-driven Agile team. We live tests. Overtime, we got really good at testing everything (distributed computation, database operations, forward compatibility of services, and more). The full regression suite for our system runs in less than 5 minutes. And for all sub-systems, if the build is green we can push to production.
It is hard to tell whether the list of advantages of testing is countable. Here's a short sample.
- It allows quick iterations. It's green, push to your users and get feedback.
- It makes it easy to scale the team. First day in the code base, change something, get instant and automated feedback. Code, test, push.
- It is more cost effective than debugging. You might have heard some developers argue that testing takes too much time. Think about the hours it takes to debug a system in production, fix the bug, prepare a new production ready version, go through QA and release again. What about the impact of the bug on your brand?
- It obsoletes the need for functional QA. Our engineer to QA person ratio is currently infinite.
- It facilitates continuous refactoring, allowing the code to get better with age. Just like wine, you want your code to age well. Removing legacy decisions early without fear of breaking other parts in the system is key.
- It attracts the right kind of engineers. And if you're wondering, we have openings: jobs@kaching.com.
Syntactic Sugar In Your Java
My first interview question at Google was quite straightforward: "On a scale of 1 to 10, how well do you know Java?". 9. The factual answer should have been closer to 5 or 6. (Uber self-confidence is a common trait in entrepreneurs, that's a topic for another day.)
The World Of Java is very vast. The language is quite rich (and burdened by its age.) The J2SE API is large. The virtual machine has over 100 opcodes. There are many open-source projects to choose from. And you can drink your Java with other languages too.
Most people I meet have no experience with writing bytecode directly. Or with using Java from another language such as Scala. For some reason, at kaChing we've always had half the team composed of "compiler guys" and so we are very touchy on language issues.
The biggest complaint about Java is its verboseness. Whilst this is true, there are two ways to alleviate this problem. Eclipse and other IDES, but mostly smart APIs. For instance, in Java you tend to repeat yourself
Map<Foo, Bar> map = new HashMap<Foo, Bar>();
but with a smart API, in this case the Google Collections Library, you can write instead
Map<Foo, Bar> map = newHashMap();
which is much more pleasing.
Whilst some languages are naturally terse, the possibility to express concepts and patterns simply is a function of the programmer's talent; Not the tools. We write very compact code such as the implementation of paths parsing for our public API which is as short as the spec (you can see a snippet here).
Static Typing
We told you about our love for compiler guys? Yeah. It turns out, kaChing's CTO Pascal-Louis Perez is a type fanatic (he worked on Type Inference in Presence of Positive Subtyping with Bounded Quantification at Stanford).
Overtime, we've learned to use types to hold our hand in many of our programming tasks. For instance, consider the method whose parameter names have been purposely eluded
void createUser(String, String, String, int, int);
what does it do? What about
void createUser(Name, Address, EmailAddress, int, Id<Portfolio>);
This is clearer, and the compiler will prove that we never pass parameters in the wrong order. We call this the one to one (1:1) principle. One type used per method signature.
By the nature of our business, we also deal with a lot of sensitive data and security is a top priority. Isn't it better to have an SSN
type whose toString
has no leaks rather than passing String
objects around?
Finally, type driven development essentially looks at the work flow of the application to describe transformations, rather than implement them. The dependency injection framework Guice and its provides methods is an excellent example of composability via typing.
Stateless
Be it in Washington or in our software, we just hate state.
Global State: Global state is bad from theoretical, maintainability, and understandability point of view, but is tolerable at run-time as long as you have one instance of your application. However, each test is a small instantiation of your application in contrast to one instance of application in production. The global state persists from one test to the next and creates mass confusion. Tests run in isolation but not together. Worse yet, tests fail together but problems can not be reproduced in isolation. Order of the tests matters. The APIs are not clear about the order of initialization and object instantiation, and so on. I hope that by now most developers agree that global state should be treated like GOTO.
Conceptually, one should strive to have state on the stack and immutable objects on the heap.