OOCCaammll:: II SSwwiippeedd RRiigghhtt Published January 14, 2022 Finding a programming language is like finding a partner. You survey the field of candidates, filter with some questionable heuristics (geography, age, appearance), assesses each candidateʼs relative merits, then try for a date. A sequence of good dates might lead to a committed relationship. After a decade-long relationship with GGoo, I wondered if maybe we should see other people. QQuueessttiioonnaabbllee HHeeuurriissttiiccss The field is so large that you have to exclude most candidates after cursory examination. I aim for heuristics that correlate with what I actually want (and hope that Iʼm wise enough to know what that is). The map is not the territory, but the mapʼs approximation aids forward progress. I chose the following approximations: _I _c_a_n _b_u_i_l_d _t_h_e _c_o_m_p_i_l_e_r _o_n _O_p_e_n_B_S_D This heuristic hopes to approximate code quality and the language communityʼs attention to detail. OpenBSD is my main operating system. At a practical level, I need a language that works in that environment. The skillful porters of OpenBSD can build almost anything, so I could just ppkkgg__aadddd a language binary and be done. But if a crippled C hacker like myself can build the language from source, the languageʼs community must have kept things exceptionally clean. Plus, if anything goes wrong, I need to be able to build and test fixes myself. The porters and language designers have more important work than holding my hand. OpenBSD is _i_n_t_e_n_t_i_o_n_a_l_l_y _h_o_s_t_i_l_e[1] to software. It doesnʼt work around bugs, and makes an active effort to crash programs that misbehave. For me, this isnʼt about security; itʼs about correctness. I want my programs to behave. If a language implementation can survive on OpenBSD, I can adjust my priors in favor of the implementation being sturdy. Lastly, OpenBSD is a niche. It has relatively few users and acts differently. If the language community paid sufficient attention to this niche, my priors move in favor of attention to other corner cases. The software that I respect most (SQLite, OpenBSD, Go) has a habit of running on strange hardware, weird operating systems, and unusual platforms. I donʼt think thatʼs statistical noise. _M_e_m_o_r_y _s_a_f_e I suck at memory management. Memory-related flaws are frequent enough in _C_V_E_s[2] to suggest that Iʼm not alone. Iʼve never written a program that needed such precise control of memory that I couldnʼt outsource that work to a compiler or runtime. This heuristic doesnʼt require a runtime garbage collector, but the language ecosystem needs to prevent me from freeing freed memory, writing past the end of a buffer, or dereferencing an invalid pointer. Maybe itʼs a compile-time GC, or a static analysis tool, or the spoils of a fiddle competition with the devil. I donʼt really care. _L_e_s_s _t_h_a_n _m_e_d_i_a_n _c_o_m_p_l_e_x_i_t_y A wise man spends less than he earns, and _S_y_s_t_e_m _2[3] is a limited resource. If I spend mental energy on syntactic nuances, remembering abstractions, or understanding tall stacks, I have no juice left for solving worthwhile problems. Itʼs probably a foolʼs errand to objectively calculate _l_a_n_g_u_a_g_e _c_o_m_p_l_e_x_i_t_y[4], but I want something closer to a Lua, Go, or Scheme; maybe a Standard ML or Haskell, but nothing like a Java, Ada, or C++. Those are more than my modest brain can handle. This implies no JVM and no CLR. Theyʼre impressive platforms that do too much. Donʼt get me wrong. If I could eat steak and ice cream every day I would. Iʼm sure a Four Seasons resort in Bali makes a lovely home. I enjoy syntactic sugar as much as the real stuff, but theyʼre all outside my budget. I loved Ada once, our fling was exciting, and her friends are good people, but sheʼs too much for a simple man. _B_a_s_i_c _s_t_a_t_i_c _a_n_a_l_y_s_i_s Normals are surprised when I describe how much of my career Iʼve devoted to fixing typos that made it to production, or writing repetitive test cases to avoid those typos. Like memory management, this is a job for a machine. Make sure the iʼs and tʼs have appropriate dots and crosses. If you wonʼt make it to dinner, please let me know before I drive to the restaurant and order appetizers. And, when I say _b_a_s_i_c, thatʼs an upper bound not a lower one. I appreciate a warning if youʼll miss a date, but you donʼt have to ask permission to use the restroom. _F_a_s_t _s_t_a_r_t _u_p _t_i_m_e I write a lot of command line tools. Waiting 20 ms for hello world is fine, 200 ms is not. Start up time should also scale well. If hello world is snappy but a linter or formatter is slow, it wonʼt work between us. IIttʼss MMee NNoott YYoouu With those heuristics in hand, I surveyed the field. The following languages looked nice from a distance, but they didnʼt make the cut. These are the first turn offs I encountered, but many would be excluded on other grounds as well. • AAddaa, CCoommmmoonn LLiisspp and RRuusstt are too complex for me • CC and DD arenʼt memory safe • CCrryyssttaall canʼt bootstrap on my laptop (it has to cross-compile) • CClloojjuurree needs the JVM. CC## and FF## need the CLR • Native compilers for DDaarrtt, KKoottlliinn, SSccaallaa, and SSwwiifftt donʼt support OpenBSD • EErrllaanngg and EElliixxiirr need BEAM which starts too slowly • FFaaccttoorr, FFoorrtthh, GGrraavviittyy, LLuuaa, PPeerrll, PPHHPP, PPrroolloogg, RRuubbyy, and TTccll have insufficient static analysis • Bootstrapping GGHHCC exhausts RAM on my laptop, and bootstrapping FFrreeee PPaassccaall was beyond my skills • Building HHaaxxee (because of Neko), Node.js (for TTyyppeeSSccrriipptt), and ZZiigg all proved more than my skills could muster • Modern RRaacckkeett (using Chez Scheme) wonʼt compile under OpenBSDʼs W^X restrictions • SSttaannddaarrdd MMLL has several good implementations. I could only build Poly/ML but had trouble using it for non-trivial programs. FFiinnaall FFoouurr That left four candidates: GGoo, NNiimm, OOCCaammll, and PPyytthhoonn. To evaluate them somewhat objectively, I made a wishlist of about 30 entries. I distributed 100 points among those wishes based on how important they were to me. For each wish and each language, I gave a percentage for how well the language fulfilled the wish. The best languages for me, in order, ended up being: OCaml, Go, Python, then Nim. The _s_p_r_e_a_d_s_h_e_e_t _i_s _a_v_a_i_l_a_b_l_e[5], but here are some highlights: _I_n_t_e_r_f_a_c_e _t_y_p_e_s This is structural typing. I want to define a set of values based on the operations they allow. This paradigm follows my intuition from the real world: if the prong fits in the slot, my device will charge. It also lets me ignore irrelevant details when writing a function. I donʼt care what a value is named, or where it came from, or how it works inside; as long as it does what I need, the function can proceed. Structural typing, when embraced throughout a language ecosystem provides excellent orthogonality between components. GGoo does beautifully here. Its iioo..RReeaaddeerr and iioo..WWrriitteerr interfaces are the canonical example. ffmmtt..FFpprriinnttff doesnʼt care if Iʼm writing to a file, a terminal, a network socket, or an encoded-and-compressed buffer in memory. If the thing does WWrriittee correctly, ffmmtt..FFpprriinnttff can use it. Modern languages usually have good support for interfaces, and all candidates scored well. PPyytthhoonn scored the best because of mypyʼs Protocol feature. This makes sense: Pythonistas have been duck-typing since I was writing BASIC. OOCCaammll has two flavors of structural typing: objects (the interface is defined by method signatures), and modules (the interface is defined by module signatures). Structural typing for objects is especially cool since it infers an interface based on usage. That is, you donʼt have to define an interface explicitly or choose a name for it. Of course, you probably want to be explicit eventually, but itʼs a handy feature when prototyping. I donʼt know why the OCaml ecosystem uses structural typing of objects so rarely. Module types seem more common, but even those donʼt seem widely used. It seems like the OCaml community is leaving money on the table here, so Iʼm probably missing something. As best I can tell, structural typing in NNiimm is called _c_o_n_c_e_p_t_s. The feature is experimental and seems complicated but usable. _S_u_m _t_y_p_e_s I dreamed of a language with disjoint unions (sum types, variant records, tagged unions, discriminated unions, whatever theyʼre called today). For full points the language needs lightweight syntax and the static analysis tool needs exhaustive pattern matching. Like interfaces, sum types feel natural. My mind is inclined to split a group into subgroups and then think about each subgroup independently. Lightweight syntax is important because I wonʼt wade through heavy syntax. This is my own flawed psychology, eating candy when I should eat vegetables, but I hope for a partner who can balance my worst inclinations. As Iʼve programmed in GGoo for the last ten years, Iʼve wanted this feature more than any other; yes, even more than generics. On three occasions I created a small language that compiles sum types into Go interfaces. I gave up each time because the Go compiler doesnʼt have exhaustiveness checks, but sum types remained on my wishlist. OOCCaammll rocks this one. It has lovely sum types on a lithe syntactic frame. mmaattcchh … wwiitthh does exhaustive pattern matching, infers types based on the patterns, and is the stuff of my dreams. In PPyytthhoonn, mypy classes with a LLiitteerraall tag attribute can fake sum types with sufficient effort. The syntax weighs less than it does in Go, but not by much. NNiimm is similar to Go: you can fake sum types with sufficiently complex class hierarchies. _G_o_o_d _h_y_g_i_e_n_e I mean this in the sense of _c_l_e_a_n_l_i_n_e_s_s or _p_r_a_c_t_i_c_e_s _t_h_a_t _p_r_e_s_e_r_v_e _g_o_o_d _h_e_a_l_t_h, not hygienic macros. I want a language whose community emphasizes clean, high-quality code. It values correctness and security fixes over new features. The community would rather write their own small-but-correct library than use a third-party library with bugs and poor maintenance. This includes an ethos of avoiding dependencies where reasonable (not like JS or Perl where everything is solved by installing a shady, third-party package). Most software sucks, but I prefer _s_o_f_t_w_a_r_e _t_h_a_t _s_u_c_k_s _l_e_s_s[6]. For many years, I under-valued the importance of a programming language community. Like a natural language, much of the value in a programming language is the community it grants access to. GGoo is the best example I know. They strive for clean language semantics, few pieces, rapid bug fixes, good security. Theyʼre glad to wait a dozen years before adding generics if it means they can do it right. The core Go team takes bug reports seriously, adds regression tests religiously, and follows consistent policies toward releases and backwards compatibility. Iʼve learned a lot from the Go in this area. The core Go team is also not afraid of writing code when needed: Go has its own code generator, its own TLS stack, its own regular expression engine. This isnʼt about _N_o_t _I_n_v_e_n_t_e_d _H_e_r_e. These libraries are often better than the ones used elsewhere and they always fit better in the Go ecosystem. Incidentally, I also think the world would benefit from greater software diversity, but thatʼs for another article. OOCCaammll seems solid too. The community is pretty focused on correctness and quality; thatʼs what drew many of them to OCaml in the first place. Public discussions emphasize correctness, and language design decisions often cover the theoretical foundations of language semantics. Xavier Leroy even _f_o_u_n_d _a_n _I_n_t_e_l _C_P_U _b_u_g[7] through persistently debugging the OCaml compiler. The community leans toward dependency bloat a little, but itʼs not awful. I had a hard time evaluating PPyytthhoonn on this metric. The community is so large that I found it difficult to pin down. The main implementation seems solid, the community has some dependency bloat and the language is adding new features rapidly. I donʼt know what to make of it. NNiimm falls on the exploration side of the _e_x_p_l_o_r_a_t_i_o_n_-_e_x_p_l_o_i_t_a_t_i_o_n _s_p_e_c_t_r_u_m[8]. The language designers add new features rapidly, but those features donʼt often work consistently. A quick perusal turns up regular users who encounter frequent problems. Donʼt get me wrong, exploration is great. The programming world needs more explorers and I think languages have a lot of room for improvement. Iʼm inclined to exploration myself. However, I want my daily-driver to be on the _e_x_p_l_o_i_t_a_t_i_o_n side: do what works and do it really well. _L_a_r_g_e _c_o_m_m_u_n_i_t_y I only mention this because I ignored it for most of my career: I want a language with a large and active community. Some developers are more productive and creative than others. As far as I can tell, those developers are evenly distributed among all the language communities. So the probability of finding one in the Python community is far more likely than finding one in the OCaml community, just based on size. I wish this wasnʼt so and the best language could win on its own merits. But itʼs the same reason China and the US win so many Olympic medals compared to Honduras and Angola. In the programming world, the classic way around this is to call C. C has a huge community and a small language can fall back to C when its own community is too small. JVM languages use the same trick. Because Iʼm interested in _a_c_t_i_v_e developers, not users, I based this metric on the number of unique committers to the languageʼs main repository. That puts Go and Python in a tie, followed by Nim, then OCaml. I was surprised that Go was even remotely close to Python on this metric. Are Gophers really that much more likely than Pythonistas to contribute to the main repository? OCaml got dinged here because its standard library is so small: most libraries live in separate repositories so they donʼt count in this metric. Why is the truth so hard to come by? CCoonncclluussiioonn Anyway, thatʼs enough words. You can look at _t_h_e _s_p_r_e_a_d_s_h_e_e_t[5] for details. The end result was OCaml with 90 points, Go with 78, Python with 74, and Nim with 69. Based on the discussion above, you may not see how OCaml won. Never underestimate the long tail. I think OCaml has remained in relative obscurity partly because itʼs functional and partly because it doesnʼt have one big, flashy marketing point. C gave you access to Unix. Perl had the best of sshh and aawwkk. Python had clean syntax. Ruby had Rails. Go did concurrency. OCaml does many things well, but itʼs not the best at any one thing. Thatʼs great for getting things done, but suboptimal for capturing market share. I fell prey to the same thinking. I liked OCaml when we first met 15 years ago, Iʼm not scared of lambdas, and I even enjoy OCamlʼs family (hi, _J_o_C_a_m_l[9]), but I still didnʼt expect Ocaml to look so good. I figured Go would triumph and Iʼd remain in my current, happy relationship. Instead, I swiped right on OCaml. Iʼm looking forward to our first date. 1: https://www.youtube.com/watch?v=_q8zzWHj15Q 2: https://cve.mitre.org/ 3: https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow#Two_systems 4: https://mndrix.blogspot.com/2017/03/programming-languages-by-spec-size.html 5: https://docs.google.com/spreadsheets/d/1nEoQeRj4OegRgqCeMmy0JQt9xrEIT2issVGQj7Z5Uho/edit?usp=sharing 6: https://suckless.org/philosophy/ 7: http://gallium.inria.fr/blog/intel-skylake-bug/ 8: http://strategy.sjsu.edu/www.stable/pdf/March,%20J.%20G.%20%281991%29.%20Organization%20Science%202%281%29%2071-87.pdf 9: http://jocaml.inria.fr/