Hacker News Comments on "Co-dfns Live Stream: Compiler Design and Architecture" Aaron Hsu Youtube Video

Rankings: this week · month (mar/apr) · year (2024) · all time

digests · search

Hacker News Comments on
Co-dfns Live Stream: Compiler Design and Architecture

Aaron Hsu · Youtube · 345 HN points · 8 HN comments

HN Theater has aggregated all Hacker News stories and comments that mention Aaron Hsu's video "Co-dfns Live Stream: Compiler Design and Architecture".

Youtube Summary

Welcome to the live stream discussing all things Co-dfns. I'll be talking about the architecture, design history, and overall aesthetic of the compiler today.

Hacker News Q/A thread: https://news.ycombinator.com/item?id=13638086

The second Hacker News AMA: https://news.ycombinator.com/item?id=13797797

Hacker News Originating Thread: https://news.ycombinator.com/item?id=13565743

GitHub repository for the compiler: https://github.com/arcfide/Co-dfns

HN Theater Rankings

This course is unranked · view top recommended courses

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.

⬐

Jul 14, 2021 · Nadya on Jelly: a recreational programming language inspired by J

One of the most memorable and interesting things I've ever seen on HN was an AMA from the creator of the Co-dfns parallel APL compiler. Here is co-dfns [0] and the AMA by the creator [1]. The video may give you some insight and at the very least may be extremely interesting to hear the APL code explained (jump to around 13:30 for the beginning of the explanation behind the structure + what the compiler is doing). He explains it very well - it's quite easy to follow along.
[0] https://github.com/Co-dfns/Co-dfns
[1] https://www.youtube.com/watch?v=gcUWTa16Jc0

⬐

Nov 15, 2019 · 9214 on The 80/24 Rule

Old Forths with block-based I/O used 64/16 (so that definition takes ≤ 4096 bytes and fits into disk block).
Makes no sense in the modern world of course, but, in the context of block editors, it creates a development environment where making complex and bloaty word definitions is actively discouraged -- an idea that Aaron Hsu briefly mentions in his APL talks [1].
[1]: https://www.youtube.com/watch?v=gcUWTa16Jc0

⬐ zentiggr
Of course, in the Forth environment, the more any word definition bloats, the harder the stack tracing you have to do in order to understand the operation flow, so keeping things neat and to the point is even more important.
It might be a failing of my own, but I've always wanted a concatenative language IDE to have a stack window, with manual labeling, so I could see what all is on the stack and follow the information flow better.

⬐

May 29, 2018 · a-saleh on APL deserves its renaissance too

I remember being floored by APL few months ago when I learned about co-dnfs and watched the stream by Aaron Hsu [1] with quotes like "This code is perfectly readable, it just isn't in English".
Thing is, when I was thinking about trying it, I never figured out a good toy project. Like, when I wanted to try anything else, most of the time I would know I would be able to cobble together a web-service. Haskell, Erlang, even Prolog.
I understand co-dnfs to be self-hosting, so maybe a toy programming language?
[1] https://www.youtube.com/watch?v=gcUWTa16Jc0

⬐ nickpeterson
Incidentally, I wouldn't be shocked if you reached out to Aaron Hsu and he actually helped you. I sent him a related email and found him to be extremely cordial and informative.

⬐ icen
Do some deep learning. It's very straightfoward, and you only need some trivial IO.
Writing a programming language in APL is not for the faint of heart - co-dfns is, alongside it's performance goals, to show that compilers aren't necessarily a bad fit for APL.

⬐

Feb 24, 2018 · throwaway7645 on Paper Programming

Also from the author:
https://news.ycombinator.com/item?id=13565743
https://news.ycombinator.com/item?id=13797797
https://m.youtube.com/watch?v=gcUWTa16Jc0

⬐

Feb 20, 2018 · Nadya on Common combinators in JavaScript

What use is Chinese for someone who doesn't read Chinese? The S combinator written like that is perfectly legible and fine for someone who is already familiar with the S Combinator. I don't see the problem there and I don't already grok the S Combinator. I do appreciate writing code under the assumption people working on the code would have some level of prerequisite knowledge.
I'm sure you'd get a kick out of Co-dfns [0]. But once you learn to "speak" APL programming it makes a lot more sense. I'd highly recommend, if you have the time, to listen to the author explaining the code for the compiler [1].
[0] https://github.com/Co-dfns/Co-dfns/blob/master/codfns.pdf
[1] https://www.youtube.com/watch?v=gcUWTa16Jc0 / HN discussion: https://news.ycombinator.com/item?id=13638086

AMA: Explaining my 750 line compiler+runtime designed to GPU self-host APL

⬐

Mar 05, 2017 · 335 points, 155 comments · submitted by arcfide

⬐ losvedir
Heh, as I tweeted a couple years ago[0]:
> Ah, APL/J/K. Time for my annual crisis of thinking everything I've ever learned about programming is wrong...
Seriously, as I'm watching this video it's equal parts fascinating, confusing, and frightening.
The world of these array languages is so foreign to my day-to-day web development work that I'm not sure what to make of it. I'm impressed that you can accomplish so much in so little space, but I have a few questions as it relates to how I work:
1. How understandable is the code if you come back to it months later? Is it "write once" or are you able to pick it up again easily?
2. Have you worked on this or other code with collaborators? How well can someone else pick up the code and start contributing?
3. It seems APL is well-suited for GPU programming like you're doing here, but do you think it's more broadly applicable? Do you think all software should be written with it, including say web development and the like?
[0]https://twitter.com/losvedir/status/636034419359289344
⬐ arcfide
Thanks for the questions!
1. I've addressed this a bit more in the previous threads on this video and the discussion that led up to the video. It's actually less write once than other stuff. However, to go with that, it's also more disposable. That means that basically, I have a low tolerance for code not continuing to make sense to me. So if I come back to a piece of code and it doesn't make sense to me, I just throw it out, because it was clearly wrong if I can't make sense of it.
I find it's quite natural to read through the code at this point. Even if I haven't thought about a particular compiler pass for a while, I just start at the beginning, and in a few seconds it's easy to see the structure and overall approach, then I go deeper and see what's actually being done. It's exceptionally easy to work with. Part of this is also because I don't have to jump all over the place to know what the code is doing. If I forgot what a helper does, I just glance a half page up and read the definition of the helper, and I know what's going on then. I don't have to keep a documentation page open to remind myself of the APIs, or the intentions or the like, because all call sites are visible, and the definition of the code is all there for me to see. A big part of this is relying on idioms rather than black box abstractions, patterns instead of information hiding.
2. Yes, I work with a number of other people who have varying levels of responsibility for the code. A good deal of them only care about the high-level concerns, and so they either use the compiler and don't read the code, or they make little tweaks here and their to suit their needs. They don't require a full understanding. A few others do work through the entire code base and we use the codebase as our single source of information when discussing design decisions and everything else. We don't have any need for supporting docs because it's too easy to work with the code to make other design docs useful when discussing architectural elements.
As for picking it up and contributing...that depends. There's a lot of background information you need to know. That's just the reality. The code itself isn't hard, but it is relying on a host of shared understanding. This includes mainly the Nanopass style of compiler design and construction, as well as a firm understanding of array programming, together with a firm understanding of the fundamental idiomatic algorithms being used in the compiler, which are the main "new thing" for people to understand after they understand how to build a compiler and how to write array programs.
Unfortunately, most schools teach neither Nanopass style compiler design, nor array programming. So, the completely uninitiated have a few things to learn. The code itself, though, is no worse than expecting someone to understand C semantics and OOP Tree oriented programming patterns in a monolithic compiler style when working with something like Clang, or tree traversal in C over C when working in a simple compiler like PCC.
3. Actually, looks can be deceiving. I'm doing GPU programming with APL, but over a domain that is considered far outside of the standard APL comfort zone. In fact, a key research contribution of this compiler is that I have developed algorithms and a holistic set of techniques for doing tree/compiler transformations on a GPU, which was previously a very open problem. This compiler/domain would have been considered one of the quintessential examples of a domain that is not well suited to APL. In fact, that's not true, it's just not suited to APL and the traditional way of thinking about this domain.
As for things like web development and the like, you can see MiServer, which is Dyalog's Open Source Web framework. There are some Dyalog User Meeting discussions about the framework, and tutorials, documentation, and the like to get you through it.
APL isn't a perfect language, but there are very few languages pushing the boundaries of human-computer interaction with language in the way that APL does, and I think this is sad.
I think in some ways we might be forced to write programs more like APL in the future, because parallelism is simply that important to the future of computing. I'm also doing research into the early stages of learning computer science, and some initial research suggests that it may in fact be better to teach students parallel programming in an APL style first, and later transition to the more traditional stuff, rather than doing it the opposite way, which so far as just convinced people that they shouldn't write parallel programs at all. In contrast, APLers prefer parallel programming and regular transform code from sequential to parallel and back again. They tend to be comfortable with both modes of thinking, which cannot be said for most other languages in common use. Agent/concurrency heavy languages might be an exception to this general rule, but they also get a bad rap.
Finally, I'd like to make a small note about bringing collaborators onboard. Many collaborators that I've had have been either inexperienced and were learning for some other reason or another, or were experienced and had one specific thing that they wanted to fix. What often happens is that the most inexperienced spend their time, as they should, learning important basics, and don't ever work on the compiler proper. I think this is the way it should be there. The more experienced people, I often find that I fix their problem that they wanted fixed in the compiler as I'm explaining the compiler to them. Because of the way the compiler is designed, I usually find myself saying, "So, to do this, you would have to make this change here [edit] and this here [edit] and ...oh wait, that's it, okay, it's done for you. Enjoy the patch!"
⬐ joe_the_user
Thanks you very much for answering questions here.
I have a question about your "key operator" - it sounds similar to an idea I've played with - a way to run heterogeneous operations on GPUs kernel by running a very small bytecode interpreter on each kernel (Something similar to what's outlined in this here[1]).
Does that seem at all related to you?
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.321...
⬐ arcfide
That's an interesting idea, and I can see there being some overlap, but the Key operator is actually not my invention, but is a member of a class of operators sometimes called classifiers. You can think about the GROUP BY operator in SQL. Or Henglein's work on discriminators. Key is an operator that is relatively new in Dyalog that came from J and some other APLs, I think. I have to go back to my paper to remember.
http://dl.acm.org/citation.cfm?id=2935331
⬐ klettow
http://code.jsoftware.com/wiki/Essays/Key#History
⬐ RodgerTheGreat
If I follow correctly, the k6 equivalent would be:
    {x'y@=z}
"X of each of Y indexed by the group of Z." As in:
    key[+/;1 2 5;`x`z`z]
⬐ joe_the_user
I'm scanning the documents.
Basically, group nodes and map/operate on them.
It seem like that would be fairly simple if you are dynamically creating kernels.
-- Edit: Which implies you're doing a pretty "bare metal" interface to the GPU which still lets you do that and there's interesting story there, if I'd not totally off-base.

⬐ arcfide
I try to avoid going bare metal as much as I can because of the difficulty involved with working with GPUs at that layer. I try to focus on higher level concerns, and use the excellent low level work of people such as the ArrayFire developers to handle the low level stuff. I still have to understand it and work with it at some level sometimes, but for the most part I stick to my core abstractions.

⬐ joe_the_user
So ArrayFire allows the creation of kernels/the calling of functions in parallel/ on the fly?

⬐ arcfide
In some limited, restricted cases, yes.
⬐ jstimpfle
The conciseness is fascinating. One of the common criticisms of combinator style programming is the difficulty of making good error messages, or generally making small fixes in the pipeline that can't be expressed using the abstraction (i.e. the current compositional context). A concrete example I have is in Haskell when you develop pure code and you can't easily debug with a few print statements (more so since it's a lazy language).
I think this criticism definitely applies in the real world of ever-changing programs without a very clear scope. I assume your focus is more on the academic / "poetic" side of things, but nevertheless: How did such concerns influence the development of your compiler? How usable is the compiler from a dfns programmer's perspective?
And since APL is a very special beast: How transferrable is your style to writing a compiler for a more conventional language (for instance, procedural), especially ones with irregularities and such?
(I apologize if these questions are covered in the video. Haven't finished it yet)
⬐ Y_Y
I assume you've come across Debug.Trace. That let's you do "printf-style" debugging in Haskell without much fuss.
    trace ("calling f with x = " ++ show x) (f x)
⬐ jstimpfle
Yeah that's possible but there's still laziness. Anyways I didn't want to shift attention much - just use it as an example to clarify what I mean.

⬐ arcfide
I cover how I debug my program in the video. I use two utilities, pp and px that are identity functions that print their results as side-effects. I'm happy that I'm working in a non-lazy language.
This compiler is poetically driven, but it is still a commercially funded and commercially marketed product. It's just that in the APL community, you can pursue both and have them be compatible with one another.
In particular, the desire to make the code as "disposable" as possible has lead to me using strictly combinatory/points-free style programming in the core compiler. It makes it very easy to change things because it's easy to see the whole picture and easier for me to delete any given piece of code. Many "fixes" in my code are single character APL fixes, or maybe two character APL fixes.
One reason for not being able to stick to the same core abstraction throughout your whole code is that you may have an inadequate core abstraction. Part of the design goal in the compiler, which I am trying to point out in the video, is that the design is meant to keep you in the same "context" or abstraction level throughout, so making a change will always fit the current compositional context. I do this by focusing on simplification as much as possible. I also keep the code as stupid as I can get away with.
This whole discussion got started because of the amount of change that has happened with this code base (hundreds of thousands of lines). You can read more about that in the other historical threads referenced by dang. I've rewritten the compiler at least five times.
This need to make lots of changes all the time and adapt means that I actually push more towards the points-free, not away from it. By sort of diving head first into it, I can gain the benefits, but lose the disadvantages that happen in the "middle ground" where you are jumping back and forth between points-free and not-points free.
The complexity of this compiler is about the same as what you would normally encounter if you were going to write a core compiler for Scheme (not the syntax expander). It's perfectly suitable for procedural programming languages, or any other, but obviously you'd need to tweak things. One of the general strategies in the compiler is to eliminate as much irregularity as early as you can get away with. A lot of it depends on how neatly your underlying IR/ASM language that you're targeting matches the semantic needs of the language you're compiling. For example, I don't demonstrate how to do tail call optimization here, but if you are familiar with a Kent Dybvig style tail call optimization pass such as found in Nanopass compiler examples, you could adapt those ideas to this data-parallel style.
Basically, if you already know how to write a Nanopass compiler, the transition is easier. If you are used to writing monolithic compilers such as the clang front end or PCC, it will be harder, since you have to absorb the Nanopass style of thinking.
As for how usable the compiler is, that depends on what you mean. If you mean, is the compiler usable right now as a product, then it covers most of the bases for functional array programming, but if you see the LIMITATIONS file, you'll note a few things that it's missing. Ironically, things like if statements and a lack of recursion are important, but much less so to a dfns programmer than having the right primitives already available. As a dfns programmer myself, I almost never use recursion, for instance, though it's good to have it when you want it.
The next few releases of the compiler are focusing on eliminating some of these last issues of usability, but for the most part the compiler is already completely functional.
As for whether a dfns programmer could readily get working on the code base, the answer is yes and no. They would have no problem reading the code, as it's dead simple to a dfns programmer. The overall structure would also be easy to see. For them, the trick would be understanding the "big picture" of how a compiler works. Most dfns programmers are not also compiler writers. Understanding the Nanopass strategy of compilation, and then understanding why a given pass is there in the compiler, and mapping a given set of array operations (which they would easily understand) to the domain space of "compiler operations" would be the struggle. Sort of the reverse. The compiler writer would struggle to understand how compiler operations map to array operations, and the array programmer would struggle to perceive the larger meaning of the array operations in terms of traditional compiler concepts. Printing out the AST throughout the whole compiler (easy to do by editing tt) goes a long way towards aiding that intuition.
Once the core idioms used in the compiler are understood, the rest is pretty straightforward.

⬐ tigershark
I wish that your posts could be as synthetic as your code and your code as expressive as your posts... that would open an universe for me and probably for a lot of people.

⬐ arcfide
I'm not sure what that would look like.

⬐ arcfide
Personally, I wouldn't want to edit my posts on a regular basis. I consider them too verbose, but I struggle to condense them.
⬐ arcfide
Thanks everyone for the great discussions and interest. I'm going to get back to this thread later, but won't be actively commenting for the rest of the day. Please feel free to continue discussion and continue asking questions and I'll get to them when/if I can.
Also, feel free to contact me directly to discuss anything at a higher bandwidth level. And as always, see the GitHub repository for more information on the project and ways that you can contribute (help out, fund the project, contribute code, bug reports, or patches).
https://github.com/arcfide/Co-dfns
If I can "shill" a little bit, you can fund the Open Source/Open Work aspects of this project through micro-funding at Gratipay, which helps to support and keep the Open Source and community aspects of the project going. Dyalog is a great company, but ultimately, as some have mentioned, it's the community that helps to keep these ideas moving forward and improving.

⬐ javajosh
Does working on your project, or running it, require a Dyalog license?

⬐ arcfide
You'll need to get at least a free copy of Dyalog to use the compiler. They have a number of different free options. If you intend to use the compiler, it is available for use by all under an AGPL license. However, the generated code also uses Co-dfns code and is likewise under an AGPL license. If that license is unsuitable for you, then you'll need to arrange for an alternative license through Dyalog. Normally this will only be required for people interested in using Co-dfns commercially in their closed source projects.
The Co-dfns work is run like an Open Company however. You are welcome to submit contributions (see the README for more information) as pull requests and can onboard yourself at your leisure. I'm working to encourage more people to support the funding of Open Work on Co-dfns, but that's a long, slow process.
So, yes, generally speaking the compiler is designed to be used in conjunction with the host of additional tool support you get from Dyalog APL (the interpreter). However, Dyalog makes it as easy as possible to get a copy of the interpreter.

⬐ javajosh
If I understand correctly, you can use the free, non-commercial-use Dyalog to run the AGPL Co-dfns compiler to produce programs. And also, the Co-dfns output is not independent of the compiler itself, meaning that the output must also be AGPL.
I'm speculating, but I suspect that anyone using Co-dfns output will be running it on a private bank of GPUs over a large private data-set, so I suppose the AGPL won't ever matter for them, in practice.

⬐ mkesper
The AGPL is not a "non-commercial" license. Freedom zero guarantees you can use it for any purpose. AGPL only becomes relevant when distributing.

⬐ arcfide
If that works for them, more power to them! They'll still likely need a commercial Co-dfns license if they are doing that, not because of the produced code, but because of the programming environment in which they do the development, unless they want to slice Co-dfns out of its "host" environment.
You are right, however, that you can use the free, non-commercial Dyalog to run the AGPL Co-dfns. Co-dfns is both a commercial and a research enterprise, and so it is important to me that the ideas in the compiler are manifest openly to the public, hence the open-source nature of the project. However, it also needs to fund that research somehow (have you ever tried to get a grant to write APL? Hahahaha!), and so commercial uses need to be managed in a reasonable way, namely, something mutually equitable.
Some interested parties are working to use Co-dfns in public facing services, and not just private data sets. One group is very slow moving, but we're exploring the possibility of a rewrite of the cryptographic stack.

⬐ jarpineh
I had some exposure to APL program for statistics few years back, but never really dived in. Largest problem being that the tools I encountered then didn't yield themselves well to intermittent exploration (indeed, Dyalog APL had Linux version that required Wine, which was very buggy). If somebody would like to give pointers on how to start with APL in 2017? I mean things pertaining to developer experience: such as what APL implementation is easier to start with on MacOS/Linux, how to find libraries and frameworks, good example app on Github etc.
Though don't want to hijack this thread from questions specifically about the compiler + runtime. On the earlier thread (https://news.ycombinator.com/item?id=13565743) you mentioned two things I'd like to know more about:
- function trains: any examples? could one do those in Lisp such as Scheme?
- what is Modern dfns based APL? (something about better APL for functional programming)
Thank you for your comments so far. I tried to watch the live stream, but there was some audio skips now and then. Will have to try again on next long train ride.
⬐ minxomat
For what it's worth, you can try ngn and Dyalog online, since it's licensed to TIO here: https://tio.run/nexus
Other than that there are great book recommendations in the last APL data types thread on HN.
⬐ jarpineh
With the Dyalog APL book going 800 pages I hope more won't be required to get feel of the language ;)
I tried meaning of life from FinnAplIdioms on TIO. Sadly it didn't produce any output...
⬐ arcfide
Remember that the FinnAPL library expects that you know your APL environment and are able to adapt the idioms to suit.
You should also consider "Notation as a Tool of Thought" by Kenneth Iverson (his Turing Award lecture) to be required reading.

⬐ jarpineh
I think it's this: http://www.jsoftware.com/papers/tot.htm
I thought I had read it at some point, but now it doesn't ring a bell. Well, it's on the list now. It starts with Babbage quote and uses phrase "tool of thought" which coincides early computer history books I've been reading.

⬐ arcfide
Yes, that would be the one.

⬐ Avshalom
If it's a comfort at least 200 of those pages are going over libraries that are shipped with dyalog but have nothing to do with the actual language.

⬐ arcfide
I just looked at the meaning of life. That's unlikely to work with many "hosted" APLs because of the use of eval, which is a security concern in an online REPL.
However, if you remove the execute you can still get a string version:
      ⊖⍕⊃⊂|⌊-*+○⌈×÷!⌽⍉⌹~⍴⍋⍒,⍟?⍳0
42
You can paste this into the REPL at TryAPL.org.
⬐ throwaway7645
+5 bad@$$ points sir.

⬐ jarpineh
Well, it seemed interesting and I suspected it would result in 42, however else it would do its work :) Perhaps that's one way t frame the Question.

⬐ arcfide
What's really interesting about the above is that it is "seeded" with a random number. :-)
⬐ klettow
Background on trains: http://code.jsoftware.com/wiki/Essays/Trains

⬐ arcfide
In 2017 I can mostly only recommend Dyalog APL. Keep in mind that I am highly biased because my compiler is written in Dyalog APL, and they sell my compiler commercially. However, IMO< Dyalog APL is the place to start with APL in 2017.
They now have Mac/Linux versions that include the "RIDE" (remote IDE) which allows you to work with the code similarly to the IDE/Interface you see me using in this video. I don't use any of the IDE features, such as the code editor, debugger, explorer, or the like, but for a new user it will be quite useful, and it will allow you to "point and click" your way through, including the entering of the APL symbols.
Dyalog APL for Linux also has a slick, but old school console IDE, which includes a built-in full-screen text editor, debugger, and the like, which is built around console interactions. If you're a hard core CLI guy, then this may appeal to you.
There is an excellent book called "Masterying Dyalog APL" which covers all the major big features, built-in frameworks, and the like, for working with Dyalog APL. Dyalog comes with built-in interfaces for things like R, .NET/Mono, ODBC/Relational databases, sockets and networking, cryptography, &c. It also ships with things like vecdb, a sophisticated web framework, and a GUI framework (windows only).
Additionally APLers should all have a copy of the FinnAPL Idiom Library at the ready. It serves as the "core library" for a lot of common tasks which takes the place of "example app" and "library calls" in many cases. Additionally, Dyalog has a large suite of dfns to use that they document at dfns.dyalog.com. Dyalog also comes with document processing and graphing libraries that can be used interactively or programmatically to produce all sorts of documents and graphs.
My general recommendation would be to stick to using dfns as much as you can, programming without trying to use external libraries. It's a common "mistake" that people learning APL try to "find library function to do X" when it's often a "half a line" APL program that would be considered "idiomatic" in APL and thus not something APLers would have a "library function" for. It's better to master the core language, because you'll find your need for libraries diminishes quite a bit once you become comfortable with the language. Not that there aren't libraries, but if you really need a library function, you'll know it, and then you'll know if you can find it.
The best example apps for writing dfns would be the dfns library by Dyalog at dfns.dyalog.com which comes shipped as a dfns workspace for use by Dyalog APL programmers. These are well commented, and cover a vast array (snicker) of topics.
Function trains are one of the neatest features of the Dyalog APL language, and represent an extension of traditional APL 2 into the "Rationalized APL" space from which comes the likes of J. The example is the above video. The 90-line compiler component that I discuss in the above video is written almost exclusively as a function train. I believe I dedicate some time to talking about function trains in the above video.
You cannot do these things in Scheme, as they are fundamentally a syntax that is incompatible with S-expressions. You could do a form of points-free programming in Scheme, but the nature of S-expressions means that the structure of the point-free code is explicit, and thus, if you wanted function-trains, you would have to implement a macro to construct them for you. That could be done, but at that point you would be well on your way to implementing APL in Scheme, rather than using Function-trains in Scheme.
Function trains also don't work well for functions whose arity exceed 2.
dfns is a syntax invented by John Scholes for implementing the APL notation using a lexically scoped syntax for function definition and control flow. They replace the more traditional flat namespace programming that is dynamically scoped, or the OOP programming model, both of which are still supported and widely used in Dyalog APL. However, to a Scheme programmer, dfns is a God-send, because they closely mirror the core constructs and concepts of a Scheme programmer, namely: lambda, recursion, and if statements/exception guarding.
When I don't use function trains in my compiler discussed and demonstrated in the linked presentation, I'm using the more explicit dfns syntax.

⬐ jarpineh
Thank you for comprehensive comment. Hopefully I can understand enough about the video to get the gist. Idea of Scheme style functional features with APL's power tools. If you get around writing more about these things I'll definitely be reading.
Since most of Python and JavaScript I do at work is mostly searching and using examples and libraries from where ever having full complement already with the system feels impossible. On the other hand, APL implementations have had some time to make things right. I'll take a look at that big APL book and see what I can gather (my favorite way to learn).
BTW: I got my first taste of APL from one of the FinnAPL fellows. I saw some incredible things by him I've since tried to replicate after a fashion.

⬐ arcfide
The Mastering Dyalog APL book is available as a free PDF, too. Though the paper version is more enjoyable, and serves as an excellent monitor stand in need, as well.
As a Scheme programmer, one of the first things I did when learning APL was to create a new REPL/environment in Chez Scheme called Sapling that replicated APL in Scheme using Scheme implementations of the primitives.
I quickly discovered that at least for my meager mind, there was no way to combine the two as one. I could use one within the other without trouble, but I couldn't mix the two easily.
If you're interested, the Co-dfns compiler is designed to integrate with existing languages. If your language has a C FFI, then you'll be able to integrate with Co-dfns compiled code. The workflow is basically for you to produce a namespace of the functions you want to use, and then you can compile them and link against that in your Python or C or C++ code. You can then just call those functions with the appropriate arrays and things are a go. The main then you need to do is write a function to create a Co-dfns array to and from the data types you are using in your program.

⬐ jarpineh
This got me a bit confused. Is the purpose of Co-dnfs to compile apps that run on a GPU? Or does it only itself compile on the GPU for apps than can run on any target architecture?
Other than that the idea of getting using my domain specific Python stuff (and few general libraries) with data processing tools is intriguing.

⬐ arcfide
These two goals are not mutually exclusive. The Co-dfns compiler is a compiler that is designed to self-host on the GPU, but it is a compiler for the Co-dfns language, which is a lexically scoped syntax in Dyalog APL. The compiler compiles dfns programs and supports Mac, Windows, and Linux, targeting CUDA, OpenCL, and Intel CPUs. The compiler itself is written as a dfns program, hence the idea of self-hosting on the GPU.
Target applications for the compiler include document processors, cryptographic libraries, neural networks programming, data analytics, high-performance web applications, vector databases, financial algorithms, bioinformatics, and HPC/Scientific computing tasks.
So, in short, yes, it compiles apps to run on the GPU, or the CPU, whichever you prefer. And yes, it is meant to compile itself on the GPU (so, you compile a GPU program on the GPU).
⬐ arcfide
This is an AMA for the linked compiler presentation by request. It is not live video, but I will be answering your questions directly here on Hacker News. This was requested due to some logistical issues and so forth during the live presentation. The content is currently being edited to start at the appropriate place, but at the moment you may have to skip ahead 17 minutes to catch the start of the presentation.

⬐ bakul
This is fantastic! I can sort of grok your code at a high level but only enough to appreciate it; not to do much with it as that would require investing a bunch of time.
Questions: 1. Did you find anything really surprising? Some new algorithm or idiom or something that was surprisingly easy? 2. Is it worth applying some of the same tricks in compilers for other languages? 3. Have you considered using the same "frontend" to create an APL->Scheme translator? That might have more of a pedagogical value. Scheme has a great macro system and it is a HLL so can be a good IL at the very least -- I can then translate the compiler itself to Scheme and study :-)

⬐ arcfide
1. There was a lot surprising throughout this whole process. For a while I didn't think it could be done. That it could be done was surprising in itself. That using trains actually improved my life was also surprising. There are some "research class" algorithms that I have published as a result of this work, so, there are some new idioms and algorithms that came out of this work. Additionally, this whole compiler is essentially a new treatise on the construction of compilers in general, providing a set of idiomatic methods for accomplishing tasks that would have been traditionally accomplished with other algorithms more commonly taught (such as the way you would normally do a depth-first tree traversal vs. the use of a path matrix to compute parent-child relationships and using key to transform the tree). Some compiler passes turned out to be really surprisingly easy, such as function inlining, which is a recent addition not shown here. After being familiar with this code, it took very little time to explore, write, and implement. Certain things that are multi-part algorithms in other languages turn out to be a single, well-known APL primitive, such as Key (for tree transformation) or Tally for AST/tree "count".
2. It depends on how you write your compiler and what your core abstraction level is. The compiler design here is language agnostic, and you could implement a compiler for any dynamically typed, lexically scoped programming language using this compiler as a template. Adding a type-checker on top would not be difficult, but I haven't established what I consider to be "idiomatic" type-checking algorithms yet, though I did have a version of type inference that was okay that was only recently removed.
However, chasing concision or using micro-level tricks without understanding the macro-level issues being addressed by the micro-level tricks will probably lead to unreadable code. The main point here of the compiler design is the importance of prioritize the macro level thinking and optimizing the micro-level to support better macro-level reasoning. How you can apply this to your own compilers depends on how much you're willing to alter the core "aesthetic/HCI philosophy" of your code base.
3. It would be almost trivial to do so, and would have probably zero pedagogical value in the way you're thinking. To do so you would change the "gc" function to spit out Scheme instead of C code, which would be trivial, and it would require changing a bit of the syntax of the outputted code on the following pages. If you wanted to reimplement the runtime as a Scheme library, that could also be done. You'd lose parallelism, of course. The C code and the Scheme code would probably be equivalently clear.
The reason it wouldn't be much use, is because there would be nothing idiomatic about the Scheme code upon which you could "rest your hat." Reading APL written with S-expressions rather than APL notation is harder, not easier. It looks no more intuitive to a Scheme (I was a Scheme programmer before this [see R7RS Small]) programmer than the APL.
Keep in mind that the core compiler is literally just a huge string of function calls. It's literally nothing but function composition.
(2=+⌿0=X∘.|X)/X←⍳20
vs.
(let ([X (apl-iota 20)]) (apl-filter X (apl-equal 2 (apl-reduce 1 apl-plus (apl-equal 0 (apl-outer-product apl-modulus X X))))))
Now, if you're really familiar with Scheme, then the terms iota, filter, and reduce should suddenly pop out at you, but that's dangerous, because Scheme's implementation of these don't at all behave the way APL's does, though they are in the same general algorithmic family. So, this might help your study for about 1 hour, before you had memorized all the most common operators for which Scheme has a similar primitive that kind of does something a little like it. However, we just about covered all of them in the above code.
Given the above code, what does it do? Was the Scheme significantly easier to understand? Now what if you had 90 such lines? The benefit of such translation starts to die out about after the first line of APL code that you read.

⬐ gwu78
This is a real gem of a comment: https://news.ycombinator.com/item?id=13571160
"petty abstractions" to assist reusability
Layer upon layer upon layer of indirection.
I want to be liberated from this mindlessness.
It is everywhere in the programming world.
Large sums of money are spent to maintain and propagate this way of thinking.
Large companies have been built upon it.
Is it possible to escape this?
These contributions from arcfide are heartening.

⬐ ycmbntrthrwaway
http://suckless.org/ guys decided to give up building libraries and abstractions

⬐ nodesocket
I'm a complete a noob to APL (I come from web development and ops), but reading the Wikipedia there is a section that says:
The following expression finds all prime numbers from 1 to R. In both time and space, the calculation complexity is O(R^2).
    (~R∊R∘.×R)/R←1↓ιR
How? What? I feel stupid now. ;-)
⬐ arcfide
Try this one:
(2=+⌿0=X∘.|X)/X←⍳20
:-) They're equivalent, but the second should give you a bit more intuition based on your understanding of the definition of prime numbers.
Use TryAPL.org to explore the code. Evaluation is right to left with parentheses being their standard meaning. Try seeing what the left and the right sides of the / expression give you, and then try removing pieces from the left of the left side expression of / incrementally to see how its built up.
If you need help seeing what the functions do, you can see help.dyalog.com "Language Reference" That will show you what the symbols do.

⬐ arcfide
You may also want to see "Game of Life in APL" on YouTube.
https://www.youtube.com/watch?v=a9xAKttWgP4

⬐ arcfide
Here's an English reading of the above code:
Filter the numbers from [2,R), removing any number that is a member of the set of products of pairs of elements in [2,R).

⬐ Avshalom
ιR creates an array 1...R
1↓ drops the first element of whatever is to the right (1...R)
R← just assigns the thing to the right to R, so now R is a variable containing the integers 2...R
R∘.×R creates a 2d matrix of every element in R multiplied by every other element in R
R∊ takes whatever's to the left (that matrix) and sees if any value in R is in that matrix. it spits out an array that is like 1 0 0 1 0 1 1 0, where the ones correspond to elements in R that are also in the matrix
~ just reverses that list so now we have an array thats like 0 1 1 0 1 0 0 1 which corresponds to elements in R that can't be created by multiplying any two elements of R together
/ goes through and only returns values of R that correspond to those 1's
the flow is like this:
  R←5
  ιR
    1 2 3 4 5
  1↓1 2 3 4 5
    2 3 4 5
  R←2 3 4 5
  R∘.×R
     4  6  8 10 
     6  9 12 15 
     8 12 16 20
    10 15 20 25
  R∊R∘.×R
    0 0 1 0
  ~R∊R∘.×R
    1 1 0 1
  1 1 0 1 / 2 3 4 5
    2 3 5
⬐ arcfide
Thanks!
⬐ gwu78
From reading your blog, I detect that you like working in MS Windows.
Can your compiler be ported to BSD?
Here is the rationale for why this can be useful: BSD can in turn be ported to new hardware with reduced amount of effort, sometimes a project that is manageable by one person.
(Compare this to what is required for porting Windows, Linux, etc. to new hardware).
Imagine an APL computer with an open source kernel that is easily ported to new hardware.

⬐ arcfide
Oh, another thing I thought about, the BSD people probably won't like that it's AGPL dual-licensed (commercial licenses available). Unfortunately, that's the only reasonable license for this right now. One day, if the project is successful enough I'd like to see an ISC licensed version, but that requires a good business model around the ISC license for compilers.

⬐ arcfide
Actually, I'm an old school OpenBSD user who still programs with ed(1). Check misc@ out back a number of years. I've had to move away because OpenBSD isn't the greatest platform for doing high performance GPU research. :-(
The compiler is already portable to BSD. However, the underlying library I use to do GPU programming (ArrayFire) is not ported to BSD, though it Is for Linux. The Dyalog interpreter is available for AIX as well as Linux, and that's what you need to host the compiler right now. If Dyalog received the interest, it would be pretty straightforward to port Dyalog APL to BSD.
If one wanted to provide a self-hosting Co-dfns compiler, you could implement a minimal dfns interpreter and then run the Co-dfns compiler over that. To my knowledge no such interpreter exists yet. However, porting either ArrayFire or the Co-dfns runtime would go a long way to doing this, because you could then share the runtime library between the interpreter and the compiler.
So, yes, this is quite possible, and probably not even that much work. I'd welcome such a contribution. :-)

⬐ kovek
How do you manage to still work with ed? Where can we read about that? I'm curious to see how useful it can be

⬐ arcfide
I don't have it anymore, but I did a live coding session in ed(1) where I edited thousands of lines of XML code as a demonstration of using ed. It's a very powerful editor, almost too powerful for my "simple tools" aesthetic.
Just a few things you can do with Ed:
Structured/Outlining Refactoring Rapid code catenation and splitting
It's one of my favorite editors. I don't think I have a video of it, but prior versions of the code were optimized for ed(1) hacking, allowing me to do structured/outline editing easily with ed(1).

⬐ breck
Really different and fascinating video, thanks!
Two surface level questions: 1) The "trains" design pattern you mention, do you have a link (to wikipedia or something) with more info? i might know the pattern by a different name 2) I liked the tree printout seen at 1:42. Is there a library or algo you used for that?

⬐ breck
Edit: Oops, typo. I meant 1:24 (https://youtu.be/gcUWTa16Jc0?t=5210).

⬐ arcfide
1. http://www.jsoftware.com/papers/fork.htm See Hook and Fork. This was the first introduction of Trains programming in APL, I think. You can see help.dyalog.com for more information of the current implementation in Dyalog.
2. I think the tree printout of trains is a built in library feature of Dyalog APL. The dfns library (dfns.dyalog.com) has a tree printing dfns that may be of use in implementing it, and I think you can read the actual source code for that in the user command for display or the like.
I may implement a version of this for my ASTs to augment the Xml and px utilities, as it is a nice format and one that might be of use for printing out the AST of the compiler. I have not had need of it yet, though, so I'll probably get to it the next time I have to do a fancy demo for which I can prepare more adequately.

⬐ breck
Thanks! Also found this link which had more info on Trains: http://help.dyalog.com/14.1/Content/Language/Introduction/Tr...

⬐ arcfide
Ah, there it is, thanks!

⬐ psandersen
This looks interesting, but getting a feel for the implications is pretty difficult without spending a lot of time.
A discussion of the implications in terms of speed, robustness, maintainability and future hardware would be really useful.
E.g. if some team did the crazy undertaking of re-writing all the software (kernel included) in a base ubuntu install using APL on the GPU what would the positives vs negatives be? Is this even possible/desirable or does it require research breakthroughs? Is this more suited for DSP functions, or even extendable to probabilistic programming?
What is the best hardware match? Does this offer advantages to push synthesise programs to FPGA's, or some of other hardware (can't remember the name, many small dumb cores with limited memory; maybe a bit like the Cell processor).

⬐ arcfide
Robustness and maintainability is as much a product of the minds working on the code as it is the code itself. However, I can say that for me, this was not even a conceivable enterprise, much less one that could be done by one person or, really, a few people, without the approach I have taken, for me at least.
I am interested in this partly because I feel that future hardware will more and more reward code written in this style than in other styles, and I want to make it possible for people, especially non-computer scientists, to learn and produce good code and "think" about problems more rigorously, which, as problem sizes increase, makes this sort of approach important for scaling.
As for speed, one nice, but currently unexplored feature of the compiler as it is written above is that the entire complexity (algorithmic complexity, Big O) is computable, even trivially derived from the code itself, because the compiler is just a string of function compositions on primitives with well known algorithmic complexity. A future feature plan for the compiler is to produce Big-O complexity analysis for code of this sort to help developers spot performance regression points earlier in their code, since the compiler can now assist you in doing so symbolically, and not just with test cases.
I have thought a bit about how easy it would be to compile code like this to the FPGA, and I think there are a lot of possibilities here, but I've not been able to make time to study it completely. My current scope is limited to GPU's and CPU's of the more standard ilk.
Rewriting the kernel of an operating system would be very challenging if you expected good performance on the GPU. GPU's are not designed to make context switching of this sort (as found in Linux) easy, or fast. It's highly likely that you will run into divergence problems, at the very least.
However, people have explored the implementation of operating systems on top of these concepts (see kOS). I expect those to be very practical, and it would be an interesting future project to bootstrap Co-dfns programs to run natively on bare metal.
A more interesting initial idea would be to explore rewriting the base userland in OpenBSD or Ubuntu in something like Co-dfns. This would open up interesting challenges relating to file systems and interacting with GPU's and the file system, but it is at least conceivable that many userland functions would become much easier to write and maintain (maybe) by playing with this approach. However, it would require additional infrastructure support and the like which Co-dfns currently does not have. I imagine the first step would be to identify the core file system APIs that would need to be provided in some reasonable analogue in Co-dfns, and then try implementing the code on the CPU first. I would expect that the code would be simpler, and could potentially better utilize the CPU, but I am not sure you would see fundamental improvements in performance without going way up in scale and increasing computational needs to the point that the speed of the filesystem is not the limiting factor.

⬐ psandersen
Thanks heaps for the detailed reply! The BigO complexity analysis 'for free' sounds pretty interesting.
I'm wondering if Co-dfns would be a natural fit for an algorithmic design of programs through deep learning. We're already seeing some attempts at using an AlphaGo like architecture (marrying traditional AI such as monte carlo tree search with modern deep learning to learn useful heuristics, or in the case of a recent paper on program design I can't find right now, they combined the network with an SMT solver).
If the compiler naturally gives the complexity of a program, creating a feedback loop with a speed prior ( http://people.idsia.ch/~juergen/speedprior.html ) might work very well for AGI.
Sorry if this far out speculation is off topic, I'm just really keen on exploring tech that could be valuable in AI development.

⬐ arcfide
If that's something of interest to you, get in touch with me via email and we can see if we can't get Co-dfns supporting what you need to make it a reality! There are a number of things that could make this nice in Co-dfns, including the possibility of using idioms instead of library calls, potentially making adaptation of algorithms easier.

⬐ hzhou321
I understand that speed may not be an issue here but I still like to ask: won't so many nano-passes affect speed?

⬐ arcfide
See Andy Keep's dissertation on using Nanopass to reimplement the Chez Scheme compiler. The short answer is, not really. Nanopass can add some constant factors, but is essentially a linear/constant addition, whereas the majority of compilers are quadratic or greater in complexity. Chez Scheme was/is N log N in most cases, and even then Nanopass did not significantly harm performance.
It actually has benefits on the GPU beyond just the CPU, because the passes tend to be smaller, and thus more readily executed efficiently on the GPU, meaning that you would likely see a large net gain in performance by using Nanopass. Since I believe this might be the only compiler of its kind, it's hard to say for sure on that mark, though.

⬐ hzhou321
Does N here refer to the length of source code or the number of passes? When the source code is big, even the identity pass takes time, right? I don't understand how it become constant addition.
Could you quickly comment on that the majority of compilers being quadratic or greater?
By the benefits on the GPU, it refers to the size of the code for the passes or the size of the source code? I assume the passes are run serially, right?
[I'll read Andy Keep's paper.]

⬐ arcfide
When I spoke about benefits on the GPU, I was speaking of the benefits of executing the compiler passes on the GPU, because with a Nanopass architecture, or really, the Micropass architecture I'm using here, each pass is simpler than the equivalent set of passes that do the same thing in a traditional compiler. This means that the passes are generally each simpler, and thus, easier to parallelize.

⬐ arcfide
In compilers the N value is commonly considered to be the number of nodes in an AST. Compiler performance is generally evaluated based on its performance over the code, which is generally bound by the number of nodes in the AST.
The identity pass is linear in the size of the AST in almost all functional compiler designs. Optimizations would then make this constant time for most functional languages with adequate optimizations. The identity pass in Co-dfns would be the single character ⊢. The Co-dfns compiler would produce code for this pass which is constant time.
Likewise, with a bit of work, this would also be true for a Nanopass compiler function, unless you insisted on traversing the AST and rebuilding it yourself, in which case the framework could not optimize this away.
The majority of compilers use algorithms that are quadratic in the size of the AST (number of nodes) for a number of passes. Many compilers use a variety of graph coloring register allocation, which is NP-complete. However, better register allocation algorithms exist that are quadratic or quadratic log in the number of nodes in the AST. The Nanopass research has demonstrated that for real life compilers, the cost of certain compiler passes generally overwhelms most of the cost of the rest of the compiler, and so adding another linear factor by splitting a pass into multiple passes, generally does not affect the big-O complexity of the entire compiler, and thus, the overall performance at scale, though it can affect the constant factors by a small amount, depending on how good your allocation engine is for your host language.
This Nanopass research has only been done on compilers implemented on the CPU. My research is exploring whole compilers on the GPU, which is an area that shows promise, since even memory bound computations such as most compiler core compiler transformations, can see significant performance speedups, even if they move from a linear to log linear complexity.
The passes are not run serially. Each pass is data parallel, and while there is a serial dependency in that each pass is dependent on the results of the previous pass, nothing in the way the compiler is written forces a hard barrier (such as a join in fork-join programming) at the compiler pass boundaries. It is conceivable that optimizations techniques for stencils such as Trapezoidal decomposition could be applied to compilers written in this fashion to allow intra-pass optimization to occur, but this is future work.
The passes themselves execute in parallel on the GPU, or at least, that is their design. They are expressed as natively data parallel, or vectorized, algorithms. If run on the CPU through the interpreter, then they will execute using the Intel Vectorized code units using the optimized primitives in the Dyalog APL interpreter.

⬐ hzhou321
Thanks. Your work is inspiring.

⬐ panic
How much abstraction overhead does the ArrayFire library introduce? Do you think you could get better performance and simplify the overall system by generating CUDA or shader code yourself?

⬐ pavanky
Not the author but as someone who worked on the arrayfire library I think I have some input to this question.
Currently a lot of math functions are already compiled using a fairly simple just in time compiler inside arrayfire. This results in complex mathematical operations being fused into single kernels.
That said, the number of functions that are supported this way are fairly limited and I am going to try and add more of them.
Anyway here are a couple of links that may be helpful in trying to understand the overheads involved. The code shown [2] compiles to a single CUDA / OpenCL kernel.
[1] https://arrayfire.com/benchmarking-parallel-vector-libraries...
[2] https://arrayfire.com/performance-improvements-to-jit-in-arr...

⬐ arcfide
I would consider ArrayFire a "hard abstraction" that is basically the bottoming out of the system. Previous to this I was using OpenACC. See my talk at the Dyalog User Meeting Last year to see some of the struggles there with Scan.
In short, ArrayFire is handling a lot of the core GPU algorithms for me, and that's its main contribution. This prevents me from needing to implement these myself, because this is itself a full time job. Much better to put a hard cutoff point and work on the high-level questions, rather than spending time trying to support three different architectures by myself. There is basically no good way of working at that level that isn't ugly and very very painful. I am glad to have someone else do that for me.
The benefit is that I can take better advantage of their platform independent and performance tweaks for individual GPUs (which is still a lot of work with the state of GPU programming today). This leaves me to focus on higher level concerns that are more relevant to the APL programmer.

⬐ pavanky
Hi Aaron, I saw the link earlier but did not get the time to check it out earlier. Someone from arrayfire pointed out that you were using arrayfire and now I recognized your name / id from the arrayfire forums :)
Anyway, I am giving it a listen now.
If I had to ask a question, what is something that frustrated you about arrayfire and what do you think needs fixing ?

⬐ arcfide
If I were to expand that idea a bit more and remark on the biggest class of bug that causes segfaults for me in ArrayFire, it would be the requirement for explicit data type casting using as() to avoid problems. I wouldn't want this to go away at the cost of performance, but....

⬐ pavanky
Hmm this is the down side of using a dynamic typed array class. While the functions to handle the type checking of the host pointers can be done in C++, it would not be so easy to implement it in the C layer.
And since it looks like you are using the C++ api anyway, can you open an issue on github so this will be looked at?

⬐ arcfide
I will see if I can remember to do this. Thanks for the contributions!

⬐ arcfide
ArrayFire was designed primarily as an user library for scientists, it seems, and so there is a clear bias there. My coopting it for use as an IR language for a compiler front-end is perhaps outside of the original scope of ArrayFire. The good thing is that ArrayFire manages to capture a few key ideas that are often absent, but critical to APL, so that was good.
On the other hand, it is non-trivial to work with more advanced concepts and workflows/algorithms in a way that allows me enough control over my data to get the performance I want when dealing with nesting or other nuanced execution patterns. As a trivial example, it would be not having more control over how Scan works. ArrayFire is unfortunately not unique in the very limited amount of support for scan operations that will be executed in parallel. I have worked around this, but it's not ideal.
Additionally, the rank limitation of 4 in ArrayFire is infuriating. Given that we're working with C++, it seems like there should be no reason that I should be able to have arbitrarily many dimensions.
The design of arrayfire also makes certain code compositions difficult (this goes back to the issue with nested execution) meaning that I'm spending a lot of effort on working around this limitation in the compiler, having to add new compiler passes and optimizations to massage the output code in such a way as to avoid the syntactic and semantic limitations of the arrayfire primitives. I'm not sure ArrayFire can do better in this regard, and this is the sort of optimization that a compiler front-end could make plenty of use of, but it would have been a nice touch to have some built-in support for mixed CPU/GPU statement overlap to avoid me needing to separate out certain codes and possibly introduce various stages to make sure that all the computation can be properly fused and executed. This limitation is especially felt with GFOR.
So, if I had my way, ArrayFire would be less user friendly to the scientists (that's why Co-dfns is here! ;-) ) and more friendly as an IR for implementing APL runtimes. :-)

⬐ pavanky
> On the other hand, it is non-trivial to work with more advanced concepts and workflows/algorithms in a way that allows me enough control over my data to get the performance I want when dealing with nesting or other nuanced execution patterns. As a trivial example, it would be not having more control over how Scan works. ArrayFire is unfortunately not unique in the very limited amount of support for scan operations that will be executed in parallel. I have worked around this, but it's not ideal.
Can you give me an example ? If you think this is not the forum to discuss this, you can always email me (check my profile) or ping me on gitter (https://gitter.im/arrayfire/arrayfire.
> Additionally, the rank limitation of 4 in ArrayFire is infuriating. Given that we're working with C++, it seems like there should be no reason that I should be able to have arbitrarily many dimensions.
I want to solve this as well, but it requires a lot of tedious work so it keeps getting postponed. But rest assured this will be fixed!
> So, if I had my way, ArrayFire would be less user friendly to the scientists (that's why Co-dfns is here! ;-) ) and more friendly as an IR for implementing APL runtimes. :-)
ArrayFire development has been request driven. So the features requested / used a lot got implemented first. But I don't see why the features you requested can not be implemented in arrayfire!

⬐ arcfide
I'll try to contact you out of stream. Thanks for getting involved here!

⬐ jonahx
arcfide,
What are your thoughts on J vs APL? Also, is APL your preferred language for most kinds of problems? When it is not, why not?

⬐ arcfide
J's ideas are terrific. Dyalog is actively incorporating the ideas in J back into APL. J's choice to use ASCII characters instead of Unicode is a historically understandable design decision that I feel should no longer constrain the APL community. Otherwise, Dyalog APL and J are my two favorite APLs.
If I can't write in APL, I would probably write it in Scheme. I haven't found anything yet that wasn't quite addressable in APL, and the one that was (writing compilers) I figured out how to do, obviously.
However, if you want to create a DSL or otherwise play with notational elements in a DSL style for which macros are the "right choice" then APL is unlikely to be your friend, because I have yet to find a way to create good macro systems on top of APL. Scheme simply wins here. However, vocabulary based DSLs (combinatory DSLs or the like) are very well suited to APL.
Outside of these, I'm actively exploring the idea of theorem proving based programming, and currently things like Agda, Idris, ACL2, or Isabelle are going to be better for this right now. Not inherently, but because of tool support. I hope to change this in the future.

⬐ throwaway7645
Is parallel programming really that easy in Dyalog APL? Could someone provide a trivial example?
⬐ arcfide
Currently Dyalog APL provides green-threaded concurrent programming using a traditional "fork" model as well as a CSP-style distributed computing model called "Isolates." In addition, Dyalog will utilize vectorized operations on the CPU for a number of primitives and code patterns. Co-dfns provides for additional functionality by providing vector level parallelism on the GPU/CPU and a true threading "fork" (this feature is only part of the language extension, and has not yet been implemented). So, if you write your code in a Co-dfns compatible manner (closed single Namespace of dfns, and some limitations in the current version which are being lifted rapidly), then you will be able to execute it on the GPU, CPU, and SMP machines.
As a trivial example, say you wanted to compute the quadratic prime numbers algorithm in parallel:
    Primes←{(2=+⌿0=X∘.|X)/X←⍳⍵}
You would simply compile this function inside of a namespace using Co-dfns and you could then run this function on the GPU to improve performance. The compiled code would run on either the CPU, GPU, or SMP machines.
⬐ throwaway7645
It just does it automatically?

⬐ arcfide
Yes.
⬐ Yahivin
I love the fearlessness. You're now one of my heroes.

⬐ arcfide
Acquire the APL, and then be the hero in your own code every day.

⬐ erikpukinskis
Look at this guy's code and you start to understand how we'll get to the Matrix situation of being able to watch scrolls of random-looking symbols and picture people and food and cities in our head.
⬐ arcfide
If you'd like to view the "Matrix" in APL:
    {⎕←(156+?72⍴54)⊃¨⊂⎕AV}⍣≡⊢⍬
This assumes you have at least a 72 character screen display in your Dyalog APL session.
⬐ arcfide
And here's a slightly more sophisticated version that mimics the animation you see on the Matrix:
      T←41 80⍴' '
      )ed T
      {⎕DL÷16⊣T∘←⎕AV[156+?80⍴54]⍪1↓(-0=?80⍴5)⊖T}⍣≡⊢0
In your Dyalog APL session, type the first two lines, put the opened edit window somewhere you can see it, then go back to the session repl and enter the third line. On Linux's terminal you use the APL+Tab key to switch between windows on the console.
For best effect, change your edit window theme to use green on black. :-)
⬐ zzzcpan
Is there a transcript?

⬐ arcfide
Unfortunately, no. There's a lot of code demonstration, some REPL work, and interacting with the source, so I'm not sure that the transcript would communicate things adequately without significant editing and adaptation. I would like to put this into some sort of guided documentation, however, at some point in the future.

⬐ beagle3
Haven't had time to watch the youtube video ...., but here's a question related to APLing in general:
Do you find that the GPU angle makes people more receptive to the ideas behind APL? In my own experience, APL/K/J quickly get a SEP field applied to them (is also visible in the discussions linked by dang), some people dismiss it outright, and some say "well, it might have been easier to grasp if the syntax wasn't so obtuse" (which is demonstrably wrong). I've given up on evangelising APL/K, but being such a perfect match for GPU semantics, maybe there's hope for more acceptance? What's your take?
⬐ coldtea
>and some say "well, it might have been easier to grasp if the syntax wasn't so obtuse" (which is demonstrably wrong).
Can we have that demonstration then?
⬐ beagle3
One example I like for K is: Maximum subarray sum[0]. The canonical K solution (sorry, my APL is too rusty to translate) is
    |/0(0|+)\
Yes, this is the complete function, and it is the efficient O(n) solution; A K programmer is likely to recognize it immediately (using eyes for pattern recognition). Now, if I were to rewrite it in infix/paren syntax, it would become:
   mss(x):=fold(max,0,scan(lambda(a,b) max(0,a+b), 0, x))
This is not at all more readable (arguably less readable) than the K version, and it's inherent, in the sense that the K uses your organic pattern matcher (supplied by eyes and brain) much better than a haskell/lisp version ever will.
Similarly,
    ,//
Is flatten; If you were to write it verbosely, it would be
    flatten(x):=converge(fold(concatenate,x))
While this may be more readable, the K syntax (as does the APL one) gives you a concise, well defined, pattern matchable version. (Which doesn't actually need documentation - the implementation is 3 chars, so if you want to know how it behaves on an empty list -- just read the implementation!)
[0] https://en.wikipedia.org/wiki/Maximum_subarray_problem
⬐ kazinator
Why don't you just learn the vocabulary of patterns used in a compressed representation, like LZ77? Then you can condense any language to make it more readable for your "organic pattern matcher" just by passing the source through gzip.
Here is the thing: I have a pretty good idea of what
   flatten(x):=converge(fold(concatenate,x))
does without knowing what exact language it is written in, or having to crack open a reference manual for it. This is not so of your sample of puerlile line noise, which is completely arbitary. If we feed it to, say Golfscript, it will do something completely different.
Also, if I delete a ) character from the above expression, I can guess that it is probably not well-formed:
   flatten(x:=converge(fold(concatenate,x))
          ^ not matched

   flatten(x):=converge(fold(concatenate,x)
                       ^ not matched
Then names like flatten and converge are atoms. We could condense this code by giving them single-character names (using some Unicode glyphs perhaps or whatever). Let's just use roman capitals:
   F(x):=V(R(C,x))
It's not actually that long in the number of tokens or atoms.
Anyway, most computer scientists and software engineers stop measuring code by character count by the time they get out of puberty.
⬐ beagle3
arcfide's answer is excellent; there's another aspect I like to present to this kind of rant (which invariably always comes up in the discussion of APL like languages):
When people write code in Pascal/Java/C/C++, it is considered bad style if it is not indented properly (where "properly" is the subject of religious wars, but there is mostly universal agreement that deeper scopes should have more indentation). That's not an issue with the language per-se, as it is pervasive among all those languages that did not eliminate it a-priori (like Python and Nim).
The compiler doesn't care; but programmers do, because we read code a lot more times than we write it, and the visual structure of the code informs us of (some of) its semantics, e.g. "this indented block is dominated by a condition specified in the lines above and below it; now let's see what the condition is, and whether it's an "if" or a loop of some kind".
And the real reasons programmers do, is because it makes the code easier to read "at a glance" -- using our efficient visual system to quickly recognize patterns immediately.
Using APL/J/K goes to the extreme with this principle - as it allows whole (useful!) functions to be compressed down to a few characters, it cuts down the need for at least one, often two or more, bottom levels of abstraction - one doesn't need to abstract computation of an average to a routine called "average" when the entire implementation (e.g. in J) " +/ % # " is shorter than the word "average".
One might argue that it is still better to name this; However, +/%# has the distinct advantage that it also conveys the behavior on an array of length 0 or a 2d-matrix, (what does your "average" function do? you have to read the source code or the often-incomplete documentation), the behavior on a matrix (it give a column by column average), etc.
There is a learning curve, for sure - but just like mathematical notation, once you have learned it and used it, it is concise, readable, pattern-matchable-with-eyes, and no one goes back to "let us multiply x by itself y items" when they are familiar with the notation "x^y".
Also, C++ is significantly more popular than the programming language "ADD ONE TO COBOL GIVING COBOL" for the same reason.

⬐ kazinator
You have somewhat of a small point there that if a calculation is made up of a four-symbol idiom, it bears repeating in the code rather than hiding behind a definition. However, that doesn't extend to arbitrary length. There is a threshold at which it's better to have a definition. I wouldn't copy and paste a sequence of 15 symbols, say. Somewhere between 4 and 15, there is a "knee".
Fact is, large software is made up of definitions. Vast numbers of them.
You can copy and paste everything only in tiny programs for your own use.
Programs are written across multiple lines not just because of indentation, but because we reference code changes to lines. Version control tools currently, by and large, work with line granularity. In code reviews, we refer to lines: "everyone, please look at line 72".
> using our efficient visual system to quickly recognize patterns immediately.
Not everyone has this; maybe only a few otherwise unemployable code-golfing idiot savants.
Maybe I could recognize +/%# if it is on its own. In real code, it will be in the middle of other such cuneiform, because it has to obtain inputs from somewhere, and there is other code expecting its outputs. I will not easily spot +/%# in a ream of similar $%#={|: type stuff.
> what does your "average" function do? you have to read the source code or the often-incomplete documentation
That just shows you're coming from an environment where you expect user-defined code to be poorly documented.
Of course, people don't overly document their page long code-golfed line noise that is for their personal use.

⬐ arcfide
Nobody wants to write proofs as English prose like they did when first proving basic algebraic properties. Yet those proofs are much easier to guess at the meaning of, even if you don't fully get it, than using our standard mathematical notation. Of course, it just so happens that our mathematical notation allows middle schoolers and high schoolers to think of these proofs as obvious and trivial, whereas they would have no idea what to make of the English proofs that are more "natural" to the English speaker. Of course, that notation that they use with such facility (even if they aren't particularly competent with it and just learned it last week) has no relation and no intuition at all to the English language, and makes no attempt to do so.
Yet, I challenge you to back up your statement about character counting by showing me a Computer Scientist and a Software Engineer who no longer writes or uses traditional mathematical notation for anything, but has systematically moved over to the more intuitive and natural programming language of their choice, such as Scheme, C, or Haskell.
APL is a suitable replacement for Mathematical notation and was designed as such. Other programming languages are not. That's a very important element.
It's not about character counting. See Iverson's Notation as a Tool of Thought Turing Award lecture.
You might have a stronger case if, since the advent of computing programming languages, which are demonstrably capable of representing a whole host of mathematical ideas and concepts directly, people have begun flocking to programming languages to replace their mathematical notation in the communication of ideas in non-computer science contexts.
However, people rarely consider a programming language equivalent to be better/superior to traditional mathematical notation when they are talking amongst themselves in front of a whiteboard.

⬐ coldtea
>Nobody wants to write proofs as English prose like they did when first proving basic algebraic properties.
You'd be surprised. Mathematical proofs contain lots of "english prose".
>Yet, I challenge you to back up your statement about character counting by showing me a Computer Scientist and a Software Engineer who no longer writes or uses traditional mathematical notation for anything, but has systematically moved over to the more intuitive and natural programming language of their choice, such as Scheme, C, or Haskell.
That's irrelevant. Best tool for the job et al.
>APL is a suitable replacement for Mathematical notation and was designed as such.
That doesn't make it suitable for programming in general.
And mathematical notation is much richer than APL, it's not constrained by mere sequential characters on a line.

⬐ beagle3
Some mathematical proofs contain lots of "english prose", but not in fields that have nailed down the semantics; and no field that did even goes back to prose.
mathematical notation is not automatically suitable for programming in general, but is not automatically unsuitable.
APL was devised as a notation for algorithms, not for general math; Iverson noticed that at the time (1950s), everyone was inventing their own algorithm specification when give mathematical proof, and tried to unify that. And since it was well enough specified, someone later wrote an interpreter, and that's how APL came to be more than notation. I urge everyone to read Iverson's "notation as a tool of thought".
> Best tool for the job et al.
Best tool depends on context. No programming language is the best tool if the person reading/writing/mainting it does not know the basics of how to program.
APL is suitable for programming in general for people who take the time to learn it, and is often the best tool for the job, as long as the context is "to be read/maintained by other people who know APL". IMHO, this is an unfortunately rare context.

⬐ kazinator
No software development in any language should be undertaken by people who don't understand the language and surrounding tooling and libraries. No program in any language should be written with the expectation that it can be maintained by people who don't know the language. This is not some APL predicament.

⬐ arcfide
I believe the translation of |/0(0|+)\ is ⌈/0,(0⌈+)\
⬐ Avshalom
K does a thing (like J I think?) where it's scan/reduce returns each subsequence/intermediate-value as it goes which is what makes that problem so trivial in K.
⬐ arcfide
Ah, if we have to preserve the sequence:
((⊃∘⍒0⌷⍉)⌷⊢)(+\,∘⍪,\)

⬐ RodgerTheGreat
The K "over" adverb (/) is like a conventional reduce/accumulate/fold. The K "scan" adverb (\) produces a list of the intermediate results. Having both is extremely handy.
If the verb argument of over/scan takes a single argument (rather than two), the adverbs will iterate to a fixed point. As before, "/" will return the final result and "\" returns all the intermediates.
For example:
      +/1 2 3
    6
    
      +\1 2 3
    1 3 6
    
      *:/,,,0
    0
    
      *:\,,,0
    (,,,0
     ,,0
     ,0
     0)
⬐ arcfide
Oh wait, so you're not preserving the subsequence, you just preserving the sum. So it's a standard scan. So my original translation is correct, my subsequence preserving version is doing a bit more than this by preserving the subarray as well as the sum.
⬐ beagle3
I think this gives the correct answer, but the computation is different (which is fine - we're only looking at the answer); K over/scan go left-to-right, and if I'm not mistaken, APL and J go right to left, that is, in K:
    +/!5 equals ((((1+2)+3)+4)+5) which is not 1+2+3+4+5
whereas in APL/J it is
    +/iota5 equals (1+(2+(3+(4+5)))) which is 1+2+3+4+5
I find the APL approach mathematically consistent and elegant, but the K approach is more practical - it makes over and scan useful for state machines, parsing, and many other uses without having to reverse anything twice.
But often, as in this case (or the flatten case) it doesn't matter which direction you start folding from.
⬐ arcfide
Going further with this idea, at the beginning of this video I point out an important "metric" that I use to evaluate simplicity of the code. This is the ease with which I can manipulate the code base using "ascetical" tools. Namely, your operating system's equivalent of Notepad. In the case of Unix, it would be Ed (not Vi, but Ed), and on Mac it would be TextEdit. In this presentation I use Notepad.
We can define "easiness to grasp" along one dimension by asking "how much tool support is required to work with the code?" Code that requires, for example, a Visual Studio IDE to manipulate and work with effectively could be argued to be more inherently complex or less easy to grasp, than a code base that can be worked with easily using only Vi (the original Unix version, say, or Vim, if you want to get fancy).
A large part of this code (over time) was written, debugged, and programmed using Ed, although some of that code has since been deleted and replaced with newer versions. You can ask yourself, in your "easier to grasp" syntax, would you feel comfortable restricting yourself to using Notepad to edit the code? What about Ed?
I think most people would agree that they may go insane if they had to use Notepad to edit their code or work with it as the exclusive means of interacting with the source code. They rely on other tools besides just the syntax to understand the code and work with it.
Because one can easily manipulate this code without strain using a simple editor like Notepad, I would argue that in at least this one dimension, it is easier to grasp. Change to some of the other commonly preferred syntaxes, such as S-expressions, M-expressions, or long names for the primitives, or non-points-free style, or the like and I think you'd be wanting a more sophisticated editor.

⬐ Cyph0n
You are starting to grab at straws with this argument. I understand that you're trying to advocate for more widespread use of APL, but constructing supporting arguments out of thin air really doesn't help your case.
If I follow your argument, I come to the conclusion that ASM is less complex than Scala because I can easily view and edit ASM in Notepad/TextEdit.

⬐ jacquesm
Just like Chinese and English are different so APL and for instance C, ruby or Basic are different.
By defining symbols for operations more complex than just the basics APL has a learning curve but once you have gotten past that it is actually more readable than the alternative because you have less to hold in your head that does not actually help to solve the problem.
All that syntactical sugar and symbol stuff requires some expenditure of cycles and the cognitive load spent on that won't be available to solve the problem at hand.
That's where your definition of 'readable' is different from what the APL people consider 'readable'. They are talking about reading the problem, you are talking about being able to 'parse' the code into (re)representing the problem.
In APL the code is what it does. In most other languages you first have to figure out what it is before you can figure out what it does.
In essence difference between the word 'cat' and a picture of a cat or an ideogram representing 'cat' (assuming one exists), or even an actual cat.

⬐ arcfide
😸
A quick search reveals at least 12 ideographs for cat.

⬐ dang
Such ASM programs would be much larger, and thus harder to read without tool support, so your objection to arcfide's argument is actually a point in its favor.
On a side note, please don't break HN's guideline against name-calling in arguments: https://news.ycombinator.com/newsguidelines.html. It's disconcerting to be faced with something that can't make sense according to standard assumptions, but that's exactly what's interesting here. We want curious responses that open discussion further, not dismissive ones that shut it down.

⬐ Cyph0n
> But that's exactly what's interesting here. We want curious responses that open discussion further, not dismissive ones that shut it down.
I'm feeling a bit of moderation bias here. I don't think it's fair to give submitters special treatment because their content is "interesting". When you post on HN, you open yourself to criticism. I felt that his argument was weak - using "Notepad readability" as a point in favor of APL - so I called him out on it.

⬐ kbenson
Whether the moderation was slightly biased, I think dang's critique is fair without bias as well.
> You are starting to grab at straws with this argument.
> but constructing supporting arguments out of thin air really doesn't help your case.
Both of those could be stated in a more constructive manner, such as asking for substantiation, rather than stating there isn't any. Discussing in goof faith requires you to provide an argument to the contrary if you disagree, and give the other side a chance to explain their assertions more.
You followed up the first paragraph with a clarifying question, which is good. That by itself would have been sufficient.

⬐ dang
I'm definitely biased!
Criticism is welcome but it should be substantive and you should try to understand what you're criticizing. All you've done so far is repeat the leading generic objection without apparently realizing how common it is, or that the current thread originated as a (highly) substantive answer to it.

⬐ throwaway7645
I would tend to agree with arcfide here. It would be easier for me to learn a 1/5 page of APL than the equivalent 15 pages of C# and I don't need the massively bloated VS and the .NET framework to do it. Not that APL is perfect or anything.

⬐ arcfide
And actually, at some level, it is. If ASM is the right level of abstraction for your core problem, using ASM to solve the problem is probably infinitely more simple than using Scala to implement ASM semantics.
But saying that you can easily edit ASM in Notepad is not what I am saying above. If you write your Scala code correctly, then with some Scala code, you can easily write it with Notepad as well.
However, as you continue to add complexity, ASM begins to be the wrong core abstraction, and begins to have usability issues that make it very difficult to edit in Notepad. How would you easily deal with refactoring in Notepad with ASM? Changing a single line Scala function declaration is probably a lot easier in Notepad than changing 10 - 20 lines of ASM enforcing a given calling convention with various elements. Thus, many people who write ASM make heavy use of macros to improve the usability of the syntax.
However, it is fundamentally easier to edit a single line of ASM than a single line of most of the Scala I've seen. But the ASM line isn't doing enough to make that editing efficiency pan out.
It's not a single absolute, it's an aesthetic and a design trade-off happening at multiple levels that has to balance a number of factors that are not easy to measure. However, the simplicity of editing is a secondary metric that can assist in this, the same that line count or token count can help us to understand the "complexity" of a code base.
To take a more specific example, as the distance between call sight and definition sight of a given name increases, it becomes increasingly difficult to use Notepad to make edits surrounding those two related points in the code, because Notepad is a poor tool for traveling far distances through code. Thus, notepad and other editors like it will generally encourage you to avoid putting your call sites and your definition sites so far apart from one another in the code base that you cannot get from one to the other easily.
In my case, the distance between call site and definition site in my core compiler is usually less than 10 lines, and more normally just a single line. There are a few that would be one click/page away.
So again, I'm not saying that you can determine code complexity solely from Notepad ease of editing, but that Notepad has a simplifying effect on the code base, and is an useful metric for evaluating complexity.
And as for your specific example, if you take a programming written in Scala, and a program written in ASM, that do the equivalent things, I think most people would find the Scala program easier to edit with Notepad than the ASM program, until the point where the ASM program is very small, at which point the ASM program would become the favorite, most likely.

⬐ Cyph0n
What you fail to mention is that it's a very clear trade-off, simply because you usually end up sacrificing readability (and sometimes performance) when you "shrink" your source code size. Since APL is a concise language by design, using the "Notepad readability" metric as an advantage is redundant at best.
Furthermore, I'd argue that the more "Notepad readable" your code is, the less people can read it, which ironically ends up decreasing your code's value in the OSS world. I could write a web framework whose source is easy to open/edit in Notepad, but it likely won't be accessible enough for people to learn from or contribute to.
There's definitely a threshold where concise/hard-to-grok becomes an issue, but I definitely feel that APL is one language where this applies.

⬐ arcfide
Define readability.
The whole point of this video presentation was to show people how to navigate the source code, and then to decide whether they still felt the design was needlessly obtuse.
APL programs benefit from efforts to simplify just as much as any other language does. However, the current version of the compiler is more readable precisely for its concision. APL is concise, but concision itself isn't everything. Notepad readability covers more than concision.
I also disagree that shrinking code leads to less readable code. Shrinking code is usually a consequence of the application of "tricks" or micro-level obfuscatory design decisions. In cases where the objective is to shrink code at a micro-level without forcing simplification of the macro-level and underlying structure of the codebase as a whole, then it tends to mess with the semantic density and uniformity of the code, leading to harder readability. However, I'm arguing against that.
Did you listen to the above presentation and watch my demonstration of the architecture of the compiler? Do you find the architecture hard to understand? Do you find the naming conventions to be arbitrary? Do you find the structure obtuse? The presentation came about specifically to address this very claim.
If you've watched the presentation and feel that the architecture of the compiler and the design of the compiler is hard to follow, and you feel that you would not be able to begin learning the details of the code, where is it hard to understand? I welcome a counter example of any other compiler on the planet that has a simpler architecture that is easier to describe.
People make the claim that APL is "hard to grok" and they think that it's hard to grok because it's concise. I don't think I've ever met someone who understand the syntax and can read APL who believes that the code is hard to grok because of concision.
The problem here is that there is a point at which you cannot reasonably shrink your code any further. For most mainstream programming languages, Notepad readability is meaningless because they couldn't make their source code notepad readable if they tried. See my other comments on the previous threads for my thoughts and definition of readability.
Readability to those who don't "read the language" is, IMO, a bad metric. Readability should be judged by the degree to which programmers working with the source can comprehend and understand the entire system as a whole, and thereby have the best confidence of making large, wide, sweeping changes. I asserts that among the trade-off between local readability of a single function and macro readability of the whole system, the trade-off should always be made towards readability of the whole system.
The goal is to be able to work with the code, make changes to the code, and continue to keep the code moving. This means that I should be minimizing technical debt and minimizing the difficulty of making changes.
Put another way, given that my entire core compiler fits on a single computer screen of two columns of large print text in two notepad files, I would assert that it is more readable than most other compilers out there. The entire macro architecture is immediately apparent, without any necessary for additional documentation or diagrams. You don't have to ask yourself or use search to talk about where a given design pattern is used, because they are all literally there on your screen at one time.
Some people would say that this is not scalable, except that I have scaled this to a non-trivial compiler, which is designed to self-host on the GPU. Most people would have said this was beyond the capabilities of most current programming techniques and languages.
I would rather be able to see the entire piece of code than have to guess at things or have a source code that may hold my hand through more of the details, but for which I cannot keep the whole thing in my head at once.
If your definition of readability is that it is accessible to the uninitiated who know neither the language nor have experience with the domain or the programming techniques involved, I agree, a more verbose style is appropriate. That's what I'm writing a dissertation for, and guides, and videos. But people who are interested in working with a piece of code require a different readability, one that conflicts with the verbosity required too explain everything to the uninitiated.
This video presentation serves as a simple introduction to the entire working mechanisms of the compiler. It is more or less complete. The only thing it lacks is detailed expositions of each compiler pass and runtime function, which have nothing to do with understanding the architecture of the compiler or reading the code. The linked video presentation should give you everything you would need to work with this code on a daily basis and everything you need to begin studying each individual compiler pass. If it does not, or if you think it is unclear in some way, I would like to improve the code, and would appreciate some specific suggestions of where it fails to be clear.

⬐ Cyph0n
I would define readability of a codebase as the amount of effort required by a newcomer to a language to understand the codebase. This would include learning the language (tutorials, docs), learning how to use the standard library, understanding the dependencies of the codebase in question, and finally, figuring out how everything fits together.
For example: how long would it take for a Java dev to understand GCC? As an APL dev, you would probably argue that it's not a fair comparsion since Java and C/C++ are much more similar than Java and APL. But that's the whole point: you have to consider the overhead of the paradigm itself (OOP vs. matrix-based), as well as the syntax. Otherwise, you're making inaccurate assumptions about how the majority of programmers learn and what tools they already use.
> Did you listen to the above presentation and watch my demonstration of the architecture of the compiler? Do you find the architecture hard to understand? Do you find the naming conventions to be arbitrary? Do you find the structure obtuse? The presentation came about specifically to address this very claim.
I currently do not have the time to watch it in its entirety, but I've added it to my YouTube queue. However, I'm arguing that APL is hard to read without context. Any language or architecture can be explained by a capable teacher, so the clarity of your presentation of the compiler cannot be used as a good indicator of APL's readability.
> I don't think I've ever met someone who understand the syntax and can read APL who believes that the code is hard to grok because of concision.
Of course you wouldn't. Once you can read Java, you can understand most Java codebases.
> The problem here is that there is a point at which you cannot reasonably shrink your code any further. For most mainstream programming languages, Notepad readability is meaningless because they couldn't make their source code notepad readable if they tried. See my other comments on the previous threads for my thoughts and definition of readability.
That has more to do with the codebase itself rather than the language. But I agree that the language used is a small part of it.
> Readability to those who don't "read the language" is, IMO, a bad metric. Readability should be judged by the degree to which programmers working with the source can comprehend and understand the entire system as a whole, and thereby have the best confidence of making large, wide, sweeping changes. I asserts that among the trade-off between local readability of a single function and macro readability of the whole system, the trade-off should always be made towards readability of the whole system.
It's a bad metric once your language is established and programmers who can work with it are in abundance. But a smaller language like APL or Nim needs to be able to attract people who aren't familiar with the syntax. Why, you ask? Well, who is going to develop and maintain useful libraries? Who is going to write tutorials and documentation, or answer SO questions?
In other words, a language needs a community, and the way to grow one is to attract programmers who aren't familiar with the syntax and/or paradigm. Look at what Elixir has done over the past few years. Erlang was a fringe language even though it was technically quite impressive. Elixir has made BEAM mainstream. Now web developers are considering building their apps with Elixir instead of Ruby or Python.
> Some people would say that this is not scalable, except that I have scaled this to a non-trivial compiler[1], which is designed to self-host on the GPU. Most people would have said this was beyond the capabilities of most current programming techniques and languages[2].
I cannot really confirm or deny your claims unless you provide references for [1] and [2]. If your compiler is non-trivial, what do we call clang?
> I would rather be able to see the entire piece of code than have to guess at things or have a source code that may hold my hand through more of the details, but for which I cannot keep the whole thing in my head at once.
And now we go back to community. If you had a more complex compiler (or project in general) where you couldn't fit it all in your head -- even APL has limits! -- you'd need one or more other people to work with you. But since APL is not a readable language (as defined above), finding such people is incredibly difficult relative to other more readable languages. You may not have faced this issue yet, but like I said, every language has limits.
Now, I know that J (or K?) is used heavily in the finance sector, so it seems to be a language that can be used in the real world, which is a good thing. But again, that doesn't make it readable!
> If it does not, or if you think it is unclear in some way, I would like to improve the code, and would appreciate some specific suggestions of where it fails to be clear.
I'll definitely be sure to let you know after I watch your presentation. I have zero experience with APL -- other than watching the infamous Conway's Game of Life demo -- so I think I fit the the target audience.

⬐ arcfide
Good points here. I think we agree at some level here. Putting aside practical readability, we can focus on "onboarding overhead." You're also right about community.
Obviously moving from C to Java or Java to C++ is going to be easier than moving to APL. We can both agree there, and agree that part of the reason is the relative similarity between Java and C. But it's actually much worse than it appears on the surface. Let me address some of your other points before getting back to this.
Every language and code snippet is hard to read without context. Indeed, some learning theories would argue that context is such an integral part of learning that the absorption of skills and understanding is deeply contextualized and so an isomorphic task may be impossible to someone simply because the task appears in an alien context. But that shared context is HUGE and people don't realize how much they rely on it.
APL is one of the most established languages around. It was designed initially in the 1950s, and has been actively used since then. It has always struggled with the computational community, despite its success with non-computer scientists, I argue, in part, because it fundamentally challenges, and has challenged, the perspectives of the status quo. Should APL become less APL-ish simply to become more popular? J was an attempt to preserve the ideas and fundamental concepts without using the funny symbols, about which people complained, it didn't change anything. People continued to dislike it and argue against it, because they are fighting against a challenge to their conception of how things are done. This is not so much a marketing problem as much as a group think problem. You cannot provide a new way of looking at something by changing your new way to be the old way to make people more comfortable with it. That's just a new name, not a new idea.
The syntax is a fundamentally good contribution to the state of computing, it was, and it still is. Remove that approach, or the underlying principles of Iverson's Notation as a Tool of Thought, and you are no longer contributing to the positive growth of the community's ideas. It is the fundamental idea of APL that people rebel against, but they simply use the syntax as a scapegoat.
If you read "notation as a tool of thought" it should become clear that the Elixir example you are presenting, which is one of the most common suggestions for APLers, doesn't work, and there is a reason that you don't see it around.
Syntax matters. That's a fundamental assertion of APL and its community. It's an obvious conclusion from HCI research, too. It is not possible to experience the full contribution of APL by ignoring its syntax and substituting one that makes you more comfortable. I tried. As a Schemer I did this. I realized that what I was using had none of the things that I valued APL for. By trying to utilize my own "syntax" for APL, I had lost the benefits and gains that APL provides. Syntax isn't just a method of communicating the ideas of APL, it is one of the fundamental ideas of APL. It is an expression of foundational principles of approaching communication of computation ideas.
For more on this idea, especially as it relates to your concepts of "people needing readable languages", please see some papers on direct development, a distinctly APL development methodology. Here are a couple:
http://archive.vector.org.uk/art10009900 http://archive.vector.org.uk/art10500550
Particularly the case of pair programming with users, they are direct experiential evidence countering the unreadability of APL for the new user. I have also done a small exploratory study on this subject:
https://www.youtube.com/watch?v=kIItfQJEVdM
The tentative conclusion we can draw from the above is that APL is not unreadable to the uninitiated, it is unreadable to computer scientists and programmers who are attached to their programming languages.
APL has a strong history of being readable and usable by the uninitiated non programmer. It has a horrible reputation of being liked by programmers who regularly use other programming languages.
And this brings us back to the first point. Our entire computing culture, from initial education onwards teaches us that APL's very ideas are "hard to use" and hard to understand. It's not just that APL lacks a shared context between Java and C, it's that the foundational execution model used when thinking about APL programs lacks a shared context between the rest of the traditional computing society and itself. Everything you learn in school about programming tells you that you should think this way about code, and APL proceeds to completely marginalize all of the things that you are used to considering important.
It is not hard to learn or use APL. People do so regularly and without struggle. It is hard to reject a way of thinking about the world that is constantly reinforced by C-style programming models for sequential processors.
It is not hard to read, it is hard/impossible to fit into the boxes that most programmers use when evaluating things. To use a bit of hyperbole, give up everything you think you know about computing, and then APL is easy to read/learn.
It's akin to an assembly language programmer being told to program with Agda, and that Agda really will give them a new and interesting and useful way of looking at the world. But then imagine everyone else being ASM programmers, too, and schools never teaching anything about Agda, but only teaching ASM (maybe one teaches PowerPC, one X86, but otherwise they're basically the same with some slight different instructions and syntax for their assemblers). Agda uses these weird greek symbols, and there are no instructions, and no one provides a simple Agda to ASM Rosetta stone and no one explains Agda in ASM programmer concepts so that we can understand the essence of Agda. And so on and so forth.
The evidence and research suggests that APL is distinctly hard for existing programmers to understand and learn, but not for most everyone else.

⬐ Cyph0n
Thanks for the informative response. Your lecture looks very interesting, so I'll definitely be watching it. I think that the topic of CS education and rethinking the fundamentals is a very interesting problem.
I really enjoyed this discussion, so thanks again.

⬐ arcfide
Good luck, and please feel free to provide feedback. I'm always looking for ways of improving the situation or approaching it from a different angle.

⬐ Karrot_Kream
Thank the both of you for a thread of informative, intellectual, and highly civil discussion. Would only more of the internet engage in this form of discourse.

⬐ None
None

⬐ arcfide
This question has been presented to me on this research, and this presentation I am giving in the video above is actually a demonstration implicitly addressing just such an issue. If you watch the video you'll see some examples of how the "one lining" features of the language assist me in exploring and discovery. I also go over how I leverage the use of the concise nature of the notation to ensure that my entire compiler is visible on the whole screen at once.
You can ask yourself, after seeing how I put this compiler together and how it is architected, whether the features that I consider important here and how they help me work with the code, could be achieved using a different syntax. In fact, the code for the core compiler is only about 90 lines long, so you could just take that code and invent any syntax you want and then try it out and see for yourself whether you can improve the syntax in some practical way that preserves the important elements of the design and makes it easier for people to read/learn/absorb.
As a simple first step, take the code that I demo (the 90 line compiler) in the video, and replace each symbol with its name and spaces on the left and right of the name. Then look at the code again and ask yourself if it's any easier to read. My experience is that it will be fundamentally harder to read. Not just harder, but that it will get progressively harder to read the more familiar you are with the language, relative to the other notation. It's not just harder at a constant, but that the more you work with the language, the more difficult it becomes to work with these "less obtuse" syntaxes in these situations.

⬐ kbenson
> My experience is that it will be fundamentally harder to read. Not just harder, but that it will get progressively harder to read the more familiar you are with the language, relative to the other notation.
This fits my experience, and I've brought up that point many times here. There's a trade-off languages make in accessibility that often reduces usability for experts. If everyone is already an expert, or there's some other compelling reason people will use your language (e.g. C and UNIX), then you can get away with being less accessible (and C isn't accessible, it just seems that way to some because we all either learn it or some language that shares many features with it).
On one end of the spectrum, you have BASIC/Visual Basic, where learning the basics is easy and accomplishing simple things is simple, but accomplishing hard things is hampered by some of those same features that promote accessibility, and experts feel constrained. Adoption is high though, because people feel productive quickly.
On the other end of the spectrum, you have APL and it's ilk, which are fairly inaccessible to novices and require a lot of time to become fluent in. After you've spent that time, easy things are easy, and hard things are also relatively easy (within bounds of the domain you are targeting).
I come to this from Perl, which is interesting in that it's actually situated on both ends of the spectrum. It's got a very accessible syntax (except for a few well known warts) that makes people familiar with Algol like languages able to use it for simple things easily, and without grasping some of the fundamental aspects of the language (context). Unfortunately, without continuing to learn those fundamental Perl concepts, they will run up against what seem like very odd and annoying design choices which cause friction. That's not to explain away all Perl's design problems, because there are a few, but fundamentally what I see is a lot of people that really haven't spent the time to really understand the language, and so speak from a place of misunderstanding.
This has actually always made me want to learn APL. I'm interested in learning languages that have a somewhat high learning curve, to see how well it pays of in the end. I suspect it pays off quite well.

⬐ jacquesm
This is why we don't start people out on chainsaws or table saws but with simple hand operated saws.
But cutting down a tree with a hand operated saw (even if it can be done, see the time of logging before chainsaws) is a painful operation, and making very straight cuts without the aid of a table saw and a guide is similar.
APL is a powertool, basic is a hand-tool.

⬐ arcfide
I'm not convinced of this. See my talk:
https://www.youtube.com/watch?v=kIItfQJEVdM

⬐ jacquesm
That's because likely for you starting out on power tools would have worked. But as modest as you are in this thread you are - to me at least - clearly extremely gifted and what works for you may not work for us mortals.
Or maybe you only appear to be extremely gifted because of the tools that you use but that's a hard one to prove without a large and preferably controlled experiment.

⬐ arcfide
:-) Thanks, but see my talk above. I'm slowly attempting to build just such controlled experiments. My first exploratory study on such is discussed in the above talk.

⬐ jacquesm
It's 6 am here and I'm still awake, I will watch your talk when I have done some sleeping and the days work. Thanks for the link. And thank you for all your work, super interesting stuff this.

⬐ kbenson
What I find interesting is that is we move slightly beyond BASIC, we get to a somewhat middle ground, where things like Python live. The question then becomes, at what point in the learning cycle do Python's choices in accessibility become liabilities, if ever? Does optimizing your code for legibility of novices end up hurting you at some point? What about if society shifts, and the majority of people have some programming experience? What if the journeyman of today is tomorrow's novice? Does Python end up being tomorrow's BASIC?
This is of course focusing on specific aspects of Python, and not Python as a whole, which I think is a very capable language (if not my cup of tea).

⬐ jacquesm
> The question then becomes, at what point in the learning cycle do Python's choices in accessibility become liabilities, if ever?
Fairly quickly, in my experience. Muddled data types, all kinds of conversion rules and non-obvious trickery required to get the job done.
Ok, you do get the job done but it almost never feels clean.
⬐ arcfide
My take is that, yes, people don't like APL, and are biased against it. It's a case of not understanding/knowing what they don't know. I was in the same boat, which is why my discovery of APL was so interesting.
APL on the GPU isn't really a "big deal" to most people either, because they think they could do the same thing with a library. I'm taking it a step further than that.
The biggest reason, I think, that people are slightly (but not totally) more receptive to the work I'm doing in my own research field is because I'm almost literally doing the impossible. And that's sort of the barrier to entry. Until you demonstrate that you can do the impossible, they don't take you seriously. In my case, it's producing a compiler that people have said was practically impossible to do, and doing it with APL, in such a way that it becomes evident that APL is a key contributing factor to my ability to solve the problem.
Basically, what I've done is create a compiler that is natively expressed in a format/style that is as compatible with the GPU as C or C++ is with the CPU. There is no one else who has done this, or even, to my knowledge, made the attempt. And unfortunately, it's that madness of the attempt that makes the whole thing interesting.
It's easier for me to convince people (on some level), because if you ask them to imagine how they would do this with their own language, they can't. With almost every other domain, people can convince themselves that they could do the same thing more productively in their own language, even if evidence of people trying and failing suggests otherwise.
Another dimension to this problem is that APL is usually presented over domains which seem "obviously suited" to APL, and that makes people think of it as a niche domain specific language. Working in the field of compilers is an obvious not-APL-friendly domain, and yet the results are just as compelling, and thus, more interesting to people.
⬐ BuuQu9hu
I don't know any languages in the APL family. Which should I learn and why? Which should I definitely avoid and why?

⬐ arcfide
Among the APL language families I would recommend are Dyalog APL and J. J is great for some of the ideas it has implemented and to gain a flavor of Iverson's vision of APL as it evolved. Dyalog APL is, IMO, more practical, and we are quickly integrating as much of J into APL as is reasonable.
Personally, I am highly biased towards using the APL symbols, and not to get "sucked in" to the idea that using ASCII symbols is easier. I think the APL symbols provide important visual and pneumonic information that actually makes learning better over time than trying to use ASCII. There is built-in APL keyboard support on Linux, and Dyalog APL comes with a good IME for Windows and Mac, so there's no reason not to use the symbols anymore.
I don't know that you should avoid any of them, but I would encourage you to use dfns if you can. Using a lexically scoped syntax is going to give you a better idea of how "good modern" APL is written. A lot of APL's don't include the J innovations or they don't have structured programming features or lack lexical scope, which is a bummer.
So, overall, go Dyalog. I'm highly biased.

⬐ jxy
I would be sold if Dyalog had a graphical environment equivalent to J's jqt, something you can at least plot the data. Everything I can find with Dyalog is .Net and Windows. RIDE seems to be the right step forward, but still much inferior than jhs.
On the other hand, Dyalog can't beat J's open source, but I guess that's something much more difficult to change.

⬐ arcfide
One problem with cross-platform graphical environments, IMO, are that they fail the integration standards that I've come to expect from GUI apps. I want Mac apps to feel like Mac apps, Linux apps like Linux apps, and Windows apps like Windows apps. This means making heavy use of specific interaction paradigms for each platform to achieve the best level of integration. This isn't possible with shared graphical environments.

⬐ jxy
RIDE is already quite good. It is simple and effective and in a sense better than jupyter. Throw in some D3 they would get the graphical capabilities, just like how jhs does it.
A comercial example? Mathematica.
Imagine a Mathematica clone with APL...

⬐ arcfide
I'm actually designing one better. :-) Probably a few years out from true production ready (maybe more, who am I kidding, probably more), but definition something cool.

⬐ bjterry
I think this expectation has mostly dropped away in the last couple years. A lot of apps now have interfaces inspired by apps (e.g. Airmail on Mac OS X acts like an iPad app) or web apps (all the Electron apps). Obviously this isn't saying that a personal dislike for it is wrong, I just think that it's becoming less important in the market.

⬐ arcfide
This might be true for standard consumer applications that can fit within the web paradigm. However, you then fail to be able to leverage vendor specific interaction paradigms, which is something I want to be able to do, such as the Mac touch strip, the Windows Surface Dial and Pen, or the Gnome title bar/notifications/messaging interface. I like to interact with voice services, so you need something for this as well, or handling the various HiDPI features of each system.

⬐ arcfide
Co-dfns is Open Source, and funded/licensed by Dyalog. They also produce MiServer and a host of tools Open Source as well. The problem is that they have to fund that work somehow, and I've yet to see a good business model on interpreters and compilers for Open Source that works well enough for Dyalog to take the risk. If I can successfully market and fund Co-dfns through an Open community funding model, I think it would go a long way towards encouraging Dyalog in openness (they are already quite open to it).
Dyalog does have a Windows windowing environment, but also their web framework, for designing interfaces. They are actively working to improve their cross-platform situation, with things like RIDE, MiServer, and the like.
As long as you can live without the GUI elements of plotting, you can use the shipped Plotting and Documentation tools (SharpPlot?) that gives you graphing and plotting tools, and produces images that you can open in a viewer. This is what I use when generating graphs and plots on Linux. SharpPlot or the Plot Wizard is very nice on the Windows IDE, but the purely command line version works great on Linux and Mac.
Cross platform GUI applications for Dyalog are best done either with individually coded backends or making it a hosted web application through the MiServer Web Framework, which has some support for doing this.

⬐ jxy
Open source is a chicken-and-egg problem. If you want success stories, look at c/c++/fortran/perl/python/R/julia, old and new. I don't have a solution. I just blame IBM and all others who monetized on APL in a short dozen years and didn't manage to hold their ground.
In an alternate universe we would see current fortran users writing APL.
More realistically, had anyone write an APL compiler that gets some applications to run on any of the top 50 supercomputers with even one percent of its peak performance, APL would still have a chance.

⬐ arcfide
We're working on it! We're getting close already. If you have a naively parallel algorithm that is good for doing CSP style decompositions on, then you can compile your individual unit programs using Co-dfns and then using Dyalog APL Isolates to distribute the bulk computation across the distributed computing cloud. It's more work than it should be right now, but it does work for the right types of problems.

⬐ arcfide
I also don't know if I would consider C/C++ or Python or the others success stories. They all have an ecosystem around them that allows for developers to be funded doing work on these projects, but only because of other products that are out there. Other products produce money, and are making enough to allow some companies and individuals to fund core technology improvements. That's different and a harder model to work with, because it requires massive scale to be effective.

⬐ jxy
APL had 50 years.
Julia had 5.

⬐ RodgerTheGreat
If you want polished tooling, support and documentation, learn Dyalog. The company freely distributes great learning materials and trial licenses. Apart from the fact that you will have to (eventually) learn a new keyboard layout, it's a pretty beginner-friendly choice.
If you want to get a job working with a vector language, learn Q. It's very popular in finance these days. If you live in the right cities and you have a strong grasp of KDB+ you can practically name your price.
If you want something simple and beautiful: Learn K. It's the most aggressively compact (in terms of scope of language features as well as concrete on-disk-size) in the APL family. The whole language fits in your head easily. There are deep similarities between the dfns dialect of Dyalog and K. (It's also my personal favorite.)
If you want to see features you haven't been exposed to before, learn J. It's probably the most alien-looking of these four (Dyalog's special symbols notwithstanding), but it's overflowing with curious and interesting ideas. Rich support for tacit programming. An elaborate numeric tower. A primitive for expressing arbitrary state machines. Much, much more.

⬐ dang
This discussion originated here: https://news.ycombinator.com/item?id=13565743
...and the live stream offered there happened here: https://news.ycombinator.com/item?id=13638086.
We asked the author to do one more thread about it because the latter didn't get much attention, and we thought you might find it interesting. It's rare for work this technically sophisticated to also be this deeply counterintuitive.

⬐ gankedfrank
The one time you abuse censorship for something positive! Thanks! APL is pretty neat.

⬐ arcfide
Thanks for giving me a chance to chat a bit more!

⬐ FullyFunctional
Thanks for sharing all these insights, it's definitely an eye opener, both regarding APL as well as conventional wisdom.

⬐

Feb 14, 2017 · fourier on Smaller Code, Better Code

It was: https://news.ycombinator.com/item?id=13638086 Video is: https://www.youtube.com/watch?v=gcUWTa16Jc0

⬐

Feb 13, 2017 · fourier on Smaller Code, Better Code

Here is the link: https://www.youtube.com/watch?v=gcUWTa16Jc0 and proper q/a thread: https://news.ycombinator.com/item?id=13638086

Live stream: Explaining my 750 line compiler+runtime designed to GPU self-host

⬐

Feb 13, 2017 · 10 points, 6 comments · submitted by arcfide

⬐ RodgerTheGreat
This has been very informative so far- looking forward to reviewing the footage later to see the parts I missed!
Do you have time to walk us through how one of the passes works step by step? Feel free to choose whichever you think is shortest or easiest to explain.

⬐ RodgerTheGreat
Thanks!

⬐ fourier
What is your workflow while developing the compiler?
- Are you experimenting with the particular function in REPL/interpreter and then add it to the code?
- How do you debug the code?
- How do you test it, what are the test suits etc?

⬐ fourier
yes thanks!

⬐ arcfide
Here are some things that I'll try to remember to get to:
1. Disposable Code 2. History of the Design of the compiler, including all the other stuff I've tried. 3. My own programming background. 4. Comments in the code 5. Smaller code vs. Readable Code and Quality vs. Size 6. Abstractions and their proper use 7. The aesthetics of programming

⬐ arcfide
This is the live stream that was talked about here:
https://news.ycombinator.com/item?id=13565743
I'll be doing this in a sort of AMA style. Please ask your technical questions here. Please use the YouTube Live chat to discuss video, stream quality, or audio questions or comments.

Hacker News Comments on Co-dfns Live Stream: Compiler Design and Architecture

Hacker News Stories and Comments

Hacker News Comments on
Co-dfns Live Stream: Compiler Design and Architecture